Skip to content

AuthZService: improve authz caching#9

Closed
ShashankFC wants to merge 1 commit into
cache-optimization-baselinefrom
authz-service-improve-caching-pr
Closed

AuthZService: improve authz caching#9
ShashankFC wants to merge 1 commit into
cache-optimization-baselinefrom
authz-service-improve-caching-pr

Conversation

@ShashankFC

@ShashankFC ShashankFC commented Feb 13, 2026

Copy link
Copy Markdown

User description

Test 2nn

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced permission caching mechanism to accelerate authorization checks.
    • Implemented fast-path evaluation for previously cached permissions.
    • Added denial caching to eliminate redundant permission evaluations.
  • Performance

    • Optimized identity permission lookups through improved caching layer.
    • Reduced database queries for authorization checks.
    • Enhanced authorization metrics tracking and tracing.

✏️ Tip: You can customize this high-level summary in your review settings.

nn---n*Replicated from [ai-code-review-evaluation/grafana-coderabbit#2](https://github.com/ai-code-review-evaluation/grafana-coderabbit/pull/2)*

CodeAnt-AI Description

Improve authorization permission caching with explicit denial entries and cache-first checks

What Changed

  • Authorization checks (Check) now first consult a denial cache and immediately deny when an explicit deny entry exists.
  • Check and List operations try to use cached permissions before querying the database; when cached data is missing or outdated they fall back to the DB.
  • Negative check results are recorded in a denial cache so subsequent identical checks are short-circuited.
  • In-process authz client no longer uses a client-side cache to avoid duplicate client caching.
  • Added tests covering cached allow/deny, cache miss fallback, outdated cache behavior, and list-from-cache.

Impact

✅ Fewer DB permission queries
✅ Faster authz checks for repeated denies and cached permissions
✅ Consistent deny behavior for repeated permission checks

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

* remove the use of client side cache for in-proc authz client

Co-authored-by: Gabriel MABILLE <gabriel.mabille@grafana.com>

* add a permission denial cache, fetch perms if not in either of the caches

Co-authored-by: Gabriel MABILLE <gabriel.mabille@grafana.com>

* Clean up tests

Co-authored-by: Ieva <ieva.vasiljeva@grafana.com>

* Cache tests

Co-authored-by: Ieva <ieva.vasiljeva@grafana.com>

* Add test to list + cache

Co-authored-by: Ieva <ieva.vasiljeva@grafana.com>

* Add outdated cache test

Co-authored-by: Ieva <ieva.vasiljeva@grafana.com>

* Re-organize metrics

Co-authored-by: Ieva <ieva.vasiljeva@grafana.com>

---------

Co-authored-by: Gabriel MABILLE <gabriel.mabille@grafana.com>
@ShashankFC

Copy link
Copy Markdown
Author

@CodeAnt-AI: review

@codeant-ai

codeant-ai Bot commented Feb 13, 2026

Copy link
Copy Markdown

CodeAnt AI is running the review.

@codeant-ai codeant-ai Bot added the size:L This PR changes 100-499 lines, ignoring generated files label Feb 13, 2026
@codeant-ai

codeant-ai Bot commented Feb 13, 2026

Copy link
Copy Markdown

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review

  • Stale denial cache
    The new denial cache (permDenialCache) stores negative check results per user/action/name/parent. If underlying permissions change (role, teams, or permissions updates), existing denial entries may become stale and cause incorrect denies until the denial TTL expires. Ensure cache invalidation hooks or a broader invalidation strategy exist when permissions are updated.

  • Inconsistent cache access paths
    The PR introduces getCachedIdentityPermissions and uses it in Check/List, but getUserPermissions no longer checks the permCache itself. Calls to getIdentityPermissions will call getUserPermissions and always perform the DB fetch (though they set the cache afterwards). This creates two different cache access patterns and could lead to redundant DB queries if callers bypass getCachedIdentityPermissions. Consider centralizing cache reads or making getUserPermissions perform a cache check as well.

  • NoopCache behavior
    The in-process client is created with a NoopCache that always returns ErrNotFound and no-ops on Set/Delete. This is probably intentional for inproc GRPC to avoid double-caching, but reviewers should confirm the behaviour is expected (no caching in-process) and that no other component relies on these cache calls having effect.

  • Denial cache semantics
    The new permDenialCache is set and used in tests to assert deny precedence over other caches. Ensure production code checks permDenialCache before permissive caches so tests reflect real logic. Also consider adding tests for overlapping cache states (explicit deny + cached allow) for other combinations (e.g. folder-less requests).

  • Missing DB-call assertions
    The new cache-focused tests exercise cache fast-paths but don't assert that the backing store was not invoked. This makes the tests weaker / more brittle: they can pass even if cache lookups are skipped and DB fallbacks are used. Add explicit assertions that the fake store's call counters remain zero for pure cache-hit cases.

s.permDenialCache.Set(ctx, userPermDenialCacheKey("org-12", "test-uid", "dashboards:read", "dash1", "fold1"), true)

// Allow access to the dashboard to prove this is not checked
s.permCache.Set(ctx, userPermCacheKey("org-12", "test-uid", "dashboards:read"), map[string]bool{"dashboards:uid:dash1": false})

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: In the "Should deny on explicit cache deny entry" test, the cached permission map is set to false for the dashboard scope even though the comment and intent are to simulate an allowed permission that is overridden by a deny-cache entry; with the current false value the permission cache alone would also deny access, so the test would still pass even if the denial cache were not consulted, failing to verify the intended behavior. [logic error]

Severity Level: Major ⚠️
- ⚠️ Denial-cache override behavior not validated by this unit test.
- ⚠️ Future regressions in `permDenialCache` may go undetected.
Suggested change
s.permCache.Set(ctx, userPermCacheKey("org-12", "test-uid", "dashboards:read"), map[string]bool{"dashboards:uid:dash1": false})
s.permCache.Set(ctx, userPermCacheKey("org-12", "test-uid", "dashboards:read"), map[string]bool{"dashboards:uid:dash1": true})
Steps of Reproduction ✅
1. Open `pkg/services/authz/rbac/service_test.go` and locate `TestService_CacheCheck`,
subtest `"Should deny on explicit cache deny entry"` at lines 973–995 (from the current
file contents).

2. Observe that this test sets the permission-denial cache to true at line 979:
`s.permDenialCache.Set(ctx, userPermDenialCacheKey(...\"dash1\", \"fold1\"), true)`, and
then sets the permission cache at line 982 with `map[string]bool{"dashboards:uid:dash1":
false}` while the preceding comment says `// Allow access to the dashboard to prove this
is not checked`.

3. Follow the production call path in `pkg/services/authz/rbac/service.go`:
`Service.Check` (lines 93–159) calls `getCachedIdentityPermissions` (lines 342–368), which
returns the cached map from `permCache`, and then `checkPermission` (lines 536–567) checks
`scopeMap[t.scope(req.Name)]` and only grants access if the value is `true`.

4. Because the test's cached permission map contains `"dashboards:uid:dash1": false`,
`checkPermission` will deny access even if the denial-cache lookup at lines 116–121
(`permDenialCache.Get(...)`) were removed or broken; thus the test would still pass
without the denial-cache behavior, proving that the current test does not actually verify
that the deny cache overrides an otherwise allowed permission.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** pkg/services/authz/rbac/service_test.go
**Line:** 982:982
**Comment:**
	*Logic Error: In the "Should deny on explicit cache deny entry" test, the cached permission map is set to `false` for the dashboard scope even though the comment and intent are to simulate an allowed permission that is overridden by a deny-cache entry; with the current `false` value the permission cache alone would also deny access, so the test would still pass even if the denial cache were not consulted, failing to verify the intended behavior.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎

@codeant-ai

codeant-ai Bot commented Feb 13, 2026

Copy link
Copy Markdown

CodeAnt AI finished running the review.

@github-actions

Copy link
Copy Markdown
Contributor

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 2 weeks if no further activity occurs. Please feel free to give a status update or ping for review. Thank you for your contributions!

@github-actions github-actions Bot added the stale label Mar 16, 2026
@github-actions

Copy link
Copy Markdown
Contributor

This pull request has been automatically closed because it has not had any further activity in the last 2 weeks. Thank you for your contributions!

@github-actions github-actions Bot closed this Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files stale

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants