recipe: add gatekeeper as optional component#821
Conversation
|
Welcome to AICR, @JaydipGabani! Thanks for your first pull request. Before review, please ensure:
A maintainer will review this soon. |
📝 WalkthroughWalkthroughThis PR adds Gatekeeper as an optional system-tier component: a registry entry (Helm chart defaults, defaultVersion 3.22.2, defaultNamespace gatekeeper-system, valueOverride key, and node-scheduling paths), a Helm values file with conservative defaults (replicas, resources, disableMutation: true, auditInterval: 60, constraintViolationsLimit: 20), and a Kyverno Chainsaw health-check that asserts both controller-manager and audit Deployments have available replicas and no Pods are in Pending/Failed/Unknown. Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@recipes/components/gatekeeper/values.yaml`:
- Around line 52-55: Update the misleading comment: instead of referencing the
unrelated flag "gatekeeper:disableMutation=false", instruct users to re-enable
the PDB via the actual Helm value for this key (podDisruptionBudget.enabled) —
e.g., tell them to use "--set gatekeeper.podDisruptionBudget.enabled=true" (or
the equivalent overlay) so the comment correctly points to the
podDisruptionBudget.enabled setting.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 15dfa5e4-9d35-4c3a-856d-2fe6258734e2
📒 Files selected for processing (3)
recipes/checks/gatekeeper/health-check.yamlrecipes/components/gatekeeper/values.yamlrecipes/registry.yaml
There was a problem hiding this comment.
♻️ Duplicate comments (1)
recipes/components/gatekeeper/values.yaml (1)
53-53:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix the Helm override example syntax for PDB re-enable.
Line 53 uses
--set gatekeeper:podDisruptionBudget.enabled=true; Helm--setexpects dot-separated nested paths. This example can fail to apply the intended value.Suggested fix
-# Production overlays can re-enable via --set gatekeeper:podDisruptionBudget.enabled=true. +# Production overlays can re-enable via --set gatekeeper.podDisruptionBudget.enabled=true.What is the correct Helm --set syntax for nested values, and is "gatekeeper:podDisruptionBudget.enabled=true" valid compared to "gatekeeper.podDisruptionBudget.enabled=true"?🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@recipes/components/gatekeeper/values.yaml` at line 53, The Helm override example uses invalid colon notation; change the example to use Helm's dot-separated nested path syntax so the dashboard PDB value is set correctly (replace "gatekeeper:podDisruptionBudget.enabled=true" with "gatekeeper.podDisruptionBudget.enabled=true"), and update the comment in values.yaml accordingly so users run the correct --set flag.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@recipes/components/gatekeeper/values.yaml`:
- Line 53: The Helm override example uses invalid colon notation; change the
example to use Helm's dot-separated nested path syntax so the dashboard PDB
value is set correctly (replace "gatekeeper:podDisruptionBudget.enabled=true"
with "gatekeeper.podDisruptionBudget.enabled=true"), and update the comment in
values.yaml accordingly so users run the correct --set flag.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 22d83453-7fc3-4806-a9fd-adf8d04b420b
📒 Files selected for processing (1)
recipes/components/gatekeeper/values.yaml
|
Hi @JaydipGabani — thanks for kicking this off and for the Slack discussion behind it. Before we dig into the YAML, can we get the PR description filled in to match the PR template? Right now the Summary, Motivation / Context, and Implementation Notes sections are empty, the Type of Change boxes aren't checked, and there's no Concretely, what would help reviewers most is an issue (filed in this repo) that captures the parts of the Slack thread that justify landing Gatekeeper in AICR — specifically:
Once an issue exists, please link it via Also worth noting that the branch is |
|
Addressed the durable-context and template gaps:
|
Signed-off-by: Jaydip Gabani <gabanijaydip@gmail.com>
Signed-off-by: Jaydip Gabani <gabanijaydip@gmail.com>
Signed-off-by: Jaydip Gabani <gabanijaydip@gmail.com>
2e1b03a to
c45a9e1
Compare
|
Added contex |
mchmarny
left a comment
There was a problem hiding this comment.
Let's hold on on this PR for now to align on the scope. See: #860 (comment)
Summary
Adds Gatekeeper as an optional AICR recipe component for clusters that need admission-controller support for AI workload guardrails.
Motivation / Context
The direct Slack discussion in
#aicrstarted from the Gatekeeper maintainers asking whether defining AI workload Kubernetes policies belongs in AICR. The thread identified three possible integration shapes: policy-as-component, validation integration throughgator policyand--no-cluster, and conformance evidence.Reviewers in the thread asked for concrete AI-specific policy examples and for the distinction from AICR/Chainsaw checks. The discussion converged on workload-shape guardrails such as
nvidia.com/gpurequest/limit parity, memory request/limit parity for GPU pods, CPU request requirements, training shared-memory requirements, and optional max-GPU caps for shared clusters. The resulting split was that AICR and Chainsaw cover platform correctness and component health, while Gatekeeper andgatorcover reusable workload guardrails through offline policy evaluation or admission-time enforcement.This PR lands only the deployable Gatekeeper component slice. It does not ship policy bundles, run
gator, connect Gatekeeper to AICR validation, or capture policy results as conformance evidence. The broader Slack-derived rationale and follow-up integration paths are tracked in #860.Fixes: N/A
Related: #860
Related discussion: https://kubernetes.slack.com/archives/C0AQMPP1BK7/p1775874775990329
Type of Change
Component(s) Affected
cmd/aicr,pkg/cli)cmd/aicrd,pkg/api,pkg/server)pkg/recipe)pkg/bundler,pkg/component/*)pkg/collector,pkg/snapshotter)pkg/validator)pkg/errors,pkg/k8s)docs/,examples/)Implementation Notes
recipes/registry.yaml.3.22.2and installs it intogatekeeper-systemwhen recipes opt in.gatorvalidation, AICR validator integration, and conformance evidence integration to follow-up work tracked in Epic: Track Gatekeeper integration for AI workload guardrails #860.Testing
Passed locally. This covers race-enabled unit tests, coverage enforcement, linting, Chainsaw CLI e2e tests, vulnerability scan, license checks, and repository-specific documentation checks.
Risk Assessment
Rollout notes: Gatekeeper remains opt-in. Existing recipes are unchanged unless they explicitly reference the component, and this PR does not install or enforce any workload policies.
Checklist
make testwith-race)make lint)git commit -S) - GPG signing info