diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index 9f9e5fb1b..72a68d8ec 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -267,6 +267,8 @@ slog.Error("operation failed", "error", err, "component", "gpu-collector") **Note:** A component must have either `helm` OR `kustomize` configuration, not both. +**After any change to `recipes/registry.yaml`, a component's values file, or a chart version pin (in registry, overlay, or mixin):** run `make bom-docs` and commit the regenerated `docs/user/container-images.md` in the same PR. The BOM is rendered fresh from each Helm chart's actual templates, so an unbumped pin can still pick up upstream image drift — running it locally is the only reliable way to know whether the doc needs an update. `make bom-check` verifies the committed BOM matches a fresh regen, but it is **opt-in only** — not wired into `make qualify`, `make lint`, or the merge gate today. Do not rely on either to catch a missed regen. + **Using mixins for shared OS/platform content:** ```yaml # Leaf overlay referencing mixins instead of duplicating content diff --git a/AGENTS.md b/AGENTS.md index 4126eec50..beadd6843 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -267,6 +267,8 @@ slog.Error("operation failed", "error", err, "component", "gpu-collector") **Note:** A component must have either `helm` OR `kustomize` configuration, not both. +**After any change to `recipes/registry.yaml`, a component's values file, or a chart version pin (in registry, overlay, or mixin):** run `make bom-docs` and commit the regenerated `docs/user/container-images.md` in the same PR. The BOM is rendered fresh from each Helm chart's actual templates, so an unbumped pin can still pick up upstream image drift — running it locally is the only reliable way to know whether the doc needs an update. `make bom-check` verifies the committed BOM matches a fresh regen, but it is **opt-in only** — not wired into `make qualify`, `make lint`, or the merge gate today. Do not rely on either to catch a missed regen. + **Using mixins for shared OS/platform content:** ```yaml # Leaf overlay referencing mixins instead of duplicating content diff --git a/Makefile b/Makefile index 6de95b1a7..f48c50369 100644 --- a/Makefile +++ b/Makefile @@ -291,7 +291,7 @@ bom-docs: ## Regenerates the auto-generated section of $(BOM_DOC_PATH) from the echo "Updated $(BOM_DOC_PATH) (prose preserved, auto-generated section refreshed)" .PHONY: bom-check -bom-check: ## Verifies $(BOM_DOC_PATH) is up to date with the live registry (CI gate, opt-in locally) +bom-check: ## Verifies $(BOM_DOC_PATH) is up to date with the live registry (opt-in; not wired into qualify/lint/merge gate) @set -e; \ $(MAKE) bom-docs; \ if ! git diff --quiet -- $(BOM_DOC_PATH); then \ diff --git a/docs/contributor/data.md b/docs/contributor/data.md index fc89986f3..3d38999a9 100644 --- a/docs/contributor/data.md +++ b/docs/contributor/data.md @@ -613,6 +613,20 @@ componentRefs: The system performs **topological sort** to compute deployment order, ensuring dependencies are deployed before dependents. The resulting order is exposed in `RecipeResult.DeploymentOrder`. +### Regenerating the BOM + +`docs/user/container-images.md` is an auto-generated bill of materials listing every container image AICR pulls across all components. It is regenerated by `make bom-docs`, which renders each Helm chart against its live OCI source and extracts image references from the rendered templates. + +**Run `make bom-docs` and commit the regenerated `docs/user/container-images.md` in the same PR whenever you:** + +- Add or remove a component in `recipes/registry.yaml` +- Bump a chart version (in `registry.yaml`, an overlay, or a mixin) +- Modify a component's `values.yaml` in a way that changes which images render (image repo override, subchart enable/disable, etc.) + +The regen can also surface drift from *upstream* chart updates — when a chart bumps an image inside its own templates without a registry pin change on our side. That drift will appear in the BOM diff whether you expected it or not. Land it as part of the same PR that triggered the regen, or split it out as a separate "BOM catch-up" PR if the unrelated diff would obscure the primary change. + +**Freshness is not gated.** `make bom-check` verifies the committed `docs/user/container-images.md` matches a fresh regen, but it is opt-in — neither `make qualify` nor `make lint` runs it today, and the merge gate has no PR-time BOM-staleness check (it only runs `bom-pinning-check`, which is the chart-pin verification per ADR-006). Run `make bom-docs` explicitly whenever you touch a component; do not rely on local qualify or CI to catch a missed regen. Wiring `bom-check` into the gate is a desirable follow-up. + ## Criteria Matching Algorithm The recipe system uses an **asymmetric rule matching algorithm** where recipe criteria (rules) match against user queries (candidates). diff --git a/docs/user/container-images.md b/docs/user/container-images.md index ac5e46c2e..6cb3e74cb 100644 --- a/docs/user/container-images.md +++ b/docs/user/container-images.md @@ -19,8 +19,8 @@ A machine-readable **CycloneDX 1.6 JSON** companion to this page is produced by ## Summary -- Components: **22** -- Unique images: **69** +- Components: **24** +- Unique images: **71** - Distinct registries: **11** Registries: `602401143452.dkr.ecr.us-west-2.amazonaws.com`, `cr.kgateway.dev`, `docker.io`, `gcr.io`, `ghcr.io`, `gke.gcr.io`, `nvcr.io`, `public.ecr.aws`, `quay.io`, `registry.k8s.io`, `us-docker.pkg.dev` @@ -51,6 +51,8 @@ Registries: `602401143452.dkr.ecr.us-west-2.amazonaws.com`, `cr.kgateway.dev`, ` | nvidia-dra-driver-gpu | helm | nvidia/nvidia-dra-driver-gpu | 25.12.0 | 1 | | nvsentinel | helm | nvsentinel | v1.3.0 | 6 | | prometheus-adapter | helm | prometheus-community/prometheus-adapter | 5.3.0 | 1 | +| slinky-slurm-operator | helm | slurm-operator | 1.1.0 | 2 | +| slinky-slurm-operator-crds | helm | slurm-operator-crds | 1.1.0 | 0 | ## Images by component @@ -141,7 +143,7 @@ _No images extracted._ ### kubeflow-trainer - `ghcr.io/kubeflow/trainer/trainer-controller-manager:v2.2.0` -- `pytorch/pytorch:2.9.1-cuda12.8-cudnn9-runtime@sha256:7b324d212a4450795b49edba9949b7cdc72429148a64e974334bfe5774d51385` +- `pytorch/pytorch:2.11.0-cuda12.8-cudnn9-runtime@sha256:eee11b3b3872a8c838e35ef48f08b2d5def2080902c7f666831310ca1a0ef2be` - `registry.k8s.io/jobset/jobset:v0.11.0` ### kueue @@ -150,7 +152,7 @@ _No images extracted._ ### network-operator -- `busybox:1.36@sha256:73aaf090f3d85aa34ee199857f03fa3a95c8ede2ffd4cc2cdb5b94e566b11662` +- `busybox:1.37@sha256:1487d0af5f52b4ba31c7e465126ee2123fe3f2305d638e7827681e7cf6c83d5e` - `nvcr.io/nvidia/cloud-native/network-operator:v26.1.1` - `nvcr.io/nvidia/doca/doca_telemetry:1.22.5-doca3.1.0-host` - `nvcr.io/nvidia/mellanox/doca-driver:doca3.2.0-25.10-1.2.8.0-2` @@ -190,6 +192,15 @@ _No images extracted._ - `registry.k8s.io/prometheus-adapter/prometheus-adapter:v0.12.0` +### slinky-slurm-operator + +- `ghcr.io/slinkyproject/slurm-operator-webhook:1.1.0` +- `ghcr.io/slinkyproject/slurm-operator:1.1.0` + +### slinky-slurm-operator-crds + +_No images extracted._ + ## How to read this list