Skip to content

chore(recipes): align overlay network-operator pins to v26.1.1#886

Merged
mchmarny merged 2 commits into
NVIDIA:mainfrom
yuanchen8911:chore/network-operator-v26.1.1
May 14, 2026
Merged

chore(recipes): align overlay network-operator pins to v26.1.1#886
mchmarny merged 2 commits into
NVIDIA:mainfrom
yuanchen8911:chore/network-operator-v26.1.1

Conversation

@yuanchen8911
Copy link
Copy Markdown
Contributor

Summary

Aligns two overlay-level network-operator chart pins to match the registry default (26.1.1):

  • recipes/overlays/kind.yaml: 25.1.0 → 26.1.1 (a year behind — this is the 25.1.0 → v26.1.1 major bump chore(recipes): check and update runtime component versions across all recipes #698's audit flagged)
  • recipes/overlays/aks.yaml: "v26.1.0" → "26.1.1" (also drops the spurious "v" prefix — NGC publishes the chart as 26.1.1 without v; Helm was warning "unable to find exact version requested; falling back to closest available version" on every render)

Motivation / Context

The audit in #698 lists network-operator: 25.1.0 → v26.1.1, major. The registry default at recipes/registry.yaml was bumped to 26.1.1 in #777, but two overlays carry their own explicit pins that didn't get updated alongside the registry — the classic overlay-pin drift pattern we've cleaned up in dynamo (#725) and aws-efa (#872).

Fixes: follow-up #3 from #698 (network-operator: 25.1.0 → v26.1.1, major)
Related: #698, #777 (registry default bump)

Type of Change

  • Refactoring (no functional changes — overlay pins now match registry default that bundles already use)

Component(s) Affected

  • Recipe engine / data (recipes/overlays/kind.yaml, recipes/overlays/aks.yaml)

Implementation Notes

No image changes. make bom-docs produces zero diff against main after this PR, because the registry default at 26.1.1 was already the BOM source — every bundle built from the registry default has been using v26.1.1 since #777. The two overlays were the only places still pinning older versions, and the kind overlay's 25.1.0 pin was a year stale.

v-prefix correction on aks.yaml. NGC publishes the chart as version 26.1.1 (no v prefix; the appVersion is v26.1.1). The "v26.1.0" pin in aks.yaml was emitting WARN: unable to find exact version requested; falling back to closest available version on every render. Helm was falling back to chart 26.1.0 correctly, but the warning is noisy and the pin should match the chart version exactly.

Dead config in kind values.yaml (noted, not addressed). recipes/components/network-operator/values.yaml carries top-level keys (deployCR, nicClusterPolicy, nvIpam, secondaryNetwork) that the v26.1.x chart schema no longer accepts. Helm silently ignores them and the chart renders correctly; the keys are simply dead config. Cleaning up the kind values file to match the new schema is out of scope for this chart-pin alignment — would conflate "bump pin" with "rewrite values schema" and is a low-priority cleanup since kind environments don't have RDMA hardware anyway and the network-operator deployment there is largely a stub for recipe-completeness.

Testing

make qualify   # Go tests, golangci-lint + yamllint, BOM regen, chart-pin
               # verification (ADR-006), 20/20 chainsaw, vulnerability scan,
               # license headers, agents-sync check — all green

Also manually verified helm template against the v26.1.1 chart with both AICR values files (values.yaml for kind, values-aks.yaml for AKS) — renders cleanly, no errors.

Risk Assessment

  • Low — Overlay pins align to the registry default that production bundles already use. The set of rendered images is unchanged (make bom-docs produces zero diff). The only behavioral change is: bundles built from kind/aks recipes now pull chart 26.1.1 instead of (25.1.0 / 26.1.0). Both are forward-compatible — the AKS values file uses only forward-compatible keys, and the kind values file has dead keys that are silently ignored on v26.1.1 the same way they were on prior versions.

Rollout notes: Existing AICR-deployed clusters running kind or AKS recipes can helm-upgrade in place by regenerating the bundle and running deploy.sh. The network-operator controller pod rotates once. No CRD migrations required (v26.1.x schema is the same forward from v26.0).

Checklist

  • Tests pass locally (make test with -race)
  • Linter passes (make lint)
  • I did not skip/disable tests to make CI green
  • I added/updated tests for new functionality (n/a — version bump only)
  • I updated docs if user-facing behavior changed (n/a — no user-visible change)
  • Changes follow existing patterns in the codebase
  • Commits are cryptographically signed (git commit -S)

`recipes/registry.yaml` already pins network-operator at `26.1.1`, but
two overlays carry their own explicit pins that drift from the default:

- recipes/overlays/kind.yaml: 25.1.0 → 26.1.1 (was a year behind; this
  is the `25.1.0 → v26.1.1` bump that NVIDIA#698's audit flagged)
- recipes/overlays/aks.yaml: "v26.1.0" → "26.1.1" (also drops the
  spurious "v" prefix — NGC's chart version is "26.1.1" without "v";
  Helm was emitting a warning on every render: "unable to find exact
  version requested; falling back to closest available version")

No image changes — verified by `make bom-docs` producing zero diff
against main, since the registry default at 26.1.1 was already the
BOM source. This PR is purely overlay-consistency cleanup so the
explicit pins match what bundles built from the registry default
already use.

Pre-existing dead config noted (not addressed here, follow-up):
`recipes/components/network-operator/values.yaml` carries top-level
keys (`deployCR`, `nicClusterPolicy`, `nvIpam`, `secondaryNetwork`)
that the v26.1.x chart schema no longer accepts. Helm silently
ignores them. The chart renders correctly; the keys are simply dead
config. Cleaning up the kind values file to match the new schema is
out of scope for this chart-pin alignment.

Closes follow-up NVIDIA#3 from issue NVIDIA#698.

Validation:
- `make qualify`: all Go unit tests pass with race detector, 20/20
  chainsaw tests pass, golangci-lint + yamllint clean, vulnerability
  scan + license headers OK
- `make bom-docs`: regenerated docs/user/container-images.md;
  zero diff vs main (registry default unchanged)
- `helm template` against the v26.1.1 chart with both AICR values
  files renders without errors
@yuanchen8911 yuanchen8911 requested a review from a team as a code owner May 14, 2026 15:42
@yuanchen8911 yuanchen8911 added enhancement New feature or request area/recipes labels May 14, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 8fb227a3-f1c6-4d22-be6f-df31d82dac6a

📥 Commits

Reviewing files that changed from the base of the PR and between f548633 and 2774500.

📒 Files selected for processing (2)
  • recipes/overlays/aks.yaml
  • recipes/overlays/kind.yaml

📝 Walkthrough

Walkthrough

This PR updates the network-operator Helm chart version referenced in two recipe overlay files. The AKS recipe upgrades from v26.1.0 to 26.1.1 with removal of the leading v prefix. The kind recipe upgrades from 25.1.0 to 26.1.1. No other configuration, dependencies, or validation settings are changed in either file.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Suggested reviewers

  • mchmarny
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: aligning overlay network-operator chart version pins to v26.1.1, which is the primary focus of the changeset.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, providing detailed context about version pin alignment, motivation, testing, and risk assessment.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@mchmarny mchmarny enabled auto-merge (squash) May 14, 2026 17:38
@mchmarny mchmarny merged commit 02d3a33 into NVIDIA:main May 14, 2026
60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants