update prometheus-community dependencies#109
Conversation
There was a problem hiding this comment.
CONTENT OF THIS REVIEW IS AI GENERATED
Overall assessment: This PR updates two prometheus-community dependencies with major version bumps (prometheus 27→29, kube-state-metrics 6→7), but the corresponding values.yaml files and chart version fields have not been updated to reflect the breaking changes these upgrades introduce. The mechanical diff (Chart.lock + Chart.yaml constraint) is correct, but the downstream configuration compatibility has not been addressed.
Risk level: High
Critical issues
None (no crashes guaranteed, but high misconfiguration risk)
Major issues
-
kubex-automation-stack/Chart.yaml— Two-major-version prometheus jump (27→29)
Skipping an entire major release (28) doubles the migration surface area. The prometheus community chart introduced breaking changes in both v28 and v29 (e.g. sub-chart value schema changes, RBAC restructuring, deprecated field removals). The existingvalues.yamlmust be validated against every major that was skipped. -
kubex-collection-gke-autopilot/Chart.yaml(and embedded inkubex-automation-stack) — kube-state-metrics 6→7 value compatibility
Thevalues.yamlfiles for both charts configurekube-state-metricsusing keys such asmetricAnnotationsAllowList,metricLabelsAllowlist,collectors, andrbac.create. These key names and structures changed in the kube-state-metrics chart v7 release. Unrecognised values are silently ignored by Helm, meaning the deployed KSM instance will fall back to upstream defaults — potentially collecting fewer/more metrics or using incorrect RBAC. This is a silent misconfiguration risk, not a loud failure.
Minor issues
- Chart
versionnot bumped in eitherChart.yaml
kubex-automation-stackremains at1.0.11andkubex-collection-gke-autopilotat1.0.6. Per Helm/SemVer conventions, any change to a dependency — especially a major upstream version bump — warrants incrementing the chart version. Without this, helm repository consumers will not see a new release and automated upgrade pipelines may silently receive a changed stack.Location: charts/kubex-automation-stack/Chart.yaml:4(unchanged line, not in diff)Location: charts/kubex-collection-gke-autopilot/Chart.yaml:4(unchanged line, not in diff)
DRY improvement opportunities
- Both
values.yamlfiles define nearly identicalkube-state-metricsconfiguration blocks (samecollectorslist, samemetricLabelsAllowlist, same tolerations). Consider extracting a shared defaults document or README table to keep them in sync across charts.
Suggested next steps
- Review the upstream changelogs for prometheus community chart v28 and v29 and kube-state-metrics chart v7, and explicitly document in the PR description that breaking changes were assessed.
- Run
helm upgrade --dry-run(orhelm template) against a live/test cluster for both charts and compare rendered manifests before and after to surface any silently-changed configuration. - Update
metricAnnotationsAllowList,metricLabelsAllowlist,collectors, andrbackeys in bothvalues.yamlfiles if they have been renamed or restructured in kube-state-metrics v7. - Bump the
versionfield in bothChart.yamlfiles (at minimum a minor bump, e.g.1.1.0) to signal the dependency change to chart consumers.
| repository: https://densify-dev.github.io/helm-charts | ||
| - name: prometheus | ||
| version: "27.*.*" | ||
| version: "29.*.*" |
There was a problem hiding this comment.
CONTENT OF THIS REVIEW IS AI GENERATED
[Severity: Major] [Confidence: High]
Location: charts/kubex-automation-stack/Chart.yaml:12
Issue: The prometheus dependency jumps two major versions (27.*.* → 29.*.*), skipping all of version 28.
Why it matters: The prometheus community Helm chart introduced breaking changes across both major 28 and major 29 releases (e.g. changes to sub-chart value structure, removal of deprecated fields, RBAC and serviceAccount schema changes). Skipping an entire major is double the surface area for breakage, and any per-major migration steps documented upstream would need to be applied in sequence. If a deployment is upgraded from a chart using v27 to one using v29 without going through v28 first, subtle runtime misconfigurations can be silently deployed.
Suggested fix: Confirm the upstream release notes for both prometheus chart v28 and v29 have been reviewed end-to-end. In particular, verify:
- No key renames under
prometheus.server,prometheus.kube-state-metrics, orprometheus.prometheus-node-exporterthat conflict with this chart'svalues.yaml. - Any deprecated fields still in use (e.g.
prometheusScrape,metricLabelsAllowlist) are still valid in v29. - Consider an incremental bump (v28 first, then v29) in separate PRs to isolate regressions.
| repository: https://densify-dev.github.io/helm-charts | ||
| - name: kube-state-metrics | ||
| version: "6.*.*" | ||
| version: "7.*.*" |
There was a problem hiding this comment.
CONTENT OF THIS REVIEW IS AI GENERATED
[Severity: Major] [Confidence: High]
Location: charts/kubex-collection-gke-autopilot/Chart.yaml:12
Issue: kube-state-metrics is bumped a full major version (6.*.* → 7.*.*), but the downstream values.yaml has not been audited for breaking value key changes introduced in kube-state-metrics chart v7.
Why it matters: The kube-state-metrics Helm chart v7 renamed and restructured several values. Notably:
metricAnnotationsAllowListandmetricLabelsAllowlistkey casing/naming changed across versions.- The
collectorslist format and default set changed in v7. rbac.createwas reorganised under a new schema.
The values.yaml for this chart (and the embedded kube-state-metrics block inside kubex-automation-stack/values.yaml) still uses the old key names. Silently-ignored unrecognised values will cause the deployed KSM to use upstream defaults rather than the intended configuration, which could result in missing metrics or excess RBAC permissions.
Suggested fix:
- Diff the
values.yamlschema between kube-state-metrics chart v6 and v7. - Update
metricAnnotationsAllowList,metricLabelsAllowlist,collectors, andrbacblocks accordingly in bothvalues.yamlfiles. - Add an explicit
helm upgrade --dry-runstep to CI or document it in the PR description.
There was a problem hiding this comment.
CONTENT OF THIS REVIEW IS AI GENERATED
Overall Assessment
This PR performs routine prometheus-community dependency bumps across two charts — a sensible maintenance task. However, two of the four version changes are major-version jumps (prometheus 27→29, kube-state-metrics 6→7), and the existing values.yaml files contain keys that are no longer valid in the target versions. No values.yaml changes were included in this PR, which is the primary risk.
Risk level: High
Critical issues
None (no data-loss or cluster-breaking regressions identified with certainty).
Major issues
-
prometheusScrape: falsesilently ignored after prometheus v29 / KSM v7 upgrade (kubex-automation-stack/values.yaml)
This key was removed in kube-state-metrics chart v7, which prometheus v29 bundles. Without it, the Prometheus server's annotation-based scrape job will pick up the KSM service and double-ingest all KSM metrics. See inline comment onChart.yaml:12. -
metricLabelsAllowlistkey casing changed in KSM v7 (both charts)
The field was renamedmetricLabelsAllowList(capitalL) in v7. Using the old casing causes KSM to start with no extra label allowlisting, breaking all per-label metadata in Densify reports. See inline comment. -
prometheuschart skips major version 28 (kubex-automation-stack)
Two major-version releases were skipped in a single PR. Both v28 and v29 each carried breaking value schema changes. The jump was pre-noted in existing review comments but no evidence of a migration-notes review orhelm upgrade --dry-runoutput is included in the PR description. -
kube-state-metricschart major bump 6→7 (kubex-collection-gke-autopilotand embedded inkubex-automation-stack)
KSM v7 reorganised several value keys beyond those already called out. Ahelm templatediff between old and new renders should be attached to the PR.
Minor issues
- The PR description is empty; there is no link to upstream release notes, no dry-run output, and no rollback plan documented.
- Chart version bumps (
1.0.11→1.0.12and1.0.6→1.0.7) are patch-level, but the dependency changes are major-version. Consider whether these should be minor bumps per SemVer convention to signal the breaking-change surface area to downstream consumers.
Suggested next steps
- Run
helm template <release> ./charts/kubex-automation-stack --values charts/kubex-automation-stack/values.yamlagainst both old and new dependency versions and diff the rendered manifests. - Fix
prometheus.kube-state-metrics.prometheusScrape: false→service.annotations.prometheus.io/scrape: "false"inkubex-automation-stack/values.yaml. - Fix
metricLabelsAllowlist→metricLabelsAllowListin bothvalues.yamlfiles. - Review the kube-state-metrics v6→v7 and prometheus v27→v29 changelogs and add a brief migration note to the PR description.
- Consider bumping chart versions to minor (
1.1.0/1.1.0) to communicate the breaking-change surface area.
| repository: https://densify-dev.github.io/helm-charts | ||
| - name: prometheus | ||
| version: "27.*.*" | ||
| version: "29.*.*" |
There was a problem hiding this comment.
CONTENT OF THIS REVIEW IS AI GENERATED
[Severity: Major] [Confidence: High]
Location: charts/kubex-automation-stack/values.yaml (kube-state-metrics sub-chart block, ~line 47)
Issue: prometheus.kube-state-metrics.prometheusScrape: false is a key that was removed in kube-state-metrics chart v7. The prometheus chart v29 bundles kube-state-metrics v7, so this key will be silently ignored after this upgrade.
Why it matters: With prometheusScrape no longer suppressing the default prometheus.io/scrape: "true" annotation, the Prometheus server's kubernetes-service-endpointslice job will start scraping the KSM service via annotation-based discovery in addition to the static scrape job already configured. This results in double-ingestion of all KSM metrics, inflating TSDB storage and potentially causing Densify analysis issues from duplicated time-series.
Suggested fix:
- Remove the
prometheusScrapekey (it no longer has any effect). - Instead, explicitly set the KSM service annotation to opt out of Prometheus scraping via the
values.yaml:
prometheus:
kube-state-metrics:
service:
annotations:
prometheus.io/scrape: "false"- Validate with
helm templatethat the rendered KSMServiceno longer carries aprometheus.io/scrape: "true"annotation after the upgrade.
| repository: https://densify-dev.github.io/helm-charts | ||
| - name: kube-state-metrics | ||
| version: "6.*.*" | ||
| version: "7.*.*" |
There was a problem hiding this comment.
CONTENT OF THIS REVIEW IS AI GENERATED
[Severity: Major] [Confidence: Medium]
Location: charts/kubex-collection-gke-autopilot/Chart.yaml:12
Issue: metricLabelsAllowlist (lowercase l) is used in both values.yaml files, but kube-state-metrics chart v7 renamed the field to metricLabelsAllowList (capital L). The same applies to metricAnnotationsAllowList which already uses the right casing in isolation, but the all-lowercase metricLabelsAllowlist form is not accepted in v7 and will be silently dropped.
Why it matters: If KSM silently ignores these values, it falls back to collecting no extra label/annotation metadata. The downstream Densify platform relies on node, pod, deployment, and namespace labels being present in the KSM metrics. Loss of these labels will cause incomplete resource tagging and degraded optimization recommendations.
Suggested fix: After bumping to v7, verify the exact accepted key names against the kube-state-metrics v7 values.yaml schema. The current best-known mapping is:
metricLabelsAllowlist→metricLabelsAllowListmetricAnnotationsAllowList→ unchanged (already correct casing)
Update both charts/kubex-collection-gke-autopilot/values.yaml and the prometheus.kube-state-metrics block in charts/kubex-automation-stack/values.yaml accordingly, and confirm with helm template output that the KSM deployment args contain --metric-labels-allowlist.
No description provided.