Skip to content

update prometheus-community dependencies#109

Open
tsipo wants to merge 2 commits into
masterfrom
dependency-update
Open

update prometheus-community dependencies#109
tsipo wants to merge 2 commits into
masterfrom
dependency-update

Conversation

@tsipo
Copy link
Copy Markdown
Member

@tsipo tsipo commented May 14, 2026

No description provided.

@tsipo tsipo requested a review from a team as a code owner May 14, 2026 20:37
Copy link
Copy Markdown
Contributor

@kubexautomation kubexautomation Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CONTENT OF THIS REVIEW IS AI GENERATED

Overall assessment: This PR updates two prometheus-community dependencies with major version bumps (prometheus 27→29, kube-state-metrics 6→7), but the corresponding values.yaml files and chart version fields have not been updated to reflect the breaking changes these upgrades introduce. The mechanical diff (Chart.lock + Chart.yaml constraint) is correct, but the downstream configuration compatibility has not been addressed.

Risk level: High


Critical issues

None (no crashes guaranteed, but high misconfiguration risk)

Major issues

  1. kubex-automation-stack/Chart.yaml — Two-major-version prometheus jump (27→29)
    Skipping an entire major release (28) doubles the migration surface area. The prometheus community chart introduced breaking changes in both v28 and v29 (e.g. sub-chart value schema changes, RBAC restructuring, deprecated field removals). The existing values.yaml must be validated against every major that was skipped.

  2. kubex-collection-gke-autopilot/Chart.yaml (and embedded in kubex-automation-stack) — kube-state-metrics 6→7 value compatibility
    The values.yaml files for both charts configure kube-state-metrics using keys such as metricAnnotationsAllowList, metricLabelsAllowlist, collectors, and rbac.create. These key names and structures changed in the kube-state-metrics chart v7 release. Unrecognised values are silently ignored by Helm, meaning the deployed KSM instance will fall back to upstream defaults — potentially collecting fewer/more metrics or using incorrect RBAC. This is a silent misconfiguration risk, not a loud failure.

Minor issues

  1. Chart version not bumped in either Chart.yaml
    kubex-automation-stack remains at 1.0.11 and kubex-collection-gke-autopilot at 1.0.6. Per Helm/SemVer conventions, any change to a dependency — especially a major upstream version bump — warrants incrementing the chart version. Without this, helm repository consumers will not see a new release and automated upgrade pipelines may silently receive a changed stack.
    • Location: charts/kubex-automation-stack/Chart.yaml:4 (unchanged line, not in diff)
    • Location: charts/kubex-collection-gke-autopilot/Chart.yaml:4 (unchanged line, not in diff)

DRY improvement opportunities

  • Both values.yaml files define nearly identical kube-state-metrics configuration blocks (same collectors list, same metricLabelsAllowlist, same tolerations). Consider extracting a shared defaults document or README table to keep them in sync across charts.

Suggested next steps

  1. Review the upstream changelogs for prometheus community chart v28 and v29 and kube-state-metrics chart v7, and explicitly document in the PR description that breaking changes were assessed.
  2. Run helm upgrade --dry-run (or helm template) against a live/test cluster for both charts and compare rendered manifests before and after to surface any silently-changed configuration.
  3. Update metricAnnotationsAllowList, metricLabelsAllowlist, collectors, and rbac keys in both values.yaml files if they have been renamed or restructured in kube-state-metrics v7.
  4. Bump the version field in both Chart.yaml files (at minimum a minor bump, e.g. 1.1.0) to signal the dependency change to chart consumers.

repository: https://densify-dev.github.io/helm-charts
- name: prometheus
version: "27.*.*"
version: "29.*.*"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CONTENT OF THIS REVIEW IS AI GENERATED

[Severity: Major] [Confidence: High]

Location: charts/kubex-automation-stack/Chart.yaml:12

Issue: The prometheus dependency jumps two major versions (27.*.*29.*.*), skipping all of version 28.

Why it matters: The prometheus community Helm chart introduced breaking changes across both major 28 and major 29 releases (e.g. changes to sub-chart value structure, removal of deprecated fields, RBAC and serviceAccount schema changes). Skipping an entire major is double the surface area for breakage, and any per-major migration steps documented upstream would need to be applied in sequence. If a deployment is upgraded from a chart using v27 to one using v29 without going through v28 first, subtle runtime misconfigurations can be silently deployed.

Suggested fix: Confirm the upstream release notes for both prometheus chart v28 and v29 have been reviewed end-to-end. In particular, verify:

  1. No key renames under prometheus.server, prometheus.kube-state-metrics, or prometheus.prometheus-node-exporter that conflict with this chart's values.yaml.
  2. Any deprecated fields still in use (e.g. prometheusScrape, metricLabelsAllowlist) are still valid in v29.
  3. Consider an incremental bump (v28 first, then v29) in separate PRs to isolate regressions.

repository: https://densify-dev.github.io/helm-charts
- name: kube-state-metrics
version: "6.*.*"
version: "7.*.*"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CONTENT OF THIS REVIEW IS AI GENERATED

[Severity: Major] [Confidence: High]

Location: charts/kubex-collection-gke-autopilot/Chart.yaml:12

Issue: kube-state-metrics is bumped a full major version (6.*.*7.*.*), but the downstream values.yaml has not been audited for breaking value key changes introduced in kube-state-metrics chart v7.

Why it matters: The kube-state-metrics Helm chart v7 renamed and restructured several values. Notably:

  • metricAnnotationsAllowList and metricLabelsAllowlist key casing/naming changed across versions.
  • The collectors list format and default set changed in v7.
  • rbac.create was reorganised under a new schema.

The values.yaml for this chart (and the embedded kube-state-metrics block inside kubex-automation-stack/values.yaml) still uses the old key names. Silently-ignored unrecognised values will cause the deployed KSM to use upstream defaults rather than the intended configuration, which could result in missing metrics or excess RBAC permissions.

Suggested fix:

  1. Diff the values.yaml schema between kube-state-metrics chart v6 and v7.
  2. Update metricAnnotationsAllowList, metricLabelsAllowlist, collectors, and rbac blocks accordingly in both values.yaml files.
  3. Add an explicit helm upgrade --dry-run step to CI or document it in the PR description.

Copy link
Copy Markdown
Contributor

@kubexautomation kubexautomation Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CONTENT OF THIS REVIEW IS AI GENERATED

Overall Assessment

This PR performs routine prometheus-community dependency bumps across two charts — a sensible maintenance task. However, two of the four version changes are major-version jumps (prometheus 27→29, kube-state-metrics 6→7), and the existing values.yaml files contain keys that are no longer valid in the target versions. No values.yaml changes were included in this PR, which is the primary risk.

Risk level: High


Critical issues

None (no data-loss or cluster-breaking regressions identified with certainty).


Major issues

  1. prometheusScrape: false silently ignored after prometheus v29 / KSM v7 upgrade (kubex-automation-stack/values.yaml)
    This key was removed in kube-state-metrics chart v7, which prometheus v29 bundles. Without it, the Prometheus server's annotation-based scrape job will pick up the KSM service and double-ingest all KSM metrics. See inline comment on Chart.yaml:12.

  2. metricLabelsAllowlist key casing changed in KSM v7 (both charts)
    The field was renamed metricLabelsAllowList (capital L) in v7. Using the old casing causes KSM to start with no extra label allowlisting, breaking all per-label metadata in Densify reports. See inline comment.

  3. prometheus chart skips major version 28 (kubex-automation-stack)
    Two major-version releases were skipped in a single PR. Both v28 and v29 each carried breaking value schema changes. The jump was pre-noted in existing review comments but no evidence of a migration-notes review or helm upgrade --dry-run output is included in the PR description.

  4. kube-state-metrics chart major bump 6→7 (kubex-collection-gke-autopilot and embedded in kubex-automation-stack)
    KSM v7 reorganised several value keys beyond those already called out. A helm template diff between old and new renders should be attached to the PR.


Minor issues

  • The PR description is empty; there is no link to upstream release notes, no dry-run output, and no rollback plan documented.
  • Chart version bumps (1.0.11→1.0.12 and 1.0.6→1.0.7) are patch-level, but the dependency changes are major-version. Consider whether these should be minor bumps per SemVer convention to signal the breaking-change surface area to downstream consumers.

Suggested next steps

  1. Run helm template <release> ./charts/kubex-automation-stack --values charts/kubex-automation-stack/values.yaml against both old and new dependency versions and diff the rendered manifests.
  2. Fix prometheus.kube-state-metrics.prometheusScrape: falseservice.annotations.prometheus.io/scrape: "false" in kubex-automation-stack/values.yaml.
  3. Fix metricLabelsAllowlistmetricLabelsAllowList in both values.yaml files.
  4. Review the kube-state-metrics v6→v7 and prometheus v27→v29 changelogs and add a brief migration note to the PR description.
  5. Consider bumping chart versions to minor (1.1.0 / 1.1.0) to communicate the breaking-change surface area.

repository: https://densify-dev.github.io/helm-charts
- name: prometheus
version: "27.*.*"
version: "29.*.*"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CONTENT OF THIS REVIEW IS AI GENERATED

[Severity: Major] [Confidence: High]

Location: charts/kubex-automation-stack/values.yaml (kube-state-metrics sub-chart block, ~line 47)

Issue: prometheus.kube-state-metrics.prometheusScrape: false is a key that was removed in kube-state-metrics chart v7. The prometheus chart v29 bundles kube-state-metrics v7, so this key will be silently ignored after this upgrade.

Why it matters: With prometheusScrape no longer suppressing the default prometheus.io/scrape: "true" annotation, the Prometheus server's kubernetes-service-endpointslice job will start scraping the KSM service via annotation-based discovery in addition to the static scrape job already configured. This results in double-ingestion of all KSM metrics, inflating TSDB storage and potentially causing Densify analysis issues from duplicated time-series.

Suggested fix:

  1. Remove the prometheusScrape key (it no longer has any effect).
  2. Instead, explicitly set the KSM service annotation to opt out of Prometheus scraping via the values.yaml:
prometheus:
  kube-state-metrics:
    service:
      annotations:
        prometheus.io/scrape: "false"
  1. Validate with helm template that the rendered KSM Service no longer carries a prometheus.io/scrape: "true" annotation after the upgrade.

repository: https://densify-dev.github.io/helm-charts
- name: kube-state-metrics
version: "6.*.*"
version: "7.*.*"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CONTENT OF THIS REVIEW IS AI GENERATED

[Severity: Major] [Confidence: Medium]

Location: charts/kubex-collection-gke-autopilot/Chart.yaml:12

Issue: metricLabelsAllowlist (lowercase l) is used in both values.yaml files, but kube-state-metrics chart v7 renamed the field to metricLabelsAllowList (capital L). The same applies to metricAnnotationsAllowList which already uses the right casing in isolation, but the all-lowercase metricLabelsAllowlist form is not accepted in v7 and will be silently dropped.

Why it matters: If KSM silently ignores these values, it falls back to collecting no extra label/annotation metadata. The downstream Densify platform relies on node, pod, deployment, and namespace labels being present in the KSM metrics. Loss of these labels will cause incomplete resource tagging and degraded optimization recommendations.

Suggested fix: After bumping to v7, verify the exact accepted key names against the kube-state-metrics v7 values.yaml schema. The current best-known mapping is:

  • metricLabelsAllowlistmetricLabelsAllowList
  • metricAnnotationsAllowList → unchanged (already correct casing)

Update both charts/kubex-collection-gke-autopilot/values.yaml and the prometheus.kube-state-metrics block in charts/kubex-automation-stack/values.yaml accordingly, and confirm with helm template output that the KSM deployment args contain --metric-labels-allowlist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant