Sync kubex charts from automation-controller main @ f2541e4#106
Conversation
There was a problem hiding this comment.
CONTENT OF THIS REVIEW IS AI GENERATED
Overall assessment: This PR cleanly introduces optional Prometheus ServiceMonitor support and migrates metrics from HTTPS/8443 to plain-HTTP/8080, including the corresponding controller manager flag. The implementation is generally well-structured, but there is one logic flaw that silently degrades endpoint security beyond the intended scope, and a tpl-in-annotations pattern that is fragile under user overrides.
Risk level: Medium
Critical issues:
None.
Major issues:
-
_helpers.tpl—--metrics-secure=falseemitted regardless ofserviceMonitor.enabled(_helpers.tpl:129): The security flag is driven bymetrics.serviceMonitor.scheme == "http", butserviceMonitor.enableddefaults tofalse. When a user hasmetrics.enabled=trueandserviceMonitor.enabled=false(the default out-of-box state), the controller still starts with--metrics-secure=falsebecauseschemedefaults to"http". This is an unintentional security downgrade. The condition should also gate onserviceMonitor.enabled=true, or a dedicatedmetrics.secureflag should drive this argument independently. -
values.yaml—tplon user-supplied annotations creates a fragile contract (values.yaml:191): The defaultannotationsblock embeds Helm template expressions ({{ .Values.metrics.port }}), which requiresmetrics-service.yamlto calltplon the annotation map. Any user who overridesannotationswith a plain string value containing{{(e.g. a Go format string, Datadog template) will get a hard Helm render error with no helpful message. This contract must be documented, or the annotation values should be resolved inside the template rather than stored as raw template strings invalues.yaml.
Minor issues:
-
metrics-servicemonitor.yaml:22—selectorLabelsincludes version label: TheServiceMonitorselector useskubex-automation-engine.selectorLabels, which likely includesapp.kubernetes.io/version. Version churn during chart upgrades can briefly break scraping. Use only stable labels (name/instance) in the ServiceMonitor selector. -
metrics-servicemonitor.yaml:28— Commented-out TLS config block: The# tlsConfig:comment block is dead code in a shipped template. Remove it or implement it properly behind ametrics.serviceMonitor.tlsConfigvalue. As-is,scheme: httpsis listed as a valid enum value in the schema but has no TLS config support, making it non-functional. -
values.schema.json:138— MissingadditionalProperties: falseonmetricsandserviceMonitorobjects: Without this constraint, typos in key names silently pass schema validation and the feature simply never activates. Add"additionalProperties": falseto both objects. -
Configuration-Reference.md:119— Incomplete documentation: The newmetrics.serviceMonitor.interval,metrics.serviceMonitor.additionalLabels, andmetrics.serviceMonitor.pathfields are not listed in the reference table.
DRY improvement opportunities:
- The port number
8080now appears in three independent places:controllerManager.metricsBindAddress(:8080),metrics.port(8080), and the annotation templateprometheus.io/port: '{{ .Values.metrics.port }}'. The bind address and service port being set independently means they can drift. Consider derivingcontrollerManager.metricsBindAddressfrommetrics.portin a helper (e.g.printf ":%d" .Values.metrics.port) rather than duplicating the literal.
Suggested next steps:
- Fix the
metricsSecureArghelper to requireserviceMonitor.enabled=true(or add an explicitmetrics.secureflag) before emitting--metrics-secure=false. - Resolve the
tpl-in-annotations pattern: either move template rendering into themetrics-service.yamltemplate itself, or prominently document the escaping requirement. - Remove or implement the commented-out
tlsConfigblock inmetrics-servicemonitor.yaml, and add a guard or note explaining thatscheme: httpsis not fully supported yet.
| {{- end }} | ||
| {{- end }} | ||
|
|
||
| {{/* |
There was a problem hiding this comment.
CONTENT OF THIS REVIEW IS AI GENERATED
[Severity: Major] [Confidence: High]
Location: charts/kubex-automation-engine/templates/_helpers.tpl:129
Issue: --metrics-secure=false is injected solely based on metrics.serviceMonitor.scheme, but serviceMonitor can be disabled (metrics.serviceMonitor.enabled=false) while scheme still defaults to "http". In that case the flag is still emitted even though no ServiceMonitor exists and a user may actually want a secured endpoint while using pod-annotation-based scraping.
Why it matters: The security posture of the metrics endpoint is silently degraded to unauthenticated HTTP any time metrics.enabled=true and metrics.serviceMonitor.scheme is "http" (the default), regardless of whether the ServiceMonitor feature is used. This could expose controller metrics without the operator realising.
Suggested fix: Introduce a dedicated, explicit metrics.secure (or metrics.serviceMonitor.secure) boolean that defaults to false, and drive --metrics-secure=false from that value instead of inferring it from scheme. Alternatively, scope the condition to also require metrics.serviceMonitor.enabled=true:
-{{- if and .Values.metrics.enabled (eq .Values.metrics.serviceMonitor.scheme "http") }}
+{{- if and .Values.metrics.enabled .Values.metrics.serviceMonitor.enabled (eq .Values.metrics.serviceMonitor.scheme "http") }}| prometheus.io/port: '{{ .Values.metrics.port }}' | ||
| prometheus.io/path: '{{ .Values.metrics.serviceMonitor.path }}' | ||
| prometheus.io/scheme: '{{ .Values.metrics.serviceMonitor.scheme }}' | ||
| # -- Optional ServiceMonitor configuration for Prometheus Operator |
There was a problem hiding this comment.
CONTENT OF THIS REVIEW IS AI GENERATED
[Severity: Major] [Confidence: High]
Location: charts/kubex-automation-engine/values.yaml:191
Issue: The default annotations block embeds Go template expressions (e.g. '{{ .Values.metrics.port }}') directly in values.yaml. These are not rendered by helm template when read from values.yaml at the values-layer; they are only rendered because metrics-service.yaml now calls tpl on the annotation block. However, values.schema.json declares annotations as "type": "object" with no special treatment — a user who overrides annotations with plain values (no template expressions) will still have them passed through tpl, which is safe, but any user who passes a value containing {{ in a legitimate non-template string (e.g. a Go format string) will get a Helm render error.
Why it matters: Using tpl on user-supplied annotation values is a well-known footgun: it silently breaks if annotation values happen to contain {{ characters, and it creates an implicit contract (users must escape {{ as {{"{{"}}) that is not documented.
Suggested fix: If templated annotations are really required, document this contract clearly in the values comments, and consider using a named helper with a specific prefix convention instead of templating the entire annotations block. Alternatively, move the templated annotations out of the default values.yaml and into a helper or hardcode the known values so plain toYaml can be used again.
| any: true | ||
| {{- end }} | ||
| selector: | ||
| matchLabels: |
There was a problem hiding this comment.
CONTENT OF THIS REVIEW IS AI GENERATED
[Severity: Minor] [Confidence: High]
Location: charts/kubex-automation-engine/templates/metrics-servicemonitor.yaml:22
Issue: The selector.matchLabels reuses kubex-automation-engine.selectorLabels, which typically includes the app.kubernetes.io/version label. If the chart version changes between the Service and the ServiceMonitor being applied (e.g. a rolling upgrade), the version label mismatch can cause Prometheus to temporarily lose scrape targets.
Why it matters: Pod selector labels (used for workload matching) are appropriate for Deployments, but ServiceMonitor selectors should typically only match stable, non-version labels. Version churn during upgrades silently breaks scraping until the reconciler catches up.
Suggested fix: Consider using a subset of labels that excludes app.kubernetes.io/version, or define a dedicated kubex-automation-engine.metricsServiceSelectorLabels helper that only emits the stable app.kubernetes.io/name and app.kubernetes.io/instance labels.
| - port: metrics | ||
| path: {{ .Values.metrics.serviceMonitor.path }} | ||
| scheme: {{ .Values.metrics.serviceMonitor.scheme }} | ||
| # tlsConfig: |
There was a problem hiding this comment.
CONTENT OF THIS REVIEW IS AI GENERATED
[Severity: Minor] [Confidence: High]
Location: charts/kubex-automation-engine/templates/metrics-servicemonitor.yaml:28
Issue: Commented-out TLS config blocks (# tlsConfig: ...) are left in the shipped template. Commented-out code in a production Helm template adds noise, signals an incomplete design, and can be misleading to operators who may not realize those fields are inoperative.
Why it matters: If TLS scraping is a future-planned feature, this is better tracked as a GitHub issue or a NOTES.txt notice. If it is not needed when scheme: https is chosen, users will look for TLS config options that do not exist and have no way to configure them.
Suggested fix: Remove the commented-out block entirely. If HTTPS scraping is intended to be supported, add a metrics.serviceMonitor.tlsConfig value key and render it conditionally, or note the limitation in NOTES.txt.
| } | ||
| }, | ||
| "scheme": { | ||
| "type": "string", |
There was a problem hiding this comment.
CONTENT OF THIS REVIEW IS AI GENERATED
[Severity: Minor] [Confidence: High]
Location: charts/kubex-automation-engine/values.schema.json:138
Issue: The JSON schema metrics object does not set "additionalProperties": false and does not list required for any of its properties, which means Helm schema validation will silently accept typos like serviceMoniotr or enabeld without error.
Why it matters: Schema validation is the last line of defence against silent misconfiguration. Without additionalProperties: false, unknown keys pass through unchallenged and features like serviceMonitor simply never activate, with no user-visible error.
Suggested fix: Add "additionalProperties": false to both the outer metrics object and the nested serviceMonitor object, consistent with how other Helm chart schemas are typically structured.
| | `metrics.serviceMonitor.scrapeTimeout` | `""` | Optional scrape timeout; must be less than or equal to `metrics.serviceMonitor.interval` when set | | ||
| | `gateway.securityContext` | chart default | Gateway sidecar container security context | | ||
| | `cleanup.podSecurityContext` | `{}` | Optional pod security context for the pre-delete cleanup job | | ||
| | `cleanup.securityContext` | chart default | Container security context for the pre-delete cleanup job | |
There was a problem hiding this comment.
CONTENT OF THIS REVIEW IS AI GENERATED
[Severity: Minor] [Confidence: Medium]
Location: charts/kubex-automation-engine/docs/Configuration-Reference.md:119
Issue: The documentation table documents several new metrics.serviceMonitor.* fields but is missing metrics.serviceMonitor.interval, metrics.serviceMonitor.additionalLabels, metrics.serviceMonitor.path, and metrics.serviceMonitor.enabled is only listed once without the co-related guard condition (metrics.enabled must also be true).
Why it matters: Incomplete documentation is a usability issue — operators scanning the reference table cannot discover all available knobs, particularly for the interval and path values that affect scrape behaviour.
Suggested fix: Add rows for metrics.serviceMonitor.interval ("") and metrics.serviceMonitor.additionalLabels ({}), metrics.serviceMonitor.path (/metrics), and update the metrics.serviceMonitor.enabled row description to note that metrics.enabled must also be true.
This PR was created automatically to sync managed Kubex chart content.
Source commit:
f2541e4Managed chart scope:
charts/kubex-automation-enginecharts/kubex-crds