Skip to content

feat: Add helm chart for QueryFlux#75

Open
themoah wants to merge 3 commits into
lakeops-org:mainfrom
themoah:feat/add-helm-chart
Open

feat: Add helm chart for QueryFlux#75
themoah wants to merge 3 commits into
lakeops-org:mainfrom
themoah:feat/add-helm-chart

Conversation

@themoah
Copy link
Copy Markdown

@themoah themoah commented Jun 1, 2026

Adds a provider-neutral Helm chart for deploying QueryFlux on Kubernetes, plus CI validation.

Notable design points:

  • Secrets out of ConfigMaps: QueryFlux mounts config.yaml verbatim and does not interpolate environment variables into it. Any secret in the config (e.g. a Postgres URL with a password) would otherwise sit in plaintext in a ConfigMap. config.existingSecret lets operators mount the whole config from a Secret instead; it takes precedence over existingConfigMap and config.create.
  • Security defaults: non-root, read-only root filesystem, all capabilities dropped, seccomp RuntimeDefault.
  • Admin credentials: auto-generated password stored in a Secret, or bring your own via existingSecret. Generated password is preserved across upgrades (re-read from the existing Secret).
  • In-memory persistence vs. replicas: the default inMemory persistence is per-pod. The chart emits an install-time warning (and the README documents) that multi-replica / autoscaled deployments should configure Postgres persistence to avoid divergent state.
  • CI / validation: make helm-check runs scripts/check-helm-chart.sh, which checks required files and runs helm lint + helm template against default values and every example. Wired into a helm job in .github/workflows/ci.yml.
  • Examples: examples/production-values.yaml (ingress, TLS, HPA, PDB, ServiceMonitor, NetworkPolicy, topology spread) and examples/external-config-values.yaml (externally managed config + server-only image).

Testing

  • make helm-check passes locally (helm lint + helm template clean for default values and both examples).

Summary by CodeRabbit

Release Notes

  • New Features

    • Added Helm chart for deploying QueryFlux on Kubernetes with support for persistence, ingress, autoscaling, network policies, and configurable admin credentials.
  • Documentation

    • Updated README with Kubernetes deployment quick-start guide.
    • Added comprehensive Helm chart documentation with production and example configurations.
  • Chores

    • Added CI workflow and validation scripts for Helm chart checks.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

Warning

Review limit reached

@themoah, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 23 minutes and 58 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 337a0ce5-d15e-4bba-89f0-6dc14db7004c

📥 Commits

Reviewing files that changed from the base of the PR and between 4e352a0 and 30608a2.

📒 Files selected for processing (8)
  • .github/workflows/ci.yml
  • charts/queryflux/Chart.yaml
  • charts/queryflux/examples/external-config-values.yaml
  • charts/queryflux/examples/production-values.yaml
  • charts/queryflux/templates/hpa.yaml
  • charts/queryflux/templates/service.yaml
  • charts/queryflux/templates/tests/test-connection.yaml
  • scripts/check-helm-chart.sh
📝 Walkthrough

Walkthrough

This PR introduces a complete, production-ready Helm chart for QueryFlux. It adds CI validation via a new helm-check job; chart metadata, schema, and default values; reusable Helm template helpers; core Kubernetes resources (Deployment, Service, ConfigMap, Secret, ServiceAccount); optional features (HPA, Ingress, NetworkPolicy, PDB, ServiceMonitor); post-install instructions; and comprehensive documentation with examples for external-config and production deployments.

Changes

QueryFlux Helm Chart

Layer / File(s) Summary
CI Integration and Chart Validation
.github/workflows/ci.yml, Makefile, scripts/check-helm-chart.sh
New helm CI job runs make helm-check, which executes a comprehensive Bash validation script that checks file structure, Chart.yaml metadata, JSON schema validity, security defaults, and helm lint/helm template across default and example values.
Chart Metadata, Schema, and Default Configuration
charts/queryflux/Chart.yaml, charts/queryflux/values.schema.json, charts/queryflux/values.yaml
Chart metadata defines apiVersion, name, version, maintainers; JSON Schema validates image, autoscaling, ingress, network policy; default values configure image, admin credentials, QueryFlux config (frontends, persistence, cluster routing), Kubernetes service/probe/resource defaults, and feature toggles.
Template Helper Functions
charts/queryflux/templates/_helpers.tpl
Reusable Helm template helpers generate chart/app names, labels, selectors, service account resolution, and derived names for ConfigMap and Secret, with support for user overrides and existing resource references.
Core Kubernetes Resources
charts/queryflux/templates/configmap.yaml, charts/queryflux/templates/secret.yaml, charts/queryflux/templates/service.yaml, charts/queryflux/templates/serviceaccount.yaml, charts/queryflux/templates/deployment.yaml
Deployment with container image/probe/volume/scheduling configuration and checksum-triggered pod restarts; Service with dynamic port mapping; ConfigMap for rendered QueryFlux config via tpl; Secret with password reuse/generation via lookup; ServiceAccount with conditional creation and automount control.
Optional Features
charts/queryflux/templates/hpa.yaml, charts/queryflux/templates/ingress.yaml, charts/queryflux/templates/networkpolicy.yaml, charts/queryflux/templates/pdb.yaml, charts/queryflux/templates/servicemonitor.yaml
Conditional Kubernetes resources: HorizontalPodAutoscaler with CPU/memory metrics, Ingress with TLS support, NetworkPolicy with ingress/egress rules, PodDisruptionBudget for availability, ServiceMonitor for Prometheus metrics scraping.
Post-Install Notes and Testing
charts/queryflux/templates/NOTES.txt, charts/queryflux/templates/tests/test-connection.yaml
Post-install instructions for port-forwarding and credential retrieval; conditional warning for in-memory persistence with replicas; Helm test hook that validates service health via wget to admin /health endpoint.
Documentation and Configuration Examples
README.md, charts/queryflux/README.md, charts/queryflux/examples/external-config-values.yaml, charts/queryflux/examples/production-values.yaml
Parent README Kubernetes section with quick install/port-forward; comprehensive chart README covering configuration, persistence, secrets, optional features, and validation; external-config example with existing ConfigMap/Secret references; production example with ingress/TLS, autoscaling, PDB, network policy, and resource constraints.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A fluffy chart takes shape so grand,
With Helm helpers throughout the land,
ConfigMaps, Secrets, Deployments too,
Kubernetes magic, all fresh and new!
Hop along and validate with glee,
QueryFlux on k8s, wild and free! 🚀

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding a Helm chart for QueryFlux to enable Kubernetes deployment.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@themoah themoah changed the title Add helm chart for QueryFlux feat: Add helm chart for QueryFlux Jun 1, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (5)
scripts/check-helm-chart.sh (1)

54-57: 💤 Low value

Consider refactoring the boolean logic for clarity.

The A && B || C construct on lines 55-57 is functionally correct but triggers a Shellcheck info notice because this pattern can be confusing (it's not equivalent to if-then-else in all contexts).

♻️ Clearer alternative
-# Admin Secret must use configurable key names rather than hardcoded ones.
-grep -q '{{ .Values.existingSecret.usernameKey }}' "$CHART_DIR/templates/secret.yaml" \
-  && grep -q '{{ .Values.existingSecret.passwordKey }}' "$CHART_DIR/templates/secret.yaml" \
-  || fail "templates/secret.yaml must use configurable admin Secret key names"
+# Admin Secret must use configurable key names rather than hardcoded ones.
+grep -q '{{ .Values.existingSecret.usernameKey }}' "$CHART_DIR/templates/secret.yaml" \
+  || fail "templates/secret.yaml must use configurable usernameKey"
+grep -q '{{ .Values.existingSecret.passwordKey }}' "$CHART_DIR/templates/secret.yaml" \
+  || fail "templates/secret.yaml must use configurable passwordKey"

This makes each check independent and easier to understand.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/check-helm-chart.sh` around lines 54 - 57, The boolean chain using
"grep -q ... && grep -q ... || fail" is confusing and triggers shellcheck;
replace it with explicit, independent checks so each grep result is evaluated
clearly and failure is invoked only if either check misses: run grep -q for '{{
.Values.existingSecret.usernameKey }}' against
"$CHART_DIR/templates/secret.yaml" and separately for '{{
.Values.existingSecret.passwordKey }}', and call the existing fail function
(referenced as fail) if either check fails; this keeps CHART_DIR and
templates/secret.yaml references and makes the intent explicit and
shellcheck-compliant.
charts/queryflux/templates/servicemonitor.yaml (1)

25-26: ⚡ Quick win

Consider validating the scrapeTimeout vs interval relationship.

Prometheus requires scrapeTimeout to be less than interval. If misconfigured (scrapeTimeout >= interval), Prometheus may log warnings or fail to scrape metrics properly. Adding a template validation would catch this at install time rather than runtime.

🛡️ Proposed validation check

Add this check near the top of the template:

 {{- if .Values.serviceMonitor.enabled }}
+{{- if not (lt (int .Values.serviceMonitor.scrapeTimeout) (int .Values.serviceMonitor.interval)) }}
+{{- fail "serviceMonitor.scrapeTimeout must be less than serviceMonitor.interval" }}
+{{- end }}
 apiVersion: monitoring.coreos.com/v1

Note: This assumes the values are numeric strings (e.g., "30s", "60s"). If they're already integers, the validation logic may need adjustment.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@charts/queryflux/templates/servicemonitor.yaml` around lines 25 - 26,
Prometheus requires scrapeTimeout < interval, but the template uses
.Values.serviceMonitor.interval and .Values.serviceMonitor.scrapeTimeout without
validation; add a Helm template pre-check that parses both values to numeric
seconds (e.g., strip trailing "s" with replace and convert with atoi) and call
fail (Helm's fail function) if scrapeTimeout >= interval so installation fails
early; implement this check near the top of the servicemonitor.yaml template
referencing .Values.serviceMonitor.interval and
.Values.serviceMonitor.scrapeTimeout and include a clear error message
indicating the offending values.
charts/queryflux/examples/external-config-values.yaml (1)

19-26: 💤 Low value

Simplify disabled port configuration.

When studio.enabled is false, the port details (name, port, targetPort, protocol) are not used by the Helm templates. Consider removing the redundant fields for clarity:

♻️ Suggested simplification
 service:
   ports:
     studio:
       enabled: false
-      name: studio
-      port: 3000
-      targetPort: 3000
-      protocol: TCP
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@charts/queryflux/examples/external-config-values.yaml` around lines 19 - 26,
The service.ports.studio block contains redundant fields when studio.enabled is
false; remove the nested keys name, port, targetPort and protocol under
service.ports.studio and keep only enabled: false to simplify the values file
and avoid confusion with unused configuration (look for the service.ports.studio
entry and the studio.enabled flag to make the change).
charts/queryflux/examples/production-values.yaml (1)

28-31: ⚡ Quick win

Use placeholder credentials in production example.

Although the comment clearly states "Demo value only," having literal credentials (postgres://queryflux:queryflux@...) in a file named production-values.yaml creates a copy-paste risk. Consider replacing with explicit placeholders to prevent accidental credential exposure:

🔒 Suggested placeholder format
       persistence:
         type: postgres
-        # Demo value only — see the note above; do not commit real credentials.
-        url: postgres://queryflux:queryflux@postgres.queryflux.svc.cluster.local:5432/queryflux
+        # Replace USERNAME, PASSWORD, and HOST before deploying.
+        # For production, use config.existingSecret instead (see note above).
+        url: postgres://USERNAME:PASSWORD@postgres.queryflux.svc.cluster.local:5432/queryflux
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@charts/queryflux/examples/production-values.yaml` around lines 28 - 31,
Replace the literal Postgres connection string under the persistence.url key
with a clear placeholder (e.g., a named environment variable or templated value)
so the production-values example contains no real credentials; update the
persistence.type comment if needed to indicate it's a placeholder and ensure any
documentation or templating (e.g., values templating that reads persistence.url)
expects the placeholder name instead of a hard-coded URL.
charts/queryflux/templates/service.yaml (1)

16-17: ⚡ Quick win

Clarify Service targetPort: it’s using the port name, while Deployment uses numeric targetPort

  • charts/queryflux/templates/service.yaml sets targetPort: {{ $port.name }} (so Service routes by named port).
  • charts/queryflux/templates/deployment.yaml defines - name: {{ $port.name }} and containerPort: {{ $port.targetPort }}, so the named-port wiring is consistent and the routing works.
  • Remaining concern is clarity: service.ports.targetPort is used by the Deployment for containerPort, not by the Service, so the value name can be misleading for users.
♻️ Suggested fix

Option A: Use the numeric targetPort in the Service (keeping the named port/name field for Service + Ingress):

     - port: {{ $port.port }}
-      targetPort: {{ $port.name }}
+      targetPort: {{ $port.targetPort }}
       protocol: {{ $port.protocol }}
       name: {{ $port.name }}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@charts/queryflux/templates/service.yaml` around lines 16 - 17, The Service
currently sets targetPort to the port's name ({{ $port.name }}), which is
confusing because the Deployment's containerPort is numeric ({{ $port.targetPort
}}); change the Service to use the numeric targetPort ({{ $port.targetPort }})
while keeping the Service port/name mapping ({{ $port.port }} and {{ $port.name
}}) so the Service forwards directly to the Deployment's containerPort; verify
the Deployment still defines the container port with name {{ $port.name }} and
containerPort {{ $port.targetPort }} for consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/ci.yml:
- Around line 47-53: Update the helm job to follow least-privilege: on the
actions/checkout step (the checkout action used in the helm job) set
persist-credentials: false to avoid leaking runner credentials, and add an
explicit job-level permissions block (e.g., permissions: contents: read) so the
job only has minimal read access; ensure these changes are applied within the
helm job definition that contains the checkout and helm steps.

In `@charts/queryflux/Chart.yaml`:
- Line 6: Replace the non-deterministic appVersion value by setting appVersion
to the concrete release tag (e.g., "0.1.0") instead of "latest" in Chart.yaml;
update this field on each release and ensure CI publishes the matching image tag
(ghcr.io/lakeops-org/queryflux:<that-version>) so chart installs are
reproducible and align with image.tag defaults.

In `@charts/queryflux/templates/hpa.yaml`:
- Around line 15-31: The HPA template currently renders an empty metrics array
when both .Values.autoscaling.targetCPUUtilizationPercentage and
.Values.autoscaling.targetMemoryUtilizationPercentage are unset, causing
Kubernetes validation to fail; fix it by adding a guard around the metrics block
(or around the whole HPA resource) that only renders the HPA/metrics when at
least one of .Values.autoscaling.targetCPUUtilizationPercentage or
.Values.autoscaling.targetMemoryUtilizationPercentage is set (e.g., use a
conditional like if or .Values.autoscaling.targetCPUUtilizationPercentage
.Values.autoscaling.targetMemoryUtilizationPercentage), or alternatively emit a
sensible default single metric (e.g., CPU) when none are provided; update the
template section that contains the metrics: block to implement this check so the
final rendered HPA always contains at least one metric.

In `@charts/queryflux/templates/tests/test-connection.yaml`:
- Line 15: The test YAML currently pins the container image as "image:
busybox:1.36" which is outdated and has known CVEs; update the image key to a
patched BusyBox tag (for example "busybox:1.36.1-r2" or a newer 1.38.x tag) or
pin to a known-safe digest in the same "image" field in the test-connection.yaml
template to ensure the test uses a CVE-patched BusyBox.

In `@charts/queryflux/values.schema.json`:
- Around line 106-109: The schema and Service template are inconsistent:
values.schema.json defines targetPort as an integer but templates/service.yaml
uses {{ $port.name }}; either remove/rename targetPort from the schema to
document that services use named ports (update values.schema.json and any docs
for the ports list) or change the Service template to reference the integer
targetPort (replace {{ $port.name }} with the port's targetPort value lookup) so
the template consumes the integer defined in values.schema.json; locate the
ports loop in templates/service.yaml and the targetPort property in
values.schema.json to apply the corresponding change across both files.

---

Nitpick comments:
In `@charts/queryflux/examples/external-config-values.yaml`:
- Around line 19-26: The service.ports.studio block contains redundant fields
when studio.enabled is false; remove the nested keys name, port, targetPort and
protocol under service.ports.studio and keep only enabled: false to simplify the
values file and avoid confusion with unused configuration (look for the
service.ports.studio entry and the studio.enabled flag to make the change).

In `@charts/queryflux/examples/production-values.yaml`:
- Around line 28-31: Replace the literal Postgres connection string under the
persistence.url key with a clear placeholder (e.g., a named environment variable
or templated value) so the production-values example contains no real
credentials; update the persistence.type comment if needed to indicate it's a
placeholder and ensure any documentation or templating (e.g., values templating
that reads persistence.url) expects the placeholder name instead of a hard-coded
URL.

In `@charts/queryflux/templates/service.yaml`:
- Around line 16-17: The Service currently sets targetPort to the port's name
({{ $port.name }}), which is confusing because the Deployment's containerPort is
numeric ({{ $port.targetPort }}); change the Service to use the numeric
targetPort ({{ $port.targetPort }}) while keeping the Service port/name mapping
({{ $port.port }} and {{ $port.name }}) so the Service forwards directly to the
Deployment's containerPort; verify the Deployment still defines the container
port with name {{ $port.name }} and containerPort {{ $port.targetPort }} for
consistency.

In `@charts/queryflux/templates/servicemonitor.yaml`:
- Around line 25-26: Prometheus requires scrapeTimeout < interval, but the
template uses .Values.serviceMonitor.interval and
.Values.serviceMonitor.scrapeTimeout without validation; add a Helm template
pre-check that parses both values to numeric seconds (e.g., strip trailing "s"
with replace and convert with atoi) and call fail (Helm's fail function) if
scrapeTimeout >= interval so installation fails early; implement this check near
the top of the servicemonitor.yaml template referencing
.Values.serviceMonitor.interval and .Values.serviceMonitor.scrapeTimeout and
include a clear error message indicating the offending values.

In `@scripts/check-helm-chart.sh`:
- Around line 54-57: The boolean chain using "grep -q ... && grep -q ... ||
fail" is confusing and triggers shellcheck; replace it with explicit,
independent checks so each grep result is evaluated clearly and failure is
invoked only if either check misses: run grep -q for '{{
.Values.existingSecret.usernameKey }}' against
"$CHART_DIR/templates/secret.yaml" and separately for '{{
.Values.existingSecret.passwordKey }}', and call the existing fail function
(referenced as fail) if either check fails; this keeps CHART_DIR and
templates/secret.yaml references and makes the intent explicit and
shellcheck-compliant.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 555b2363-9872-498f-8474-ac81be12f2c3

📥 Commits

Reviewing files that changed from the base of the PR and between dac817e and 4e352a0.

📒 Files selected for processing (23)
  • .github/workflows/ci.yml
  • Makefile
  • README.md
  • charts/queryflux/Chart.yaml
  • charts/queryflux/README.md
  • charts/queryflux/examples/external-config-values.yaml
  • charts/queryflux/examples/production-values.yaml
  • charts/queryflux/templates/NOTES.txt
  • charts/queryflux/templates/_helpers.tpl
  • charts/queryflux/templates/configmap.yaml
  • charts/queryflux/templates/deployment.yaml
  • charts/queryflux/templates/hpa.yaml
  • charts/queryflux/templates/ingress.yaml
  • charts/queryflux/templates/networkpolicy.yaml
  • charts/queryflux/templates/pdb.yaml
  • charts/queryflux/templates/secret.yaml
  • charts/queryflux/templates/service.yaml
  • charts/queryflux/templates/serviceaccount.yaml
  • charts/queryflux/templates/servicemonitor.yaml
  • charts/queryflux/templates/tests/test-connection.yaml
  • charts/queryflux/values.schema.json
  • charts/queryflux/values.yaml
  • scripts/check-helm-chart.sh

Comment thread .github/workflows/ci.yml
Comment thread charts/queryflux/Chart.yaml Outdated
Comment thread charts/queryflux/templates/hpa.yaml
Comment thread charts/queryflux/templates/tests/test-connection.yaml Outdated
Comment thread charts/queryflux/values.schema.json
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant