Skip to content

feat(infra): Tamp build pipeline + Helm chart for self-hosted deploy (HOL-54)#134

Open
BrewingCoder wants to merge 12 commits into
mainfrom
hol-54/airm5-build-tool
Open

feat(infra): Tamp build pipeline + Helm chart for self-hosted deploy (HOL-54)#134
BrewingCoder wants to merge 12 commits into
mainfrom
hol-54/airm5-build-tool

Conversation

@BrewingCoder
Copy link
Copy Markdown
Owner

@BrewingCoder BrewingCoder commented May 12, 2026

Summary

  • Replaces ad-hoc dotnet/yarn/docker shell scripting with a Tamp-driven build pipeline in build/Build.cs. One command (dotnet tamp Ci) covers Restore → Compile → Test → Publish → YarnInstall → FrontendBuild → DockerBuildBackend. A second command (dotnet tamp SmokeQa) chains push → helm install → /health verification.
  • Adds an AGPL-operator-consumable Helm chart at infra/helm/holdfast/ — two-pod deployment (backend + postgres), community-idiomatic labels, sensible operator defaults, lab-cluster overrides in values.lab.yaml.
  • Cutover criterion proven against the lab cluster: dotnet tamp SmokeQa --registry registry.home.local ships the image and verifies https://holdfast.brewingcoder.com/health returns Healthy in 3.3 seconds (cache-warm).
  • Wave 2 (post-cutover): bumps Tamp to current (Core 1.7.0) and wires the supply-chain + coverage + codegen satellites: Syft, Grype, TruffleHog, GraphQLCodegen, Coverlet, ReportGenerator. 20 targets total; Compliance aggregate fans out to SbomScan + CveGate + SecretScan.

What lands

Layer Files Net
Tamp build script build/Build.cs, build/Build.csproj wave 1 + wave 2
Helm chart infra/helm/holdfast/ — Chart.yaml, values.yaml, values.lab.yaml, README, .helmignore, 8 templates ~750
.gitignore tweak un-ignore /build/ (the **/build rule otherwise hides it) 10
CHANGELOG one focused entry documenting this scope only (HOL-53 docs sweep deferred) 78

Existing compose -f compose.yml -f compose.hobby-dotnet.yml up hobby flow is unchanged.

Backend pipeline (build/Build.cs)

20 targets enumerated by dotnet tamp --list:

Ci · Clean · Compile · Compliance · CoverageReport · CoverageTest · CveGate
DeployQa · DockerBuildBackend · DockerPush · FrontendBuild · FrontendCodegen
Info · Publish · Restore · SbomScan · SecretScan · SmokeQa · Test · YarnInstall

Idiomatic 5-line Tamp shape:

[FromPath("yarn")]                              readonly Tool YarnTool       = null!;
[FromPath("helm")]                              readonly Tool HelmTool       = null!;
[FromPath("syft", Optional = true)]             readonly Tool SyftTool       = null!;
[FromNodeModules("turbo")]                      readonly Tool TurboTool      = null!;

Target SbomScan => _ => _.Executes(() => Syft.Scan(SyftTool, s => s.SetDirectorySource(RootDirectory).AddOutputCycloneDxJson(Sbom)));
Target CveGate  => _ => _.DependsOn(SbomScan).Executes(() => Grype.Scan(GrypeTool, s => s.SetSbomSource(Sbom).SetFailOn("high").SetByCve(true)));
Target Compliance => _ => _.DependsOn(SbomScan, CveGate, SecretScan);
Target Ci => _ => _.Default().DependsOn(Test, Publish, FrontendBuild, DockerBuildBackend);

Pin set (post-Wave-2):

Tamp.Core           1.7.0   <!-- TAMP001-004 analyzers, async overloads, Secret.Reveal public -->
Tamp.NetCli.V10     1.4.0
Tamp.Docker.V27     0.3.1   ·   Tamp.Helm.V3   0.1.0   ·   Tamp.Http   0.1.1
Tamp.Yarn.V4        0.1.1   ·   Tamp.Turbo.V2  0.2.1   ·   Tamp.Vite.V5  0.1.1
Tamp.GraphQLCodegen.V5  0.1.1
Tamp.Coverlet.V6   0.1.0    ·   Tamp.ReportGenerator.V5  0.1.1
Tamp.Syft  0.1.0    ·   Tamp.Grype  0.1.0    ·   Tamp.TruffleHog.V3  0.1.1

Compliance (SBOM + CVE + secret scan) is deliberately not part of Ci so the fast iteration path stays fast; release-prep runs dotnet tamp Compliance on demand. Same shape for CoverageTest / CoverageReport (run on demand, not per-build).

Helm chart (infra/helm/holdfast/)

Two pods. Renders 7 resources:

ServiceAccount  holdfast
Secret          holdfast-postgres            (chart-managed OR existingSecret)
ConfigMap       holdfast-backend             (PSQL_*, STORAGE__ANALYTICS, etc.)
Service         holdfast-backend  :8082      (ClusterIP, named `http`)
Service         holdfast-postgres :5432      (ClusterIP, internal only)
Deployment      holdfast-backend             (1 replica, /health probes)
StatefulSet     holdfast-postgres            (1 replica, volumeClaimTemplate)

Chart-managed Postgres by default (TimescaleDB-HA pg16, fsGroup: 1000 for AGPL-portable PSA-restricted compatibility). Operators bring-your-own via postgres.enabled=false + externalPostgres.*.

Auth: auth.mode=dev only in v1 — README documents the operator-must-front-with-ZTA-proxy guidance. Enterprise in-app auth is roadmapped.

Trial findings (frictions caught + filed)

Tamp side, wave 1 (16 frictions, all filed with airm5, all closed in coordinated waves):

Tamp side, wave 2 (4 new frictions filed):

TAMP001 saved a real bug: my first cut of CoverageTest dropped the DotNet.Test plan inside a multi-statement Executes(Action) lambda — the analyzer caught it on first compile and pointed me at the fix. Day-1-catch confirmed twice.

HoldFast side (5 chart bugs caught and fixed inline during wave 1):

  • Postgres probe used Unix socket TimescaleDB-HA doesn't expose → use -h 127.0.0.1 TCP loopback
  • ConfigMap had STORAGE_ANALYTICS not STORAGE__ANALYTICS (.NET config-key convention)
  • Same shape for AUTH_MODE → should be REACT_APP_AUTH_MODE (per GoEnvCompat)
  • Dockerfile binds :8082, chart had :8080 everywhere
  • Probes hit /health/live — SPA fallback returned 200 (silent lie). Real endpoint is /health

Each captured in a commit on this branch; see git log for the post-mortem.

What's NOT in this PR

Deliberately scoped out — opening as follow-ups after merge:

  • In-cluster SmokeQa default URL — microk8s's recommendation. SmokeQa should default to http://holdfast-backend.holdfast.svc.cluster.local:8082/health (no CF hop) and a separate SmokeQaPublic target probes the public URL.
  • Tamp.GitVersion.V6 semver image tags — blocked on friction Improve code documentation and unit testing for model package #17 (missing [GitVersion] injection attribute). Currently uses Git.Commit[..7].
  • HOL-53 — broader docs/HOLDFAST-NOTES.md + docs/CHANGELOG-FORK.md sweep to reflect the full post-rewrite architecture (this PR adds one focused entry for the build/deploy scope only).
  • CI/CD re-enablement — workflows stay disabled. Flipping gh workflow enable after this lands is its own concern.

Test plan

  • dotnet tamp --list — 20 targets enumerate clean on a machine without syft/grype/trufflehog (Optional flag works)
  • dotnet build build/Build.csproj — 0 warnings, 0 errors against Tamp.Core 1.7.0
  • dotnet tamp Info — 6 ms, prints config / git / image tag / QA URL
  • dotnet tamp Test (wave 1) — 3,172/3,172 green in 21.4s
  • dotnet tamp DockerBuildBackend — 485 MB image, byte-size parity with the legacy docker build baseline
  • dotnet tamp Publishartifacts/publish/HoldFast.Api/ byte-identical to raw dotnet publish
  • dotnet tamp FrontendBuild — 17/17 turbo tasks green via direct Turbo.Run
  • dotnet tamp DeployQa --registry registry.home.localhelm upgrade --install against va-mk8s-1..6, both pods reach Ready
  • dotnet tamp SmokeQa --registry registry.home.localhttps://holdfast.brewingcoder.com/health returns Healthy via Cloudflare tunnel
  • helm lint infra/helm/holdfast clean; helm install --dry-run validates against live API server

Coordinated work

  • airm5 — shipped 40+ Tamp packages across 9+ release waves to address frictions surfaced here. Tamp.Helm.V3 0.1.0 was authored specifically for the cutover. Wave 2 additions (Syft, Grype, TruffleHog wrappers) shipped 2026-05-13.
  • microk8s — provisioned namespace holdfast, ARC runner RBAC (edit in holdfast ns), CF tunnel (holdfast.brewingcoder.com:8082), DNS, WAF Skip rule, pre-created holdfast-postgres Secret.

🤖 Generated with Claude Code

BrewingCoder and others added 12 commits May 11, 2026 14:40
…rgets (HOL-54)

First slice of AIRM5/Tamp build-tool integration. Adds:

- build/Build.csproj — .NET 10 console project referencing Tamp.Core 1.0.7
  and Tamp.NetCli.V10 1.0.5
- build/Build.cs — minimal Build class extending TampBuild with four
  side-by-side targets:
    * Info     — prints config/solution/root/git context
    * Restore  — DotNet.Restore on src/dotnet/HoldFast.Backend.slnx
    * Compile  — DependsOn Restore; --no-restore build
    * Test     — DependsOn Compile; --no-build test with TRX logger
                 writing to artifacts/test-results/
- .gitignore — un-ignore the root /build/ directory so the script is tracked
  (the **/build rule still hides nested build/ dirs inside packages)

Existing pipeline (dotnet build / dotnet test directly against the slnx)
remains the source of truth; this branch runs Tamp side-by-side per the
adoption plan in HOL-54.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ckend (HOL-54)

Adds five more side-by-side targets to build/Build.cs against Tamp 1.0.7 +
satellites:

  * Clean              — AbsolutePath delete/ensure on artifacts/
  * Publish            — DotNet.Publish HoldFast.Api -> artifacts/publish/.
                         Verified byte-identical to raw dotnet publish.
  * YarnInstall        — Yarn.Install --immutable against workspace root.
                         Berry 4.x workspace tree fully recognised; 12.7s warm.
  * FrontendBuild      — Yarn.Run build:frontend (npm-script proxy to turbo)
                         until Turbo's chicken/egg bootstrap is addressed.
                         17/17 turbo tasks green, 1m41s.
  * DockerBuildBackend — authored but NOT yet run; held until BuildKit-aware
                         Buildx.Build vs legacy Build choice is patched.

Package refs extended in Build.csproj: Tamp.Yarn.V4 0.1.0, Tamp.Turbo.V2
0.1.0, Tamp.Vite.V5 0.1.0, Tamp.Docker.V27 0.2.0.

This commit deliberately carries workaround stubs that should be removed
once airm5 ships the friction-fix wave (see HOL-54 thread):

  * ResolveOnPath helper (~25 lines) — replaces missing [FromPath] / Tool
    discovery for native tools (yarn, docker, turbo).
  * Console.WriteLine in Info target — Tamp.Logger surface is instance-only,
    no Log.Information static; standing in until clarified.
  * Glob-based bin/obj cleanup dropped from Clean — AbsolutePath.GlobDirectories
    returns 0 hits for "**/bin"/"**/obj" patterns (probable Tamp.Core bug).
  * FrontendBuild routes via Yarn.Run because Tamp.Turbo.V2 needs a Tool that
    only exists at node_modules/.bin/turbo after YarnInstall runs.

Backend Restore/Compile/Test/Publish + frontend YarnInstall/FrontendBuild are
all green via Tamp at parity with the legacy pipeline. Existing pipeline still
runs unchanged side-by-side. No cutover yet.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Tamp.Core + NetCli.V10 jumped to 1.2.0 in airm5's coordinated wave. The
satellites bump independently (0.x.1 patches rebuilt against Core 1.2.0);
those are blocked on NuGet flatcontainer propagation as of this commit and
will follow in a separate cleanup once the CDN catches up.

Build.cs surface changes:

  * .TopLevel() — stripped from every target. 1.1.0+ makes top-level the
    default; the call is a no-op marked [Obsolete]. (NB: .Internal() is the
    new inverse marker if a target should be hidden from --list.)
  * .DependsOn(nameof(Target)) → .DependsOn(Target). The new
    [CallerArgumentExpression] overloads inject the identifier name
    literally; existing nameof()/bare-string forms still compile, but the
    bare form reads as English.
  * New `Ci` target marked `.Default()` — the canonical no-args entry. It
    fans out into Test, Publish, FrontendBuild, DockerBuildBackend so a
    cold `dotnet tamp` exercises the entire pipeline. Note: DependsOn is
    chained per-target rather than varargs — the varargs overload takes
    `string[]`, not `Target[]`, so the natural `DependsOn(A, B, C)` shape
    doesn't bind. Reported observation, not a blocker.

Net: 88 → 56 lines (-36%) with no semantic change. Test suite still 3,172
green in 20.2s post-refactor. Frictions #1#12 all stay closed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Standard-shape helm chart at infra/helm/holdfast/ for the cutover from
docker-compose hobby deploy → kubernetes-native deploy. Lives in the
HoldFast repo (not a separate infra repo) so operators consuming the
AGPL fork get one canonical chart in the source tree alongside the
Dockerfile.

Architecture: two pods.

  * Deployment/<release>-backend
    The single .NET 10 Kestrel container (API + frontend bundle +
    workers + OTLP receivers). Resource defaults derived from the 43h
    soak: 200m/512Mi requests, 2000m/2Gi limits. Liveness on
    /health/live, readiness on /health/ready.

  * StatefulSet/<release>-postgres
    TimescaleDB-HA pg16 with a single volumeClaimTemplate. PGDATA
    pinned to /home/postgres/pgdata/data (not the upstream postgres
    default — TimescaleDB-HA's layout differs). pg_isready exec probes.
    Operator override: postgres.enabled=false + externalPostgres.*
    to bring your own database.

Templates (all standard helm shape):

  templates/
    _helpers.tpl              labels, fullname, image, postgres host
                              composition (chart-managed vs external)
    NOTES.txt                 post-install runbook
    serviceaccount.yaml
    configmap.yaml            backend env (URIs, storage selector, auth)
    secret.yaml               PSQL_PASSWORD (or operator references
                              an existing Secret via passwordExistingSecret)
    backend-service.yaml      ClusterIP, port 8080
    backend-deployment.yaml
    postgres-service.yaml     ClusterIP, port 5432, internal-only
    postgres-statefulset.yaml

Labels follow community-standard kubernetes.io/* conventions
(name/instance/version/component/managed-by/part-of) per microk8s's
"lean toward bitnami/prometheus-operator shape, not lab conventions"
guidance. Operator-facing distribution audience wins over lab-internal
convention matching.

values.yaml defaults are operator-safe (ghcr.io registry, no storage
class hint, no hardcoded URLs — all REQUIRED fields are documented).
values.lab.yaml carries the BrewingCoder microk8s overrides:
localhost:32000 registry, nfs-va-vm storage class, the four
holdfast.brewingcoder.com URL knobs the backend needs.

Auth: chart v1 only supports auth.mode=dev. enterprise mode (in-app
JWT) is roadmapped — the chart should support `--set auth.mode=enterprise`
and a JWT issuer config when that lands, but operators today must
front the deployment with a zero-trust proxy (Cloudflare Access,
Authelia, oauth2-proxy). README documents this explicitly.

Lints clean (helm 3.17.4); template renders against values.lab.yaml
produce the expected 7 resources with correct lab-cluster overrides
applied. No Ingress shipped — Cloudflare tunnel routes
holdfast.brewingcoder.com → holdfast-backend.holdfast.svc:8080
directly.

Wire-up of `dotnet tamp DeployQa` (and `Tamp.Helm.V3` if airm5 ships
the wrapper, hand-rolled Tool.Plan() if not) is the next commit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Tamp ecosystem bumped to Wave 9 — coordinated cut that lands Helm.V3
0.1.0, Tamp.Http 0.1.1, the params Target[] overload on lifecycle
methods (friction #14 fix), and patch satellites across the fleet.

Pin moves:
  * Tamp.Core              1.2.0 → 1.3.0
  * Tamp.NetCli.V10        1.2.0 → 1.3.0
  * Tamp.Yarn.V4           0.1.0 → 0.1.1
  * Tamp.Turbo.V2          0.2.0 → 0.2.1
  * Tamp.Vite.V5           0.1.0 → 0.1.1
  * Tamp.Docker.V27        0.3.0 (0.3.1 still on the flatcontainer CDN
                                  lag — follow-up bump pending)
  + Tamp.Helm.V3           0.1.0 (new — the cutover deploy verb)
  + Tamp.Http              0.1.1 (new — HttpProbe for SmokeQa)

Build.cs additions:

  * [Parameter] Registry, QaUrl, PostgresPassword (the third via env
    var HOLDFAST_PG_PASSWORD)
  * [FromPath("helm")] HelmTool
  * ImageTag = short git SHA; LocalImageRef + RegistryImageRef helpers
  * Info target prints all three plus the deploy URL
  * DockerBuildBackend now tags BOTH the local-friendly name and the
    registry-prefixed name in one buildx pass
  * DockerPush — depends on DockerBuildBackend, calls Docker.Push
    against the registry-prefixed tag
  * DeployQa — depends on DockerPush, calls Helm.Upgrade with
    --install --wait --atomic --timeout 5m against
    infra/helm/holdfast/ + values.lab.yaml, image.tag overridden to
    the current SHA, postgres password sourced from the Parameter
  * SmokeQa — depends on DeployQa, polls QaUrl/health/live for up to
    2 minutes via HttpProbe.WaitForHealthy
  * Ci.DependsOn(Test, Publish, FrontendBuild, DockerBuildBackend)
    refactored to params Target[] one-liner (friction #14 paid off
    immediately)

Test still 3,172/3,172 green in 21.4s on the bumped stack. DeployQa +
SmokeQa unverified locally — both require cluster reachability
(localhost:32000 only resolves inside the lab cluster, helm needs
credentials, QA URL doesn't route yet). First end-to-end run will
happen on the ARC runner once microk8s finishes cluster prep
(namespace + CF tunnel + RBAC).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…and-suspenders (HOL-54)

Cluster prep delivered by microk8s — namespace, RBAC, CF tunnel rule,
WAF, and a pre-created `holdfast-postgres` Secret are all in place.
Two chart-side adjustments to consume that work:

  * values.lab.yaml — postgres.auth.existingSecret = holdfast-postgres
    (passwordKey defaults to "password"; chart's secret.yaml template
    is gated on `not .existingSecret` so it won't try to overwrite)

  * values.yaml — postgres.podSecurityContext.fsGroup = 1000

    Default fsGroup for chart-managed postgres matches the postgres
    UID in `timescale/timescaledb-ha:pg16` (probed: uid=1000(postgres)).
    The lab NFS export is permissive (no_root_squash) so this isn't
    strictly required there, but PSA-restricted clusters require it,
    so the chart needs to ship a sensible default for the AGPL
    operator audience. Operators swapping the image to one with a
    different UID override.

  * Build.cs — drop the [Parameter] HOLDFAST_PG_PASSWORD plumbing and
    the .SetValue("postgres.auth.password", ...) on the Helm.Upgrade
    call. Password is now resolved entirely via existingSecret on the
    chart side; runner pod doesn't need any env var injected. Also
    obviates the runner-pod-spec patching microk8s offered.

Verified:
  * helm lint clean
  * helm template renders fsGroup: 1000 on postgres StatefulSet,
    PSQL_PASSWORD valueFrom secretKeyRef.name=holdfast-postgres on
    backend deployment
  * `helm install --dry-run` against live cluster (k8s-lab) succeeds
  * holdfast-postgres Secret confirmed present in namespace via kubectl

Ready for first `dotnet tamp SmokeQa` end-to-end whenever Scott says go.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…0m timeout (HOL-54)

First end-to-end deploy attempt surfaced two issues, captured here:

1. **Postgres readiness probe wrong path.** `pg_isready -U postgres`
   without `-h` defaults to the Unix socket at /var/run/postgresql,
   which TimescaleDB-HA does not reliably expose. Pod stayed NotReady;
   event log: `Readiness probe failed: /var/run/postgresql:5432 - no
   response`. Backend cascaded into CrashLoopBackOff trying to connect.
   Fix: probe via `-h 127.0.0.1` to force TCP-loopback check through
   postgres's TCP listener, which is reliably bound regardless of
   socket configuration.

2. **5m helm timeout too tight for first deploy.** TimescaleDB-HA image
   is 1.73 GB; first pull on each node is 3-4 minutes. Atomic rollback
   triggered before postgres could even finish pulling on cold nodes.
   Bumped DeployQa timeout to 10 minutes for headroom.

3. **Disable --atomic temporarily** so a failed deploy leaves the
   namespace populated for `kubectl get / logs` post-mortem. Re-enable
   once the chart has a few clean runs under it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…nges (HOL-54)

First end-to-end deploy crashed the backend with `Connection refused
(localhost:8123)` from ClickHouseMigrationService.StartAsync. Root cause:
ConfigMap was writing env-var names that the .NET host doesn't bind.

Three name fixes in templates/configmap.yaml:

  * STORAGE_ANALYTICS → STORAGE__ANALYTICS
    .NET configuration uses double-underscore to express nested keys
    (Storage:Analytics). Single underscore → value never loaded →
    defaultBackend falls back to "clickhouse" → ClickHouseMigrationService
    registers → crash on connection refused. The Program.cs gate that
    skips ClickHouse when Storage:Analytics=Postgres is correct; the
    chart just wasn't delivering the value.

  * AUTH_MODE → REACT_APP_AUTH_MODE
    HoldFast.Shared.Runtime.GoEnvCompat maps REACT_APP_AUTH_MODE to
    Auth:Mode (legacy Go env-var contract preserved on the .NET side).
    AUTH_MODE alone is unmapped and silently ignored.

  * COLLECTOR_OTLP_ENDPOINT → OTEL_EXPORTER_OTLP_ENDPOINT
    The backend hosts OTLP receivers — it's not an OTLP client to a
    separate collector. The "OTLP endpoint" value here is for the
    backend to export its OWN telemetry. OTel SDK convention is
    OTEL_EXPORTER_OTLP_ENDPOINT.

Plus one chart hygiene fix in templates/backend-deployment.yaml:

  * Add checksum/config annotation to the pod template, computed as
    sha256sum of configmap.yaml's rendered content. Standard helm
    idiom — without it, `helm upgrade` of env-only changes silently
    leaves pods serving with stale config. With it, ConfigMap edits
    trigger a rolling restart automatically.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…HOL-54)

Third end-to-end deploy attempt put the backend past the ClickHouse
crash but into a probe-port mismatch crashloop. Root cause: Dockerfile
sets `ENV ASPNETCORE_URLS=http://+:8082` (line 137) and `EXPOSE 8082`,
so Kestrel binds on 8082 — but my chart hardcoded 8080 throughout.

Backend log captured "Now listening on: http://[::]:8082" → readiness
probe on 8080 → connection refused → kubelet liveness-failure-kill →
restart loop. Image is fine; chart was lying about the port.

Fixes:

  * values.yaml — backend.service.port 8080 → 8082, probes ports → 8082,
    with a comment that points future readers at the Dockerfile so the
    bind port stays the single source of truth.
  * backend-deployment.yaml — containerPort 8080 → 8082.

NB: this is a coordinated change with the cluster operator — the
Cloudflare tunnel rule on the microk8s side previously routed to
:8080 and needs to update to :8082 before external traffic resolves.
Internal helm install proceeds independently.

Plus one observability hygiene fix:

  * configmap.yaml — gate OTEL_EXPORTER_OTLP_ENDPOINT on non-empty.
    The third deploy logged the backend self-exporting metrics to
    https://holdfast.brewingcoder.com/otel and getting 502 from CF
    edge. Not crashing the app but adding noise. The "OTLP endpoint"
    in HoldFast's context is for incoming receivers (hosted in the
    backend itself), not for the backend to ship its own traces
    outbound; the latter is opt-in and operators may not want it.
  * values.lab.yaml — collectorOtlpEndpoint = "" disables self-export
    in QA. Operators wanting backend-traces-elsewhere set it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…split) (HOL-54)

Backend's Program.cs uses app.MapHealthChecks("/health") — single endpoint,
no /live or /ready paths. My chart was probing /health/live, /health/ready,
and SmokeQa was hitting /health/live too. All three "passed" because the
backend serves a React SPA from wwwroot with a fallback that returns
index.html (HTTP 200) for unmapped paths — so the probes were lying.
Actual /health returns plain-text "Healthy" and is what we should be hitting.

Fixes:
  * values.yaml — liveness + readiness probes path: /health/{live,ready}
    → /health, with a comment about the SPA-fallback trap so future
    readers don't fall back into it.
  * Build.cs — SmokeQa probes /health instead of /health/live.

Verified locally via `kubectl port-forward svc/holdfast-backend 18082:8082`
plus `curl http://localhost:18082/health` → "Healthy" (200). The /health/live
URL on the running pod returns the SPA bundle's index.html (also 200 —
which is exactly why the lie was so quiet).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Captures the build/deploy work that lands in this branch: Tamp build
script targets, helm chart surface, what's preserved alongside (compose
hobby flow still works), and the cutover-criterion proof against the
lab cluster.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wave 2 of the Tamp cutover. Adds six new satellite wrappers and bumps
core+wrapper pins to current versions:

  Tamp.Core         1.3.0 -> 1.7.0   (TAMP001-004 analyzers, async overloads)
  Tamp.NetCli.V10   1.3.0 -> 1.4.0
  Tamp.Turbo.V2     0.2.0 -> 0.2.1
  Tamp.Docker.V27   0.3.0 -> 0.3.1

New satellites + targets:

  Tamp.Syft / Tamp.Grype         -> SbomScan + CveGate + Compliance
  Tamp.TruffleHog.V3             -> SecretScan
  Tamp.GraphQLCodegen.V5         -> FrontendCodegen
  Tamp.Coverlet.V6 +
    Tamp.ReportGenerator.V5      -> CoverageTest + CoverageReport

Optional-flagged the new tool injections so `dotnet tamp --list` works on
machines without syft/grype/trufflehog/reportgenerator/graphql-codegen
installed — they only fail when the relevant target is actually invoked.

Compliance (SBOM + CVE + secret scan) is deliberately not in Ci so the
fast iteration path stays fast; release-prep runs `dotnet tamp Compliance`
on demand. TAMP001 caught a real bug in CoverageTest while authoring this
(dropped DotNet.Test plan inside a multi-statement Executes lambda) — the
analyzer paid for itself in this single session.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant