Skip to content

feat(minio): 4 SA prefix-policy isolation + ephemeral CI#1288

Merged
zbnerd merged 27 commits into
developfrom
feature/minio-sa-isolation
Jun 15, 2026
Merged

feat(minio): 4 SA prefix-policy isolation + ephemeral CI#1288
zbnerd merged 27 commits into
developfrom
feature/minio-sa-isolation

Conversation

@zbnerd

@zbnerd zbnerd commented Jun 15, 2026

Copy link
Copy Markdown
Owner

Summary

  • Replace shared minioadmin root credential with 4 prefix-scoped service accounts.
  • Single bucket maple-expectation preserved. No data migration. No Spring source change.
  • New one-shot minio-bootstrap container creates bucket + ILM (1-rule invariant) + 4 SA + 4 policies.
  • Per-module env files (.env.<module>) replace shared .env for credentials.
  • Root credential lives only in .env.bootstrap (read by minio-bootstrap container).
  • scripts/dev-bootstrap.sh regenerates the full env set in one line for local dev.
  • module-rest-controller's storage.minio.* block removed (no ObjectStorage caller).
  • CI uses ephemeral MinIO + per-job random SA keys (no GitHub Secrets).

Service account matrix

SA Get Put Resource
ext-api runs/*, snapshots/*, ocid-mapping/*
calculator runs/*, data/snapshots/* (Get) + calculator/runs/* (Put)
synchronizer runs/*, calculator/runs/*
cleanup ✓, List, ✓ runs/*, calculator/runs/* (no wildcard)

Files changed

  • docker/minio/policies/*.json × 4 (new)
  • docker/minio/bootstrap.sh (new, idempotent; uses text-mode mc ilm ls + bash read, no jq)
  • scripts/dev-bootstrap.sh (new)
  • docker-compose.yml (modify: minio-init → minio-bootstrap)
  • .env (modify: trim per-module creds)
  • .env.<module> × 4 (new, gitignored, mode 600)
  • .env.<module>.template × 4 + .env.bootstrap.template (new, committed)
  • module-rest-controller/src/main/resources/application.yml (modify: remove dead MinIO block)
  • .gitignore (modify: add env file patterns + scripts allowlist tweak)
  • docs/01_ADR/ADR-728_minio-key-rotation-deferred.md (new, with runbook)
  • module-infra/src/test/kotlin/.../MinioPolicyJsonTest.kt (new)
  • module-infra/src/test/kotlin/.../MinioPolicyScopeIT.kt (new, gated INTEGRATION_MINIO + -PintegrationTest)
  • module-infra/src/test/kotlin/.../MinioBootSmokeIT.kt (deleted; replaced by 4 per-SA IT classes)
  • module-infra/src/test/kotlin/.../ExtApiBootSmokeIT.kt (new)
  • module-infra/src/test/kotlin/.../CalculatorBootSmokeIT.kt (new)
  • module-infra/src/test/kotlin/.../SynchronizerBootSmokeIT.kt (new)
  • module-infra/src/test/kotlin/.../CleanupBootSmokeIT.kt (new)
  • .github/workflows/ci.yml (modify: add minio-it job with ephemeral MinIO)

Test plan

  • ./gradlew :module-infra:test --tests "*MinioPolicyJsonTest*" — structural assertions on all 4 policy JSONs (6/6 pass)
  • ./gradlew :module-infra:test --tests "*BootSmokeIT*" -PintegrationTest -DINTEGRATION_MINIO=true — module boots with each SA (4/4 pass)
  • ./gradlew :module-infra:test --tests "*MinioPolicyScopeIT*" -PintegrationTest -DINTEGRATION_MINIO=true — per-SA in-scope OK, out-of-scope 403 (3/3 pass)
  • Manual boot smoke: all 4 MinIO-using modules + rest-controller boot cleanly with new env files (Task 12)
  • CI minio-it job — ephemeral MinIO, random SA keys, scope IT runs in PR gate

Notes for reviewer

Issues found & resolved during implementation

  1. s3:HeadObject not in MinIO's IAM subset — removed from all 4 policy JSONs. MinIO's s3:GetObject covers HeadObject semantics. Fix-up commit 2128dc5e4.

  2. jq and awk not in minio/mc Alpine image — bootstrap ILM loop rewritten to use text-mode mc ilm ls + tr + bash read. Fix-up commit 83ccde4e0.

  3. Credential drift — if .env.bootstrap is regenerated (e.g., via scripts/dev-bootstrap.sh) without re-running docker compose up minio-bootstrap, the SAs in MinIO will have stale secrets. The bootstrap script is intentionally idempotent on user existence (does NOT update existing SA secrets). The runbook in ADR-728 documents the manual sync procedure.

Follow-up (out of scope for this PR)

  • Full pipeline E2E test (Airflow + DB + pipeline-test skill) — deferred to PR review or first deploy. The 4 IT suites prove credential scoping; the actual end-to-end use of the SAs for a real write hasn't been exercised in CI.
  • A scripts/sync-sas.sh helper that does mc admin user remove × 4 + docker compose up minio-bootstrap would close the credential-drift gap. Out of scope for this PR.
  • A read-api audit (re-introducing a 5th SA if rest-controller ever gains an ObjectStorage caller) — out of scope.

🤖 Generated with Claude Code

zbnerd added 18 commits June 15, 2026 01:22
…p container

Single bucket preserved, 5 service accounts (ext-api, calculator,

synchronizer, cleanup, read-api), prefix-scoped policies, one-shot

minio-bootstrap container for SA/policy creation, root credential

isolated to bootstrap env only. Zero Spring source change. Key

rotation deferred to ADR.
5 policy JSONs, idempotent bootstrap.sh, .env split, scope IT,

rotation-deferred ADR. Zero Spring source change.
Spec: 5→4 SA (drop read-api per Q4 codebase audit), reassign

ocid-mapping/* from synchronizer to ext-api (Q5), add CI/dev

strategy to spec. Plan: 14 tasks, ILM 1-rule invariant, dev-bootstrap.sh,

ephemeral CI, 4 per-SA BootSmokeIT classes, ADR with runbook.
MinIO IAM subset does not include s3:HeadObject as a separate action;
it is implicit in s3:GetObject. mc admin policy create rejects it.

All 4 policy JSONs updated. Structural test still passes (the test does
not assert s3:HeadObject presence). Verified against running MinIO via
the Task 5 smoke test.
The minio/mc Alpine image does not ship jq, awk, grep, or sed. The
previous '|| true' silently masked the jq error, so the ILM cleanup
loop was a no-op and re-runs accumulated duplicate rules (5 per
prefix observed in the live environment before this fix).

Switch to text-mode 'mc ilm ls', strip the box-drawing characters
with tr, and use bash's read + positional params to extract
(ID, PREFIX) pairs. Wrap the inner tokenizer in 'set +u' to handle
stripped lines with fewer than 3 tokens. Drop the '|| true' on
mc ilm rm so a failed removal fails the script loudly.

Verified by running the bootstrap container twice against the live
MinIO: after run 2, exactly 1 rule per prefix (snapshots/, runs/,
calculator/, ocid-mapping/).
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

zbnerd added 9 commits June 15, 2026 03:44
…rvice consumer)

OcidLookupService.kt:29 reads ocid-mapping/ocid-mapping-<runId>.jsonl.gz
files produced by OcidLookupPhase to populate synchronizer's in-memory
mapping state. Without Read on ocid-mapping/* the synchronizer SA gets
403 on every ocid-lookup event.

The security invariant is write-ownership, not read-ownership: ext-api
is the sole WRITER of ocid-mapping/*; synchronizer is a READER. The
synchronizer policy deliberately has no s3:PutObject action.
Asserts synchronizer can read runs/*, calculator/runs/*, and
ocid-mapping/* and gets 403 on PutObject to ocid-mapping/*. The 403
on write is the load-bearing assertion: it proves ext-api is the
sole writer (write-ownership invariant).
Previously the step only ran 'mc admin user list local' for human
inspection. Now it greps for each of the 4 SAs and exits 1 with an
explicit error message if any are missing. set -euo pipefail makes
the step fail the job instead of silently passing.
…ite)

Original Q5 audit incorrectly dropped synchronizer's ocid-mapping read
access. Re-reading OcidLookupService.kt:29 shows synchronizer consumes
ocid-mapping/ocid-mapping-*.jsonl.gz to populate its mapping state.
The security invariant is write-ownership: ext-api is the sole writer;
synchronizer is a reader. Spec text, Appendix A note, and revision
history updated to reflect the corrected invariant.
@zbnerd zbnerd merged commit f09fc9d into develop Jun 15, 2026
@zbnerd zbnerd deleted the feature/minio-sa-isolation branch June 15, 2026 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant