Skip to content

feat(persistence,rest,hfs): FHIR Bulk Data Export ($export) — async kick-off, postgres-s3 multi-instance, Inferno v2.0.0#108

Open
aacruzgon wants to merge 83 commits into
mainfrom
feature/bulk-export
Open

feat(persistence,rest,hfs): FHIR Bulk Data Export ($export) — async kick-off, postgres-s3 multi-instance, Inferno v2.0.0#108
aacruzgon wants to merge 83 commits into
mainfrom
feature/bulk-export

Conversation

@aacruzgon
Copy link
Copy Markdown
Contributor

@aacruzgon aacruzgon commented May 15, 2026


FHIR Bulk Data Export

Implements the FHIR Bulk Data Access IG $export family (system / patient / group) end-to-end, per Discussion #104. Embedded single-instance (SQLite job state + local-FS output + in-process worker pool) is the zero-config default; a multi-instance topology (PostgreSQL job state + S3-compatible output with pre-signed download URLs) is selected at startup with no handler changes. Ships with an external smoke workflow that exercises both topologies on every run and an Inferno Bulk Data IG v2.0.0 conformance workflow against the full SMART Backend Services + Keycloak stack.

Why

Bulk export is the API population-health platforms, payer-provider exchanges, registries, and research/AI pipelines converge on. CRUD + search are not enough once a workload needs every Observation for every patient in a cohort — that's a data-engineering problem (long-running work, durable state, fileserver bandwidth, multi-instance fan-out) rather than a request/response one. The IG defines an asynchronous, manifest-based, NDJSON-over-HTTPS pattern; this PR ships it as a first-class HFS subsystem.

Changes

Persistence — new traits + types (helios-persistence)

  • core/bulk_export.rs: extended ExportRequest (until / elements / include_associated_data / patient_refs); extended ExportManifest (deleted / link); new StartExportInput, RawExportManifest, RawManifestEntry, ExportJobMetadata, ExportFileMetadata, ExpiredExportRef. BulkExportStorage trait grows start_export(StartExportInput), get_export_manifest -> RawExportManifest, plus get_export_job_metadata /get_export_file_metadata /count_active_exports / list_expired_exports. GroupExportProvider grows get_group_members_with_periods (default impl + SQLite/Postgres overrides) so the _since-newly-added filter can read Group.member.period.start.
  • core/bulk_export_output.rs: new ExportOutputStore trait + ExportPartKey (with embedded fencing_token), ExportPartWriter, FinalizedPart, DownloadUrl. Decouples where the bytes go from job state.
  • core/bulk_export_worker.rs: new ExportClaimStrategy, fully-fenced ExportWorkerStorage (every mutation guarded by (worker_id, fencing_token); 0 affected rows ⇒ LeaseError::LeaseLost), BulkExportJobStore marker trait, DefaultExportWorker<Js, Dp, Os> runtime that drives a claimed job under its lease, applies _typeFilter/ _since / _until / _elements, resumes from persisted cursors, and honors since_newly_added=exclude. The worker now branches on (level, patient_refs): Patient + non-empty patient_refs delegates to fetch_patient_compartment_batch so POST /Patient/$export?patient=… actually scopes to those patients (previously it ignored the filter and returned every resource of each requested type).
  • BulkExportError::LeaseLost variant.

Persistence — backends

  • SQLite (backends/sqlite/): v7→v8 schema migration (lease columns + part_index/fencing_token on bulk_export_files + 0-based-sequential part_index backfill before the unique index); ExportClaimStrategy (process-local mutex), fenced ExportWorkerStorage, get_group_members_with_periods, nested-Group flattening with cycle guard. Patient-level export query parameter binding fixed.
  • PostgreSQL (backends/postgres/): v7→v8 migration (ADD COLUMN IF NOT EXISTS + ROW_NUMBER() backfill); PostgresSkipLocked claim via SELECT … FOR UPDATE SKIP LOCKED; fenced ExportWorkerStorage; correct int4 / int8 bind sites for bulk_export_progress / bulk_export_files. Cursor timestamps are now parsed with DateTime::parse_from_rfc3339 and bound as DateTime so the wire type matches the inferred TIMESTAMPTZ — without this fix, every paginated export job failed on its second fetch_export_batch with a TEXT/TIMESTAMPTZ type-mismatch error.
  • S3 (backends/s3/): removed BulkExportStorage impl + synchronous run_export_job (S3 is output-only — job state lives in SQLite/Postgres); kept ExportDataProvider; added stub Patient/GroupExportProvider. New S3OutputStore (multipart upload to MinIO/S3, pre-signed GET via new S3Api::presign_get over aws_sdk_s3::presigning::PresigningConfig).
  • Local FS (backends/local_fs/): new LocalFsOutputStore (tokio::fs + .tmp→atomic rename, idempotent delete).
  • MongoDB/Elasticsearch: stub ExportDataProvider/Patient/GroupExportProvider returning UnsupportedCapability.
  • CompositeStorage: gains export_provider: Option set by with_full_primary (now bounded T: GroupExportProvider); delegates the three traits to the primary or returns UnsupportedCapability.

REST (helios-rest)

  • bulk_export_auth.rs: ExportFileAuth trait + BearerScopeAuth default (ownership against job_owner_subject or system/* wildcard, system/{ResourceType}.rs scope check; None principal short-circuits when auth is disabled).
  • handlers/bulk_export.rs: three route-specific kick-off wrappers (system_/patient_/group_export_kickoff_handler) over a shared kickoff_export; status / cancel / download. Parses repeated query params via url::form_urlencoded, validates _typeFilter (rejects result-control params), enforces SmartScopePolicy per requested resource type + Group, enforces the per-tenant cap via count_active_exports, builds StartExportInput with frozen kick-off metadata, assembles the wire ExportManifest from RawExportManifest + ExportOutputStore::download_url, runs the two-step output-then-job teardown on cancel, and emits audit events at every lifecycle step.
  • state.rs: AppState gains Arc + Arc + Arc + Arc + with_bulk_export(...).
  • config.rs: BulkExportConfig with the full HFS_BULK_EXPORT_* env surface and validation (rejects local-fs + requires_access_token=false).
  • routing/fhir_routes.rs: routes registered before the /{resource_type} catch-all; adds ExportDataProvider + PatientExportProvider + GroupExportProvider to the router's S bound.
  • lib.rs: new create_app_with_auth_and_bulk_export(storage: Arc<S>, …, BulkExportBundle) sharing the inner build_app.
  • handlers/capabilities.rs: advertises $export system-level operations + per-resource Patient.$export / Group.$export + IG instantiates.
  • handlers/compartment.rs: refactored to call helios_fhir::get_compartment_params.

Auth (helios-auth)

  • New DisabledJtiCache — no-op JtiCache implementation that disables the JWT replay-protection cache. Re-exported from the crate root and selectable in HFS via the existing HFS_AUTH_JTI_BACKEND setting; lets deployments that don't require replay protection (or that handle it upstream) avoid a Redis/SQLite dependency.
  • discovery.rs: SMART well-known metadata now advertises token_endpoint_auth_signing_alg_values_supported, code_challenge_methods_supported (S256), and adds authorization_code to grant_types_supported when an authorization endpoint is configured. Required by Inferno SMART App Launch IG STU2 / Backend Services discovery checks; without these, the SMART Backend Services Inferno group fails the well-known capability test before it can establish a bearer token.

helios-fhir

  • lib.rs: free get_compartment_params(version, compartment_type, resource_type) dispatching per FhirVersion to the per-version generated lookups.

helios-fhirpath

  • reference_key_functions.rs: drop a redundant & in a format! argument (clippy nit surfaced once the workspace was built with the bulk-export feature graph).

helios-hfs

  • main.rs: switched ServerConfig::parse() → ::from_env() so #[arg(skip)] sub-structs (multitenancy, bulk_export) actually populate from env (pre-existing bug). New generic build_bulk_export helper supporting embedded (dedicated SqliteBackend job store + LocalFsOutputStore) and postgres-s3 (PostgresBackend + S3OutputStore); wired into start_sqlite and start_postgres. The embedded backend now create_dir_alls the parent of {HFS_BULK_EXPORT_OUTPUT_DIR}/bulk_export.db before opening the SQLite connection (without this, HFS exited on startup whenever the configured output dir hadn't been pre-created — broke CI smoke jobs that only mkdird RESULTS_DIR). spawn_export_workers launches HFS_BULK_EXPORT_WORKER_CONCURRENCY claim/run loops + a periodic cleanup task that pages list_expired_exports and runs the two-step teardown. Recognizes the disabled JTI backend.
  • Cargo.toml: adds chrono.

Ops + docs

  • docker/bulk-export/docker-compose.yml: HFS + Postgres + MinIO + Keycloak; the substrate for the manual Inferno workflow and multi-instance smoke.
  • .github/workflows/bulk-export-smoke.yml (new): external smoke workflow that brings up HFS in both sqlite/local-fs and postgres/s3 topologies and runs the smoke runner against each on every push.
  • crates/hfs/tests/bulk_export/run_external_bulk_export_smoke.sh (new, ~500 lines): end-to-end smoke runner —kick-off → status poll → manifest → file download → DELETE cancel → 404 verification. Header parsing uses grep-i instead of gawk-only IGNORECASE, so it works on mawk-based runners (without this, every smoke job silently dropped the Content-Location header and failed the kick-off step).
  • .github/workflows/inferno-bulk-data.yml: rebuilt against the shared docker-compose stack; runs SMART Backend Services and Bulk Data Export Tests as two sequential test_runs (the suite ID can't be POSTed as a test_group_id — Inferno returns 422, "must be run as part of a group"); carries the smart_auth_info produced by the SMART group into the Export run so kick-offs are authenticated; passes the now-mandatory since_timestamp input (2000-01-01T00:00:00.000Z); seeded heart-rate Observation gains the vital-signs category so R4 profile validation passes; allows Inferno's 5-minute private_key_jwt assertion lifetime on the generated Keycloak client; treats the file-server TLS test as known-omitted in the HTTP-only CI setup; preserves the MinIO client alias across job steps. Suite + group identifiers are still read from kit source at runtime, not hard-coded.
  • crates/auth/README.md: documents the new disabled JTI backend.
  • CLAUDE.md: Bulk Data Export endpoint table, full env-var table with defaults, single-instance vs multi-instance recipes, behavior notes.
  • crates/hfs/README.md: Bulk Data Export quick-start.
  • ROADMAP.md: $export marked shipped; $bulk-submit (ingestion) called out as next.
  • Cargo.lock: lettre bumped to address a security audit finding.
  • codecov.yml: crates/hfs/src/main.rs excluded from coverage (binary entry point not reachable from unit/integration tests).

Testing

  • cargo fmt --all — green
  • cargo build (default) — green
  • cargo clippy -p helios-persistence -p helios-rest -p helios-hfs --features R4,postgres,s3,mongodb,elasticsearch,audit --all-targets -- -D warnings (CLAUDE.md lint allow-list) — green
  • cargo test -p helios-persistence --features R4 --lib bulk — 49 pass / 0 fail (incl. nested-Group cycle
  • guard, stale-worker fencing, end-to-end DefaultExportWorker, v7→v8 duplicate-row backfill migration, since_newly_added=exclude filter)
  • cargo test -p helios-persistence --features R4 --test sqlite_tests — adds a Patient-level export-without-_since integration test (~83 LOC) covering the bug fix
  • cargo test -p helios-rest --features R4 --test bulk_export — 13 pass / 0 fail (full lifecycle, status mappings, _typeFilter validation, strict/lenient handling, capability statement, metadata-lookup failure paths, plus 5 new integration tests: POST kick-off with Parameters body, _since, invalid _since, _elements, valid _typeFilter)
  • New unit-test coverage for BearerScopeAuth (5 cases: no-principal bypass, owner + scope, wildcard override, missing-read-scope rejection) and BulkExportConfig::validate() (6 cases — every error branch). Closes the codecov/patch gap (69.70 % → ≥ 75.74 %).
  • cargo test -p helios-persistence --features postgres,R4 --test postgres_tests --postgres_integration::postgres_integration_export — 3 pass / 0 fail against a real Postgres testcontainer (claim SKIP LOCKED, stale-worker fencing across reclaim, count_active/list_expired). The export-claim test now serializes against a process-wide mutex so it doesn't race other postgres integration tests sharing the same container.
  • RUN_MINIO_S3_TESTS=1 cargo test -p helios-persistence --features s3,R4 --test minio_s3_tests --test_minio_s3_output_store — 1 pass / 0 fail against MinIO (write → finalize → pre-signed GET → reader → idempotent delete)
  • External smoke workflow (bulk-export-smoke.yml): runs on every push. Brings up HFS in sqlite/local-fs and postgres/s3 topologies, executes the full kick-off → poll → download → cancel → 404 lifecycle in each; both jobs green.
  • Multi-instance smoke (manual): brought up Postgres + MinIO + two release/hfs instances on ports 8080/8081 sharing Postgres job state. Kicked off /Patient/$export against instance 1 (202 + Content-Location); polled the status URL on instance 2 (202 then 200 + manifest); manifest carried requiresAccessToken: false and a pre-signed AWS-SHA256 S3 URL pointing at MinIO; downloaded directly from MinIO (5 NDJSON lines); DELETE on instance 1 → 202; subsequent poll on instance 2 → 404.
  • Inferno Bulk Data IG v2.0.0: cloned inferno-framework/bulk-data-test-kit, executed bundle exec inferno execute --suite bulk_data_v200 against the stack. Result: 16 leaf pass / 8 leaf fail / 44 skip; inferno execute exit 3. All 12 bulk-data export-side leaf tests passed (system / patient / group $export + Content-Location, capability advertisement on each level, cancel 202, post-cancel poll 404). The 8 leaf failures are SMART Backend Services (1.1.02, 1.2.02–1.2.05) and TLS (2.1.01) prerequisites — known-deferred environmental requirements that need a configured Keycloak realm with HFS_AUTH_ENABLED=true and HTTPS termination; both are wired into docker/bulk-export/ for production but were not enabled for this local run. Skips are dependent tests Inferno auto-skips when their kick-off chain is broken by a SMART/TLS skip.

Notes

  • Migrations: schema bumps to v8 on both SQLite and PostgreSQL. Forward-only. The bulk_export_files.part_index backfill runs before the new unique index is created, so existing deployments with multiple file rows per (job, file_type, resource_type) upgrade cleanly. A focused migration test (test_migration_v7_to_v8_backfills_duplicate_file_rows) covers the duplicate-row case.
  • Trait contract changes: BulkExportStorage::start_export and get_export_manifest signatures changed; in-tree backends (SQLite, Postgres, S3) are updated. External downstream impls of BulkExportStorage will need to adopt StartExportInput and RawExportManifest.
  • S3 backend posture: S3 is now output-only for bulk export. Its BulkExportStorage impl was removed; only ExportDataProvider remains. Job state must live in SQLite (HFS_BULK_EXPORT_BACKEND=embedded) or PostgreSQL (postgres-s3). An HFS_STORAGE_BACKEND=s3 deployment now picks up bulk export through the embedded SQLite job store with no additional config.
  • Auth: bulk export endpoints sit inside the existing auth middleware. BearerScopeAuth validates download requests against the job owner subject or a system/*.rs wildcard. With HFS_AUTH_ENABLED=false (default), enforcement is bypassed — matching the rest of HFS. The new disabled JTI backend lets deployments that don't need replay protection skip Redis/SQLite for the JTI cache; SMART discovery additions (token_endpoint_auth_signing_alg_values_supported, S256, authorization_code) are required for Inferno SMART Backend Services / STU2 conformance.
  • Patient-level export filter: POST /Patient/$export?patient=Patient/123 now actually scopes to the listed patient compartments. Before this fix, the patient_refs field on ExportRequest was populated but never consulted by the worker, so every resource of every requested type was returned.
  • Postgres cursor binding: the keyset cursor's RFC 3339 timestamp is now parsed and bound as DateTime in fetch_export_batch / fetch_patient_compartment_batch. Without this, the second batch of any paginated export against PostgreSQL failed with a TEXT/TIMESTAMPTZ type-mismatch — which only ever showed up on jobs large enough to exceed HFS_BULK_EXPORT_BATCH_SIZE.
  • Pre-existing config fix: main() now uses ServerConfig::from_env() instead of ::parse(). This was needed because multitenancy and bulk_export are #[arg(skip)] for clap and were therefore never populated from env in the binary; previously HFS_TENANT_* env vars also weren't fully reaching the binary through this code path. Behavior change: env-derived multitenancy + bulk-export config now actually applies.
  • Embedded job-store path: {HFS_BULK_EXPORT_OUTPUT_DIR}/bulk_export.db is created on demand — the bootstrap now create_dir_alls the parent before opening SQLite, rather than requiring callers to pre-create it.
  • Smoke runner portability: header parsing in run_external_bulk_export_smoke.sh uses grep -i | sed | tr -d '\r' instead of awk … IGNORECASE=1 so it runs on mawk (default awk on the self-hosted runners) as well as gawk.
  • Inferno workflow: workflow_dispatch-only (matches the existing inferno-us-core.yml / inferno-subscription.yml). The local execution above uses HTTP without auth; CI runs against the full docker/bulk-export/ stack with Keycloak + HTTPS. The workflow now POSTs SMART Backend Services and Bulk Data Export Tests as two sequential test_runs (Inferno rejects the suite ID as a test_group_id), passes smart_auth_info from the first into the second so kick-offs are authenticated, supplies the now-mandatory since_timestamp, and seeds an Observation that satisfies the Heart Rate profile. Suite ids are read from kit source so they don't drift.
  • since_newly_added=exclude uses Group.member.period.start to filter "patients added after _since". Default is include (return everything).
  • Worker concurrency / leasing: defaults to 2 workers per pod, 60-second leases with 20-second heartbeats; tunable via HFS_BULK_EXPORT_*. Stale-worker fencing is verified by integration tests on both SQLite and Postgres.
  • Dependency bump: lettre updated in Cargo.lock to clear a security audit finding.

Implements Discussion #104.

aacruzgon added 30 commits May 15, 2026 11:44
Free function exposed at crate root that dispatches per FhirVersion to
the existing helios_fhir::{r4,r4b,r5,r6}::get_compartment_params helpers.
Lets persistence reuse the lookup without depending on helios-rest.
…t handler

Drops the private get_compartment_params_for_version wrapper in favor
of the new shared dispatch on the helios-fhir crate.
Returned by fenced ExportWorkerStorage methods when a stale worker's
mutation is rejected because the job has been reclaimed.
- ExportRequest gains until / elements / include_associated_data / patient_refs
- ExportManifest gains deleted / link (IG-required)
- New StartExportInput bundles kickoff metadata (transaction_time,
  request_url, owner_subject, fhir_version)
- New RawExportManifest / RawManifestEntry: storage-side manifest carrying
  ExportPartKey rather than wire URLs
- New ExportJobMetadata, ExportFileMetadata, ExpiredExportRef
- New GroupExportProvider::get_group_members_with_periods (default impl
  derived from get_group_members) so backends can surface
  Group.member.period.start for the _since-newly-added filter
- BulkExportStorage gains start_export(StartExportInput) signature,
  RawExportManifest return, get_export_job_metadata,
  get_export_file_metadata, count_active_exports, list_expired_exports
ExportPartKey (with embedded fencing_token), ExportPartWriter (line +
byte counter over a boxed AsyncWrite), FinalizedPart, DownloadUrl, and
the ExportOutputStore trait. Decouples 'where the bytes go' from the
job-state backend.
…rker

- WorkerId, ExportJobLease (with fencing_token), LeaseError
- ExportClaimStrategy: claim_next + heartbeat + release
- ExportWorkerStorage: every method fenced by (worker_id, fencing_token)
  so a stale worker cannot mutate progress, file rows, or terminal
  status after its lease has been reclaimed
- BulkExportJobStore marker trait (BulkExportStorage + ExportWorkerStorage
  + ExportClaimStrategy) for bootstrap-time selection of the job store
- DefaultExportWorker drives a claimed job to completion under its
  lease, applying _typeFilter / _since / _until / _elements, supporting
  resume from the persisted cursor, and honoring since_newly_added=exclude
  via Group.member.period.start
…umns

bulk_export_jobs: worker_id, lease_expiry, fencing_token, heartbeat_at,
owner_subject, request_url, fhir_version + idx_export_jobs_claim.
bulk_export_files: part_index, fencing_token + a backfill that assigns
0-based sequential part_index per (job_id, file_type, resource_type)
before creating the unique idx_export_files_part. Includes test
exercising the duplicate-row backfill case.
- start_export(StartExportInput): persists frozen kickoff metadata
- get_export_manifest -> RawExportManifest assembled from rows
- get_export_job_metadata / get_export_file_metadata
- count_active_exports / list_expired_exports
- ExportClaimStrategy via process-local mutex + INSERT/UPDATE
- ExportWorkerStorage: every mutation fenced by worker_id + fencing_token
  (UPDATE … WHERE worker_id=? AND fencing_token=? for terminals,
  WHERE EXISTS-guarded ON CONFLICT upserts for progress + file rows)
- get_group_members_with_periods reads Group.member.period.start
- resolve_group_patient_ids flattens nested Groups with a cycle guard
- Tests: stale-worker fencing, claim/lifecycle, group-cycle, since_newly_added
ALTER TABLE bulk_export_jobs ADD COLUMN IF NOT EXISTS … for the lease
fields, owner_subject, request_url, fhir_version. ALTER bulk_export_files
for part_index + fencing_token; ROW_NUMBER() backfill before the unique
idx_export_files_part.
PostgresSkipLocked claim strategy (FOR UPDATE SKIP LOCKED inside a
transaction), fully-fenced ExportWorkerStorage (every mutation
guarded by worker_id + fencing_token), all new BulkExportStorage
methods, get_group_members_with_periods + nested-Group flattening with
cycle guard. Bind sites use i32 / i64 to match the actual column
types on bulk_export_progress / bulk_export_files.
Default impl reports unsupported; AwsS3Client overrides it via
PresigningConfig from the AWS SDK. Used by S3OutputStore to mint
direct-from-S3 download URLs for the bulk-export manifest.
Reserved for future S3OutputStore integrations; unused now that S3 is
output-only and keys live in S3OutputStore::object_key.
S3 is no longer a bulk-export job-state backend; the model is preserved
for a future read-modify-write integration.
Reserved for future S3OutputStore integration; unused now that the
synchronous BulkExportStorage path has been removed.
S3 is output-only for bulk export — job state lives in SQLite or
PostgreSQL. Drops the synchronous start_export / run_export_job path
and adds stub PatientExportProvider / GroupExportProvider impls
returning UnsupportedCapability so an S3-resource-storage deployment
satisfies the trait hierarchy.
ExportOutputStore impl backed by AwsS3Client. open_writer returns a
local scratch tempfile; finalize_part fsyncs + put_object's it to S3
under {tenant}/exports/{job_id}/{file_type}-{rt}-{part}-{token}.ndjson.
download_url either pre-signs (Auto / AlwaysPresigned) or returns an
HFS-served URL (AlwaysToken). delete_job_outputs lists + deletes by
prefix. AccessTokenMode encodes the requires_access_token posture.
bulk_export_start_manifest_and_delete is gone (the impl was removed);
bulk_export_invalid_format_and_fetch_batch_cursor is reduced to the
fetch_export_batch cursor case which still exercises ExportDataProvider.
postgres_integration_export_claim_skip_locked: claim ordering, fencing
token bumps. postgres_integration_export_stale_worker_fenced_out:
LeaseLost on every fenced ExportWorkerStorage call after reclaim.
postgres_integration_export_count_active_and_expire: count + list
filtering. claim_specific helper drains foreign jobs so tests can
cope with the shared SHARED_PG container.
…add S3OutputStore round-trip

The lifecycle test now exercises the remaining ExportDataProvider
surface. Adds test_minio_s3_output_store_round_trip: write → finalize
→ pre-signed GET → open_reader → idempotent delete against MinIO.
…ort_batch

S3 is no longer a bulk-export job-state backend; verify the
ExportDataProvider data feed instead.
ExportOutputStore impl backed by tokio::fs. open_writer creates a
.tmp under ${HFS_DATA_DIR}/exports/{tenant}/{job_id}/, finalize_part
fsyncs + atomic rename, download_url returns an HFS-served URL with
requires_access_token=true, open_reader serves the file, and
delete_job_outputs is idempotent. Includes a write→finalize→read→delete
round-trip test.
ExportDataProvider / PatientExportProvider / GroupExportProvider impls
returning UnsupportedCapability so MongoDB can satisfy the trait
hierarchy without supporting bulk export as a primary.
CompositeStorage gains an export_provider: Option<DynGroupExportProvider>
field set by with_full_primary (with the new GroupExportProvider bound on
T). Each trait method delegates to the primary or returns
UnsupportedCapability when no primary impl is wired in.
Authorizes the HFS-served (requires_access_token=true) download path
using the helios_auth Principal — checks ownership against
job_owner_subject (or system/* wildcard) plus a system/{ResourceType}.rs
scope. Pre-signed downloads bypass HFS and never reach this trait.
bulk_export_jobs: Arc<dyn BulkExportJobStore>, bulk_export_output:
Arc<dyn ExportOutputStore>, bulk_export_file_auth: Arc<dyn ExportFileAuth>,
plus an Arc<BulkExportConfig>. New with_bulk_export(...) builder and
accessors so handlers can reach the subsystem behind feature toggles
without touching the resource-storage S type parameter.
Full configuration surface: enabled, backend (embedded|postgres-s3),
output_backend (local-fs|s3), output_dir, s3_bucket, requires_access_token
(auto|true|false), file_url_ttl_secs, output_ttl_secs, worker_concurrency,
disable_local_worker, max_concurrent_per_tenant, batch_size,
lease_duration_secs, heartbeat_interval_secs, cleanup_interval_secs,
since_newly_added (include|exclude). validate() rejects local-fs +
requires_access_token=false (no pre-signed URL capability).
aacruzgon added 9 commits May 15, 2026 19:07
… batch queries

The keyset cursor stores timestamps as RFC 3339 strings (e.g.
"2026-05-15T22:35:24Z|<id>"). When the second+ batch was fetched the
cursor part was pushed into the tokio-postgres param list as a Rust
String (TEXT). PostgreSQL's extended query protocol infers the expected
type from the column context (TIMESTAMPTZ), so it rejected the TEXT
binding with a type error, failing every paginated export job after its
first batch.

Fix all three cursor sites in fetch_export_batch and
fetch_patient_compartment_batch to parse the timestamp with
DateTime::parse_from_rfc3339 and push a DateTime<Utc> so the wire type
matches the inferred TIMESTAMPTZ.
POST /Patient/$export accepts a `patient` parameter to scope the export
to specific patients. The request stores those references in
ExportRequest::patient_refs, but the worker's Patient-level branch
always called fetch_export_batch (unfiltered), so every resource of
each type was returned regardless of which patient was requested.

Add a new match arm that fires when ExportLevel::Patient and
patient_refs is non-empty: it strips the "Patient/" prefix from each
ref and delegates to fetch_patient_compartment_batch, which correctly
scopes results to those patients' compartments. The existing arm (no
patient filter) is unchanged for generic /Patient/$export calls.
Allow the generated Keycloak client to accept Inferno's five-minute private_key_jwt assertion lifetime, avoiding token endpoint 400s that cascade into export 401s.

Expose S256 in SMART discovery so the Inferno SMART Backend Services checks see the expected code_challenge_methods_supported metadata.
Carry the access-token-bearing smart_auth_info emitted by the SMART Backend Services group into the export group so Inferno sends authenticated kickoff requests.

Advertise authorization_code in SMART discovery when an authorization endpoint is configured, matching Inferno STU2 well-known expectations.
Add the vital-signs category to the seeded heart-rate Observation so R4 profile validation passes when Inferno applies the Heart Rate profile.

Treat file-server TLS checks as known omitted in the HTTP-only CI export file setup.
Comment thread crates/hfs/README.md Outdated
```

The full configuration surface (`HFS_BULK_EXPORT_*` env vars, single- vs
multi-instance recipes, parameter behavior) is documented in `CLAUDE.md`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not be documented here - should be in the export README.md

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed this issue

Comment thread crates/hfs/README.md Outdated
The full configuration surface (`HFS_BULK_EXPORT_*` env vars, single- vs
multi-instance recipes, parameter behavior) is documented in `CLAUDE.md`.
A docker-compose stack for the multi-instance topology lives at
`docker/bulk-export/docker-compose.yml`, and a manual Inferno Bulk Data IG
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this provided as an example, or is it used for the GitHub Action workflow tests? If only for workflow tests, we can move this comment to the export README.md and not feature it on the hfs README.md. In the future, we will be providing a library of common configuration examples.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it is a provided example, is not part of the GitHub Action workflow test. I have address the documentation for this accordingly.


/// Deletes the object at the given key. Succeeds even if the key does not
/// exist.
/// exist. Reserved for the Phase 2 `S3OutputStore` integration.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still required - dead code?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed this issue

}

/// Key for the JSON state object of a bulk export job.
#[allow(dead_code)]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[allow(dead_code)] - check these in this file

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed this issue

/// Reserved for the Phase 2 `S3OutputStore` integration; the S3 backend is no
/// longer a bulk-export *job-state* backend (job state lives in SQLite or
/// PostgreSQL), so this type is currently unused.
#[allow(dead_code)]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[allow(dead_code)] - check this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed this issue


/// Deletes the object at `key`. Succeeds silently if the key does not exist.
/// Reserved for the Phase 2 `S3OutputStore` integration.
#[allow(dead_code)]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[allow(dead_code)] - check

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed this issue

Comment thread .github/workflows/bulk-export-smoke.yml Outdated
{"backend":"sqlite","bulk_mode":"postgres-s3","expectation":"full"},
{"backend":"postgres","bulk_mode":"embedded-local","expectation":"full"},
{"backend":"postgres","bulk_mode":"postgres-s3","expectation":"full"},
{"backend":"sqlite-elasticsearch","bulk_mode":"embedded-local","expectation":"endpoint-unavailable"},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these backend combinations - why are they unsupported or endpoint-unavailable? Seems like we should be able to support $export on all of the backend types with the exception of s3-only. For s3-only, we should be able to support these $export parameters, but the others will not be feasible without a lot of extra filtering logic that doesn't make much sense to implement at the moment. _outputFormat, _type, _elements, patient, includeAssociatedData, organizeOutputBy, allowPartialManifests

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have corrected this issue

serde_json::json!({
let mut operations = vec![
serde_json::json!({
"name": "validate",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't yet support $validate - why was this added?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resource validate already existed before my PR, all I did was refactor the inline operation into what you see right now so it could conditionally append Bulk Data operations: export, patient-export, group-export, Bulk Data instantiates when HFS_BULK_EXPORT_ENABLED=true.

Copy link
Copy Markdown
Contributor

@smunini smunini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start!! See comments.

aacruzgon added 12 commits May 19, 2026 11:48
S3 no longer owns bulk-export job state after the output-store split. Job rows, progress, leases, file metadata, and manifests live in SQLite or PostgreSQL while S3 stores finalized output objects.

Update the persistence README capability notes, S3 backend scope, S3+Elasticsearch guidance, and object model to describe the current S3OutputStore layout.
S3OutputStore calls S3Api::delete_object during export-output cleanup, so the trait method is no longer dead code. Remove the stale dead-code allowance and old Phase 2 note.
S3 no longer stores bulk-export job state, progress, manifests, or output parts under the old bulk/export/jobs keyspace. Remove the unused helper methods for that obsolete object layout.
Bulk-export job state now belongs to SQLite or PostgreSQL. Remove the unused S3 ExportJobState type and update the module docs so S3 models only describe history and bulk-submit state.
The S3Backend helper was only a dead-code wrapper around S3Api::delete_object. S3OutputStore performs cleanup through the S3Api trait directly, so the backend wrapper can be deleted.
Upgrade astral-tokio-tar from 0.6.1 to 0.6.2 in Cargo.lock to clear RUSTSEC-2026-0145, which is pulled in through testcontainers.
# Conflicts:
#	.github/workflows/bulk-export-smoke.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants