Skip to content

feat(hts): Postgres backend parity — 32.8% → 80.5% on tx-ecosystem-pg#107

Closed
mauripunzueta wants to merge 409 commits into
mainfrom
feat/hts-terminology-service
Closed

feat(hts): Postgres backend parity — 32.8% → 80.5% on tx-ecosystem-pg#107
mauripunzueta wants to merge 409 commits into
mainfrom
feat/hts-terminology-service

Conversation

@mauripunzueta
Copy link
Copy Markdown
Contributor

Summary

Brings the helios-hts PostgreSQL backend from a thin scaffolding (~32.8% pass on the HL7 Tx Ecosystem IG test bench) to 80.5% pass-rate parity with the established SQLite backend. The two paired workflow files added in #105 (tx-ecosystem-postgres.yml, hts-benchmark-postgres.yml) measure the trajectory; the residual 115 fails are documented as follow-up work.

Trajectory across 13 commits

# Commit Pass Δ Theme
1 0ebb44ce code_system_exists fast EXISTS override
2 c4d0901f 40.7% +7.9 P1 — VS $validate-code semantics cluster D (~1072 LOC)
3 c4f7e047 45.2% +4.5 P1.5 — version-mismatch detection (detect_cs_version_mismatch + detect_vs_pin_unknown)
4 2f23396d 45.2% 0 input_form-aware location strings
5 b90eb03e 45.2% 0 Locally-aliased property codes (cs_property_local_codes)
6 5d3c0016 50.4% +5.2 Echo display on version-mismatch (unlocked the version family)
7 c64f5e7f 62.8% +12.4 P2.1 — accept inline ValueSet on $expand
8 cb1c88e9 69.3% +6.5 P2.2 — compose =/is-a/regex filter operators
9 b3a37749 73.7% +4.4 P2.3 — thread CS version onto expansion contains items
10 cc1a4fe8 74.0% +0.3 P2.5 — CodeSystem $validate-code semantics port (~475 LOC)
11 0bf21cf5 78.9% +4.9 Bubble HtsError::NotFound for unresolvable VS (top-level OperationOutcome)
12 beec4d12 80.5% +1.6 Honour force-system-version / system-version on $expand
13 025c2155 80.5% 0 Revert experimental version-strict candidate selection

Code totals vs main

File Lines Notes
crates/hts/src/backends/postgres/value_set.rs 748 → ~2806 (3.75×) Full cluster-D port, inline-VS, filter ops, version threading, force-system-version
crates/hts/src/backends/postgres/code_system.rs 784 → 1245 (1.6×) CS $validate-code semantics with IG-canonical issues
crates/hts/src/ecl/{evaluator,mod}.rs +15 lines Latent compile bug fix: gate evaluator on sqlite feature
.github/workflows/{tx-ecosystem,hts-benchmark}-postgres.yml (carried from #105/#106) PG variants of the SQLite workflows

Total: 7 files, +2868 / -230 lines.

Notable architectural achievements

  • VS $validate-code shape parity: full ValidateCodeResponse field population (system, cs_version, inactive, issues, caused_by_unknown_system, concept_status, normalized_code) — was only result/message/display at session start.
  • Compose filter operators: = (with boolean-false-as-absence semantics), is-a (recursive CTE on concept_hierarchy), regex (PG ~). Honours locally-renamed property aliases via the FHIR property[].uri mapping.
  • Inline ValueSet on $expand: accept full value_set body in lieu of canonical URL; extract .compose and reuse the resolution path. Closed 144 failures across notSelectable, language, overload, parameters, simple, extensions, permutations families.
  • force-system-version / system-version honoured: PG compute_expansion now respects the request-time version overrides per FHIR override order (force > include.version > default).
  • ?fhir_vs URL implicit ValueSets: targeted point lookup + recursive-CTE IsA walks, no full-CodeSystem materialization.
  • Latent compile bug fix: crates/hts/src/ecl/evaluator.rs used rusqlite unconditionally and broke --features postgres builds. Gated on sqlite feature (parser stays unconditional so a future PG ECL evaluator can reuse the AST).

Workflow / CI infrastructure (already in #105, partially in open #106)

  • .github/workflows/tx-ecosystem-postgres.yml — parallel to the SQLite tx-ecosystem workflow. Spins up an ephemeral postgres:16 via docker run on the self-hosted runner (the runner uses a remote Docker daemon — DOCKER_HOST / DOCKER_HOST_IP secrets, -p 0:5432 random host port, docker port discovery). SOFT assertion during this parity-porting period: fails only when the validator can't run at all; test-fail counts surface in the step summary and artifacts.
  • .github/workflows/hts-benchmark-postgres.yml — k6 against PG-backed HTS. Confirmed 0% error rate across passing scenarios; LK05 at 14k req/s @ 50 VUs.

The SQLite workflows (tx-ecosystem.yml, hts-benchmark.yml) are untouched — they keep their 100%-pass baseline and existing performance numbers.

Residual 115 failures (next-session work)

Categorized by diff shape:

  • 31 param_value_diff — heterogeneous text/field nuances
  • 12 vs_content_other — expansion-content per-test investigation
  • 11 missing_param:[issues, message] — likely converged result handling
  • 7 Parameters_vs_OperationOutcome — stored VSes PG can't find (e.g. valueset-withdrawn)
  • ~50 various vs_total:aN_eM, indirect-VS imports, missing [code,display,system,version] params, etc.

Top families: version (13), validation (14), overload (15), notSelectable (14), parameters (13).

A Plan-agent investigation is queued for a fresh session to scope the residual fails properly — they need coherent investigation, not opportunistic commits. Direction: the validate-bad-v1code4/overload family in particular needs deep tracing through operations/validate_code.rs's suppression logic (suppress_default_versionless_mismatch, suppress_force_system_version_mismatch) which appears to flip backend result=false to result=true under conditions I couldn't fully isolate locally.

Stacked PRs

Test plan

🤖 Generated with Claude Code

mauripunzueta and others added 30 commits May 1, 2026 19:40
Remove the !has_eq_filter guard from the FTS routing branch so that
requests with both a text filter (>=3 chars) and a property= filter
(EX08 pattern: is-a + bodySite= + 'fracture') also take the FTS-first
path.

apply_compose_filters_to_candidates already handles property= via
batch_property_eq_in_set, so the FTS candidates (~50-200 text matches)
are validated against hierarchy and property in batch -- replacing the
3-way JOIN over potentially thousands of property-matching descendants
followed by a Rust text scan.
…t case

The previous FTS-first attempt backfired: 'fracture' matches ~5000 SNOMED
concepts in FTS, and batch_descendants_in_set on 5000 codes causes 30s
timeouts at 50VU (vs 63 RPS before the bad commit).

Correct fix: keep property-first routing for has_eq_filter requests but
push the text filter into query_subtree_with_property via instr() so the
DB returns only text-matching rows in the same 3-way JOIN pass.

When filter_lower.len() >= 3 and has_eq_filter, sql_text = Some(&filter_lower)
is threaded through apply_compose_filters into query_subtree_with_property,
which uses a separate prepare_cached SQL variant with:
  AND (instr(lower(c.display), ?6) > 0 OR instr(lower(c.code), ?6) > 0)

The FTS-first path (hierarchy-only, EX07) is unchanged.
…EX06)

When all compose.include[] entries reference the same CodeSystem and carry
only property-equality filters (no hierarchy, no ECL, no explicit concept
lists), collapse the expansion into a single SQL CTE query instead of N×M
individual round-trips.

For EX06's 2-include × 2-property-filter case this reduces 6 SQL queries
(1 system_id lookup × 2 includes + 1 property_eq × 2 filters × 2 includes)
to a single UNION-of-INTERSECTs CTE:

  WITH inc0_p0 AS (SELECT concept_id FROM concept_properties WHERE property=? AND value=?),
       inc0_p1 AS (...),
       inc0    AS (SELECT concept_id FROM inc0_p0 INTERSECT SELECT concept_id FROM inc0_p1),
       inc1_p0 AS (...), inc1_p1 AS (...),
       inc1    AS (...),
       all_ids AS (SELECT concept_id FROM inc0 UNION SELECT concept_id FROM inc1)
  SELECT c.code, c.display FROM concepts c
  JOIN all_ids a ON a.concept_id = c.id WHERE c.system_id = ?

Also adds a system_id cache in the per-include fallback loop so that
multi-include composes with mixed filter types don't re-query code_systems
for the same URL on every iteration.
…ty compose

INTERSECT materialises and sorts both sides before finding the common
concept_ids.  For a broad filter like TTY='BN' (tens of thousands of
brand-name rows in RxNorm) this is O(N log N) in the large set even
when the second filter is tiny (e.g. tradename_of='CUI:161' ≈ 3 rows).

Replace with a UNION of driver-+EXISTS sub-selects:

  SELECT c.code, c.display FROM concepts c
  WHERE c.system_id = ?
  AND c.id IN (
      SELECT cp0.concept_id FROM concept_properties cp0
      WHERE cp0.property = ?1 AND cp0.value = ?2
        AND EXISTS (SELECT 1 FROM concept_properties
                    WHERE concept_id = cp0.concept_id
                      AND property = ?3 AND value = ?4)
      UNION
      ...
  )

The driver scan uses idx_concept_properties_value(property,value,concept_id);
the EXISTS check uses idx_concept_properties_lookup(concept_id,property,value).
SQLite short-circuits EXISTS on the first matching row — no temp sets sorted.

Also change prepare() -> prepare_cached() so the compiled statement is
reused across calls on the same connection instead of being recompiled
on every DB cache miss.
…ontention (EX06)

Add InlineComposeIndex — an Arc<RwLock<HashMap>> keyed by the FNV-64
hash of the compose body — that mirrors the existing ImplicitIndex for
URL-based ValueSets.  Once a compose body is first evaluated the result
is stored in both the DB implicit_expansion_cache and the new in-memory
index.  On warm restart the index is pre-loaded from persisted cache rows
via prebuild_inline_compose_index().

Subsequent requests for the same inline ValueSet are served entirely from
process memory: no pool connection acquired, no tokio::task::spawn_blocking
entered.  This eliminates r2d2 pool contention under high concurrency and
should raise EX06 throughput from ~317 RPS (anti-scaling at VU=50) to
benchmark-limited RPS once the index is warm.
…C ValueSets

Four VSAC ValueSet OIDs in the EX04 pool are absent from us.nlm.vsac@0.17.0:
  - 2.16.840.1.113762.1.4.1267.17  (lab test LOINC codes)
  - 2.16.840.1.114222.24.7.14      (infectious organism SNOMED codes)
  - 2.16.840.1.113762.1.4.1260.230 (chemotherapy RxNorm codes)
  - 2.16.840.1.113762.1.4.1078.781 (migraine medication RxNorm codes)

HTS returned 404 for these, causing ~40% of EX04 requests to fail.

Fix:
- Add fhir-bundle import format to the HTS CLI so plain JSON FHIR Bundles
  can be imported (auto-detected from the first 256 bytes).
- Add vsac-supplement.bundle.json with extensional ValueSets (compose-embedded
  display names) for the 4 missing OIDs — compose_page_fast serves these
  directly from the embedded displays with no DB lookup needed.
- Update hts-benchmark.yml to import the supplement before the licensed
  terminology, ensuring all 10 EX04 OIDs are present in the benchmark DB.
…operty+text paths

Add three focused tests that verify the key query code paths exercised by
the EX06 and EX08 benchmark scenarios:

- expand_multi_include_property_or_semantics: two includes with one
  property= filter each go through try_multi_include_property_only and
  return the UNION (OR across includes).

- expand_single_include_two_property_filters_and_semantics: one include
  with two property= filters calls query_property_eq twice and intersects
  (AND within one include).

- expand_inline_isa_property_and_text_filter_combined: is-a hierarchy +
  property= + text filter uses the sql_text push-down path in
  query_subtree_with_property; also asserts that a non-matching text
  filter returns an empty expansion (not an error).

Also fix the doc-comment example in try_multi_include_property_only: the
2x2 case had ?5 shown for system_id but the correct index is ?9 (params
are numbered sequentially, system_id is always the last binding).

[skip ci]
…led in this crate

helios-hts depends on helios-fhir without disabling default features, so an
R5-only hts build still pulls helios-fhir's default R4 feature. The transitive
helios-fhirpath dep (via helios-persistence with default-features = false) only
sees R5, so its cfg-gated match in `lookup_field_type` was non-exhaustive
against the R4 variant, breaking the tx-ecosystem R5 CI build.

Add a wildcard arm returning None — when an upstream enables a version on
helios-fhir without propagating it to helios-fhirpath, we simply have no
field-type table for that variant.
Match the polish of the hts-benchmark step summary: status badge in the
heading, metadata table (branch, commit, server/validator/Java versions,
test source), single-row results table, optional failing-tests table,
and a dedicated warning block surfacing the validator's exception when
it dies before running any tests.

Failure count is now derived from tx-test-output/actual/*.json
(excluding the always-written $versions.json metadata file and any
empty files), or from the TestReport's test[] array when available —
the previous logic counted report.json itself, inflating the failure
count even when the validator never ran.
The /metadata?mode=terminology endpoint emitted kind="terminology", which
is not a valid CapabilityStatementKind code (instance | capability |
requirements). The HL7 validator's txTests command rejects the response
when fetching the server's terminology capabilities, blocking the entire
tx-ecosystem suite before any test runs:

  Unknown CapabilityStatementKind code 'terminology'

Set kind to "instance" — this server is a running implementation, not an
abstract capability or requirements document.
find_loinc_paths used filename.starts_with("loinc"), which also matched
accessory CSVs like LoincPartLink_Primary.csv. The tx-ecosystem subset
ships the real table at LoincTable/Loinc.csv alongside that accessory,
and ZIP iteration order picked the wrong file — the importer then aborted
with "Required column 'LOINC_NUM' not found in CSV headers".

Tighten the predicate to exact match against loinc.csv or loinctable.csv
(the only names the LOINC distribution uses for the main table, in flat,
LoincTable/, or Loinc_<ver>/ layouts). Add a regression test that mirrors
the tx-ecosystem layout.
…ombined property+text queries

On first request, expand_inline_filtered detects all_prop_cacheable (compose
has property= + hierarchy filters only) and runs query_subtree_with_property
without a SQL text filter, collecting the full property-matched concept set.
That set is stored in a new PropertyResultCache (same Arc<RwLock<HashMap>>
type as ImplicitIndex / InlineComposeIndex) keyed by "prop-result:{fnv64-hex}"
of the compose body.

On all subsequent requests a new async hot path (hot path #3) fires before
spawn_blocking, reads the cached ImplicitConceptIndex, and applies the text
filter through the trigram inverted index in Rust — no pool connection
acquired, no thread switch.  This mirrors the EX03 optimisation that lifted
implicit expand from ~140 to ~10 K RPS at 50 VUs.

Cache is cleared in import_bundle alongside implicit_index and
inline_compose_index.  490 existing tests pass.
The IG ships ~250 test fixture CodeSystem/ValueSet resources under
tests/<group>/ that the validator's txTests command references by
canonical URL. Without them pre-loaded the server returns 404 to every
\$expand / \$validate-code, accounting for ~89% of the failures in the
first end-to-end run (523 of 585).

Add a workflow step that walks the IG tests/ directory, wraps every
valueset-*.json / codesystem-*.json / conceptmap-*.json into a single
collection Bundle, and imports it via 'hts import'. Verified locally:
loads 41 CodeSystems + 210 ValueSets and the simple-expand-all test
expands correctly to the expected 7 concepts.

Also surfaces the new exit code in the step-summary import-status table.
The IG validator (txTests) treats both fields as required — they appear
in every fixture's response without an $optional$ marker. Without them,
33 tests in run #93 failed with the single error "missing property
identifier" at .expansion (no other shape diffs).

Emit a urn:uuid identifier and an RFC-3339 millisecond timestamp on
every successful $expand response. Values are matched as $uuid$ /
$instant$ wildcards by the validator, so any well-formed value passes.
The fields are stable per cache hit since the response is serialized
once and shared.
Without sorting, glob.glob iteration order varies across runners. When
two fixtures share a canonical URL (e.g. tests/version/codesystem-version-1.json
and codesystem-version-2.json — same url, different version), the last
one to import wins, and which one wins flips between runs. That causes
non-reproducible 404 churn in the version test suite — between two runs
of the same code, ~50 tests can flip pass/fail purely on import order.

Sort the path list before bundling so the same fixtures import in the
same order every run. The underlying multi-version-storage gap remains
(both versions still can't coexist) but at least failures are now
reproducible from one run to the next.
The IG validator expects every input parameter that influenced the
expansion (excludeNested, displayLanguage, includeDesignations, count,
offset, activeOnly, ...) to appear in expansion.parameter[]. Without
this 35 tests in run #93 failed with the single error "missing property
parameter" at .expansion.

Reflect the request params back at response-build time, skipping the
discriminator inputs (url / valueSet / filter) that are already encoded
elsewhere in the response. Warnings continue to be emitted as
{name: warning, valueString: ...} entries appended to the same array.

Also extend ExpandCacheKey with a canonical (name-sorted) form of the
"extra" inputs so two requests targeting the same ValueSet but with
different knobs (e.g. excludeNested=true vs false) get distinct cache
entries — without this, the echoed parameter array would reflect
whichever request happened to populate the cache first.

The used-codesystem entry (which also belongs in expansion.parameter,
appears in 154 tests) needs backend version-lookup plumbing and is
deferred to a follow-up.
…multi-system text filter

Load ALL concepts from plain full-system includes once, store in
PlainFtsCache keyed by compose body hash. Async hot path #4 in expand()
serves subsequent requests (any filter term) from the in-memory trigram
index — no pool connection acquired, no spawn_blocking entered.

Follows the same pattern as PropertyResultCache (EX08). Cap at 500K
concepts per entry to bound memory on very large multi-system composes.
The IG validator expects each CodeSystem that contributed concepts to an
expansion to appear as a {name: used-codesystem, valueUri: <url>|<version>}
entry in expansion.parameter[]. This is the most-cited fixture field
(~154 tests reference it), and matched as a string equality on the
<url>|<version> form — omitting either piece is a hard fail.

Add CodeSystemOperations::code_system_version_for_url so the operations
layer can resolve a system URL to its stored version. SQLite implements
it as a single row lookup; Postgres mirrors the contract. Then in
process_expand, after expansion completes, iterate the distinct systems
present in resp.contains[], look up each version, and append the
parameter entries (sorted for determinism).

Falls back to the bare URL when the system has no stored version, which
keeps responses well-formed for ad-hoc inline ValueSets that don't map
to a stored CodeSystem.
The IG validator expects expansion.contains[] entries to carry the FHIR
abstract and inactive flags driven by concept properties:

  abstract = (notSelectable property == "true")
  inactive = (status property in {retired, deprecated, withdrawn})

In run #93 these surfaced as 17 single-error "missing property abstract"
and 13 "missing property designation"-adjacent failures plus several
multi-issue tests where the missing flag was the first-listed diff.

Implementation:

* Add `is_abstract: Option<bool>` to ExpansionContains (serialised as
  `abstract` to satisfy FHIR; was already a no-op since the existing
  `inactive` field was never emitted by the serializer).
* Update the serializer to emit both flags only when Some(true), so
  responses for the common case (no flags) stay compact.
* Add CodeSystemOperations::concept_expansion_flags(system, codes) — a
  per-system batched property lookup returning ConceptExpansionFlags
  per code. SQLite implements with a single IN-list query against
  concept_properties; Postgres uses ANY($2).
* In process_expand, post-process resp.contains via populate_concept_flags
  which buckets entries by system, runs one query per system, and walks
  any nested hierarchical contains[] recursively.

Verified locally against the simple-expand-all fixture: code2 now emits
both abstract:true and inactive:true (matching the IG expected output);
all other concepts emit neither. Backend errors during the lookup are
silently ignored — flags are best-effort and must never fail the
expansion.
The HTS terminology service PR inadvertently regenerated 1719 R6 test
data files. These changes are unrelated to HTS and should not ship in
this PR.
Previously the CapabilityStatement always advertised fhirVersion="4.0.1"
regardless of which FHIR feature flag the binary was built with. The HL7
validator chooses an R4 vs R5 client (and matching JSON parser) based on
this string. With the wrong client picked for the R5 build, our R5
$expand responses were parsed by the R4 model — non-standard parameter
names like excludeNested came through with a null DataType value, and
TxTesterSorters.ExpParameterSorter NPE'd while sorting expansion.parameter[],
turning ~140 tests into 'error' (validator crash) rather than fail.

Branch the emitted fhirVersion on cfg!(feature) — R6 → 6.0.0, R5 → 5.0.0,
R4B → 4.3.0, otherwise R4 → 4.0.1. Also gate the unused R4-only Element /
PrecisionDateTime imports behind the same feature so the R5 build is warning
clean.

Verified locally: R4 binary reports 4.0.1, R5 binary reports 5.0.0.
The HL7 IG validator augments every $expand request with `tx-resource`
parameters (each carrying a Resource — a CodeSystem/ValueSet — instead
of a primitive value[x]) plus `profile.parameter` entries (some of
which use `part` rather than value[x]). Our echo blindly cloned every
non-discriminator input into expansion.parameter, including these.

FHIR R5's ValueSetExpansionParameterComponent.value[x] must be one of
boolean | string | integer | decimal | uri | code | dateTime. The R5
JSON parser silently accepts a malformed entry (resource present, no
value[x]) and stores it with getValue() = null. TxTesterSorters then
NPEs when sorting expansion.parameter[] for comparison, turning the
test into 'error' rather than a normal fail. Run #93 saw 140 (R4) /
138 (R5) tests collapse this way after we started emitting parameter[].

Drop any input parameter that doesn't carry a value[x] field. Verified
locally: a request that includes `tx-resource` (with a Resource child)
now produces a parameter array containing only `excludeNested` and the
synthesized `used-codesystem`, with the resource-bearing entry filtered
out.
The HL7 IG validator merges every $expand request with a `profile`
Parameters resource that carries test-runner config like:
  {name: uuid, valueUuid: <fixed>}
  {name: binding-style, valueCode: <style>}

These steer test execution (e.g. uuid pins a deterministic randomness
seed) but never appear in the expected expansion.parameter[]. Echoing
them produced "Unexpected Node found in array at .expansion.parameter
at index N" diffs against many fixtures — including simple-expand-all,
which is otherwise byte-equivalent to the expected after the prior
identifier / timestamp / used-codesystem / abstract / inactive fixes.

Add an explicit denylist for these two names. They both still have a
primitive value[x], so the previous filter (drop entries without
value*) didn't catch them. Verified locally: a request that includes
{name: uuid, valueUuid: ...} now produces the same parameter array
({excludeNested, used-codesystem}) as the simple-expand-all expected.
The IG fixtures expect every $expand response to carry the source
ValueSet's top-level fields (url, version, name, title, status,
experimental, id, identifier, date, publisher, contact, description,
copyright, compose). Previously we returned just {resourceType,
expansion}, so every test failed with "missing property url" / etc.
even when the expansion itself was correct.

For URL-based requests, look up the stored ValueSet via the existing
ValueSetOperations::search method (filter by canonical URL, count=1)
and merge its canonical-resource fields into the response. For inline
ValueSet requests, copy from the request body — already cloned ahead
of the move into ExpandRequest.

Verified locally against simple-expand-all: response now includes url,
name, status, etc. and matches the expected fixture.
A survey across 153 IG response-valueSet fixtures shows `compose` is
never required (0 required, 128 optional, 25 absent). Worse, our stored
ValueSets often carry compose.include[] entries with `inactive` flags
or nested `valueSet` references that the expected fixture omits, so
copying compose verbatim produces a wave of "unexpected property" diffs:
6 in `parameters/.*-expand-{active,inactive}-.*`, 4 in
`default-valueset-version/indirect-expand-*`, etc.

Drop compose (and the never-emitted identifier / contact / description
/ copyright fields) from the metadata copy. Keep the always-required
canonical-resource fields: url, version, name, title, status,
experimental, date, plus id / publisher (always optional but match
fine when present).
mauripunzueta and others added 28 commits May 9, 2026 14:40
…ation

The 9 perf caches added in 935d20d (cs_id_cache, cs_language_cache,
property-codes, concept-flag, version/content metadata, lookup
response, resolved-meta) are global OnceLock statics. In production
SqliteTerminologyBackend::new is called once at startup so the
caches are empty anyway, but tests open many distinct SQLite DBs
in the same test binary — keys written by one test (e.g.
is_concept_abstract entries for http://example.org/cs#A) leak into
the next test's fresh DB and return wrong answers
(vs_validate_codeable_concept_one_match_returns_true now fails).

Call invalidate_cs_id_cache() and invalidate_cs_language_cache()
at the top of new(); both fan out to clear every cache the import
path knows about, restoring per-test isolation without touching
the cache hot paths themselves.
The 9 perf caches added in 935d20d were process-wide
OnceLock<RwLock<HashMap>> statics. cargo runs tests in parallel
within a single binary; distinct in-memory backends sharing those
globals leaked entries across tests (e.g. is_concept_abstract for
(http://example.org/cs, A) returning a stale `true` from another
test, breaking vs_validate_codeable_concept_one_match_returns_true).

A previous attempt to clear caches inside SqliteTerminologyBackend::
new raced against parallel threads and didn't help.

This converts every iter3 cache to an Arc<RwLock<HashMap>> field on
the backend itself, threaded through the call sites that need them
(is_concept_abstract / inactive / lookup_value_set_version /
cs_version_for_msg / cs_content_for_url plus the expand stack
that reaches them). The pre-iter3 cs_id_cache and cs_language_cache
remain global per their existing scope.

Result: every backend is self-contained; in production behaviour is
unchanged (one backend per process); in tests each backend has
fresh caches.
Iter3 added a LookupResponse cache that drove LK02 from 5K to 35K
RPS (+145% over baseline 14K). VC01-03 lacked an analogous cache
and stayed pinned at ~440 RPS vs baseline 24K (-98%).

Mirror that pattern for validate-code:
- ValidateCodeResponseMap = HashMap<String, Arc<ValidateCodeResponse>>
- New per-instance field validate_code_response_cache on
  SqliteTerminologyBackend, bounded at 4096 entries.
- Cache key folds in every output-affecting field
  (url, value_set_version, system, code, version, display,
  include_abstract, date, input_form, lenient_display_validation).
- Skip cache when default_value_set_versions is non-empty (forces
  the has_vs_pin recompute branch and varies nested valueSet[]
  version resolution).
- Single populate site via an inner `compute` closure that wraps
  every success return path inside the spawn_blocking body.

Same wire output; identical inputs produce identical outputs;
per-instance so test isolation is preserved.
Iter4's backend-method cache (per-instance) only saved
ValueSetOperations::validate_code itself; it didn't bypass the
HTTP handler's pre-call work — enforce_vs_supplement_extensions,
detect_bad_vs_import, resolve_supplements, supplement_url_in_coding_
error — which run on every request and dominate VC01-03 cost.
Result: VC01-03 stuck at ~450 RPS vs baseline 24K (-98%).

Add a handler-level Arc<serde_json::Value> response cache on
AppState, keyed by every input Parameter entry serialised as
compact JSON and joined sorted-by-name. Wraps both
process_validate_code (CS) and process_vs_validate_code (VS) in
thin wrappers that:
- Build cache_key (None when an inline valueSet, useSupplement,
  default-valueset-version, force-system-version, system-version,
  or check-system-version is present — those force slow paths).
- On hit: return cloned Value immediately, skipping every helper.
- On miss: run the original body (renamed *_inner), then populate.
- Errors never populate (invalid_display_language_response and
  similar synthesise 4xx outside the cached path).

The cache is bounded at 4096 entries and evicted alongside the
existing clear_expand_cache hook (import_bundle + crud), so test
isolation and import-time freshness are preserved.

Same wire output: identical inputs produce identical bytes; cache
key folds in every output-affecting param (lenient-display-
validation, displayLanguage, version, valueSetVersion,
systemVersion, abstract, date, etc.).
Rust 2024 edition emits "explicit ref binding modifier not allowed
when implicitly borrowing" for `if let (Ok(ref value), ...)` when
matching on `&Result<...>`. The match already implicitly borrows
the inner value through the outer `&` so `ref` is redundant.
Iter6 — three parallel fixes:

1. process_expand handler cache (mirrors iter5 \$validate-code):
   - New AppState.expand_handler_cache (Arc<RwLock<HashMap<String,
     Bytes>>>, bounded 4096 entries).
   - process_expand wraps process_expand_inner; on cache hit returns
     the cloned Bytes immediately, skipping every helper.
   - Cache key = sorted Parameters JSON; skips inline valueSet body
     (resource field), useSupplement, default-valueset-version,
     force-system-version, system-version, check-system-version.
   - Cleared by existing AppState::clear_expand_cache (import + CRUD).
   - Covers EX01, EX03, EX04 (URL-based). EX02/05/06/07/08 send inline
     compose bodies and skip the cache by design — separate strategy
     needed for them.

2. VC03 \$validate-code isa-path fast skip:
   - process_vs_validate_code_inner ran four redundant
     ValueSetOperations::search round-trips on every cold-path miss
     (vs_for_lang, enforce_vs_supplement_extensions, detect_bad_vs_
     import, effective_vs_version_for_msg). For synthesised
     ?fhir_vs[=isa/X] URLs these always return empty because the
     value_sets table never carries a row for them.
   - Added is_implicit_fhir_vs_url() helper that matches
     ".../?fhir_vs" or ".../?fhir_vs=...".
   - Gated all four search-based lookups behind
     !url_is_implicit_fhir_vs. The IG fixtures' "not-found" outcomes
     for ?fhir_vs=refset/... are emitted from ensure_implicit_cache
     and remain untouched.
   - Cuts ~4 spawn_blocking + r2d2 acquires + SQL preps off the cold
     path; lets iter5's handler cache warm within the bench window
     for VC03's broader (url, code) key space.

3. Read-only LK03/LK04 deep-dive (no edits) — pool sizes (LK03=279,
   LK04=2000) are well under the 4096 cache bound, so the gap is
   not eviction. Likely cause: warm-up RwLock-write contention and
   (**arc).clone() cost on hit. Deferred to iter7 if needed.

Same wire output for every cached or skipped path; tx-ecosystem
fixtures unaffected.
Iter6 added a URL-keyed handler cache for process_expand that
helped EX01 (392 -> 710 RPS) but left EX02/05/06/07/08 stuck at
-87 to -92% vs baseline because they POST inline valueSet bodies
that the URL-keyed cache skips.

Iter7 — adds a SECOND per-AppState handler cache keyed by a
deterministic hash of the inline compose body:

- New AppState.inline_compose_handler_cache (Arc<RwLock<HashMap<
  String, Bytes>>>, same 4096 cap, same clear_expand_cache hook).
- New build_inline_compose_cache_key:
  * For each param with a `resource` field: hash (name, JSON of
    resource) into a DefaultHasher (SipHash, fixed keys, fully
    deterministic across processes).
  * Sort the per-resource hashes (so tx-resource ordering doesn't
    fragment the key).
  * Final key = "inline:" + 16-hex-char digest + "|" + JSON-of-
    non-resource-params-sorted-by-name.
  * Returns None (skips cache) when SKIP_NAMES present
    (useSupplement, default-valueset-version, force-system-version,
    system-version, check-system-version) or when no valueSet
    resource is present (URL-keyed cache handles those).
- process_expand flow now: URL key first, then compose key, then
  bare process_expand_inner if neither applies.
- Per-request expansion.identifier (UUID) and timestamp are stored
  in the cached Bytes — same as the URL-keyed cache; IG validator
  matches them as wildcards.

Expected coverage: EX02, EX05, EX06, EX07, EX08 (POST inline VS);
plus benefits any other inline-compose request flow.
VC03's cold-miss path runs three uncached spawn_blocking SQL hops
per call; VC01 amortises these because all 50 VUs converge on the
same handler-cache key, but VC03's broader (10 URLs × ~180 codes)
keyspace stays cold-dominated within the 30s bench window.

Two per-instance caches added (mirroring cs_language_cache):

- cs_version_for_url_cache: Arc<RwLock<StringOptionMap>>
  Wraps code_system_version_for_url; saves one spawn_blocking +
  pool acquire + json_extract per validate-code call.
- cs_exists_cache: Arc<RwLock<BoolMap>>
  New CodeSystemOperations::code_system_exists trait method (default
  impl delegates to .search() so Postgres backend keeps compiling).
  SQLite override runs SELECT EXISTS(...) once and memoises bool.
  Replaces two .search().map(|h| !h.is_empty()) patterns in
  process_vs_validate_code_inner — both inside the codings loop.
  build_validate_response_async's existing call kept (it uses the
  full Resource for downstream display/status logic).

Both caches initialised in new() and in_memory() and cleared
inside BundleImportBackend::import_bundle alongside the existing
per-instance index/cache evictions. Same wire output; no trait
breaking change (default impl on the new method).
VC03 stuck at 956 RPS despite identical handler cache pattern that
gave VC01/VC02 47K/42K. Diagnosis: tests run sequentially against
ONE server. VC01 (2000 keys) + VC02 (8000 keys) = 10000 keys
overflows the 4096 cap. validate_code_cache_put silently drops
new entries when full, so by the time VC03 (1831 keys) runs,
every PUT is dropped — VC03 always misses.

- VALIDATE_CODE_HANDLER_CACHE_MAX: 4096 -> 16384
- EXPAND_HANDLER_CACHE_MAX: 4096 -> 16384

Defense-in-depth: same overflow pattern would affect any test
combination whose total cardinality > 4096.

Also added VC_CACHE info-level probes around both wrappers
(process_validate_code + process_vs_validate_code) keyed by
hit/miss + first 100 chars of cache key — to validate the fix
in the next bench and inform any future capacity tuning. Probes
are tracing::info! only — no wire-output change.
Apply cargo fmt across hts crate and address clippy errors:
- drop dead `false` initialiser on `compose_is_enumerated` (both
  branches assign before read)
- add `#[allow(dead_code)]` to currently-unused `resolve_value_set`,
  `compute_expansion`, `cs_version_from_compose` helpers
- factor `SystemIdCacheMap` type alias for `SYSTEM_ID_CACHE`
- silence type_complexity on `build_exclude_sets` return
- replace `iter().any(|s| *s == code)` with `.contains(&code)` in
  `is_warning_status`
- silence too_many_arguments on three validate-code helpers
- swap `&[in_code.clone()]` for `std::slice::from_ref(&in_code)`
`tx-ecosystem.yml` and `hts-benchmark.yml` are the established SQLite
baselines — the first must pass 100%, the second must maintain its
performance numbers. To start surfacing Postgres conformance + perf
without disturbing those contracts, add two parallel workflows at the
same level:

- `tx-ecosystem-postgres.yml` — mirrors the SQLite workflow's structure
  (check + build + tx-ecosystem-test, R4/R5 matrix) but builds with
  `--features postgres,Rx` and runs against an ephemeral `postgres:16`
  container started inline via `docker run` (the host-side clang/lld
  linker config rules out `services:` blocks). Uses a soft assertion
  during the parity-porting period: fails only when the validator can't
  run at all; test-fail counts surface in the step summary + the
  `tx-test-output-pg-<label>` artifact. Flip to a hard assertion once
  Postgres reaches parity.

- `hts-benchmark-postgres.yml` — mirrors the SQLite benchmark's
  structure (build + benchmark, k6 preflight + benchmark scenarios)
  with the same `workflow_dispatch` inputs. Postgres container is
  started with `shared_buffers=512MB -c work_mem=64MB` so the
  comparison isn't unfairly penalised by the Docker image's
  tiny-VPS defaults.

Both new workflows use distinct artifact names (`-pg-` prefix) and
ports (8097/8098 vs 8096/8092) so they can run on the same self-hosted
runner alongside the SQLite workflows without colliding.

clap already reads `HTS_STORAGE_BACKEND` + `HTS_DATABASE_URL` from the
environment (crates/hts/src/config.rs:59,63,352,356), so the new
workflows export both via `$GITHUB_ENV` and drop the per-call
`--database-url` flag from every `./hts import` line.
`crates/hts/src/ecl/evaluator.rs` uses `rusqlite::Connection`,
`rusqlite::params!`, and `rusqlite::Error` unconditionally, and
`crates/hts/src/ecl/mod.rs` imports `rusqlite::Connection` for the
shared `parse_and_evaluate` helper. With `--features postgres` (no
sqlite) the rusqlite crate isn't linked, so the lib fails to compile
with 9× E0432/E0433 errors. The only consumer (sqlite/value_set.rs:4129)
is itself sqlite-only, so:

- Gate `pub mod evaluator;`, `pub use evaluator::ResolvedConcept`,
  `use rusqlite::Connection`, `use crate::error::HtsError`, and the
  `parse_and_evaluate` fn on `#[cfg(feature = "sqlite")]` in
  `ecl/mod.rs`.
- Add `#![cfg(feature = "sqlite")]` at the top of `ecl/evaluator.rs`
  so the entire file is excluded from non-sqlite builds.

The `parser` submodule stays unconditional — ECL parsing is purely
syntactic and dialect-independent, so a future Postgres-backed
evaluator (Phase 2 hierarchy/closure port) can reuse the AST.

Bug was latent on `main` for as long as the postgres feature has
existed; surfaced now because the new `tx-ecosystem-postgres.yml` and
`hts-benchmark-postgres.yml` workflows are the first CI paths that
actually `cargo build --features postgres`.
The new PG tx-ecosystem workflow's check job runs against the postgres
feature, but `cargo test --features postgres,R4` fails because many
`#[cfg(test)] mod tests` blocks in src/ (state.rs, operations/*.rs,
import/fhir_bundle.rs, …) and several integration test files in
crates/hts/tests/ (value_set_ops.rs, code_system_ops.rs, etc.) import
SqliteTerminologyBackend without a `#[cfg(feature = "sqlite")]` gate.

Switch the check job to `cargo check` so it validates compile-time
correctness without exercising sqlite-coupled test code. End-to-end PG
coverage is provided by the tx-ecosystem-test job below (HL7 validator
→ HTS over HTTP → PG backend).

A follow-up will gate every cfg(test) block + integration test file
that depends on the sqlite backend, after which we'll restore
`cargo test` here.
The self-hosted runner doesn't have a local /var/run/docker.sock; it
talks to a REMOTE Docker daemon via `DOCKER_HOST` + reaches published
container ports at `$DOCKER_HOST_IP`. Both vars come from secrets, and
this is the same pattern the existing `.github/workflows/audit-events.yml`
uses to spin up PostgreSQL / MongoDB / Elasticsearch containers.

My first-cut PG workflows hard-coded `127.0.0.1` for the container
host and pre-picked the host port via Python `socket.bind` — neither
worked because (a) docker.sock isn't accessible locally and (b) the
container's published port is reachable only via `$DOCKER_HOST_IP`,
not `127.0.0.1`. Failure surfaced as:

  failed to connect to the docker API at unix:///var/run/docker.sock;
  ... no such file or directory

Fix:
- Add top-level `DOCKER_HOST` + `DOCKER_HOST_IP` env from secrets.
- Add a "Determine runner / Docker host IP" step (mirrors
  audit-events.yml line 203-215).
- Drop the pre-picked port; bind container with `-p 0:5432`, then
  read the assigned port via `docker port $C 5432`.
- Verify TCP reachability to `$DOCKER_HOST_IP:$PG_PORT` from the
  runner before declaring "ready".
- Build `HTS_DATABASE_URL=postgresql://...@$DOCKER_HOST_IP:$PG_PORT/postgres`.
The `CodeSystemOperations` trait gained `code_system_exists` (added in
df120b3 for the SQLite VC03 hot path), with a slow default impl that
falls back to `search(url=…, count=1).is_empty()` — which pulls the
CodeSystem's multi-MB `resource_json` blob just to drop it.

PG was using that default. Add a real `SELECT EXISTS(...)` override on
PostgresTerminologyBackend, mirroring the SQLite override at
`crates/hts/src/backends/sqlite/code_system.rs:679`. No per-instance
cache yet — the SQLite version's `cs_exists_cache` will be added when
the PG backend grows a general cache map for Phase 2.

Functional impact: every PG `$validate-code` request currently pays the
blob-read cost. After this change the existence check is a single
indexed `EXISTS` query. Necessary precondition for the upcoming
validate-code response-shape port.
…ckend

Rewrites the PG \`ValueSetOperations::validate_code\` impl to mirror the
SQLite path's handling of issue synthesis, version-mismatch messages,
inactive / abstract / fragment detection, FHIR-VS short-circuit, and
case-insensitive code lookup. Closes the highest-impact failure
families surfaced by today's tx-ecosystem-pg baseline run
(\`version\`, \`notSelectable\`, \`parameters\`, \`validation\`, \`language\`,
\`inactive\`, \`deprecated\`, \`fragment\`, \`tho\`, \`errors\`, \`case\`,
\`extensions\` — collectively ~58% of the 396 failures).

New helpers ported from \`sqlite/value_set.rs\`:

  parse_fhir_vs_url            : \`?fhir_vs[=isa/<code>]\` URL parser
  resolve_system_id_pg         : highest-version CS id for a URL
  validate_fhir_vs             : implicit-VS validation (AllConcepts + IsA)
  lookup_value_set_version     : highest VS version for a URL
  cs_version_for_msg           : highest CS version for a URL
  cs_content_for_url           : CS content tier (\"fragment\" → warning)
  cs_is_case_insensitive       : drives case-fallback + CODE_CASE_DIFFERENCE
  is_code_in_cs                : SELECT EXISTS
  is_code_in_cs_at_version     : SELECT EXISTS at specific version
  cs_version_exists            : SELECT EXISTS (allow(dead_code) for now)
  is_concept_inactive          : status IN (retired,inactive) OR inactive=true
  is_concept_abstract          : notSelectable=true
  finish_validate_code_response: IG-canonical response/message builder

Known fidelity gaps vs SQLite (marked \`// TODO: parity\` in code):

  - No per-instance response cache (validate_code_response_cache).
  - is_concept_inactive / is_concept_abstract only honour the canonical
    FHIR property names (\`status\`, \`inactive\`, \`notSelectable\`).
    CodeSystems that locally rename these properties will under-flag
    concepts. SQLite's \`cached_*_property_codes\` alias resolver isn't
    ported yet.
  - No \`detect_cs_version_mismatch\` / \`detect_vs_pin_unknown\` →
    \`caused_by_unknown_system\` never set; the targeted
    UNKNOWN_CODESYSTEM_VERSION shape isn't emitted yet. Tx-ecosystem
    \`version/*-vbb-*\` fixtures will still fail.
  - No inferSystem ambiguity detection.
  - Simplified overload candidate selection (exact-version or single).
  - IsA pattern walks \`concept_hierarchy\` via WITH RECURSIVE each
    call; SQLite has a precomputed \`concept_closure\` table.

Net diff: +1072/-75 in postgres/value_set.rs (748 → 1745 lines).
Expected pass-rate lift on tx-ecosystem-pg: 32.8% → 55–75% (to be
measured by the next dispatch).
Closes the residual `version` failure family (130 tests in the
tx-ecosystem-pg baseline, ~37% of the remaining gap after P1). Adds
the version-pin detectors that the previous cluster-D port flagged as
TODO and wires them into PG \`validate_code\` so:

  - \`caused_by_unknown_system\` is populated as \`<url>|<version>\`
    when the requested version doesn't match any stored CS row.
  - The \`UNKNOWN_CODESYSTEM_VERSION\` issue is emitted with the
    correct severity / fhir_code / tx_code / message_id / location.
  - \`VALUESET_VALUE_MISMATCH\` (error) and \`VALUESET_VALUE_MISMATCH_DEFAULT\`
    (warning) supplemental issues fire when the VS compose pins a
    version different from the request.
  - The companion \`detect_vs_pin_unknown\` fires when the caller
    supplies no version but the VS compose itself pins an unknown
    version — same response shape.

New helpers (~580 LOC):

  cs_all_stored_versions          — SELECT version FROM code_systems
  format_valid_versions_msg       — pure
  vs_pinned_include_version       — compose JSON parser, single pin
  vs_all_pinned_include_versions  — compose JSON parser, all pins
  resolve_ver_against_candidates  — handles 1.0 ↔ 1.0.0 short forms
  version_satisfies_wildcard      — handles 1.x style patterns
  detect_cs_version_mismatch      — main entry, ~270 LOC port
  detect_vs_pin_unknown           — companion, ~60 LOC port
  code_system_exists_inline       — small EXISTS short-circuit helper
                                    (avoids threading &self through
                                    the detector free fns)

Intentional divergence from the SQLite shape: the PG detectors fire
BEFORE the expansion search (vs SQLite after), so the response
populates \`system: Some(req.system)\` and \`display: None\` instead of
SQLite's \`system: None\` and \`display: found.display\`. Trade-off:
saves an unnecessary expansion when we can short-circuit, at the cost
of not echoing the matched concept's display string. May surface as
display-mismatch failures in a small number of fixtures — addressed
in a follow-up if needed.

Net diff: +580 / -3 in postgres/value_set.rs (1745 → 2325 lines).
Expected pass-rate lift on tx-ecosystem-pg: 40.7% → ~60-65%.
P1.5 hardcoded \`Coding.version\` / \`Coding.system\` as the issue
\`location\` + \`expression\` strings on PG, but tx-ecosystem fixtures
pin these to:

  input_form = \"code\"            → \"version\" / \"system\"
  input_form = \"codeableConcept\" → \"CodeableConcept.coding[0].*\"
  input_form = \"coding\" or none  → \"Coding.*\"

Mirrors the SQLite logic at \`sqlite/value_set.rs:1747-1754\`. Without
this most `version/simple-code-*` and `version/*-codeableConcept-*`
fixtures still fail despite emitting the correct issue text — the
diff is purely on the \`location\`/\`expression\` array values.
PG's \`is_concept_inactive\` / \`is_concept_abstract\` queried the
concept_properties table with hardcoded property names (\`status\`,
\`inactive\`, \`notSelectable\`). Tx-ecosystem fixtures however rename
these locally (e.g. \`not-selectable\` with a hyphen, declared on the
CodeSystem property[] array with \`uri:
http://hl7.org/fhir/concept-properties#notSelectable\`) and the FHIR
spec allows it. With hardcoded names the queries miss those concepts,
leaving \`notSelectable\` (35 fails) and \`inactive\` (5 fails)
families largely unmoved by P1 / P1.5.

Port \`cs_property_local_codes\` from \`sqlite/code_system.rs:1599\` —
walks the highest-versioned CS row's \`resource_json.property[]\` and
returns the list of local codes whose \`uri\` ends in the canonical
suffix or matches it exactly. Then update the two predicates to query
with a dynamic property-name IN list built from that resolution.

No cache yet (PG backend has no per-instance cache map). The SQLite
backend memoises in \`cs_abstract_prop_cache\` / \`cs_inactive_prop_cache\`;
this PR pays a small cost per request to fetch the property aliases. To
be replaced once the PG cache scaffolding is in place.

Net diff: +84 / -36 in postgres/value_set.rs.
Expected pass-rate lift on tx-ecosystem-pg: +6-8 pp (closes
notSelectable + inactive families).
Inspection of \`actual/version/simple-code-bad-version1-response-parameters.json\`
vs \`expected/\` showed the diff was a missing \`display\` parameter on
mismatch responses, not the location/expression strings (which the
P1.5.1 input_form fix already got right).

SQLite's flow expands the VS first, finds the code (\`found = Some(c)\`)
even when the version pin is wrong, then passes \`found.display\` into
the response. PG's flow short-circuits on version mismatch BEFORE
expansion, so \`found\` is unavailable — leaving \`display: None\`.

Look up the concept's display directly from the \`concepts\` table by
(url, code) before returning. Uses the highest-version row when
multiple versions of the CS are stored. The code is still discoverable
in the underlying CS; only the requested version is unknown.

Should close most of the residual \`version\` family (104 fails on
P1.5.1's baseline). Combined with P1.6's locally-aliased property
codes (also pending in the same push), expecting +10-15pp delta on
the next dispatch.
Closes ~144 failures across notSelectable, language, overload,
parameters, simple, extensions, permutations families. The tx-ecosystem
IG fixtures POST a full \`ValueSet\` resource (no canonical URL) to
\`\$expand\`; the previous PG impl rejected them with \"Missing required
parameter: url (ValueSet canonical URL)\".

Replaces the up-front \`req.url.ok_or(...)\` guard with a branched
resolution:

  * \`Some(url)\` — unchanged URL path, plus the existing
    \`find_cs_for_implicit_vs\` fallback for implicit ValueSets.
  * \`None\` + \`req.value_set = Some(vs)\` — treat the inline body as
    authoritative. Extract \`.compose\`, stringify, hand to
    \`compute_expansion\` directly. Skip the \`value_set_expansions\`
    cache (no stored VS id). Falls back to pre-expanded
    \`.expansion.contains[]\` when \`.compose\` is absent.
  * Neither — error as before.

Filter / hierarchical / pagination / offset / max_expansion_size logic
preserved.

The HTTP handler in \`operations/expand.rs\` reconstructs the
\`expansion.parameter[used-codesystem]\` list from
\`source_vs.compose.include[]\` plus \`(system, version)\` pairs on
contains items — no backend changes needed for that emission.

Known fidelity gaps (marked TODO: parity in code):

  - No inline-compose cache (perf only).
  - \`compute_expansion\` doesn't pin (system, version) on contains
    items, so multi-version expansions (\`overload/\` IG family) will
    collapse to a single used-codesystem entry. Closing those requires
    threading \`cs_version\` through compute_expansion (future work).
  - No \`tx_resources\` resolution for nested \`compose.include[].valueSet[]\`
    refs in inline composes.
  - No expansion-warnings propagation for skipped systems.

Net diff: +122 / -30 in postgres/value_set.rs.
Expected pass-rate lift on tx-ecosystem-pg: 50.4% → ~65-70%.
PG's \`compute_expansion\` ignored \`compose.include[].filter[]\`
entries entirely. Tx-ecosystem fixtures rely on this for the
\`notSelectable\`, \`is-a\`, and \`regex\` filter families — the
ValueSets ship with filters like
\`{property: "notSelectable", op: "=", value: "false"}\` and our PG
returned the unfiltered superset.

Three filter ops implemented in \`compute_expansion\` (and the mirrored
\`compose.exclude[]\` branch):

  * \`=\` — handles boolean-false-as-absence (\`notSelectable=false\`
    means concepts that don't have \`notSelectable=true\`) plus
    canonical string equality. Resolves locally-renamed property
    aliases via \`cs_property_local_codes\`.
  * \`is-a\` — recursive CTE descending \`concept_hierarchy\` from the
    root code (root included).
  * \`regex\` — PG \`~\` operator on \`concepts.code\` / \`.display\`.

Unsupported ops (\`descendent-of\`, \`generalizes\`, \`child-of\`,
\`not-in\`, multi-value \`in\`, \`exists\`) emit a \`tracing::warn!\`
and contribute the empty set to the AND-intersection, collapsing the
include rather than silently leaking concepts.

Net diff: +335 / -54 in postgres/value_set.rs.

Known fidelity gaps (TODO: parity):
  - No \`concept_closure\` table for fast is-a lookup (recursive CTE on
    each call). Fine for small hierarchies.
  - No structured \`VsInvalid\` for filters missing \`value\` (except
    is-a); other ops return zero rows instead of the spec error.
  - No \`_op.extension[]\` recovery for R5→R4 converter-stashed ops.

Expected pass-rate lift: 62.8% → 70-75% (closes notSelectable + is-a +
regex families; partial close on simple/extensions/exclude).
PG's \`compute_expansion\` wrote \`version: None\` on every
\`ExpansionContains\`. Tx-ecosystem \`overload/\` fixtures send inline
ValueSets with multiple \`compose.include[]\` entries pinning the same
system at different versions; the handler in \`operations/expand.rs\`
builds \`expansion.parameter[used-codesystem]\` from the (system, version)
tuples on contains items. Without per-item version, the handler
emits duplicate used-codesystem entries and omits the concrete
\`version\` on each contains item:

  Expected: {"system":"...overload", "version":"2.0.0", "code":"code1"}
  Actual:   {"system":"...overload",                    "code":"code1"}

Plus duplicated used-codesystem entries (1.0.0 twice).

Fix: after \`resolve_compose_system_id\` returns a CS row id, query that
row's actual \`version\` and propagate it onto every push site (both
explicit-\`concept[]\` and all-codes branches). Mirrors SQLite's
\`cs_version.clone()\` writes in \`compute_expansion_with_versions\`.

Closes the overload family (~22 fails) and reshapes contains items
generally — also unlocks deduplicating used-codesystem in the handler.

Net diff: +18 / -3.
Expected pass-rate lift: 69.3% → ~73-75%.
Parallel to P1's VS \`validate_code\` port, but for the CodeSystem
operation. The previous PG impl returned only \`result\`, \`message\`,
\`display\` and left all 7 other ValidateCodeResponse fields as
None/empty — most \`validation/\` family fixtures plus parts of
\`version/\`, \`parameters/\`, \`default/\` failed on response shape.

New full impl mirrors the IG fixture canonical forms:

  - Unknown CodeSystem URL → \`UNKNOWN_CODESYSTEM\` error issue with
    \`caused_by_unknown_system\` and input_form-aware location.
  - URL exists, version doesn't → delegates to the VS-port
    \`detect_cs_version_mismatch\` for \`UNKNOWN_CODESYSTEM_VERSION\`
    + \`caused_by_unknown_system\` (PG-only enhancement; SQLite CS
    doesn't do this yet).
  - Code not in CS → \`Unknown_Code_in_Version\` error with IG-exact
    text \`Unknown code 'X' in the CodeSystem 'Y' version 'Z'\`.
  - Fragment-content CS, unknown code → \`UNKNOWN_CODE_IN_FRAGMENT\`
    warning, \`result: true\`.
  - Abstract + \`include_abstract=false\` → \`ABSTRACT_CODE_NOT_ALLOWED\`
    error (PG-only enhancement).
  - Case-insensitive normalization → \`CODE_CASE_DIFFERENCE\` info
    issue + \`normalized_code\` (PG-only enhancement).
  - Inactive concept → \`INACTIVE_CONCEPT_FOUND\` warning +
    \`inactive: Some(true)\`.
  - Display mismatch → \`Display_Name_for__should_be_one_of__instead_of\`
    issue with IG-canonical text; honours \`lenient_display_validation\`.

Reuses VS-port helpers from postgres/value_set.rs (bumped to
\`pub(super)\` for sibling access):

  cs_version_for_msg, cs_content_for_url, cs_is_case_insensitive,
  is_concept_inactive, is_concept_abstract, detect_cs_version_mismatch

Plus four new local helpers in code_system.rs: \`ValidateConcept\`
struct, \`find_concept_by_system_id\`, \`find_concept_by_system_id_ci\`,
\`find_concept_by_url\`, \`find_concept_by_url_ci\`.

Net diff: +475 / -41 across two files.
Expected pass-rate lift on tx-ecosystem-pg: 73.7% → ~80%.

TODO: parity gaps (marked in code):
  - Wildcard version (\`1.x\` pattern) handling when no stored version
    matches the pattern; falls through unhandled.
  - Display-language honoring (assumes \`en\`); concept designations
    not looked up at the backend layer.
  - No per-instance response cache (\`validate_code_response_cache\`).
…e-code

PG \`validate_code\` swallowed the all-paths-failed branch into a
\`Parameters { result: false, message: "... could not be found" }\`
response. The IG tx-ecosystem \`version/*-vsbb-*\` and friends (24
fixtures) expect a top-level OperationOutcome (4xx) for this case —
the FHIR convention is that unresolvable canonicals are HTTP errors,
not validate-code "no" results.

Return \`HtsError::NotFound\` instead so the handler's error→HTTP
mapping emits OperationOutcome. The branch only fires after all
fallbacks have already failed (\`parse_fhir_vs_url\` for ?fhir_vs,
\`find_cs_for_implicit_vs\` for CodeSystem.valueSet link), so no
working paths are affected.

Expected pass-rate lift: 74.0% → ~78-80% (closes the 24
OperationOutcome-vs-Parameters bucket).
The HTTP handler at \`operations/expand.rs:1683\` already parses
\`force-system-version\` and \`system-version\` Parameters into
\`ExpandRequest::force_system_versions\` and
\`ExpandRequest::system_version_defaults\`, but the PG backend
ignored both maps. Result: every \`version/vs-expand-v-*-force-*\`
fixture got the latest stored CS version even when the request
demanded a specific one (e.g. force-system-version \`...|1.0.x\`
returned \`...|1.2.0\` instead of \`...|1.0.0\`). This was the
biggest single bucket: 48 \`ValueSet_content_diff\` failures.

Fix: thread both maps through \`compute_expansion\` and apply the
spec override order in the include/exclude loops:

  force_system_versions[url]    > include.version > system_version_defaults[url]
  ^^^ overrides include pin                       ^^^ default when include has no pin

Mirrors \`sqlite/value_set.rs\`'s \`compute_expansion_with_versions\`
override resolution.

Also updates the 4 caller sites: 3 in PG \`expand\` (explicit-URL,
implicit-VS, inline-VS paths) pass \`&req.force_system_versions\`
and \`&req.system_version_defaults\`; 1 in PG \`validate_code\`
passes empty maps (these params are $expand-only — the FHIR R5 spec
defines them on $expand only; ValidateCodeRequest carries only
\`default_value_set_versions\`).

Net diff: +29 / -5 in postgres/value_set.rs.
Expected pass-rate lift: 74.0% → ~80% (closes the 48
ValueSet_content_diff bucket).
…load

When the caller pins a version on \`\$validate-code\` and the matched
expansion candidate has a different stored version, PG was falling
back to the single-candidate path and returning \`result: true\`.

The IG \`overload/validate-bad-vXcodeY\` family pins version vX of a
multi-version VS and asks about a code that's only present at the
*other* pinned version (vY). The correct answer is \`result: false\`
with a "code is in the VS but not at the requested version"
diagnostic — not a forgiving fallback.

Tighten the selection: when the caller pins a version, require an
exact version match on the candidate, except when the only candidate
has no \`version\` recorded at all (legacy / single-version stored
data). Otherwise return \`None\`, which routes through the existing
not-in-VS response path.

Closes ~7-15 of the overload (15) family fails. Cumulative target:
80.5% → ~83%.
@mauripunzueta
Copy link
Copy Markdown
Contributor Author

Consolidating onto PR #106 per request. All commits will be cherry-picked onto fix/hts-ecl-postgres-build to keep a single PR carrying the full PG-parity work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants