Skip to content

fix: address self-review findings across engine, rocky-sdk, and dagster#924

Merged
hugocorreia90 merged 4 commits into
mainfrom
fix/engine/session-review-findings
Jun 18, 2026
Merged

fix: address self-review findings across engine, rocky-sdk, and dagster#924
hugocorreia90 merged 4 commits into
mainfrom
fix/engine/session-review-findings

Conversation

@hugocorreia90

Copy link
Copy Markdown
Contributor

Self-review of the unreleased delta (the tags → Dagster fan-out, config-group guard, surrogate keys, and state-format goldens). Each finding was traced consumer-first and fixed with a discriminating test that fails without the fix.

High

  • dagster: sanitize Dagster tag values, not just keys. get_model_tags sanitized governance tag keys but passed values through verbatim. Dagster validates values too (^[A-Za-z0-9_.-]{0,63}$) at AssetSpec construction, so a governance value carrying @, /, whitespace, or exceeding 63 chars raised DagsterInvalidDefinitionError when building the derived-model specs. Added _sanitize_tag_value (same charset as the key sanitizer; empty values preserved, since Dagster allows them) applied to governance and synthesized rocky/* values. New test pushes gnarly values through build_model_specs (fails on main).

Medium

  • rocky-sdk: TestResult / CiResult couldn't parse real failure output. Both hand-written shadows declared failures: list[list[str]] (positional tuples), but the engine serializes failures as [{name, error}] objects — so the shadow parsed only the all-pass case and raised on any real failure, and also lacked model_results / declarative / unit_tests. Soft-swapped both names to the generated TestOutput / CiOutput, matching the existing HistoryResult / ModelHistoryResult precedent in the same module; kept the Result names as aliases for import compatibility. Real wire shape captured from rocky test --output json; discriminating round-trip tests added on both the SDK and dagster sides (the old fixtures + assertions were stale-positional, internally consistent with the broken shadow).
  • engine: deny_unknown_fields on SurrogateKeySpec and TestRef. A typo'd key in a [[surrogate_key]] or [[use_test]] block was silently dropped; it now fails the load loudly.

Low / hardening

  • engine: resolve_group_schema validates the resolved schema with the same validate_identifier format_table_ref applies at SQL-gen, so a bad [args] value fails at load with a clear message instead of as broken SQL (load/run stay consistent).
  • engine: emit-sql guards the per-model output filename against path traversal (skip-with-report on an unsafe name) and the module doc no longer over-claims "identical to a run" for multi-adapter projects (one dialect is resolved for all models).
  • engine: apply_surrogate_keys documents its validate-before-interpolation precondition and re-checks it with a debug_assert.
  • engine: init_db notes the eager-table registration contract that keeps the on-disk-format golden complete.

Open decision (documented, not changed)

rocky test's model-execution path (execute_locally) does not apply declared surrogate keys, unlike rocky run and rocky emit-sql. A model with a [[surrogate_key]] is materialized without the surrogate column under rocky test, so a test depending on that column behaves differently from a real run. Left as a documented divergence rather than changed here, because aligning it would flip existing rocky test outcomes — flagging for a call on whether rocky test should mirror run-time surrogate keys.

Verification

  • cargo fmt --check, cargo clippy --all-targets --all-features, cargo test (rocky-core/engine/duckdb/cli) — green.
  • SDK + dagster pytest (incl. -W error::pytest.PytestCollectionWarning) and ruff check / ruff format --check — green.
  • just codegen and just regen-fixtures — no drift.

🤖 Generated with Claude Code

get_model_tags sanitized governance tag keys but passed values through
verbatim. Dagster validates tag *values* too (^[A-Za-z0-9_.-]{0,63}$) at
AssetSpec construction, so a governance value carrying @, /, whitespace, or
exceeding 63 chars raised DagsterInvalidDefinitionError when building the
derived-model specs. Add _sanitize_tag_value (same charset as the key
sanitizer, empty values preserved) and apply it to governance values and the
synthesized rocky/* values. Add a discriminating test that pushes gnarly
values through build_model_specs.
The hand-written TestResult/CiResult shadows declared failures: list[list[str]]
(positional [name, error] tuples), but the engine serializes failures as
[{name, error}] objects (the CLI maps its (name, error) tuples to the
TestFailure struct). The shadow shape only ever parsed the empty (all-pass)
case and raised a ValidationError on any real failure; it also lacked
model_results / declarative / unit_tests. Soft-swap both names to the generated
TestOutput / CiOutput, matching the existing HistoryResult / ModelHistoryResult
precedent in the same module, and keep the Result names as aliases for import
compatibility. Guard pytest collection on the Test*-named alias.

Captured the real wire shape from rocky test --output json and added
discriminating round-trip tests on both the sdk and dagster sides (the dagster
fixtures + the failures[0][0] assertion were stale-positional, internally
consistent with the broken shadow).
- deny_unknown_fields on SurrogateKeySpec and TestRef so a typo'd key in a
  [[surrogate_key]] or [[use_test]] block fails the load loudly instead of
  silently dropping the misspelled setting (NamedTest keeps flatten, so it
  can't take deny_unknown_fields; the SurrogateKeySidecar wrapper intentionally
  ignores other keys).
- resolve_group_schema validates the resolved schema with validate_identifier
  (the same validator format_table_ref applies at SQL-gen), so a bad [args]
  value fails at load with a clear message rather than as broken SQL, and
  load/run never disagree.
- apply_surrogate_keys documents its validate-before-interpolation precondition
  and re-checks it with a debug_assert.
- emit-sql guards the per-model output filename against path traversal
  (skip-with-report on an unsafe name) and softens the doc's single-dialect
  'identical to a run' claim for multi-adapter projects.
- init_db notes the eager-table registration contract for the F1 golden.
- execute_locally documents that rocky test's model-execution path does not
  apply surrogate keys (a known divergence from run/emit-sql).
@hugocorreia90 hugocorreia90 merged commit 58c6f8d into main Jun 18, 2026
17 checks passed
@hugocorreia90 hugocorreia90 deleted the fix/engine/session-review-findings branch June 18, 2026 13:33
hugocorreia90 added a commit that referenced this pull request Jun 19, 2026
… + 2 correctness fixes (#930)

Adds coverage for features that shipped in engine 1.52.0 after the last docs
overhaul, grounded against the source and verified file-by-file.

Reference:
- model-format.md: new [[surrogate_key]], [[tests]], [[use_test]], [[test]], and
  [columns.<name>] sidecar-block sections + group/[args] Fields-table rows.
- cli.md + commands/modeling.md: the new rocky emit-sql verb (+ backfill catalog),
  the rocky test unit_tests summary, per-column description in catalog output, and
  a fully-populated compile models_detail example with a tags object.
- json-output.md, configuration.md, glossary.md: unit_tests/tags/failures-shape
  notes, a config-groups pointer, and surrogate-key + config-group glossary entries.

Concepts: testing.md (declarative + unit-test sections), data-quality-checks.md
([[test]] vs [[tests]] callout), sql-generation.md (surrogate-key SQL), and
schema-patterns.md (config-groups vs source-schema-pattern distinction).

Dagster: translator/derived-models/types/resource for the model-tag projection
(get_model_tags), ModelDetail.tags, and the TestOutput/CiOutput aliases.

Correctness fixes:
- migrate-from-dbt.md: emitted [[test]] blocks now execute under rocky test as of
  engine-v1.52.0 (#899); the prior 'not yet wired' clause was stale.
- testing.md: test failures are {name, error} objects, not positional tuples (#924).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant