test(engine): gate on-disk state format (F1) + mixed-version no-clobber upload (F2)#922
Merged
Merged
Conversation
…n no-clobber upload (F2) Two release-gate invariants for the consumed state surface. F1 — on-disk state-schema format golden. A new test pins CURRENT_SCHEMA_VERSION plus the exact set of redb tables a fresh store materializes (read back from `list_tables()`, so it's self-correcting and auto-detects additions/removals). The `state.redb` file is persisted and shared across pods, so any table add/remove/rename or version bump now fails CI until the golden is updated in the same PR. F2 — mixed-version no-clobber upload. The open-time half of the mixed-version invariant was already gated (open_with_policy_* tests cover forward-incompat hard-fail/recreate + backward auto-migrate). This adds the missing run-time half: extract the end-of-run "skip upload when the store was recreated for forward-incompat" decision into a named `state_sync::upload_state_unless_recreated`, and test that a recreated store skips the upload entirely (proven by pointing at a backend that hard-errors the moment dispatch is reached — deleting the guard turns the test red). Behavior-preserving: the run path keeps its warn-on-error. The periodic mid-run uploader is gated by the same flag at spawn time.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two release-gate invariants for the consumed state surface — the buildable halves of F1 and F2 from the finite-work plan.
F1 — on-disk state-schema format golden
There was no test pinning the on-disk state layout. The
state.redbfile is persisted and shared across pods (state_sync), so its layout is a compatibility contract. New teston_disk_state_schema_format_is_pinned:CURRENT_SCHEMA_VERSION == 9, andlist_tables()so the golden is self-correcting and auto-detects any table add/remove/rename (init_dbeagerly creates every table).Any such change now fails CI until the golden — and, for a table change, the version + migration — is updated in the same PR.
Scoped to table-set + version on purpose: record values are deliberately not byte-pinned, because the
*_forward_deserializes_*tests exist to allow field evolution.F2 — mixed-version no-clobber upload
The open-time half of the mixed-version invariant was already gated:
open_with_policy_*cover forward-incompat hard-fail/Fail/Recreate, backward auto-migrate, and same-version no-op, and run in engine-ci. This was not untested before — this PR adds the missing run-time half.The catastrophic form of the 06-07 failure mode is a downgraded pod (recreated under
on_schema_mismatch = recreate) uploading its state back and clobbering the newer shared state. The end-of-run skip decision was inline in the 3600-linerunfn and untested. This PR:state_sync::upload_state_unless_recreated(recreated, cfg, path)(behavior-preserving — the run path keeps itswarn!-on-error), andon_upload_failure = Fail):recreated=true → Ok(upload skipped),recreated=false → Err(MissingConfig "s3")(upload attempted). Deleting the guard turns the test red (verified).Both upload sites honor the flag: the periodic mid-run uploader is gated at spawn time (not started when the store was recreated); the end-of-run upload goes through the new seam.
Verification
cargo test --all-features -p rocky-core -p rocky-cligreen (the new F1/F2 tests + 1361 rocky-core);cargo fmt --check+clippy --all-targets --all-featuresclean.🤖 Generated with Claude Code