Skip to content

Test work units as a black box (golden + property-based) #982

Description

@scarmuega

Motivation

The boundary pipeline (RUPD, Ewrap, Estart) is now built out of sharded WorkUnit implementations whose contracts span multiple non-trivial axes: per-shard determinism, idempotent state writes, crash-recovery via persisted progress cursors, and equivalence with the unsharded path. Recent refactors (RUPD sharding, EWRAP split, ESTART account-snapshot sharding, progress-cursor resume) have repeatedly introduced regressions that only surface late — e.g. the live_pledge discrepancy where a shard-scoped lookup silently returned 0 for owners in other shards, manifesting only after the protocol-7 mark window propagated through the snapshot rotation.

We need a test layer that pins these contracts down independently of full-network integration runs.

Scope

Treat each WorkUnit (RupdWorkUnit, EwrapWorkUnit, EstartWorkUnit) as a black box. The test harness should drive the work unit through its lifecycle (initializeload/compute/commit_state/commit_archive per shard → finalize) against a fixture-backed Domain, and assert on observable outputs only:

  • entities written to state (by namespace + key range)
  • archive log entries
  • EpochState.end accumulators / progress cursors / incentives
  • emitted deltas

No reaching into private fields of the work unit or its BoundaryWork/RupdWork context.

Test categories

1. Golden tests via predetermined fixtures

  • A small set of curated state snapshots (synthetic, not full network bootstraps) covering:
    • pre-Allegra, Allegra–Mary, Alonzo, Babbage, Conway boundaries
    • pools whose owners span multiple shards
    • pools whose owner credential ≠ operator credential
    • unregistered operators / unregistered delegators around the RUPD/EWRAP boundary
    • retiring pools, MIRs to unregistered accounts, expiring DReps, dropping/enacting proposals
  • For each fixture: run the full work-unit lifecycle and diff the resulting state + archive against a checked-in expected output. Treat unexpected diffs as test failures.

2. Property-based tests

Use proptest (already a dep) over generated Domain snapshots. Two properties to start:

  • Idempotency: running the work unit to completion, then re-running initialize + lifecycle from scratch on the post-commit state, must be a no-op (no new deltas, no entity changes, no archive growth).
  • Crash recovery: for every shard k ∈ [0, total_shards) and every lifecycle phase within that shard, simulate a crash after that phase, restart, and assert the final committed state matches a clean run. Combined with idempotency, this exercises the *_progress.committed cursor + the start_shard() resume path end-to-end.

A third property worth adding once the harness exists: shard-count invariance — running with total_shards = 1 and total_shards = N (for several N that divide 256) must produce byte-identical state. This is the property that would have caught the live_pledge bug directly.

Out of scope

  • Full-network bootstrap tests (already covered by tests/cardano and tests/epoch_pots).
  • Performance/memory benchmarks.

Notes

  • Likely lives under crates/cardano/tests/ with a small in-memory Domain impl, or piggybacks on dolos-testing.
  • Fixture format should be small enough to commit (probably JSON or CBOR snapshots of a handful of entities, not full Mithril dumps).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions