Test Gaps and Test Framework Limitations #425

owleyeview · 2026-03-19T00:13:06Z

owleyeview
Mar 19, 2026
Collaborator

I spent some time analyzing our dance test framework in relation to our target architecture; what it can do, what it can't, what gaps unit tests can cover. Here's a report to keep us honest and help guide us in deciding if changes or additional test frameworks are needed.

MAP Testing Gaps Report

Context

MAP recently introduced a new host ingress pathway: MapIpcRequest → Runtime::dispatch → MapIpcResponse, exposed as a single Tauri command (dispatch_map_command). This surface covers transaction lifecycle management, holon CRUD operations, lookups, and commit — all dispatched through a wire→domain binding seam with lifecycle enforcement via CommandDescriptor.

The existing integration test infrastructure (sweettests) predates this surface. This report documents the gaps between what the test infrastructure can exercise today and what the system actually does at runtime.

Gap 1: Dance Test Harness — Negative-Path and Lifecycle Safety

What's affected: The existing sweettest harness (adders, executors, fixture state management)

Concrete defects:

Commit fixture state advances too early. The commit adder calls fixture_holons.commit() during fixture construction, which unconditionally advances staged holons to Saved state. This happens regardless of the expected outcome. A test case that expects commit to fail would still have its fixture state advanced to Saved, causing fixture/reality divergence for all subsequent steps.
Commit executor panics on valid non-OK outcomes. The executor pattern-matches the response body into a TransientReference at line 34-38 and panics on any other shape — before the if response.status_code == ResponseStatusCode::OK guard at line 43. A legitimate failure response with a different body shape (or no body) would panic, making negative-path commit tests impossible.
Load/commit lifecycle-driving response properties are not asserted. Production transaction lifecycle transitions are driven by CommitRequestStatus (Complete vs Incomplete) and LoadCommitStatus on the response holon, not by HTTP-like status codes alone. The commit executor doesn't examine these properties. The load_holons executor asserts counters but not lifecycle semantics. This means the harness cannot distinguish between "commit succeeded and transaction is now Committed" vs "commit succeeded but transaction remains Open" (e.g., Incomplete status).
Other adders share the unconditional-advance pattern. For example, add_with_properties_step advances fixture head unconditionally. Any step that is expected to fail but whose adder mutates fixture state will cause the same kind of divergence.
No post-commit lifecycle assertions are expressible. There is no transaction lifecycle tracker in the fixture or execution state. The framework cannot model "mutation should be rejected because the transaction is now Committed." This class of test is currently unwritable.

Impact: The harness cannot safely express negative-path lifecycle tests. Existing positive-path tests are unaffected.

Gap 2: No Integration Test Coverage of the Runtime Dispatch Path

What's affected: Runtime::dispatch, wire→domain binding, CommandDescriptor enforcement, RuntimeSession transaction management

Current state: Sweettests call context.initiate_dance(DanceRequest) directly. This exercises the dance/zome pathway through TrustChannel to guest code, then back. The entire host ingress stack is bypassed:

Layer	Exercised by sweettests?
`MapIpcRequest` / `MapIpcResponse` serialization	No
`Runtime::dispatch_inner` (bind → lifecycle check → dispatch → convert)	No
Wire→domain binding (`MapCommandWire` → `MapCommand`, `HolonReferenceWire` → `HolonReference`)	No
`CommandDescriptor` enforcement (`requires_open_tx`, `requires_commit_guard`, mutation entry)	No
`HostCommitExecutionGuard` concurrency protection	No
`RuntimeSession` multi-transaction management	No

Existing host-side unit tests (host/crates/map_commands/src/tests/dispatch_tests.rs) do exercise the Runtime dispatch path, but without a live Holochain conductor. They validate the dispatch logic, binding, and lifecycle checks in isolation.

The gap: No test combines the Runtime dispatch pathway with a live conductor backend. The binding seam, lifecycle enforcement, and actual DHT persistence have never been tested together.

Gap 3: Extension Dance Dispatch via MAP Commands Is Not Implemented

What's affected: TransactionAction::Dance in the MAP command surface

Current state: The transaction dispatch handler returns NotImplemented for the Dance variant:

TransactionAction::Dance(_) => Err(HolonError::NotImplemented(
    "TransactionAction::Dance: extension dances pending API refactor".to_string(),
)),

TransactionAction::Query is similarly unimplemented.

Clarification: This does not mean the MAP command surface cannot reach the guest. Built-in commands that need guest interaction (commit, load_holons, delete, fetch, etc.) already do so — they delegate to ClientHolonService, which builds dance requests internally and executes them via context.initiate_dance(). The Dance variant exists specifically for dynamically-defined extension dance types that don't map to a built-in command. Until this variant is implemented, clients cannot dispatch arbitrary/custom dances through the MAP command surface.

Impact: Custom extension dance-based operations cannot be routed through the MAP command surface yet. All built-in operations work today.

Gap 4: Fixture Import Is Incompatible with the Runtime Path

What's affected: Test initialization — how fixture holons get loaded into a test session

Current state: Sweettests import fixture holons by:

Binding wire-level fixture holons to a TransactionContext via .bind(&transaction_context)
Calling transaction_context.import_transient_holons(bound_holons)

This requires a TransactionContext to exist before any test steps run.

The problem for Runtime-based tests: With a Runtime, no TransactionContext exists until a BeginTransaction command is dispatched. The fixture import mechanism would need to either:

Express fixture loading as MAP commands (e.g., BeginTransaction → NewHolon → StageNewHolon → ... for each fixture holon), or
Reach behind the Runtime to get the TransactionContext after BeginTransaction, defeating the purpose of testing through the command surface, or
Use LoadHolons with a wire-compatible bundle format

This is a bootstrapping problem that doesn't have a clean solution today.

Gap 5: No Multi-Transaction Test Support

What's affected: Testing sequential or concurrent transactions within a single test session

Current state: Each sweettest operates within a single TransactionContext for its entire lifetime. There is no mechanism to:

Begin a second transaction after committing the first
Test cross-transaction visibility (can transaction B see holons committed by transaction A?)
Test transaction isolation semantics
Test the RuntimeSession active transaction registry

Why it matters: RuntimeSession manages a HashMap<TxId, Arc<TransactionContext>> for exactly this purpose. Multi-transaction workflows are a core use case for the client (e.g., load type descriptors in transaction 1, create holons in transaction 2). None of this is tested at the integration level.

Gap 6: Sweettest Operations Don't Exercise the Direct Mutation/Lookup Path

What's affected: The new context.mutation().* and context.lookup().* facade operations

Current state: Sweettests exercise holon operations exclusively through dances — context.initiate_dance(DanceRequest) → TrustChannel → guest zome code. This is the correct path for testing dance/choreography correctness.

However, the MAP command surface routes most operations directly through TransactionContext facades:

context.mutation().new_holon(), .stage_new_holon(), .with_property_value(), etc.
context.lookup().get_all_holons(), .get_staged_holon_by_base_key(), etc.

Only commit and load go through dances (via ClientHolonService).

The gap: The direct facade operations are tested in host-side unit tests but never against a live conductor. Any divergence between the dance path and the direct facade path for the same logical operation would go undetected.

Summary Matrix

Gap	What's Missing	Existing Partial Coverage
1. Harness negative-path safety	Fixture state guards, executor failure handling, lifecycle tracking	Positive-path tests pass correctly
2. Runtime dispatch integration	End-to-end Runtime + live conductor	Host unit tests (no conductor); sweettests (no Runtime)
3. Dance dispatch via MAP commands	`TransactionAction::Dance` implementation	Dance dispatch works through `initiate_dance` directly
4. Fixture import for Runtime tests	Bootstrap mechanism for Runtime-based test sessions	`import_transient_holons` works for direct-context tests
5. Multi-transaction tests	Sequential/concurrent transaction lifecycle tests	Single-transaction sweettests; RuntimeSession unit tests
6. Direct facade path coverage	Facade operations against live conductor	Host unit tests (mocked); sweettests (via dances only)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test Gaps and Test Framework Limitations #425

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Test Gaps and Test Framework Limitations #425

Uh oh!

Uh oh!

owleyeview Mar 19, 2026 Collaborator

MAP Testing Gaps Report

Context

Gap 1: Dance Test Harness — Negative-Path and Lifecycle Safety

Gap 2: No Integration Test Coverage of the Runtime Dispatch Path

Gap 3: Extension Dance Dispatch via MAP Commands Is Not Implemented

Gap 4: Fixture Import Is Incompatible with the Runtime Path

Gap 5: No Multi-Transaction Test Support

Gap 6: Sweettest Operations Don't Exercise the Direct Mutation/Lookup Path

Summary Matrix

Replies: 0 comments

owleyeview
Mar 19, 2026
Collaborator