Skip to content

promote: CLI hygiene, store-aware doctor, preservation guardrails + doc freshness#88

Merged
ericckzhou merged 4 commits into
mainfrom
dev
Jun 7, 2026
Merged

promote: CLI hygiene, store-aware doctor, preservation guardrails + doc freshness#88
ericckzhou merged 4 commits into
mainfrom
dev

Conversation

@ericckzhou

Copy link
Copy Markdown
Owner

Promotes accumulated dev work to main. Test/docs/hygiene hardening plus one diagnostic improvement; no runtime behavior change to the pipeline, no version bump.

uv run pytest green (780) and ruff clean on the dev tip. After this squash-merges, reset dev to main (git reset --hard origin/main + --force-with-lease) per the branch workflow.

Read-only commands (doctor, verify, replay, inspect, diff, history, timeline, matrix, export) loaded litellm via two paths: main.py imported every command module at top level, and execution/__init__.py eagerly imported LiteLLMAdapter, so touching any execution submodule (e.g. a replay artifact's execution.models) dragged in litellm and its import-time warnings.

Fix both: dispatch imports each command module lazily inside its branch; execution/__init__.py defers LiteLLMAdapter via PEP 562 __getattr__ (public import unchanged). Add tests/meta/test_cli_import_hygiene.py, a subprocess-per-module guard that the dispatcher and read-only commands never import litellm/litellm_adapter. Sync README subcommand count (ten->eleven), add doctor to the CLI reference, and add a doctor troubleshooting note.
doctor previously always probed SQLite writability regardless of --store-path, so 'falsifyai doctor --store-path postgres://...' reported the path as ok even with no postgres plugin installed -- then 'run' crashed. Now doctor resolves the store scheme the same way build_store does: it reports the selected scheme and registered backends, FAILs (exit 3) when no backend is registered for the scheme, and only write-probes the built-in SQLite store. Plugin stores are reported as registered but not probed -- constructing one could open a network connection, which would break doctor's diagnose-only contract.
Both .claude/CLAUDE.md and AGENTS.md still claimed all subpackages have empty __init__.py files with no implementation yet -- plainly false at 0.6.4, where the pipeline runs end-to-end. Replace with an accurate one-liner pointing at the CHANGELOG.
…hitecture docs (#87)

Turn two preservation-layer invariants the architecture doc only asserted in prose into executable guards, and refresh the doc to the 0.6.x surface. No runtime behavior changes.

- Store-lifecycle harness (tests/unit/test_cli_store_lifecycle.py): one parametrized test asserts every read-only consumer (diff, export, history, inspect, matrix, replay, timeline, verify) closes its ReplayStore on both normal return and post-construction read failure. run is a producer and is covered in its own test file where the execution stack is already mocked.

- Centralize the consumers-never-re-resolve guarantee into the import-hygiene meta-guard: forbid falsifyai.verdict.resolver for all read-only command modules and the dispatcher, replacing seven scattered per-command tests that used two weaker techniques (in-process sys.modules deletion; AST direct-import scans that miss transitive imports). Closes a gap where replay had no check.

- Refresh docs/ARCHITECTURE.md to the full 11-command CLI surface and current perturbation/invariant families, and fix the stale future-commands line. CHANGELOG [Unreleased] documentation note; no version bump.
@ericckzhou ericckzhou merged commit c6b391b into main Jun 7, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant