promote: CLI hygiene, store-aware doctor, preservation guardrails + doc freshness#88
Merged
Conversation
Read-only commands (doctor, verify, replay, inspect, diff, history, timeline, matrix, export) loaded litellm via two paths: main.py imported every command module at top level, and execution/__init__.py eagerly imported LiteLLMAdapter, so touching any execution submodule (e.g. a replay artifact's execution.models) dragged in litellm and its import-time warnings. Fix both: dispatch imports each command module lazily inside its branch; execution/__init__.py defers LiteLLMAdapter via PEP 562 __getattr__ (public import unchanged). Add tests/meta/test_cli_import_hygiene.py, a subprocess-per-module guard that the dispatcher and read-only commands never import litellm/litellm_adapter. Sync README subcommand count (ten->eleven), add doctor to the CLI reference, and add a doctor troubleshooting note.
doctor previously always probed SQLite writability regardless of --store-path, so 'falsifyai doctor --store-path postgres://...' reported the path as ok even with no postgres plugin installed -- then 'run' crashed. Now doctor resolves the store scheme the same way build_store does: it reports the selected scheme and registered backends, FAILs (exit 3) when no backend is registered for the scheme, and only write-probes the built-in SQLite store. Plugin stores are reported as registered but not probed -- constructing one could open a network connection, which would break doctor's diagnose-only contract.
Both .claude/CLAUDE.md and AGENTS.md still claimed all subpackages have empty __init__.py files with no implementation yet -- plainly false at 0.6.4, where the pipeline runs end-to-end. Replace with an accurate one-liner pointing at the CHANGELOG.
…hitecture docs (#87) Turn two preservation-layer invariants the architecture doc only asserted in prose into executable guards, and refresh the doc to the 0.6.x surface. No runtime behavior changes. - Store-lifecycle harness (tests/unit/test_cli_store_lifecycle.py): one parametrized test asserts every read-only consumer (diff, export, history, inspect, matrix, replay, timeline, verify) closes its ReplayStore on both normal return and post-construction read failure. run is a producer and is covered in its own test file where the execution stack is already mocked. - Centralize the consumers-never-re-resolve guarantee into the import-hygiene meta-guard: forbid falsifyai.verdict.resolver for all read-only command modules and the dispatcher, replacing seven scattered per-command tests that used two weaker techniques (in-process sys.modules deletion; AST direct-import scans that miss transitive imports). Closes a gap where replay had no check. - Refresh docs/ARCHITECTURE.md to the full 11-command CLI surface and current perturbation/invariant families, and fix the stale future-commands line. CHANGELOG [Unreleased] documentation note; no version bump.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Promotes accumulated
devwork tomain. Test/docs/hygiene hardening plus one diagnostic improvement; no runtime behavior change to the pipeline, no version bump.no implementation yetagent contextARCHITECTURE.mdrefreshed to the 11-command surfaceuv run pytestgreen (780) andruffclean on the dev tip. After this squash-merges, resetdevtomain(git reset --hard origin/main+--force-with-lease) per the branch workflow.