AI-first code review using multiple models to surface issues before human review.
# Run directly with uvx (no install needed)
uvx --from "git+https://github.com/benthomasson/ftl-code-review" code-review --help
# Or install as a tool
uv tool install "git+https://github.com/benthomasson/ftl-code-review"# Review a branch (recommended - uses observe/review loop)
code-review review-loop -b feature-branch
# Review staged changes
code-review review-loopAs AI-assisted development scales, human reviewers become the bottleneck. This tool runs reviews through multiple AI models (Claude, Gemini) to:
- Catch issues before human review
- Surface disagreements between models as signals for attention
- Validate against specs for compliance checking
- Provide CI gates for automated quality control
If models disagree, humans should look closer.
Run the full observe → review loop with automatic context gathering.
code-review review-loop -b feature-branch
code-review review-loop -b feature-branch --base main
code-review review-loop -b feature-branch --max-iterations 3This:
- Auto-looks up test coverage from
coverage-map.json(if present) - Gathers observations (exception hierarchies, call graphs, etc.)
- Runs review with observations as context
- Saves all artifacts to
reviews/<branch>/<timestamp>/
Run a single-pass review without observation gathering.
code-review review -b feature-branch
code-review review -b feature-branch --spec spec.md
code-review review --observations obs.json # With pre-gathered observationsGather observations without running the review (for debugging).
code-review observe -b feature-branch -o observations.jsonCI-friendly command with exit codes.
code-review gate -b feature-branch
# Exit 0 = PASS, Exit 1 = CONCERN, Exit 2 = BLOCKRun lint checks on changed files.
code-review lint -b feature-branch
code-review lint -b feature-branch --fixShow only disagreements between models.
code-review compare -b feature-branchValidate implementation against a specification.
code-review check-spec spec.md -b feature-branchReview specific files directly (not diffs). Useful for reviewing existing code or entire modules.
# Review a single file
code-review files src/auth/client.py
# Review multiple files
code-review files src/auth/client.py src/auth/oauth.py
# Review all Python files in a directory
code-review files src/auth/
# Review with glob patterns
code-review files "src/**/*.py" --globThe review system can request additional context via observation tools:
| Tool | Purpose |
|---|---|
exception_hierarchy |
Show exception MRO and subclasses |
raises_analysis |
What exceptions a function raises |
call_graph |
What a function calls |
find_usages |
Where a symbol is used |
git_blame |
Who changed specific lines |
test_coverage |
Find tests for a file |
coverage_map_tests |
Find tests from coverage-map.json |
coverage_map_files |
Find files covered by tests |
file_imports |
Extract imports from a file |
project_dependencies |
Get pyproject.toml/requirements.txt |
If you generate a coverage-map.json with coverage-map, reviews automatically include test coverage:
# Generate coverage map (one-time, or after test changes)
uvx --from "git+https://github.com/benthomasson/coverage-map" \
coverage-map collect --source src --tests tests
# Run review (auto-detects coverage-map.json)
code-review review-loop -b feature-branchOutput:
Auto-lookup: 2 Python file(s) changed
src/auth/client.py: 13 tests
src/utils/logger.py: 91 tests
Auto-lookup found tests for 2 file(s)
Each change is assessed on multiple axes:
| Dimension | Verdicts |
|---|---|
| Correctness | VALID / QUESTIONABLE / BROKEN |
| Spec Compliance | MEETS / PARTIAL / VIOLATES / N/A |
| Test Coverage | COVERED / PARTIAL / UNTESTED |
| Integration | WIRED / PARTIAL / MISSING |
## Review: feat/oauth-retry
### src/auth/client.py
VERDICT: PASS
CORRECTNESS: VALID
TEST_COVERAGE: COVERED
REASONING: Retry logic correctly handles OSError and TransportError.
13 tests verify the behavior.
### src/utils/logger.py
VERDICT: CONCERN
CORRECTNESS: VALID
INTEGRATION: PARTIAL
REASONING: New log_with_context() added but not called from client.py.
---
## Model Agreement
- claude: 8P / 1C / 0B
- gemini: 7P / 2C / 0B
## Disagreements
- src/utils/logger.py: claude=PASS, gemini=CONCERN
GATE: CONCERN (no BLOCKs)
The review-loop command saves artifacts to reviews/<branch>/<timestamp>/:
reviews/feat-oauth-retry/2026-02-26_17-36-02/
├── 00-auto-coverage.json # Auto-looked up coverage
├── 01-observe-prompt.txt # Observation prompt
├── 01-observe-response.txt # Model's observation requests
├── 01-observations.json # Observation results
├── 01-review-prompt.txt # Review prompt
├── 01-claude-response.txt # Claude's review
├── 01-gemini-response.txt # Gemini's review
├── observations.json # All observations combined
└── report.md # Final report
- Python 3.11+
claudeCLI installed and authenticatedgeminiCLI installed and authenticated
Check availability:
code-review models- coverage-map - Map source files to tests
- multi-model-review - Paper review (inspiration)
MIT