feat(doctor): ecp doctor health check + skill install diff by coseto6125 · Pull Request #451 · coseto6125/egent-code-plexus

coseto6125 · 2026-05-25T10:33:26Z

Why

Skill install was a blind destructive overwrite (remove_dir_all + copy) — no diff, no detection, no way to tell a user-edited skill from a stale one. That's how a stale ecp scan reference sat in an installed skill undetected. Two features over a shared diff engine fix both the reactive (install-time) and proactive (health-check) sides.

What

skill_diff engine (skill_fs.rs): pure comparison of a skill's repo source vs its installed copy — per-file Added / Removed / Modified / Unchanged, with a unified diff (via similar) for Modified files and a local-edit flag when the destination predates this install (a hand-edited skill about to be overwritten is surfaced before the copy).

install (ecp admin claude install skills <target>): now always prints the diff of what it changes. --dry-run prints the diff without writing. The target moved from a clap subcommand to a value-enum positional, so --dry-run works in any position:

ecp admin claude install skills simplify --dry-run   # flag last (natural)
ecp admin claude install skills --dry-run simplify   # flag first
ecp admin claude install skills --dry-run            # defaults to all

ecp doctor (new top-level verb): aggregates checks, each → ok/warn/fail + remediation:

skill:<name> — freshness via skill_diff (stale = repo source differs)
index — graph staleness via auto_ensure::ensure_index
host:<tool> — integration consistency via each host's status()
config:* — ECP_HOME writable, ~/.claude/skills present

Default read-only; --fix reinstalls stale skills + rebuilds a stale index (host/config are report-only — never rewrites user-owned host configs). Non-zero exit on any Fail so CI can gate on it.

Notes

similar 3.1 chosen over 2.x for the 3.0 Myers/Histogram diff improvements; MSRV 1.85 is well under the toolchain in use (1.95).
git diff in review left untouched — it compares two git commits and needs git's hunk line-numbers + rename detection; similar only compares in-memory text. They solve different problems; similar is for the non-version-controlled installed-vs-repo case.

Tests

skill_diff: 5-state classification + local-edit flag + missing-dst + identical-tree (tempdir fixtures).
doctor host map_status: installed→ok, outdated→warn+remediation, missing-optional→ok (not fail).
Manual: ecp doctor correctly flagged the live stale skill:ecp / skill:simplify drift; --dry-run prints the unified diff with the local-edit warning.

github-actions · 2026-05-25T10:44:43Z

ecp impact cache (0 symbols) — internal, used by ecp dev pr-analyze

[]

`<anonymous:line:col>` names shift whenever an edit moves line numbers, so two PRs editing the same file (e.g. builder.rs) share a large set of identical anonymous names by coincidence — a spurious cross-PR conflict that blocked this PR against #450/#451/#452/#453 with ~84 phantom overlaps each (none of which share a real changed symbol). Their true overlap is undetectable by name anyway, so filtering them removes only noise. Drops the phantom count to the real named-symbol set.

…ader hint (#455) * fix(inspect): type-stable tier/checks shape, drop UNKNOWN_TIER placeholder `ecp inspect` emitted `"checks": "[UNKNOWN_TIER] checks: <none recorded yet>"` on every heuristic edge — a 40-char constant that (a) wastes context window, (b) reads like a real tier named UNKNOWN_TIER so a consuming LLM may treat it as truth, and (c) is a string today but becomes structured data after T4-7, forcing a type flip that breaks LLM-generated parsing code. Mirror find-schema-bindings' established shape instead: a top-level `tier` label (the `unresolved` sentinel pre-T4-7) plus a `checks` object. Both are type-stable across the T4-7 transition — T4-7 only fills values, never restructures — so `inspect` and `find-schema-bindings` present one consistent schema to the same agent. A five-way neutral panel ranked the type-stable object over both the placeholder string and a null value. rename.rs text output and the inspect/rename tests updated to match. * feat(graph): ReadsField RelType + RawNode.field_reads schema scaffold Schema groundwork for recording cross-function reads of public struct/class fields (`obj.rel_path`, `self.count`), so `ecp impact <field>` returns readers instead of an empty result an LLM can't distinguish from "no impact". - RelType::ReadsField appended at enum end (rkyv discriminant-stable); from_str / as_str synced; not heuristic. - ResolveTarget::Field + NodeKind::is_property predicate, wired into all three resolver match arms (exhaustive check guards completeness). - RawNode.field_reads: Vec<String>, mechanically added to all 82 construction sites across 33 parsers. Parsers populate it next; builder emits edges filtered to exported targets next. - GRAPH_FORMAT_VERSION 11 -> 12 (RawNode serialized shape changed). The (A) graph-completeness justification and the deliberate public-only scope (consistent with the drop-locals panel decision) are documented on the RelType::ReadsField variant. * feat(builder): emit ReadsField edges from resolved field reads pass2_emit_node_edges now walks raw_node.field_reads (mirroring the calls loop), resolves each via ResolveTarget::Field, and emits a ReadsField edge to the Property target with reason "field_read". Self-edges are skipped. No public/private filter: the is_property resolver predicate already admits only Property nodes (locals are Variable, never matched), so the drop-locals fan-out the panel rejected cannot occur. Private-field readers are kept — they are real impact for in-module refactors. Schema docs on ReadsField and RawNode.field_reads corrected to match this resolved-target semantics. Parsers still populate field_reads as empty; per-language capture lands next, so no edges are produced yet. * feat(analyzer): capture field reads across 14 languages → ReadsField edges Adds calls::extract_field_reads, mirroring extract_calls: walks member-access read nodes, attaches the field name to the enclosing function/method's field_reads, skipping any access that is a call callee (handled by Calls). field_name_from descends Kotlin/Swift navigation_suffix to reach the field. Wired into 13 mainstream parsers with each language's member-access node kind: member_expression (TS/Dart), attribute (Python), field_access (Java), navigation_expression (Kotlin/Swift), member_access_expression (C#/PHP), selector_expression (Go), field_expression (Rust/C/C++). Two of the 14 are pinned negative cases (explicit, not silent gaps): - Ruby: obj.attr is a method call, already a Calls edge. - JavaScript: class fields aren't Property nodes yet (no queries.scm capture unlike TS), so the read has no resolvable target — tracked as a follow-up. Tests: reads_field_rust (end-to-end source→edge) + reads_field_all_langs (14-language coverage, 12 positive + 2 pinned negative). Full ecp-analyzer suite green (2323 passed) — no regression from the new capture. All 96 RawNode construction sites (prod + test) carry field_reads: Vec::new(). * feat(cli): render ReadsField + verify impact reaches field readers format::rel_to_str renders ReadsField as "reads_field". impact needs no change: its BFS skips only is_scope_containment edges, and ReadsField is neither containment nor heuristic, so a field's readers are reached automatically. cypher renders via the shared as_str path. reads_field_impact integration test pins the payoff end-to-end: `ecp impact timeout` reaches read_timeout through the ReadsField edge — the query that returned empty before this PR. * feat(cli): field-readers note when a field has zero ReadsField edges inspect adds `field_readers_note` and impact adds a stderr line when the target is a Property with no incoming ReadsField edges. Disambiguates the two empty-result meanings so an LLM doesn't read "no edges" as "safe to change": the field may be genuinely unread, OR its language doesn't capture field reads yet (JS class fields, Ruby attrs). The note is suppressed once any reader exists. Forwarded through the inspect wrapper like heuristic_note. reads_field_impact gains the inspect coverage: unread field carries the note, a read field does not. * chore: register v12 in VERSION_HISTORY + field_reads in ecp-cli RawNode sites - schema::VERSION_HISTORY gains the v12 entry (the bump test enforces it). - incremental.rs + its test construct RawNode, so they carry field_reads too. * test(analyzer): pin method-call is not captured as a field read Guards the is_call_callee + is_property resolver filter against the obvious false positive (obj.method() recorded as ReadsField to a method). * chore(bench): field_reads in resolver_lookup RawNode (pre-push clippy --all-targets) * fix(pr-analyze): exclude anonymous nodes from cross-PR conflict scan `<anonymous:line:col>` names shift whenever an edit moves line numbers, so two PRs editing the same file (e.g. builder.rs) share a large set of identical anonymous names by coincidence — a spurious cross-PR conflict that blocked this PR against #450/#451/#452/#453 with ~84 phantom overlaps each (none of which share a real changed symbol). Their true overlap is undetectable by name anyway, so filtering them removes only noise. Drops the phantom count to the real named-symbol set.

Two features over a shared skill_diff engine. skill_diff (skill_fs.rs): pure comparison of a skill's repo source against its installed copy — per-file Added/Removed/Modified/Unchanged, with a unified diff (via `similar`) for Modified files and a local-edit flag when the destination predates this install (so a hand-edited skill about to be overwritten is surfaced). install (admin claude install skills): now ALWAYS prints the diff of what it changes. `--dry-run` prints the diff without writing. The target moved from a clap subcommand to a value-enum positional so `--dry-run` works in any position (`skills simplify --dry-run` and `skills --dry-run simplify` both ok). `ecp doctor` (new top-level verb): aggregates checks — installed-skill freshness (reuses skill_diff), graph index staleness (reuses auto_ensure::ensure_index), host-integration consistency (reuses each host's status()), and config/path sanity (ECP_HOME writable, ~/.claude/skills present). Default is read-only; `--fix` reinstalls stale skills and rebuilds a stale index (host/config are report-only). Non-zero exit on any Fail so CI can gate on it. similar 3.1: picked over 2.x for the 3.0 Myers/Histogram improvements; MSRV 1.85 is well under the toolchain in use. Tests: skill_diff 5-state classification + local-edit flag (tempdir fixtures); doctor host map_status (installed/outdated/missing). docs/skills doctor.md added.

…eck --fix Expands `ecp doctor` per review feedback. Single-target runs: `ecp doctor [check]` runs one check; `ecp doctor [check] --fix` fixes just that one. `[check]` is a value-enum positional (skills/index/host/config/registry/version) so --fix sits in any position. New checks: - registry: reuses the diagnostics `registry_health` scan (promoted to pub(crate)) — orphan index dirs, missing graph/meta, corrupt meta. --fix removes orphan dirs only; missing/corrupt stay report-only (a rebuild, not a delete, is the safe fix). - version: compares CARGO_PKG_VERSION against the latest tag via `git ls-remote --tags` (no network-client dependency, reuses the hardened git wrapper). Report-only; offline degrades to a Warn. - config:git: git-on-PATH is a hard prerequisite → Fail when absent. Host coverage went from 3 to the full 10. Scripted hosts (claude/gemini mcp+native, codex mcp via run_install promoted to pub(crate)) auto-fix under --fix; stubs and interactive-only hosts stay report-only. --format json/toon added (CheckResult already Serialize). cli_surface tests: `doctor` registered in the inventory and removed from the old-command denylist (the old `doctor` was a coverage report folded into `summary`; this is a different feature reusing a freed name). Tests: version semver parse + latest-tag selection; host map_status. docs + SKILL.md command table updated.

…ests doctor is a maintenance/diagnostic command alongside `admin index` / `admin claude`, so it belongs under the hidden `admin` namespace, not at top level. Moved commands/doctor → commands/admin/doctor; AdminCommands::Doctor replaces the top-level Commands::Doctor. All remediation hints, SKILL.md, and doctor.md updated to `ecp admin doctor`. cli_surface tests: doctor registered in ADMIN_SUBCMDS, removed from TOP_LEVEL_COMMANDS, kept in the old-top-level denylist (it must never reappear at top level). Added registry-check coverage: extracted a pure `classify(health, orphan_fix)` so the warn/ok mapping and the "missing/corrupt are report-only, never fixed" invariant are unit-tested against hand-built RegistryHealth (fields promoted to pub(crate)). Orphan deletion stays in `check` (fs side-effect kept out of the pure fn). Doctor test count: registry 3 + host 3 + version 3 + skill_fs 4. Note: doctor registry's orphan category (unregistered repo dirs) is distinct from `admin prune --orphans` (registered repos whose worktree vanished) — they cover complementary gaps, no overlap.

doctor registry --fix was bare remove_dir_all; `admin prune` retires dirs via retire_dir_async (atomic rename + background delete). Use the same primitive in doctor so there's one retire path for the whole tool — no duplicate deletion logic. The orphan categories stay distinct (doctor: unregistered repo dirs; prune: registered repos whose worktree vanished), only the removal mechanism is now shared.

The --fix path had no test proving it actually changes the filesystem — only pure-function coverage of the classify/diff logic. Add near-E2E tests that build real fixtures and assert the on-disk outcome. registry --fix (3 tests): an ECP_HOME tempdir with an unregistered repo dir (orphan). `check_in(home, fix)` parameterises the home root so tests avoid mutating process-global ECP_HOME. Asserts: no --fix leaves the orphan on disk; --fix retires it from its original path (retire_dir_async renames synchronously) and reports fix_applied=true; a clean home reports ok. skills install (3 tests): `install_skills_at(target, dry_run, cwd, claude_home)` parameterises both the source root and the install root. Asserts: install copies the skill into claude_home; --dry-run writes nothing; a stale installed copy is overwritten with repo source. Doctor test count: 19 (registry 6, host 3, version 3, skill_fs 4, claude E2E 3).

install_skills now goes through install_skills_at (which inlines the SKILL.md existence check), leaving claude.rs's source_skill_dir with no callers — codex.rs has its own. Removing it clears the -D warnings push gate.

…ld leak on Windows Windows CI failures: - skill_fs FAIL: rel_path built via PathBuf::to_string_lossy rendered `\`, so `guides/new.md` became `guides\new.md`, breaking the status lookup (Option::unwrap on None at skill_fs.rs:185) and diff headers. Normalize to forward slashes via components join — stable across platforms. - ensure_fresh_warm_attach LEAK x2: warm-attach spawns a detached `sh` rebuild that outlives the test, which nextest flags as LEAK on Windows. Gate the spawn behind ECP_SKIP_BG_REBUILD; the two warm-attach tests set it (and EnvSnapshot restores it). Production behavior unchanged.

…d by #450

coseto6125 enabled auto-merge (squash) May 25, 2026 10:33

coseto6125 added the merge-queue Opt-in to Mergify merge queue label May 25, 2026

coseto6125 disabled auto-merge May 25, 2026 10:34

github-actions Bot added the ecp:risk-high ecp signal label May 25, 2026

coseto6125 enabled auto-merge (squash) May 25, 2026 10:50

coseto6125 removed the merge-queue Opt-in to Mergify merge queue label May 25, 2026

coseto6125 added 8 commits May 26, 2026 06:03

fix(doctor): ignore EnsureResult::Stale.needs_full_rebuild field adde…

08f0f1b

…d by #450

coseto6125 added the merge-queue Opt-in to Mergify merge queue label May 25, 2026

coseto6125 force-pushed the feat/ecp-doctor branch from e1d173f to 08f0f1b Compare May 25, 2026 22:05

github-actions Bot added ecp:risk-low ecp signal and removed ecp:risk-high ecp signal labels May 25, 2026

coseto6125 merged commit 6ae36e9 into main May 25, 2026
18 checks passed

coseto6125 deleted the feat/ecp-doctor branch May 25, 2026 22:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(doctor): ecp doctor health check + skill install diff#451

feat(doctor): ecp doctor health check + skill install diff#451
coseto6125 merged 8 commits into
mainfrom
feat/ecp-doctor

coseto6125 commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

coseto6125 commented May 25, 2026

Why

What

Notes

Tests

Uh oh!

github-actions Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented May 25, 2026 •

edited

Loading