feat(graph): SchemaField nodes + MirrorsField heuristic edges (T4-7)#291
Merged
Conversation
This was referenced May 21, 2026
coseto6125
added a commit
that referenced
this pull request
May 21, 2026
Ubuntu CI was hitting os error 28 (No space left on device) at link time — 14 tree-sitter parsers × 30+ test binaries × default debug-info filled the runner's 14 GB free space (last seen on PR #291 run 26240379100, tantivy.rlib link failed mid-cascade with callmeta_c_cpp misreported as the failing test). Two stacked mitigations: 1. Ubuntu-only `jlumbroso/free-disk-space` step before checkout — drops tool-cache + android + dotnet + haskell, reclaiming ~30 GB. macOS & windows have 50+ GB free by default; skipped there. 2. Job-level `CARGO_PROFILE_TEST_DEBUG=0` — shrinks target/debug/deps/ roughly 70%. Assertions stay on; only DWARF data is dropped.
11 tasks
coseto6125
added a commit
that referenced
this pull request
May 21, 2026
Three recent PRs landed on main with overlapping schema changes but were not rebased against each other, leaving main in a state that fails to compile (CI red on 162c52d): - #285 (T1-4 + T1-5 + T1-11) — `Node.uid: StrRef → u64`, added `Node.owner_class: StrRef`, made `uid::compute` the canonical UID source. - #291 (T4-7 SchemaField + MirrorsField) — added `post_process/schema_field_mirrors.rs`, switched `RawSchemaField.{name, owner_class}` from `StrRef` to `Box<str>`. - #292 (T7-2 per-symbol content_hash) — added `Node.content_hash: u64`. The five resulting compile errors are mechanical: each call site needs to be brought to the shape the PR author would have written if they had rebased against the other two. This commit applies that reconciliation without rewriting history (no force-push to main). - `post_process/schema_field_mirrors.rs:99-118` — replace the `format!()` + `string_pool.add()` UID construction with `uid::compute(NodeKind::SchemaField, &path_str, Some(owner_name), field_name)`, the T1-5 canonical pattern. Adds the now-required `content_hash: 0` (synthetic mirror node, no source span — per T7-2 doc convention) and `owner_class: string_pool.add(owner_name)` (T1-11 rename isolation key: a SchemaField like `Foo.id` correctly belongs to class `Foo`). Side effect: drops one heap allocation per mirror node (no more `format!()` String + pool.add round-trip). - `protobuf/parser.rs:101-110` — replace `pool.add(&field_name)` and `pool.add(owner)` (StrRef-returning) with `field_name.into_boxed_str()` and `Box::from(owner.as_str())`. Drops the now-unused `pool: &mut StringPool` parameter from `extract_proto_fields` plus the `StringPool` allocation at the call site (3 cosmetic edits in one file). - `python/parser.rs` — add `use ecp_core::pool::StringPool;` import that the existing T5-2 event-topic wire-up at line 1094 already assumed. Verified: `cargo check --workspace --all-targets --all-features`, `cargo clippy --workspace --all-targets --all-features -- -D warnings`, and `cargo test --workspace --no-fail-fast` (2805 passed, 15 ignored) all clean.
coseto6125
added a commit
that referenced
this pull request
May 21, 2026
… T7-2 (Node content_hash/owner_class) The merges of #291 (T4-7) and #292 (T7-2) left two downstream files behind. Required by this branch's pre-push clippy gate; expected to drop on rebase once fix/main-compile-post-291-292 lands. - crates/ecp-analyzer/src/protobuf/parser.rs: RawSchemaField now stores Box<str>, not StrRef; drop the per-call StringPool plumbing. - crates/ecp-analyzer/src/post_process/schema_field_mirrors.rs: Node gained content_hash + owner_class fields and SchemaField UIDs are now computed via uid::compute, not pool-interned format!() strings.
coseto6125
added a commit
that referenced
this pull request
May 21, 2026
… T7-2 (Node content_hash/owner_class) The merges of #291 (T4-7) and #292 (T7-2) left two downstream files behind. Required by this branch's pre-push clippy gate; expected to drop on rebase once fix/main-compile-post-291-292 lands. - crates/ecp-analyzer/src/protobuf/parser.rs: RawSchemaField now stores Box<str>, not StrRef; drop the per-call StringPool plumbing. - crates/ecp-analyzer/src/post_process/schema_field_mirrors.rs: Node gained content_hash + owner_class fields and SchemaField UIDs are now computed via uid::compute, not pool-interned format!() strings.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the dead-data gap from T4-2/T4-3/T4-4: detector outputs (
RawSchemaField) are now promoted to actual graph nodes + edges, unlocking the user-visible value of schema cross-binding.What lands:
crates/ecp-analyzer/src/post_process/schema_field_mirrors.rsruns afterclass_membership/overrides. It promotes eachRawSchemaFieldto aNode { kind: SchemaField, ... }+HasPropertyedge from the owning class, then buckets by(name.to_lowercase(), SchemaType)and emits pairwiseMirrorsFieldheuristic edges (confidence 0.9) where the 4-point strict rubric is satisfied. D3 cluster semantics emit all pairs for k>=3 with uniform(name, type, owner_class).RawSchemaField.name/owner_classswitched fromStrReftoBox<str>. The pre-T4-7 design interned into a per-fileStringPoolthat the parser dropped at scope exit, leaving the StrRefs pointing to deallocated memory. Owning the string sidesteps the bug entirely (~16 B extra per field, no pool plumbing).GRAPH_FORMAT_VERSIONbump —NodeKind::SchemaFieldandRelType::MirrorsFieldalready exist on main (T0-1 / T-H1 shipped). Only addition isHashderive onSchemaTypefor the bucket key.Acknowledged v1 gaps
test_partial_match_emits_blindspot+ Phase 2 docs inschema_field_mirrors.rs.Test plan
cargo test -p ecp-analyzer --test schema_field_mirror— 6 pass, 1 ignoredcargo test -p ecp-analyzer --tests— no regressions across the existing parser/schema/event suitescargo clippy --workspace --tests --all-features -- -D warnings— clean