diff --git a/CHANGELOG.md b/CHANGELOG.md index 0fb4b2d..273cb73 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,67 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [1.5.1] - 2026-06-09 + +Release fixing a class of **macro-token blindness** that ran across the +codebase, plus a new (backward-compatible) `allow_prelude_glob` architecture +option. syn's visitor +treats a macro body as an opaque token stream, so calls, `self` uses, and +component renders inside `vec![…]`, `format!(…)`, and DSL macros like dioxus +`rsx!{…}` were invisible to every analyzer that only walks the AST — producing +false DEAD_CODE, TQ_NO_SUT, TQ_UNTESTED, and SRP-cohesion findings, and *missed* +forbidden calls in the architecture matchers. Seven sites had each reinvented a +weak `Punctuated` parse blind to the `;`-repeat and block-bodied +forms; they now share one recovery helper. + +### Added +- **`adapters/shared/macro_tokens.rs`** — the single source of macro-body + recovery. `recover_exprs` escalates comma-list → braced-block (handles + `vec![x; n]` repeat and `;`-separated bodies, lifting trailing `Stmt::Macro` + back to an `Expr::Macro` and keeping a `let … else { … }` diverging branch) → + single-expr. `idents_in_call_position` / + `tokens_reference_ident` are the raw, never-fail fallback for DSL bodies + (`rsx!{ Component { .. } }`) that parse as no structured expression. +- **`[[architecture.pattern]] allow_prelude_glob` (bool, default `true`).** When + a pattern sets `forbid_glob_import = true`, idiomatic `*::prelude::*` re-export + globs (`std`/`dioxus`/`bevy`; a `prelude` segment at any depth) are exempt by + default. Set `allow_prelude_glob = false` to forbid even prelude globs. The + `forbid_glob_import` matcher stays a "find all globs" primitive — the prelude + exemption is a policy decision applied at the analyzer layer, so projects keep + full control (unlike a hard-coded matcher exemption). + +### Fixed +- **Macro-embedded calls are now seen across all call-graph / cohesion sites.** + Dead-code (DRY-002), TQ-SUT (TQ-002), TQ reachability (TQ-003), SRP cohesion + (LCOM4), and the architecture `forbid_function_call`/`forbid_method_call`/ + `forbid_macro_call` matchers all route macro bodies through `recover_exprs`, + so a call inside `vec![helper(); 3]` or a method inside `format!(…)` is no + longer dropped. The reachability consumers additionally harvest idents in + *call/construction position* — a call `f(..)` (any case) or an UpperCamelCase + component/struct render `Component { .. }` (lowercase DSL element tags like + `div { .. }` and prop keys are excluded, so an unused `fn div()` is never + masked) — so a dioxus component referenced only via `rsx!{ Component { .. } }` + (struct syntax the structured parse sees but does not record as a call) is + recognised as used — clearing false DEAD_CODE / NO_SUT on component-heavy UI + crates. This is the safe direction for findings: an extra recovered reference + only ever *suppresses* a finding, never raises a false one (a rare colliding + name can mask a true positive — the accepted conservative bias). The + architecture `forbid_*` matchers deliberately use only the structured path, + never the positional fallback, since over-collection there would manufacture + false violations. +- **SLM (self-less method) no longer false-fires on macro `self` use.** A `self` + referenced only inside `format!("{}", self.x)`, `write!(…)`, or any macro is + now found via a raw token scan (`tokens_reference_ident`), replacing the old + `matches!`-only special case. + +### Notes +- **IOSP own-call counting stays deliberately macro-blind.** Counting own-calls + inside macro bodies would newly flag real macro-driven render code (a + component holding a conditional while rendering own components in `rsx!`) as + an IOSP Violation — a breaking change out of scope here. This is the safe + direction (under-reports, never false-positive) and is pinned by a + characterization test. + ## [1.5.0] - 2026-06-05 Minor release with one **breaking** config change. rustqual stops shipping a diff --git a/Cargo.lock b/Cargo.lock index b8c6a68..38816cc 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -704,7 +704,7 @@ dependencies = [ [[package]] name = "rustqual" -version = "1.5.0" +version = "1.5.1" dependencies = [ "clap", "clap_complete", diff --git a/Cargo.toml b/Cargo.toml index d567aa7..e0fc44d 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -1,6 +1,6 @@ [package] name = "rustqual" -version = "1.5.0" +version = "1.5.1" edition = "2021" description = "Comprehensive Rust code quality analyzer — seven dimensions: IOSP, Complexity, DRY, SRP, Coupling, Test Quality, Architecture" license = "MIT" diff --git a/book/reference-configuration.md b/book/reference-configuration.md index d210df2..218b071 100644 --- a/book/reference-configuration.md +++ b/book/reference-configuration.md @@ -256,6 +256,7 @@ Repeatable symbol-pattern rules. | `forbid_method_call` | List of method names to forbid (`unwrap`, `expect`, …) | | `forbid_macro_call` | List of macro names to forbid (`println`, `dbg`, …) | | `forbid_glob_import` | `true` to forbid `use foo::*;` | +| `allow_prelude_glob` | When forbidding globs, still allow `*::prelude::*` re-exports (`std`/`dioxus`/`bevy`). Default `true`; set `false` to forbid even prelude globs | | `forbidden_in` | Globs where the rule fires | | `allowed_in` | Globs where the rule is exempted | | `except` | Globs in `forbidden_in` that are exempted | diff --git a/src/adapters/analyzers/architecture/analyzer.rs b/src/adapters/analyzers/architecture/analyzer.rs index a3b4f27..502a073 100644 --- a/src/adapters/analyzers/architecture/analyzer.rs +++ b/src/adapters/analyzers/architecture/analyzer.rs @@ -168,7 +168,10 @@ fn build_globset(patterns: &[String]) -> Option { /// Run every active matcher of `pattern` on one parsed file. /// Integration: iterator-chain over matchers, collects findings. -fn run_pattern_matchers(file: &crate::ports::ParsedFile, pattern: &SymbolPattern) -> Vec { +pub(super) fn run_pattern_matchers( + file: &crate::ports::ParsedFile, + pattern: &SymbolPattern, +) -> Vec { let rule_id = format!("architecture/pattern/{}", pattern.name); let mut out = Vec::new(); @@ -204,6 +207,7 @@ fn run_pattern_matchers(file: &crate::ports::ParsedFile, pattern: &SymbolPattern let hits = find_glob_imports(&file.path, &file.ast); out.extend( hits.into_iter() + .filter(|h| !pattern.allow_prelude_glob || !is_prelude_glob(h)) .map(|h| match_to_finding(h, &rule_id, pattern)), ); } @@ -224,6 +228,18 @@ fn run_pattern_matchers(file: &crate::ports::ParsedFile, pattern: &SymbolPattern out } +/// True if a glob-import hit targets a `*::prelude::*` re-export (a `prelude` +/// segment at any depth) — the idiomatic `std`/`dioxus`/`bevy` pattern exempted +/// by `allow_prelude_glob`. The matcher reports every glob; this policy predicate +/// decides which are forgiven. Operation: kind match + segment scan, no own calls. +fn is_prelude_glob(hit: &MatchLocation) -> bool { + matches!( + &hit.kind, + ViolationKind::GlobImport { base_path } + if base_path.split("::").any(|seg| seg == "prelude") + ) +} + // ── layer rule ───────────────────────────────────────────────────────── /// Run the layer rule against the whole parsed workspace. diff --git a/src/adapters/analyzers/architecture/call_parity_rule/calls/mod.rs b/src/adapters/analyzers/architecture/call_parity_rule/calls/mod.rs index 018e6ee..1a02bdc 100644 --- a/src/adapters/analyzers/architecture/call_parity_rule/calls/mod.rs +++ b/src/adapters/analyzers/architecture/call_parity_rule/calls/mod.rs @@ -122,42 +122,13 @@ mod scope; mod visit; /// Best-effort extraction of expressions from a macro token stream. -/// Most macros accept comma-separated exprs (`assert!(a, b)`, -/// `format!("{}", x)`), but block-like bodies (`tokio::select! { ... }`) -/// and separator-`;` variants (`vec![x; n]`) don't. We try three -/// strategies in order: -/// 1. Comma-separated `syn::Expr` list (covers ~90% of macro calls). -/// 2. Brace-wrapped parse as a `syn::Block` — extracts every statement -/// expression, covering block-bodied and `;`-separated forms. -/// 3. Single `syn::Expr` — for macros whose argument is one expression. -/// -/// Still silent-skips on total parse failure (extern-DSL macros, custom -/// grammar) — a documented limitation of syntax-level call-graph -/// construction. +/// Thin alias for the shared [`crate::adapters::shared::macro_tokens::recover_exprs`] +/// — the single source of the comma-list / `;`-repeat / block-bodied / single-expr +/// recovery strategy used by every macro-aware collector. Kept as a local name +/// so the call_parity call-graph code reads in its own vocabulary. +/// Integration: delegates to the shared helper, no logic. pub(super) fn parse_macro_tokens(tokens: proc_macro2::TokenStream) -> Vec { - use syn::parse::Parser; - use syn::punctuated::Punctuated; - use syn::Token; - let parser = Punctuated::::parse_terminated; - if let Ok(exprs) = parser.parse2(tokens.clone()) { - return exprs.into_iter().collect(); - } - let braced = quote::quote! { { #tokens } }; - if let Ok(block) = syn::parse2::(braced) { - return block - .stmts - .into_iter() - .filter_map(|stmt| match stmt { - syn::Stmt::Expr(e, _) => Some(e), - syn::Stmt::Local(l) => l.init.map(|init| *init.expr), - _ => None, - }) - .collect(); - } - if let Ok(expr) = syn::parse2::(tokens) { - return vec![expr]; - } - Vec::new() + crate::adapters::shared::macro_tokens::recover_exprs(&tokens) } /// Project an inferred receiver type to the canonical call-graph diff --git a/src/adapters/analyzers/architecture/matcher/function_call.rs b/src/adapters/analyzers/architecture/matcher/function_call.rs index e8c33bf..e0fb298 100644 --- a/src/adapters/analyzers/architecture/matcher/function_call.rs +++ b/src/adapters/analyzers/architecture/matcher/function_call.rs @@ -74,15 +74,16 @@ impl<'ast> Visit<'ast> for FunctionCallVisitor<'_> { } fn visit_macro(&mut self, node: &'ast syn::Macro) { - // Macro token streams are invisible to the default visitor. Parse - // the token stream as comma-separated expressions so calls inside - // `format!("{}", Box::new(x))` are caught. - use syn::punctuated::Punctuated; - if let Ok(args) = syn::parse::Parser::parse2( - Punctuated::::parse_terminated, - node.tokens.clone(), - ) { - args.iter().for_each(|expr| visit::visit_expr(self, expr)); + // Macro token streams are invisible to the default visitor. Recover the + // embedded exprs (comma-list, `;`-repeat, and block-bodied forms) so + // forbidden calls inside `format!("{}", Box::new(x))`, `vec![f(); n]`, + // etc. are caught. Structured-only by design: unlike the reachability + // consumers, a forbid-matcher must NOT use the raw positional fallback + // (it would manufacture false violations), so a forbidden call inside an + // unparseable extern-DSL body is best-effort-missed rather than risk a + // false positive. + for expr in crate::adapters::shared::macro_tokens::recover_exprs(&node.tokens) { + visit::visit_expr(self, &expr); } visit::visit_macro(self, node); } diff --git a/src/adapters/analyzers/architecture/matcher/glob_import.rs b/src/adapters/analyzers/architecture/matcher/glob_import.rs index cc53047..d11df34 100644 --- a/src/adapters/analyzers/architecture/matcher/glob_import.rs +++ b/src/adapters/analyzers/architecture/matcher/glob_import.rs @@ -35,6 +35,11 @@ impl GlobImportVisitor<'_> { segments.pop(); } syn::UseTree::Glob(g) => { + // Report EVERY glob (including `self::*`, `super::*`, and + // `*::prelude::*`). Policy decides what is allowed: path scope + // via `forbidden_in`/`except`, and the prelude exemption via + // `allow_prelude_glob`, both applied at the analyzer layer. The + // matcher stays a dumb "find all globs" primitive. let base_path = segments.join("::"); let start = g.star_token.span().start(); self.hits.push(MatchLocation { diff --git a/src/adapters/analyzers/architecture/matcher/macro_call.rs b/src/adapters/analyzers/architecture/matcher/macro_call.rs index 2986428..c815169 100644 --- a/src/adapters/analyzers/architecture/matcher/macro_call.rs +++ b/src/adapters/analyzers/architecture/matcher/macro_call.rs @@ -41,14 +41,14 @@ impl<'ast> Visit<'ast> for MacroCallVisitor<'_> { } } // Descend into the macro token stream so nested macros (e.g. - // `vec![format!(...)]`) are caught. Matches the same parse trick - // the rustqual call-target collector uses. - use syn::punctuated::Punctuated; - if let Ok(args) = syn::parse::Parser::parse2( - Punctuated::::parse_terminated, - node.tokens.clone(), - ) { - args.iter().for_each(|expr| visit::visit_expr(self, expr)); + // `vec![format!(...)]`) are caught — recover the embedded exprs + // (comma-list, `;`-repeat, and block-bodied forms) via the shared + // helper the rustqual call-target collector also uses. Structured-only + // by design: a forbid-matcher must NOT use the raw positional fallback + // (it would manufacture false violations), so a nested macro inside an + // unparseable extern-DSL body is best-effort-missed. + for expr in crate::adapters::shared::macro_tokens::recover_exprs(&node.tokens) { + visit::visit_expr(self, &expr); } visit::visit_macro(self, node); } diff --git a/src/adapters/analyzers/architecture/matcher/method_call.rs b/src/adapters/analyzers/architecture/matcher/method_call.rs index 301e6e3..376fd05 100644 --- a/src/adapters/analyzers/architecture/matcher/method_call.rs +++ b/src/adapters/analyzers/architecture/matcher/method_call.rs @@ -80,16 +80,15 @@ impl<'ast> Visit<'ast> for MethodCallVisitor<'_> { } fn visit_macro(&mut self, node: &'ast syn::Macro) { - // Macro token streams are not parsed by syn's default visitor, but - // we want to catch method calls inside `format!("{}", x.unwrap())` - // and similar. Parse the token stream as a comma-separated list of - // expressions (works for most function-like macros). - use syn::punctuated::Punctuated; - if let Ok(args) = syn::parse::Parser::parse2( - Punctuated::::parse_terminated, - node.tokens.clone(), - ) { - args.iter().for_each(|expr| visit::visit_expr(self, expr)); + // Macro token streams are not parsed by syn's default visitor, but we + // want to catch method calls inside `format!("{}", x.unwrap())` and + // similar. Recover the embedded exprs (comma-list, `;`-repeat, and + // block-bodied forms) and feed them through the visitor. Structured-only + // by design: a forbid-matcher must NOT use the raw positional fallback + // (it would manufacture false violations), so a method call inside an + // unparseable extern-DSL body is best-effort-missed. + for expr in crate::adapters::shared::macro_tokens::recover_exprs(&node.tokens) { + visit::visit_expr(self, &expr); } visit::visit_macro(self, node); } diff --git a/src/adapters/analyzers/architecture/matcher/tests/function_call.rs b/src/adapters/analyzers/architecture/matcher/tests/function_call.rs index 4949d8c..3c9229c 100644 --- a/src/adapters/analyzers/architecture/matcher/tests/function_call.rs +++ b/src/adapters/analyzers/architecture/matcher/tests/function_call.rs @@ -136,6 +136,24 @@ fn descends_into_macro_tokens() { assert_eq!(hits.len(), 1, "{hits:?}"); } +#[test] +fn descends_into_repeat_form_macro_tokens() { + // `vec![Box::new(42); 3]` — the `;`-repeat body fails a comma-expr parse; + // the shared recovery's block fallback must still surface the forbidden + // call, otherwise `forbid_function_call` silently misses it. + let src = r#" + fn main() { + let _v = vec![Box::new(42); 3]; + } + "#; + let hits = find(src, &["Box::new"]); + assert_eq!( + hits.len(), + 1, + "forbidden call inside a vec![_; n] repeat macro must be caught: {hits:?}" + ); +} + // ── turbofish tolerance ────────────────────────────────────────────── #[test] diff --git a/src/adapters/analyzers/architecture/matcher/tests/glob_import.rs b/src/adapters/analyzers/architecture/matcher/tests/glob_import.rs index aa294f3..b2a433c 100644 --- a/src/adapters/analyzers/architecture/matcher/tests/glob_import.rs +++ b/src/adapters/analyzers/architecture/matcher/tests/glob_import.rs @@ -126,6 +126,27 @@ fn does_not_match_empty_file() { assert!(hits.is_empty()); } +#[test] +fn matches_prelude_glob_at_matcher_level() { + // The matcher is a dumb "find ALL globs" primitive — it reports prelude + // globs too (`crate::prelude::*`, `dioxus::prelude::*`, …). The idiomatic + // exemption is a POLICY decision applied at the analyzer layer via + // `allow_prelude_glob` (see `run_pattern_matchers`), not here — consistent + // with the `self::*`/`super::*` handling above. + for src in [ + "use crate::prelude::*;", + "use dioxus::prelude::*;", + "use some::nested::prelude::*;", + ] { + let hits = find(src); + assert_eq!( + hits.len(), + 1, + "matcher must report the prelude glob (policy decides exemption), got {hits:?} for `{src}`" + ); + } +} + #[test] fn line_number_points_to_glob_statement() { let src = "\n\n\nuse foo::*;\n"; diff --git a/src/adapters/analyzers/architecture/matcher/tests/macro_call.rs b/src/adapters/analyzers/architecture/matcher/tests/macro_call.rs index 374f279..66bd2c9 100644 --- a/src/adapters/analyzers/architecture/matcher/tests/macro_call.rs +++ b/src/adapters/analyzers/architecture/matcher/tests/macro_call.rs @@ -82,6 +82,24 @@ fn matches_macro_inside_macro_args() { ); } +#[test] +fn matches_macro_inside_repeat_form_macro_args() { + // `vec![format!("inner"); 3]` — the `;`-repeat outer body needs the shared + // recovery's block fallback for the nested forbidden macro to be found. + let src = r#" + fn run() { + let v = vec![format!("inner"); 3]; + let _ = v; + } + "#; + let hits = find(src, &["format"]); + assert_eq!( + hits.len(), + 1, + "nested macro inside a vec![_; n] repeat macro must be found: {hits:?}" + ); +} + // ── Negative matches ────────────────────────────────────────────────── #[test] diff --git a/src/adapters/analyzers/architecture/matcher/tests/method_call.rs b/src/adapters/analyzers/architecture/matcher/tests/method_call.rs index ce5adc2..8a27054 100644 --- a/src/adapters/analyzers/architecture/matcher/tests/method_call.rs +++ b/src/adapters/analyzers/architecture/matcher/tests/method_call.rs @@ -187,3 +187,21 @@ fn matches_method_call_inside_macro_arg() { "method inside macro arg must match: {hits:?}" ); } + +#[test] +fn matches_method_call_inside_repeat_form_macro() { + // `vec![x.unwrap(); 3]` — the `;`-repeat body needs the shared recovery's + // block fallback; a plain comma-expr parse would miss the forbidden method. + let src = r#" + fn run() { + let x: Option = Some(1); + let _v = vec![x.unwrap(); 3]; + } + "#; + let hits = find(src, &["unwrap"]); + assert_eq!( + hits.len(), + 1, + "method inside a vec![_; n] repeat macro must match: {hits:?}" + ); +} diff --git a/src/adapters/analyzers/architecture/tests/analyzer.rs b/src/adapters/analyzers/architecture/tests/analyzer.rs index 5492da7..cf8ae6e 100644 --- a/src/adapters/analyzers/architecture/tests/analyzer.rs +++ b/src/adapters/analyzers/architecture/tests/analyzer.rs @@ -85,6 +85,7 @@ fn pattern( forbid_item_kind: None, forbid_derive: None, forbid_glob_import: None, + allow_prelude_glob: true, regex: None, reason: String::new(), } @@ -230,3 +231,38 @@ fn analyze_emits_findings_for_layer_violation_only_when_enabled() { "disabled → empty" ); } + +#[test] +fn glob_policy_prelude_exemption_is_config_gated() { + // `forbid_glob_import` reports ALL globs at the matcher level; the analyzer + // policy then forgives `*::prelude::*` iff `allow_prelude_glob` (default + // true). The plain `crate::guts::*` glob always fires. + use crate::adapters::analyzers::architecture::analyzer::run_pattern_matchers; + let src = "use dioxus::prelude::*;\nuse crate::guts::*;\n"; + let file = crate::ports::ParsedFile { + path: "src/a.rs".into(), + content: src.into(), + ast: syn::parse_file(src).expect("parse fixture"), + }; + let mut pat = pattern(None, None); + pat.forbid_path_prefix = None; + pat.forbid_glob_import = Some(true); + + // Default (true): prelude glob (line 1) forgiven, only the plain glob (line 2) fires. + pat.allow_prelude_glob = true; + let exempt = run_pattern_matchers(&file, &pat); + assert_eq!(exempt.len(), 1, "prelude must be exempt, got {exempt:?}"); + assert_eq!( + exempt[0].line, 2, + "the surviving finding is the non-prelude glob" + ); + + // Opt-out (false): every glob fires, including the prelude one. + pat.allow_prelude_glob = false; + let strict = run_pattern_matchers(&file, &pat); + assert_eq!( + strict.len(), + 2, + "allow_prelude_glob = false forbids even prelude globs, got {strict:?}" + ); +} diff --git a/src/adapters/analyzers/architecture/tests/rendering.rs b/src/adapters/analyzers/architecture/tests/rendering.rs index 1788dd3..ecc3633 100644 --- a/src/adapters/analyzers/architecture/tests/rendering.rs +++ b/src/adapters/analyzers/architecture/tests/rendering.rs @@ -172,6 +172,7 @@ fn match_to_finding_projects_every_field() { forbid_item_kind: None, forbid_derive: None, forbid_glob_import: None, + allow_prelude_glob: true, regex: None, reason: "no anon fns".into(), }; diff --git a/src/adapters/analyzers/dry/call_targets.rs b/src/adapters/analyzers/dry/call_targets.rs index eb67d77..201af43 100644 --- a/src/adapters/analyzers/dry/call_targets.rs +++ b/src/adapters/analyzers/dry/call_targets.rs @@ -173,16 +173,25 @@ impl<'ast> Visit<'ast> for CallTargetCollector { } fn visit_macro(&mut self, node: &'ast syn::Macro) { - // Parse macro arguments as expressions to find embedded function calls. - // Works for assert!(), assert_eq!(), format!(), vec![], etc. - use syn::punctuated::Punctuated; - if let Ok(args) = syn::parse::Parser::parse2( - Punctuated::::parse_terminated, - node.tokens.clone(), - ) { - args.iter() - .for_each(|expr| syn::visit::visit_expr(self, expr)); - } + // Macro bodies are opaque to syn's visitor; recover embedded exprs so + // calls inside assert!(), format!(), vec![] — including the `;`-repeat + // and block-bodied forms — become call-graph edges. + crate::adapters::shared::macro_tokens::recover_exprs(&node.tokens) + .iter() + .for_each(|expr| syn::visit::visit_expr(self, expr)); + // Always also harvest call/construction-position idents from the body. + // DSL component invocations use struct syntax (`Component { .. }`) that + // the structured visit parses but does NOT record as a call. Harvesting + // only idents in call/construction position (not prop keys or locals) + // keeps the over-collection tight; for reachability it only ever + // *suppresses* a finding, never raises a false one (a rare colliding + // name can mask a true positive — the accepted conservative bias). + let target = self.target(); + crate::adapters::shared::macro_tokens::idents_in_call_position(&node.tokens).for_each( + |id| { + target.insert(id); + }, + ); syn::visit::visit_macro(self, node); } diff --git a/src/adapters/analyzers/dry/tests/dead_code.rs b/src/adapters/analyzers/dry/tests/dead_code.rs index 076a6c8..5a80dfc 100644 --- a/src/adapters/analyzers/dry/tests/dead_code.rs +++ b/src/adapters/analyzers/dry/tests/dead_code.rs @@ -1197,6 +1197,117 @@ fn helper_reached_via_trait_blanket_dispatch_is_not_dead_code() { ); } +#[test] +fn fn_called_only_in_repeat_macro_arg_not_dead_code() { + // `caller` invokes `helper()` ONLY inside a `vec![_; n]` *repeat* macro. + // Before the macro-token fix, `visit_macro` (call_targets.rs) re-parsed the + // body only as a comma-separated expr list; the `;` separator failed that + // parse and EVERY call edge in the macro was dropped, so `helper` was + // wrongly reported dead. `recover_exprs`' block fallback now recovers it. + // + // The comma-expr form `vec![helper()]` was already handled; the fix covers + // the forms that don't parse as an expr list — repeat (`;`) and block + // bodies. See the bound-local control below for the non-macro baseline. + let code = r#" + fn helper() -> i32 { 42 } + fn caller() -> Vec { vec![helper(); 3] } + #[cfg(test)] + mod tests { + #[test] + fn t() { let _ = super::caller(); } + } + "#; + let parsed = parse(code); + let warnings = dead_code_warnings(&parsed); + assert!( + !warnings.iter().any(|w| w.function_name == "helper"), + "helper() called inside a vec![_; n] repeat macro must not be flagged dead, got {warnings:?}" + ); +} + +#[test] +fn component_referenced_only_in_dsl_macro_not_dead_code() { + // Dioxus-style: `LiveLogViewer` is a production component referenced only + // inside an `rsx!` DSL body whose nested elements + attributes parse as + // none of expr-list / block / expr. `recover_exprs` yields nothing, so the + // raw `idents_in_tokens` fallback must still register the reference — + // otherwise the component is wrongly flagged dead (the sovard symptom). + let code = r#" + fn LiveLogViewer() { let _ = 1; } + fn app() { rsx! { div { class: "log", LiveLogViewer {} } }; } + fn main() { app(); } + "#; + let parsed = parse(code); + let warnings = dead_code_warnings(&parsed); + assert!( + !warnings.iter().any(|w| w.function_name == "LiveLogViewer"), + "component referenced only inside an rsx! DSL macro must not be dead, got {warnings:?}" + ); +} + +#[test] +fn fn_named_like_a_lowercase_dsl_tag_is_still_dead_code() { + // `view` is an unused production fn whose name also appears as a lowercase + // DSL element tag (`view { .. }`) in a rendered body. A lowercase tag is not + // a component/struct, so the positional harvest must NOT record it — `view` + // stays flagged dead. Guards the UpperCamelCase brace-group gate. + let code = r#" + fn view() { let _ = 1; } + fn app() { rsx! { view { class: "x", Widget {} } }; } + fn main() { app(); } + "#; + let parsed = parse(code); + let warnings = dead_code_warnings(&parsed); + assert!( + warnings.iter().any(|w| w.function_name == "view"), + "a fn matching only a lowercase DSL tag must still be flagged dead, got {warnings:?}" + ); +} + +#[test] +fn fn_named_like_a_dsl_prop_key_is_still_dead_code() { + // Negative control for the positional raw harvest: `render_label` appears in + // the rsx! body ONLY as a prop key (`Widget { render_label: "x" }`), not in + // call/construction position. It must NOT be harvested as a call, so the + // genuinely-uncalled `render_label` stays flagged dead — guarding against + // the over-collection false-negative the raw-ident fallback would have had. + let code = r#" + fn render_label() { let _ = 1; } + fn app() { rsx! { Widget { render_label: "x" } }; } + fn main() { app(); } + "#; + let parsed = parse(code); + let warnings = dead_code_warnings(&parsed); + assert!( + warnings.iter().any(|w| w.function_name == "render_label"), + "a fn matching only a DSL prop key must still be flagged dead, got {warnings:?}" + ); +} + +#[test] +fn fn_called_via_bound_local_not_dead_code_control() { + // Control for `fn_called_only_in_repeat_macro_arg_not_dead_code`: the same + // call routed through a bound local (`let h = helper(); vec![h]`) is a + // plain AST call expression — not buried in macro tokens — so the edge is + // recorded and `helper` is correctly NOT dead. Isolates the bug to macro + // token streams, not to `vec!` itself. + let code = r#" + fn helper() -> i32 { 42 } + fn caller() -> Vec { let h = helper(); vec![h] } + #[cfg(test)] + mod tests { + #[test] + fn t() { let _ = super::caller(); } + } + "#; + let parsed = parse(code); + let warnings = dead_code_warnings(&parsed); + assert!( + !warnings.iter().any(|w| w.function_name == "helper"), + "helper() via a bound local must not be flagged dead, got {warnings:?}" + ); +} + #[test] fn test_only_called_fn_is_not_uncalled() { // A production fn called only from tests is TestOnly, NOT Uncalled — guards diff --git a/src/adapters/analyzers/dry/tests/wildcards.rs b/src/adapters/analyzers/dry/tests/wildcards.rs index 6fd037a..1c7fc1c 100644 --- a/src/adapters/analyzers/dry/tests/wildcards.rs +++ b/src/adapters/analyzers/dry/tests/wildcards.rs @@ -87,6 +87,25 @@ fn test_excludes_custom_prelude_glob() { assert!(warnings.is_empty(), "crate::prelude::* should be excluded"); } +#[test] +fn test_excludes_extern_crate_prelude_glob() { + // An external crate's prelude (`dioxus::prelude::*`, `bevy::prelude::*`) is + // the same idiomatic re-export pattern as std/crate preludes and must be + // excluded too — not only `prelude::*` / `crate::prelude::*`. + for code in [ + "use dioxus::prelude::*;", + "use bevy::prelude::*;", + "use some_crate::nested::prelude::*;", + ] { + let parsed = parse(code); + let warnings = detect_wildcard_imports(&parsed); + assert!( + warnings.is_empty(), + "extern-crate prelude glob must be excluded, got {warnings:?} for `{code}`" + ); + } +} + #[test] fn test_external_glob_detected() { let parsed = parse("use std::collections::*;"); diff --git a/src/adapters/analyzers/iosp/tests/root/own_calls_b.rs b/src/adapters/analyzers/iosp/tests/root/own_calls_b.rs index 7054fff..b957f96 100644 --- a/src/adapters/analyzers/iosp/tests/root/own_calls_b.rs +++ b/src/adapters/analyzers/iosp/tests/root/own_calls_b.rs @@ -148,3 +148,38 @@ fn test_mixed_leaf_and_integration_calls_safe() { f.classification ); } + +#[test] +fn own_call_only_inside_macro_is_not_counted_deliberate_safe_direction() { + // DELIBERATE non-fix (macro-token-blindness sweep): IOSP own-call counting + // does NOT descend into macro bodies — `iosp::visitor::walk_macro` only + // counts panic!/todo! for error-handling, never own-calls. So `helper()`, + // reachable only inside `vec![helper(); 1]`, is absent from `own_calls`. + // + // This is the safe direction (under-reports, never false-positive) and is + // kept on purpose: descending would newly flag real macro-driven render + // code — e.g. a dioxus component that renders its own components inside + // `rsx!` while holding a conditional — as an IOSP Violation. That is a + // breaking behaviour change, out of scope for fixing the macro *call-graph* + // blindness (dead-code / TQ / SRP / architecture), which only ever removes + // false positives. Flipping this must be a separately-scoped decision; this + // test guards the boundary. + let code = r#" + fn helper() -> i32 { 42 } + fn renders(flag: bool) -> Vec { + if flag { + vec![helper(); 1] + } else { + vec![] + } + } + "#; + let results = parse_and_analyze(code); + let f = results.iter().find(|f| f.name == "renders").unwrap(); + assert!( + !f.own_calls.iter().any(|c| c == "helper"), + "IOSP deliberately does not count an own-call hidden in a macro body; \ + got own_calls = {:?}", + f.own_calls + ); +} diff --git a/src/adapters/analyzers/srp/mod.rs b/src/adapters/analyzers/srp/mod.rs index 3f9387c..f1201ed 100644 --- a/src/adapters/analyzers/srp/mod.rs +++ b/src/adapters/analyzers/srp/mod.rs @@ -317,21 +317,17 @@ impl<'ast> Visit<'ast> for MethodBodyVisitor { } fn visit_macro(&mut self, node: &'ast syn::Macro) { - // Parse macro arguments as comma-separated expressions and feed - // them through our own `visit_expr` override, so `self.field` - // accesses and `self.method()` calls inside `debug_assert!(...)`, - // `assert_eq!(...)`, `format!(...)` etc. contribute to LCOM4 - // just like non-macro code. We must call `self.visit_expr(e)` - // (the override) rather than `syn::visit::visit_expr(self, e)` - // (the dispatcher) — the dispatcher skips our outer match arm - // and walks straight into sub-expressions, so the method-call - // ident itself would never be recorded. - use syn::punctuated::Punctuated; - if let Ok(args) = syn::parse::Parser::parse2( - Punctuated::::parse_terminated, - node.tokens.clone(), - ) { - args.iter().for_each(|expr| self.visit_expr(expr)); + // Recover the macro body's expressions and feed them through our own + // `visit_expr` override, so `self.field` accesses and `self.method()` + // calls inside `debug_assert!(...)`, `assert_eq!(...)`, `format!(...)` + // — including the `;`-repeat and block-bodied forms — contribute to + // LCOM4 just like non-macro code. We must call `self.visit_expr(e)` + // (the override) rather than `syn::visit::visit_expr(self, e)` (the + // dispatcher) — the dispatcher skips our outer match arm and walks + // straight into sub-expressions, so the method-call ident itself + // would never be recorded. + for expr in crate::adapters::shared::macro_tokens::recover_exprs(&node.tokens) { + self.visit_expr(&expr); } syn::visit::visit_macro(self, node); } diff --git a/src/adapters/analyzers/srp/tests/cohesion/lcom4_and_collector.rs b/src/adapters/analyzers/srp/tests/cohesion/lcom4_and_collector.rs index b256fd5..af7310a 100644 --- a/src/adapters/analyzers/srp/tests/cohesion/lcom4_and_collector.rs +++ b/src/adapters/analyzers/srp/tests/cohesion/lcom4_and_collector.rs @@ -132,6 +132,33 @@ fn method_body_visitor_sees_self_calls_inside_debug_assert_macro() { ); } +#[test] +fn method_body_visitor_sees_self_calls_inside_repeat_form_macro() { + // The `;`-repeat macro body `vec![self.validate(); 1]` fails a comma-expr + // parse; the shared recovery's block fallback must still surface the + // `self.validate()` link so cohesion (LCOM4) is not under-counted. + let code = r#" + struct Storage { buf: usize, active: bool } + impl Storage { + fn validate(&self) -> bool { self.active && self.buf > 0 } + fn append(&mut self, n: usize) { + self.buf = n; + let _ = vec![self.validate(); 1]; + } + } + "#; + let methods = collect_methods_for(code); + let append = methods + .iter() + .find(|m| m.method_name == "append") + .expect("append collected"); + assert!( + append.self_method_calls.contains("validate"), + "append should see self.validate() inside a vec![_; n] repeat macro, got: {:?}", + append.self_method_calls + ); +} + #[test] fn lcom4_unites_methods_linked_via_debug_assert_macro() { // Bug 2 reproducer. `append` only writes `extra`, which no other diff --git a/src/adapters/analyzers/structural/slm.rs b/src/adapters/analyzers/structural/slm.rs index 15d93f3..2288426 100644 --- a/src/adapters/analyzers/structural/slm.rs +++ b/src/adapters/analyzers/structural/slm.rs @@ -116,25 +116,15 @@ impl<'ast> Visit<'ast> for SelfRefChecker { } } if let syn::Expr::Macro(m) = expr { - if m.mac - .path - .segments - .last() - .map(|s| s.ident == "matches") - .unwrap_or(false) + // A macro body is an opaque token stream that syn's visitor never + // descends into, so a `self` used only inside `format!("{}", self.x)`, + // `matches!(self, …)`, `write!(f, "{}", self)` etc. would otherwise be + // invisible — falsely flagging the method as self-less. Scan the raw + // tokens for a `self` reference (robust for any macro grammar). + if crate::adapters::shared::macro_tokens::tokens_reference_ident(&m.mac.tokens, "self") { - let first_is_self = m - .mac - .tokens - .clone() - .into_iter() - .next() - .map(|t| t.to_string() == "self") - .unwrap_or(false); - if first_is_self { - self.has_self_ref = true; - return; - } + self.has_self_ref = true; + return; } } syn::visit::visit_expr(self, expr); diff --git a/src/adapters/analyzers/structural/tests/slm.rs b/src/adapters/analyzers/structural/tests/slm.rs index c25800d..7d70c4c 100644 --- a/src/adapters/analyzers/structural/tests/slm.rs +++ b/src/adapters/analyzers/structural/tests/slm.rs @@ -102,3 +102,19 @@ fn test_matches_macro_self_not_flagged() { "matches!(self, ...) should count as self reference" ); } + +#[test] +fn test_self_in_macro_arg_not_flagged() { + // A method whose ONLY self-reference is a macro argument (here `format!`) + // genuinely uses self — it must not trip SLM. `syn::visit` treats a macro + // body as an opaque token stream, so the `SelfRefChecker` has to scan macro + // tokens for `self` (it special-cases only `matches!` today → false positive). + let w = detect_in( + "struct S { name: String } impl S { fn label(&self) -> String { format!(\"[{}]\", self.name) } }", + ); + assert!( + w.is_empty(), + "self inside a macro arg (format!) must count as a self reference: {} warning(s)", + w.len() + ); +} diff --git a/src/adapters/analyzers/tq/mod.rs b/src/adapters/analyzers/tq/mod.rs index 121b9e0..323ba6e 100644 --- a/src/adapters/analyzers/tq/mod.rs +++ b/src/adapters/analyzers/tq/mod.rs @@ -125,16 +125,24 @@ impl<'ast> Visit<'ast> for FullCallGraphCollector { } fn visit_macro(&mut self, node: &'ast syn::Macro) { - // Parse macro arguments as expressions so calls inside vec![], assert!(), format!() - // etc. become edges in the graph. Without this, TQ-003 misses reachability through - // macro-wrapped calls (e.g. `vec![make_unmatched(path)]`). - use syn::punctuated::Punctuated; - if let Ok(args) = syn::parse::Parser::parse2( - Punctuated::::parse_terminated, - node.tokens.clone(), - ) { - args.iter() - .for_each(|expr| syn::visit::visit_expr(self, expr)); + // Macro bodies are opaque to syn's visitor; recover embedded exprs so + // calls inside vec![], assert!(), format!() — including the `;`-repeat + // and block-bodied forms — become edges in the graph. Without this, + // TQ-003 misses reachability through macro-wrapped calls + // (e.g. `vec![make_unmatched(path)]`). + crate::adapters::shared::macro_tokens::recover_exprs(&node.tokens) + .iter() + .for_each(|expr| syn::visit::visit_expr(self, expr)); + // Always also harvest call/construction-position idents so reachability + // flows through component renders. DSL components use struct syntax + // (`Component { .. }`) the structured visit parses but does not record as + // a call. Harvesting only call/construction position (not prop keys) + // keeps it tight; for TQ-003 it can only suppress a false "untested", + // never raise one. + if self.current_fn.is_some() { + self.current_calls.extend( + crate::adapters::shared::macro_tokens::idents_in_call_position(&node.tokens), + ); } syn::visit::visit_macro(self, node); } diff --git a/src/adapters/analyzers/tq/tests/coverage.rs b/src/adapters/analyzers/tq/tests/coverage.rs index 43a9f20..4e9b698 100644 --- a/src/adapters/analyzers/tq/tests/coverage.rs +++ b/src/adapters/analyzers/tq/tests/coverage.rs @@ -84,6 +84,46 @@ fn test_function_not_in_lcov_no_warning() { assert!(warnings.is_empty()); } +#[test] +fn cfg_excluded_fn_present_in_results_but_absent_from_lcov_not_flagged() { + // Repro for the wasm-only / cfg-gated coverage discrepancy: rustqual parses + // source cfg-agnostically (so `excluded_fn` IS in `all_results`), but the + // coverage run built a target where it was cfg'd out — so the file IS in the + // LCOV (its sibling `native_fn` was covered) while `excluded_fn` appears in + // NEITHER function_hits NOR line_hits. It must NOT be reported as Uncovered + // (TQ-004) or UntestedLogic (TQ-005): "not built in this target" ≠ "0 hits". + let native = make_func("native_fn", "src/lib.rs", 5); + let mut excluded = make_func("excluded_fn", "src/lib.rs", 12); + excluded.complexity = Some(ComplexityMetrics { + logic_occurrences: vec![LogicOccurrence { + kind: "if".to_string(), + line: 13, + }], + ..Default::default() + }); + let results = vec![native, excluded]; + let mut lcov = HashMap::new(); + // Only native_fn is in the coverage artifact (hit + its line); excluded_fn + // and its lines are entirely absent, exactly as llvm-cov emits when the + // function is cfg'd out of the built target. + lcov.insert( + "src/lib.rs".to_string(), + make_lcov_data(&[("native_fn", 3)], &[(5, 3)]), + ); + let uncovered = detect_uncovered_functions(&results, &lcov); + assert!( + !uncovered.iter().any(|w| w.function_name == "excluded_fn"), + "cfg-excluded fn absent from LCOV must not be Uncovered, got {uncovered:?}" + ); + let untested_logic = detect_untested_logic(&results, &lcov); + assert!( + !untested_logic + .iter() + .any(|w| w.function_name == "excluded_fn"), + "cfg-excluded fn's logic lines absent from LCOV must not be UntestedLogic, got {untested_logic:?}" + ); +} + #[test] fn test_function_excluded_by_is_test_flag() { // Test entry points are excluded via the analyzer-computed `is_test` diff --git a/src/adapters/analyzers/tq/tests/sut.rs b/src/adapters/analyzers/tq/tests/sut.rs index 6014be7..268adda 100644 --- a/src/adapters/analyzers/tq/tests/sut.rs +++ b/src/adapters/analyzers/tq/tests/sut.rs @@ -182,6 +182,86 @@ fn calls_prod_fn_known_only_via_declared_set() { ); } +#[test] +fn tq_no_sut_abc_triangulation() { + // A/B/C triangulation isolating the SUT mapper from macro blindness. + // A — plain call: has a SUT → must NOT fire TQ_NO_SUT (rule baseline). + // B — call only inside a comma-expr macro (`vec![target_b()]`): the SUT + // collector re-parses macro tokens as an expr list, so this IS seen → + // must NOT fire. (If it fired, TQ_NO_SUT would be a pure macro cascade.) + // C — genuinely no SUT: TQ_NO_SUT MUST fire (rule works). + let declared = vec![ + make_declared("target_a", false), + make_declared("target_b", false), + ]; + let warnings = detect_in( + r#" + #[test] + fn test_a() { assert_eq!(target_a(), 7); } + #[test] + fn test_b() { assert_eq!(vec![target_b()][0], 7); } + #[test] + fn test_c() { assert_eq!(1 + 1, 2); } + "#, + &declared, + "fn target_a() -> i32 { 7 } fn target_b() -> i32 { 7 }", + &[], + ); + let fired: Vec<&str> = warnings.iter().map(|w| w.function_name.as_str()).collect(); + assert_eq!( + fired, + vec!["test_c"], + "only test_c (no SUT) must fire; A=plain B=comma-macro are real SUT calls, got {fired:?}" + ); +} + +#[test] +fn tq_no_sut_satisfied_by_component_render_in_dsl_macro() { + // A UI test whose SUT call is rendering a component inside an `rsx!` DSL + // body (which parses as no structured exprs). The raw ident fallback must + // let the component count as the SUT, so NO_SUT does not fire — the sovard + // `live_log_is_a_polite_live_region` symptom. + let declared = vec![make_declared("LiveLogViewer", false)]; + let warnings = detect_in( + r#" + #[test] + fn live_log_renders() { + let _ = rsx! { div { class: "log", LiveLogViewer {} } }; + } + "#, + &declared, + "fn LiveLogViewer() {}", + &[], + ); + assert!( + warnings.is_empty(), + "rendering a component inside an rsx! DSL must satisfy SUT, got {warnings:?}" + ); +} + +#[test] +fn tq_no_sut_satisfied_by_repeat_macro_call() { + // Regression guard for the repeat-macro SUT blindness fixed in the + // macro-token sweep. The SUT mapper routes macro bodies through + // `recover_exprs` (in `test_references.rs`), whose block fallback handles the + // `;`-repeat form a comma-only parse missed. So `target()` inside + // `vec![target(); 1]` IS a real SUT call and TQ_NO_SUT must stay silent. + let declared = vec![make_declared("target", false)]; + let warnings = detect_in( + r#" + #[test] + fn test_repeat() { assert_eq!(vec![target(); 1][0], 7); } + "#, + &declared, + "fn target() -> i32 { 7 }", + &[], + ); + assert!( + warnings.is_empty(), + "target() called inside a vec![_; n] repeat macro is a real SUT call, got {warnings:?}" + ); +} + #[test] fn calls_fn_known_only_via_scope_functions() { // `scope_fn` is in the analysed scope but NOT in the declared set; the diff --git a/src/adapters/analyzers/tq/tests/untested.rs b/src/adapters/analyzers/tq/tests/untested.rs index d058a7a..df3c6a5 100644 --- a/src/adapters/analyzers/tq/tests/untested.rs +++ b/src/adapters/analyzers/tq/tests/untested.rs @@ -233,3 +233,22 @@ fn build_reaches_prod_set_seeds_prod_and_walks_callers_backward() { "a test fn reaching no prod code is excluded from the seed" ); } + +#[test] +fn full_call_graph_recovers_repeat_form_macro_call_edge() { + // `caller`'s only call to `helper` is inside a `vec![helper(); 3]` repeat + // macro. The comma-expr parse fails on the `;`; the shared recovery's block + // fallback must still produce a `caller → helper` edge so TQ-003 + // reachability flows through macro-wrapped calls. + let src = "fn helper() {} fn caller() { let _ = vec![helper(); 3]; }"; + let ast = syn::parse_file(src).expect("parse"); + let parsed = vec![("lib.rs".to_string(), src.to_string(), ast)]; + let graph = crate::adapters::analyzers::tq::build_full_call_graph(&parsed); + assert!( + graph + .get("caller") + .is_some_and(|callees| callees.iter().any(|c| c == "helper")), + "caller must have a macro-embedded edge to helper, got: {:?}", + graph.get("caller") + ); +} diff --git a/src/adapters/config/architecture.rs b/src/adapters/config/architecture.rs index cc9cc2f..6b2c27d 100644 --- a/src/adapters/config/architecture.rs +++ b/src/adapters/config/architecture.rs @@ -96,6 +96,12 @@ fn default_unmatched_behavior() -> String { "composition_root".to_string() } +/// Serde default for `allow_prelude_glob` — prelude globs are exempt unless a +/// project explicitly opts in to forbidding them. +pub(crate) fn default_true() -> bool { + true +} + /// `[architecture.layers.]` — path globs assigning files to a layer. #[derive(Debug, Deserialize, Clone)] #[serde(deny_unknown_fields)] @@ -182,6 +188,12 @@ pub struct SymbolPattern { /// Match glob imports (`use foo::*`) when set to true. #[serde(default)] pub forbid_glob_import: Option, + /// When `forbid_glob_import` is set, still allow idiomatic `*::prelude::*` + /// re-export globs (the common `std`/`dioxus`/`bevy` pattern — a `prelude` + /// segment at any depth). Set to `false` to forbid even prelude globs. + /// Default `true`. + #[serde(default = "default_true")] + pub allow_prelude_glob: bool, /// Escape hatch: raw regex applied to AST-aware source (comments/strings masked). #[serde(default)] pub regex: Option, diff --git a/src/adapters/config/tests/architecture.rs b/src/adapters/config/tests/architecture.rs index 29bee85..0835a5b 100644 --- a/src/adapters/config/tests/architecture.rs +++ b/src/adapters/config/tests/architecture.rs @@ -145,9 +145,33 @@ fn test_symbol_pattern_all_matchers_parse() { &vec!["Serialize".to_string()] ); assert_eq!(p.forbid_glob_import, Some(true)); + // Omitted → `default_true`: prelude globs are exempt unless opted out. + assert!( + p.allow_prelude_glob, + "allow_prelude_glob must default to true" + ); assert_eq!(p.regex.as_deref(), Some(r"some\s+pattern")); } +#[test] +fn forbid_glob_import_can_opt_out_of_prelude_exemption() { + let toml_str = r#" + [[pattern]] + name = "no-globs-at-all" + forbidden_in = ["src/**"] + forbid_glob_import = true + allow_prelude_glob = false + reason = "strict: even prelude globs are banned" + "#; + let c: ArchitectureConfig = toml::from_str(toml_str).unwrap(); + let p = &c.patterns[0]; + assert_eq!(p.forbid_glob_import, Some(true)); + assert!( + !p.allow_prelude_glob, + "explicit allow_prelude_glob = false must be honoured" + ); +} + #[test] fn test_symbol_pattern_allowed_in_alternative() { let toml_str = r#" @@ -319,3 +343,11 @@ fn test_call_parity_default_call_depth_is_3() { // connection. Direct test pins the value. assert_eq!(default_call_depth(), 3); } + +#[test] +fn test_allow_prelude_glob_default_is_true() { + // `#[serde(default = "default_true")]` is dispatched by dynamic string + // lookup, invisible to the call-graph-based untested check. Direct call + // pins the value (prelude globs exempt unless opted out). + assert!(default_true()); +} diff --git a/src/adapters/shared/macro_tokens.rs b/src/adapters/shared/macro_tokens.rs new file mode 100644 index 0000000..3805fba --- /dev/null +++ b/src/adapters/shared/macro_tokens.rs @@ -0,0 +1,163 @@ +//! Best-effort recovery of what a macro's opaque token stream references. +//! +//! `syn`'s visitor treats a macro body as an opaque [`proc_macro2::TokenStream`], +//! so calls, identifiers, and `self` uses inside `vec![…]`, `format!(…)`, +//! `rsx!{…}` etc. are invisible to every analyzer that only walks the AST. +//! This module is the single place that re-derives that information, so the +//! call-graph / cohesion / structural sites don't each reinvent a weaker copy +//! (the historical bug: ~7 independent `Punctuated` parses, each +//! blind to the `;`-repeat and block-bodied forms). +//! +//! Two levels, matching the two safe directions: +//! - [`recover_exprs`] — *structured*: returns real [`syn::Expr`]s, so callers +//! can run their normal expr visitors. Safe for everyone (no false positives, +//! only best-effort recall). The architecture `forbid_*` matchers use ONLY +//! this — a raw over-collecting fallback there would manufacture false +//! forbidden-call violations. +//! - [`idents_in_call_position`] — *raw, positional*: for reachability consumers +//! (dead-code, TQ) where a recovered call/component reference only ever +//! *suppresses* a finding, never raises a false one. It harvests idents in +//! call/construction position only, so DSL prop keys and locals don't pollute +//! the call set. +//! - [`tokens_reference_ident`] — *raw, presence*: SLM's yes/no "does this body +//! reference `self`" check. + +use proc_macro2::{Delimiter, TokenStream, TokenTree}; + +/// Best-effort extraction of the expressions embedded in a macro token stream. +/// Most macros accept comma-separated exprs (`assert!(a, b)`, `format!("{}", x)`), +/// but block-like bodies (`tokio::select! { … }`) and separator-`;` variants +/// (`vec![x; n]`) don't. Three strategies in order: +/// 1. Comma-separated `syn::Expr` list (covers ~90% of macro calls). +/// 2. Brace-wrapped parse as a `syn::Block` — extracts every statement +/// expression, covering block-bodied and `;`-separated forms. +/// 3. Single `syn::Expr` — for macros whose argument is one expression. +/// +/// Silent-skips (returns empty) on total parse failure (extern-DSL macros with +/// custom grammar) — recover their calls via [`idents_in_call_position`] where +/// the consumer can safely over-collect. +/// Operation: parser dispatch over fallback strategies, no own calls. +pub fn recover_exprs(tokens: &TokenStream) -> Vec { + use syn::parse::Parser; + use syn::punctuated::Punctuated; + use syn::Token; + let parser = Punctuated::::parse_terminated; + if let Ok(exprs) = parser.parse2(tokens.clone()) { + return exprs.into_iter().collect(); + } + let braced = quote::quote! { { #tokens } }; + if let Ok(block) = syn::parse2::(braced) { + return block.stmts.into_iter().flat_map(stmt_exprs).collect(); + } + if let Ok(expr) = syn::parse2::(tokens.clone()) { + return vec![expr]; + } + Vec::new() +} + +/// The expressions carried by a block statement (zero or more): a bare +/// expression; a `let`'s initialiser **plus** its `else { … }` diverging block +/// (let-else), so a call reachable only through the diverging branch is not +/// dropped; or a trailing-`;` macro statement (`format!(…);`) lifted back to an +/// `Expr::Macro` so nested macro calls inside a `;`-separated body survive. +/// `Stmt::Item` (e.g. a `const X = f();` nested in a macro body) is deliberately +/// dropped — an uncommon shape; its calls are recovered by the reachability +/// consumers' positional fallback. Operation: stmt-shape match, no own calls. +fn stmt_exprs(stmt: syn::Stmt) -> Vec { + match stmt { + syn::Stmt::Expr(e, _) => vec![e], + syn::Stmt::Local(l) => l + .init + .map(|init| { + let mut exprs = vec![*init.expr]; + if let Some((_, diverge)) = init.diverge { + exprs.push(*diverge); + } + exprs + }) + .unwrap_or_default(), + syn::Stmt::Macro(m) => vec![syn::Expr::Macro(syn::ExprMacro { + attrs: m.attrs, + mac: m.mac, + })], + _ => Vec::new(), + } +} + +/// Identifiers in `tokens` that sit in *call or construction position*, +/// recursing into nested groups (order unspecified): +/// - an `Ident` followed by a `(…)` group — a call (`helper(…)`), any case; +/// - an **UpperCamelCase** `Ident` followed by a `{…}` group — a struct literal +/// or DSL component render (`Component { … }`). +/// +/// The UpperCamelCase gate on brace-groups is what separates a component/struct +/// render from a lowercase DSL element tag (`div { … }`, `button { … }`): a +/// lowercase tag is never a Rust function/struct, so harvesting it could only +/// mask an unused `fn div()` — excluding it keeps the over-collection tight. A +/// real lowercase function is still captured via its `(…)` call form. +/// +/// This is the precise raw fallback for DSL macro bodies (`rsx!{ Component { +/// prop: x } }`) that [`recover_exprs`] cannot parse: it captures the +/// component/function reference while EXCLUDING prop keys, field names, locals, +/// type-path segments, element tags, and string contents — so a prop named like +/// a production function (`Widget { helper: "x" }`) does NOT spuriously mark +/// `helper` as called. The consumer intersects the result against known +/// function/component names. Operation: positional token-tree walk, no own calls. +pub fn idents_in_call_position(tokens: &TokenStream) -> impl Iterator { + let mut out = Vec::new(); + let mut stack: Vec = vec![tokens.clone()]; + while let Some(stream) = stack.pop() { + let trees: Vec = stream.into_iter().collect(); + for (i, tt) in trees.iter().enumerate() { + match tt { + TokenTree::Ident(id) if is_call_position(id, trees.get(i + 1)) => { + out.push(id.to_string()); + } + TokenTree::Group(g) => stack.push(g.stream()), + _ => {} + } + } + } + out.into_iter() +} + +/// Whether `id`, given the token immediately after it, is in call/construction +/// position: a `(…)` group is a call (any case); a `{…}` group counts only when +/// `id` is UpperCamelCase (a component/struct, not a lowercase element tag). +/// Operation: delimiter + casing predicate, no own calls. +fn is_call_position(id: &proc_macro2::Ident, next: Option<&TokenTree>) -> bool { + let Some(TokenTree::Group(g)) = next else { + return false; + }; + match g.delimiter() { + Delimiter::Parenthesis => true, + Delimiter::Brace => id + .to_string() + .chars() + .next() + .is_some_and(char::is_uppercase), + _ => false, + } +} + +/// True if `name` appears as an identifier anywhere in `tokens` (recursing into +/// groups). Never fails — for yes/no presence checks like SLM's "does this macro +/// body reference `self`", where parsing the body as exprs is both unnecessary +/// and fragile (e.g. `matches!(self, Variant(_))` won't parse as an expr list). +/// A spurious hit only errs toward "reference present", the safe direction for a +/// self-less-method check. Note: idents inside string literals (capture +/// interpolation like `format!("{self}")`) are NOT seen — `proc_macro2` lexes a +/// string as one `Literal`, not idents. Operation: token-tree presence scan. +pub fn tokens_reference_ident(tokens: &TokenStream, name: &str) -> bool { + let mut stack: Vec = vec![tokens.clone()]; + while let Some(stream) = stack.pop() { + for tt in stream { + match tt { + TokenTree::Ident(id) if id == name => return true, + TokenTree::Group(g) => stack.push(g.stream()), + _ => {} + } + } + } + false +} diff --git a/src/adapters/shared/mod.rs b/src/adapters/shared/mod.rs index e5480d9..6bc0739 100644 --- a/src/adapters/shared/mod.rs +++ b/src/adapters/shared/mod.rs @@ -10,6 +10,7 @@ pub mod declared_function; pub mod file_to_module; pub mod file_visitor; pub mod macro_expansion; +pub mod macro_tokens; pub mod normalize; pub mod project_scope; pub mod test_references; diff --git a/src/adapters/shared/test_references.rs b/src/adapters/shared/test_references.rs index 0317a7f..067e0e8 100644 --- a/src/adapters/shared/test_references.rs +++ b/src/adapters/shared/test_references.rs @@ -43,16 +43,21 @@ struct TestReferenceCollector { impl<'ast> Visit<'ast> for TestReferenceCollector { fn visit_macro(&mut self, node: &'ast syn::Macro) { if self.in_test_fn { - // Parse macro arguments (assert!, assert_eq!, vec!, etc.) as expressions - // to find references embedded inside macros. - use syn::punctuated::Punctuated; - if let Ok(args) = syn::parse::Parser::parse2( - Punctuated::::parse_terminated, - node.tokens.clone(), - ) { - args.iter() - .for_each(|expr| syn::visit::visit_expr(self, expr)); - } + // Macro bodies are opaque to syn's visitor; recover embedded exprs + // so SUT references inside assert!(), vec!(), `;`-repeat / block + // forms are seen as the test exercising production code. + crate::adapters::shared::macro_tokens::recover_exprs(&node.tokens) + .iter() + .for_each(|expr| syn::visit::visit_expr(self, expr)); + // Always also harvest call/construction-position idents. A test that + // exercises the SUT by rendering a component (`rsx!{ Component { .. } + // }`) uses struct syntax the structured visit parses but does not + // record as a call. Harvesting only call/construction position (not + // prop keys) keeps it tight; for SUT detection it can only suppress a + // false NO_SUT, never raise one. + self.current_calls.extend( + crate::adapters::shared::macro_tokens::idents_in_call_position(&node.tokens), + ); } syn::visit::visit_macro(self, node); } diff --git a/src/adapters/shared/tests/macro_tokens.rs b/src/adapters/shared/tests/macro_tokens.rs new file mode 100644 index 0000000..097ca23 --- /dev/null +++ b/src/adapters/shared/tests/macro_tokens.rs @@ -0,0 +1,211 @@ +//! Tests for the shared macro-token recovery helpers — the single place that +//! re-derives calls/idents/`self` from a macro's opaque token stream. + +use crate::adapters::shared::macro_tokens::{ + idents_in_call_position, recover_exprs, tokens_reference_ident, +}; +use proc_macro2::TokenStream; +use syn::visit::Visit; + +/// Parse a fragment of macro-body tokens (the stuff between the delimiters). +fn ts(src: &str) -> TokenStream { + src.parse().expect("token stream must lex") +} + +/// Collect free-call callee idents (`foo()` → "foo") from recovered exprs, so a +/// test can assert a macro-embedded call survived recovery. +fn call_idents(exprs: &[syn::Expr]) -> Vec { + #[derive(Default)] + struct V(Vec); + impl<'ast> Visit<'ast> for V { + fn visit_expr_call(&mut self, n: &'ast syn::ExprCall) { + if let syn::Expr::Path(p) = &*n.func { + if let Some(seg) = p.path.segments.last() { + self.0.push(seg.ident.to_string()); + } + } + syn::visit::visit_expr_call(self, n); + } + } + let mut v = V::default(); + exprs.iter().for_each(|e| v.visit_expr(e)); + v.0 +} + +// ── recover_exprs: the three escalating strategies ───────────────────── + +#[test] +fn recover_exprs_comma_list_form() { + // `assert_eq!(a(), b())` body — the common comma-separated expr list. + let calls = call_idents(&recover_exprs(&ts("a(), b()"))); + assert!( + calls.contains(&"a".to_string()) && calls.contains(&"b".to_string()), + "comma form must recover both calls: {calls:?}" + ); +} + +#[test] +fn recover_exprs_repeat_semicolon_form() { + // `vec![helper(); 3]` body — the historical blind spot: the `;` separator + // makes the comma-list parse fail; the braced-block fallback recovers it. + let calls = call_idents(&recover_exprs(&ts("helper(); 3"))); + assert!( + calls.contains(&"helper".to_string()), + "repeat (`;`) form must recover the call via the block fallback: {calls:?}" + ); +} + +#[test] +fn recover_exprs_repeat_form_recovers_nested_macro_stmt() { + // `vec![format!("inner"); 3]` body — the block fallback turns + // `format!("inner");` into a `Stmt::Macro`; it must be lifted back to an + // `Expr::Macro` so a nested macro call is not dropped. + let exprs = recover_exprs(&ts(r#"format!("inner"); 3"#)); + let has_format = exprs + .iter() + .any(|e| matches!(e, syn::Expr::Macro(m) if m.mac.path.is_ident("format"))); + assert!(has_format, "nested format! in a `;`-body must be recovered"); +} + +#[test] +fn recover_exprs_recovers_let_else_diverging_branch() { + // `let Some(x) = opt else { bail() };` — the block fallback parses this as a + // `Stmt::Local` with a let-else `diverge`. The bound initialiser (`opt`) is + // a path, but the diverging `else { bail() }` carries the only call; it must + // be recovered too, not dropped. + let calls = call_idents(&recover_exprs(&ts("let Some(x) = opt else { bail() };"))); + assert!( + calls.contains(&"bail".to_string()), + "a call in the let-else diverging branch must be recovered: {calls:?}" + ); +} + +#[test] +fn recover_exprs_single_expr_form() { + let calls = call_idents(&recover_exprs(&ts("only_call()"))); + assert!( + calls.contains(&"only_call".to_string()), + "single-expr form must recover the call: {calls:?}" + ); +} + +#[test] +fn recover_exprs_unparseable_dsl_returns_empty_without_panic() { + // A custom-DSL body that parses as none of expr-list / block / expr must + // yield an empty Vec rather than panic — reachability falls back to + // `idents_in_call_position`. `=> =>` is not valid in any of the three grammars. + let exprs = recover_exprs(&ts("=> => =>")); + assert!( + exprs.is_empty(), + "unparseable DSL must recover nothing, got {} exprs", + exprs.len() + ); +} + +// ── tokens_reference_ident: the SLM yes/no primitive ─────────────────── + +#[test] +fn tokens_reference_ident_finds_self_in_format_args() { + // The SLM bug: `format!("[{}]", self.name)` references self in macro args. + assert!( + tokens_reference_ident(&ts(r#""[{}]", self.name"#), "self"), + "self inside format! args must be found" + ); +} + +#[test] +fn tokens_reference_ident_recurses_into_nested_groups() { + assert!( + tokens_reference_ident(&ts("outer(inner(self))"), "self"), + "self nested in groups must be found" + ); +} + +#[test] +fn tokens_reference_ident_false_when_absent() { + assert!( + !tokens_reference_ident(&ts(r#""[{}]", other.name"#), "self"), + "must not find self when it is not referenced" + ); +} + +#[test] +fn tokens_reference_ident_ignores_string_literal_and_substring() { + // A `self` only inside a string literal is not an ident (proc_macro2 lexes + // it as one Literal), and `local_self` must not substring-match `self` — + // both are false, guarding SLM against this class of false positive. + assert!( + !tokens_reference_ident(&ts(r#""self is great""#), "self"), + "self only inside a string literal is not a reference" + ); + assert!( + !tokens_reference_ident(&ts("local_self.field"), "self"), + "must not substring-match `self` inside `local_self`" + ); +} + +// ── idents_in_call_position: the precise reachability fallback ────────── + +#[test] +fn idents_in_call_position_harvests_calls_and_component_renders() { + // `Component { child: Leaf(x) }`-style DSL: the component (UpperCamelCase + // ident + `{`) and the nested call `Leaf(x)` (ident + `(`) must surface... + let ids: Vec = idents_in_call_position(&ts("Component { child: Leaf(x) }")).collect(); + for name in ["Component", "Leaf"] { + assert!( + ids.iter().any(|i| i == name), + "call/construction-position ident `{name}` must be collected, got {ids:?}" + ); + } + // ...but a prop key (`child:`) and a bare argument (`x`) must NOT — this is + // the false-negative gap fix: a prop named like a production fn must not be + // recorded as a call. + for name in ["child", "x"] { + assert!( + !ids.iter().any(|i| i == name), + "non-call ident `{name}` must be excluded, got {ids:?}" + ); + } +} + +#[test] +fn idents_in_call_position_excludes_lowercase_dsl_element_tag() { + // `div { class: "log", Widget {} }` — `div` is a lowercase DSL element tag, + // not a component/struct, so it must NOT be harvested (else an unused + // `fn div()` would be masked by a UI tag). The UpperCamelCase `Widget` + // render is still captured. + let ids: Vec = + idents_in_call_position(&ts(r#"div { class: "log", Widget {} }"#)).collect(); + assert!( + !ids.iter().any(|i| i == "div"), + "lowercase element tag `div` must NOT be harvested, got {ids:?}" + ); + assert!( + ids.iter().any(|i| i == "Widget"), + "UpperCamelCase component `Widget` must still be harvested, got {ids:?}" + ); +} + +#[test] +fn idents_in_call_position_keeps_lowercase_call_form() { + // A lowercase *call* `helper()` (paren group) is still harvested regardless + // of case — only the brace-group construction form is case-gated. + let ids: Vec = idents_in_call_position(&ts("div { onclick: helper() }")).collect(); + assert!( + ids.iter().any(|i| i == "helper"), + "lowercase call `helper()` must still be harvested, got {ids:?}" + ); +} + +#[test] +fn idents_in_call_position_excludes_prop_named_like_a_function() { + // The exact Codex example: `Widget { helper: "x" }` — `helper` is a prop + // key, not a call. It must not be harvested, or a `fn helper` would be + // wrongly marked used/tested. + let ids: Vec = idents_in_call_position(&ts(r#"Widget { helper: "x" }"#)).collect(); + assert!(ids.iter().any(|i| i == "Widget"), "Widget render harvested"); + assert!( + !ids.iter().any(|i| i == "helper"), + "prop key `helper` must NOT be harvested, got {ids:?}" + ); +} diff --git a/src/adapters/shared/tests/mod.rs b/src/adapters/shared/tests/mod.rs index f87e93a..c7656c5 100644 --- a/src/adapters/shared/tests/mod.rs +++ b/src/adapters/shared/tests/mod.rs @@ -2,6 +2,7 @@ mod cfg_test; mod cfg_test_files; mod file_to_module; mod macro_expansion; +mod macro_tokens; mod normalize; mod normalize_coverage; mod scope;