What happened?
The PHP call-graph extractor misses all file-scope (top-level) calls, and include/require are not captured as dependencies. On procedural PHP codebases (where most code runs at file scope rather than inside functions/methods), this yields ~0% call-site coverage — calls edges and file→file dependency edges are almost entirely absent, even though the deterministic tree-sitter parse succeeds.
Root cause 1 — top-level calls dropped. In packages/core/src/plugins/extractors/php-extractor.ts, extractCallGraph() only records a call when it is lexically inside a function/method body:
// Extract call expressions
if (functionStack.length > 0) { // <-- file-scope calls never recorded
const caller = functionStack[functionStack.length - 1];
...push {caller, callee} ...
}
Any function_call_expression / member_call_expression / scoped_call_expression at file scope produces no entry. This is fine for OOP/modular code, but page-style procedural PHP (e.g. a controller/view file that calls helpers at the top level) ends up with an empty call graph.
Root cause 2 — include/require not mapped. extractStructure() maps imports only from namespace_use_declaration (PHP use). include / include_once / require / require_once — the actual dependency mechanism in non-namespaced/legacy PHP — are never turned into import/dependency edges. So file→file edges are also missing for these codebases.
The LLM file-analyzer layer does not recover this: per agents/file-analyzer.md (~line 145) it is told not to re-read source for code files, and it infers calls edges from the import map + neighbor symbols — both empty here.
Expected: file-scope calls represented in the call graph (e.g. with a synthetic file/module-scope caller), and include/require(_once) mapped to dependency edges.
Actual: 0 call edges and 0 include-based dependency edges for top-level procedural PHP files.
Minimal reproduction
Minimal PHP file (demo.php):
<?php
require_once __DIR__ . '/helpers.php'; // (A) not captured as a dependency edge
function helper() { return 1; }
helper(); // (B) top-level call — DROPPED
function caller() {
helper(); // (C) in-function call — captured
}
Run the bundled extractor (same engine file-analyzer uses):
import { TreeSitterPlugin, builtinLanguageConfigs } from '@understand-anything/core';
const p = new TreeSitterPlugin(builtinLanguageConfigs);
await p.init();
const src = fs.readFileSync('demo.php', 'utf8');
console.log(p.extractCallGraph('demo.php', src)); // -> only the (C) helper() call; (B) is missing
console.log(p.analyzeFile('demo.php', src).imports); // -> [] ; the require_once (A) is missing
Measured on a real procedural PHP CMS (~380 PHP files, ~84k lines), extractCallGraph per file vs grep ground-truth of call sites:
| File |
type |
real call sites (grep) |
captured |
coverage |
| page file (top-level heavy) |
view/controller |
~645 |
0 |
0% |
| another page file |
view/controller |
~1218 |
0 |
0% |
| helpers file (function defs) |
functions |
~959 |
737 |
~77% |
| theme helpers file |
functions |
~875 |
456 |
~52% |
For one heavily-used helper, 99/99 call sites in the two page files were invisible, while 57/58 in-function call sites were captured. So in-function extraction is healthy; the gap is specifically file-scope calls + include/require.
Suggested direction (optional)
- In
extractCallGraph, also record calls when functionStack is empty, attributing them to a synthetic file/module-scope caller (e.g. <file> or the module node), so impact analysis can answer "who calls X" for procedural code.
- In the PHP extractor, recognize
include / include_once / require / require_once expressions and emit import/dependency edges (resolving the path argument where statically determinable).
Happy to help test against the procedural codebase if useful.
Plugin version
2.7.5 (main @ HEAD, cloned 2026-06-02; tree-sitter-php@0.23.12)
Platform / client
Claude Code (CLI) — reproduced by invoking the bundled @understand-anything/core TreeSitterPlugin.extractCallGraph directly (the same path file-analyzer uses via extract-structure.mjs).
OS + Node version
Windows 11 (x64), Node v24.2.0, pnpm 10.33.4
Primary language of the analyzed project
PHP (legacy/procedural — non-namespaced, include/require-based)
Approximate file count
~380 PHP files (~84k lines)
What happened?
The PHP call-graph extractor misses all file-scope (top-level) calls, and
include/requireare not captured as dependencies. On procedural PHP codebases (where most code runs at file scope rather than inside functions/methods), this yields ~0% call-site coverage —callsedges and file→file dependency edges are almost entirely absent, even though the deterministic tree-sitter parse succeeds.Root cause 1 — top-level calls dropped. In
packages/core/src/plugins/extractors/php-extractor.ts,extractCallGraph()only records a call when it is lexically inside a function/method body:Any
function_call_expression/member_call_expression/scoped_call_expressionat file scope produces no entry. This is fine for OOP/modular code, but page-style procedural PHP (e.g. a controller/view file that calls helpers at the top level) ends up with an empty call graph.Root cause 2 — include/require not mapped.
extractStructure()maps imports only fromnamespace_use_declaration(PHPuse).include/include_once/require/require_once— the actual dependency mechanism in non-namespaced/legacy PHP — are never turned into import/dependency edges. So file→file edges are also missing for these codebases.The LLM
file-analyzerlayer does not recover this: peragents/file-analyzer.md(~line 145) it is told not to re-read source for code files, and it inferscallsedges from the import map + neighbor symbols — both empty here.Expected: file-scope calls represented in the call graph (e.g. with a synthetic file/module-scope caller), and
include/require(_once)mapped to dependency edges.Actual: 0 call edges and 0 include-based dependency edges for top-level procedural PHP files.
Minimal reproduction
Minimal PHP file (
demo.php):Run the bundled extractor (same engine
file-analyzeruses):Measured on a real procedural PHP CMS (~380 PHP files, ~84k lines),
extractCallGraphper file vsgrepground-truth of call sites:For one heavily-used helper, 99/99 call sites in the two page files were invisible, while 57/58 in-function call sites were captured. So in-function extraction is healthy; the gap is specifically file-scope calls + include/require.
Suggested direction (optional)
extractCallGraph, also record calls whenfunctionStackis empty, attributing them to a synthetic file/module-scope caller (e.g.<file>or the module node), so impact analysis can answer "who calls X" for procedural code.include/include_once/require/require_onceexpressions and emit import/dependency edges (resolving the path argument where statically determinable).Happy to help test against the procedural codebase if useful.
Plugin version
2.7.5 (main @ HEAD, cloned 2026-06-02;
tree-sitter-php@0.23.12)Platform / client
Claude Code (CLI) — reproduced by invoking the bundled
@understand-anything/coreTreeSitterPlugin.extractCallGraphdirectly (the same pathfile-analyzeruses viaextract-structure.mjs).OS + Node version
Windows 11 (x64), Node v24.2.0, pnpm 10.33.4
Primary language of the analyzed project
PHP (legacy/procedural — non-namespaced,
include/require-based)Approximate file count
~380 PHP files (~84k lines)