Skip to content

fix(php-extractor): capture file-scope calls + map include/require as imports (#367)#400

Open
tirth8205 wants to merge 1 commit into
Egonex-AI:mainfrom
tirth8205:fix/php-extractor-file-scope-and-includes
Open

fix(php-extractor): capture file-scope calls + map include/require as imports (#367)#400
tirth8205 wants to merge 1 commit into
Egonex-AI:mainfrom
tirth8205:fix/php-extractor-file-scope-and-includes

Conversation

@tirth8205

Copy link
Copy Markdown
Contributor

Summary

Fixes two extractor bugs that caused ~0% call-graph + import coverage on procedural (page-style) PHP — the dominant style outside Composer/PSR-4 projects (#367).

Bug A — top-level calls dropped. extractCallGraph() only recorded a call when functionStack.length > 0, so any function_call_expression / member_call_expression / scoped_call_expression at file scope produced no entry. On procedural PHP, that meant the call graph was empty.

Fix: when functionStack is empty, attribute the call to a synthetic <file> caller (a stable, language-neutral identifier) so file→callee edges are still emitted.

Bug B — include / require not mapped. extractStructure() only looked at namespace_use_declaration for imports. include, include_once, require, and require_once — the file→file dependency mechanism for non-namespaced / legacy PHP — were ignored.

Fix: recognise expression_statement children whose inner node is include_expression / include_once_expression / require_expression / require_once_expression. Walk the path argument:

  • String literal ('config.php', "lib/util.php") → captured as-is.
  • Magic-constant concatenation (__DIR__ . '/helpers.php') → reconstructed by recursing into binary_expression and preserving __DIR__ / __FILE__ verbatim so the consumer can resolve relative to the current file.
  • Non-resolvable expressions (variables, function calls) → emitted with the raw argument text so the dependency edge isn't silently lost.

Test plan

All 35 tests in php-extractor.test.ts pass (29 pre-existing + 6 new).

New tests cover:

  • attributes top-level calls to the synthetic <file> caller (#367)main(), $obj->run(), Helper::go() at file scope all captured under <file>.
  • mixes file-scope and in-function calls in a single file (#367)<file>-scope bootstrap() + handleRequest() and in-function process_input() both appear with correct callers.
  • captures include/require with a plain string literal as imports (#367) — all four variants (include, include_once, require, require_once) produce import entries.
  • resolves __DIR__ . '/path' concatenation in require_once (#367) — produces source "__DIR__/helpers.php".
  • captures include/require alongside use statements (#367) — coexists with namespace_use_declaration extraction.
  • records non-resolvable include path as raw text (#367)include $path; still emits an entry.

Existing behaviour (in-function calls, instance/static method calls, use imports, grouped/aliased use, block-scoped namespaces, return types, exports) is preserved — the original test that asserted top-level calls were silently dropped was updated to reflect the new (correct) behaviour.

Verification commands run:

  • pnpm test --root understand-anything-plugin/packages/core php-extractor → 35/35 pass, 0 type errors (--typecheck).
  • pnpm lint → clean.

Closes #367.

… imports (Egonex-AI#367)

Two bugs in the PHP extractor caused ~0% call-graph + import coverage on
procedural (page-style) PHP:

Bug A: extractCallGraph() only recorded calls when functionStack.length > 0,
silently dropping every function_call_expression / member_call_expression /
scoped_call_expression at file scope. Top-level calls (the norm in
procedural PHP) now flow into the graph attributed to a synthetic
"<file>" caller so file→callee edges are preserved.

Bug B: extractStructure() only mapped namespace_use_declaration ("use")
to imports, leaving include / include_once / require / require_once
expressions invisible. These now emit import entries with the path
captured. Plain string literals resolve to their content; __DIR__ /
__FILE__ concatenations are reconstructed by walking the binary_expression
and preserving the magic constant verbatim so the consumer can resolve it
relative to the current file. Non-resolvable arguments (variables, etc.)
still emit an edge with the raw argument text so the dependency isn't
silently lost.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: PHP call graph drops file-scope (top-level) calls; include/require not mapped → ~0% coverage on procedural PHP

1 participant