fix(pipeline): graph routing infinite loop + search allow_tools by bjridicodes · Pull Request #82 · bayrem/AJSAA

bjridicodes · 2026-05-19T22:05:15Z

Summary

Infinite loop in generate_queries: _needs_generate_queries routed based on state["raw_queries"], but the node writes state["queries"] — so the router always saw an empty list and looped back. Replaced the conditional self-edge with a direct add_edge("generate_queries", "search_jobs"). Also fixed the cache-hit path which was reading from state["raw_queries"] (always [] on first run) instead of the queries file directly.
anthropic_web search returned nothing: allow_tools: false in config.yaml was applied globally via the factory, including to the search LLM. The Claude CLI needs --dangerously-skip-permissions to invoke its web-search tool. build_llm() now overrides allow_tools=True when task="search", regardless of config.

Test plan

205 tests pass, ruff + mypy clean
Full pipeline run: infisical run --env=dev -- python run.py — generate_queries advances to search_jobs without looping; anthropic_web search returns URL candidates

🤖 Generated with Claude Code

Two bugs found during the first full pipeline run after the search milestone: 1. generate_queries self-loop: _needs_generate_queries checked state["raw_queries"] but the node writes state["queries"], so the router always saw an empty list and looped. Replaced the conditional self-edge with a direct edge to search_jobs. Also fixed the cache-hit path which read from state["raw_queries"] (always []) instead of the queries file. 2. anthropic_web search returned nothing: allow_tools: false in config was applied globally, including to the search LLM. The Claude CLI needs --dangerously-skip-permissions to invoke web-search tools. Factory now overrides allow_tools=True when task="search". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Drop the redundant "→" link column; the run ID cell now carries the href so the table is one column narrower and every run is still one click away. Fixed the test that checked for the old trailing link cell pattern. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

_node_row_html computed total_tokens = in + out, dropping cache_read and cache_creation. This caused the pipeline execution table to show lower numbers than the grand total, making the two sections appear inconsistent. Now matches _usage_row_html which already counted all four buckets. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

build_llm(cfg["llm"]) without a task arg defaults to task="default" which resolves allow_tools=False. Company searches with url: hints invoke the Claude CLI and need --dangerously-skip-permissions to browse the web — same fix as search_jobs already had. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Grand total line now shows '≈Xk effective compute' in green when cache tokens are present (formula: new_in + out + 0.1×cache_read), making it clear that high cache-read is efficient, not wasteful. Per-node pipeline table replaces the single token total with 'Xin / Yout' and adds '/ Zcached' (green) when cache-read tokens are present for that node. Same change applied to the live-page JS so the live view is consistent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ctive Three connected fixes to improve search volume and quality: 1. url_validator: drop known aggregator/listing-page URL patterns before hitting Tavily (builtin.com, hnhiring, arc.dev listing pages, etc.). These pass URL validation because Tavily can fetch them, but they're search-result category pages, not individual postings — scoring rejects them at near-100% rate, wasting Tavily extract quota. 2. search_jobs: raise _DIRECTIVE_TARGET 30→50 and _DIRECTIVE_LLM_MAX 50→80 to target 30-50 validated individual postings per run. 3. web_search SEARCH_DIRECTIVE: instruct the LLM to search each major job board (WTTJ, LinkedIn /jobs/view, Lever, Greenhouse, Ashby, Workday) with dedicated queries rather than relying on broad web results. Explicitly forbid listing/search pages in the FORBIDDEN block so the LLM understands what counts as an individual posting. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…s in query/

Prompts extracted from web_search.py now live in: query/SEARCH_DIRECTIVE_PROMPT.md — directive (jobs search, URL candidates) query/SEARCH_COMPANY_PROMPT.md — company single-query search Edit these files to tune search behaviour without touching Python code. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

bjridicodes and others added 8 commits May 19, 2026 22:05

fix(gitignore): track SEARCH_DIRECTIVE and SEARCH_COMPANY prompt file…

ab2c014

…s in query/

bayrem approved these changes May 20, 2026

View reviewed changes

bayrem merged commit 872aab3 into main May 20, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(pipeline): graph routing infinite loop + search allow_tools#82

fix(pipeline): graph routing infinite loop + search allow_tools#82
bayrem merged 8 commits into
mainfrom
fix/graph-routing-and-search-tools

bjridicodes commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bjridicodes commented May 19, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants