feat(monitor+scoring): live dashboard counters + store discarded jobs by bjridicodes · Pull Request #84 · bayrem/AJSAA

bjridicodes · 2026-06-02T21:46:32Z

Summary

Live dashboard: duration counter, spinning node arrow, per-node live timer, jobs treated column, discarded jobs section
Scoring pipeline: LLM now scores all jobs (not just passing ones), score_jobs_batch returns (passed, discarded) tuple, discarded jobs stored to .data/discarded_jobs.jsonl and shown in dashboard with real scores and reasoning
Also includes softer scoring prompt (transferable experience counts, hard blocks vs soft gaps)

Live dashboard changes

Duration — increments live via requestAnimationFrame using run_start_time from state
Spinning arrow — CSS @keyframes spin applied to ⟳ on running node
Per-node timer — counts while node runs using node_start_times from graph; shows final time on complete
Jobs column — shows treated count for search_jobs, search_companies, analyze_jobs, store_results
Discarded section — below stored jobs, shows all sub-threshold jobs with their real LLM score and one-line reason

Test plan

pytest tests/ — 57 tests pass
Run infisical run --env=dev -- .venv/bin/python run.py and open http://127.0.0.1:8765 to confirm live counters work
After run: verify query/jobs_discarded.jsonl and .data/discarded_jobs.jsonl exist with scored entries

🤖 Generated with Claude Code

Two bugs found during the first full pipeline run after the search milestone: 1. generate_queries self-loop: _needs_generate_queries checked state["raw_queries"] but the node writes state["queries"], so the router always saw an empty list and looped. Replaced the conditional self-edge with a direct edge to search_jobs. Also fixed the cache-hit path which read from state["raw_queries"] (always []) instead of the queries file. 2. anthropic_web search returned nothing: allow_tools: false in config was applied globally, including to the search LLM. The Claude CLI needs --dangerously-skip-permissions to invoke web-search tools. Factory now overrides allow_tools=True when task="search". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Drop the redundant "→" link column; the run ID cell now carries the href so the table is one column narrower and every run is still one click away. Fixed the test that checked for the old trailing link cell pattern. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

_node_row_html computed total_tokens = in + out, dropping cache_read and cache_creation. This caused the pipeline execution table to show lower numbers than the grand total, making the two sections appear inconsistent. Now matches _usage_row_html which already counted all four buckets. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

build_llm(cfg["llm"]) without a task arg defaults to task="default" which resolves allow_tools=False. Company searches with url: hints invoke the Claude CLI and need --dangerously-skip-permissions to browse the web — same fix as search_jobs already had. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Grand total line now shows '≈Xk effective compute' in green when cache tokens are present (formula: new_in + out + 0.1×cache_read), making it clear that high cache-read is efficient, not wasteful. Per-node pipeline table replaces the single token total with 'Xin / Yout' and adds '/ Zcached' (green) when cache-read tokens are present for that node. Same change applied to the live-page JS so the live view is consistent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ctive Three connected fixes to improve search volume and quality: 1. url_validator: drop known aggregator/listing-page URL patterns before hitting Tavily (builtin.com, hnhiring, arc.dev listing pages, etc.). These pass URL validation because Tavily can fetch them, but they're search-result category pages, not individual postings — scoring rejects them at near-100% rate, wasting Tavily extract quota. 2. search_jobs: raise _DIRECTIVE_TARGET 30→50 and _DIRECTIVE_LLM_MAX 50→80 to target 30-50 validated individual postings per run. 3. web_search SEARCH_DIRECTIVE: instruct the LLM to search each major job board (WTTJ, LinkedIn /jobs/view, Lever, Greenhouse, Ashby, Workday) with dedicated queries rather than relying on broad web results. Explicitly forbid listing/search pages in the FORBIDDEN block so the LLM understands what counts as an individual posting. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…s in query/

Prompts extracted from web_search.py now live in: query/SEARCH_DIRECTIVE_PROMPT.md — directive (jobs search, URL candidates) query/SEARCH_COMPANY_PROMPT.md — company single-query search Edit these files to tune search behaviour without touching Python code. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… fallback Primary path uses linkedin-api (unofficial voyager API, email/password auth). Falls back transparently to stickerdaniel/linkedin-mcp-server (Patchright browser) on any primary failure. MCP server installed under mcp_servers/ and ignored by git. - providers/search/connectors/linkedin.py: full implementation replacing stub - requirements.txt: add linkedin-api>=2.3.1, mcp>=1.0.0 - config/config.yaml: enable linkedin connector, max_concurrent: 1 - agent/nodes/search_jobs.py: add linkedin to _DEFAULT_MAX_CONCURRENT - tests/test_linkedin_connector.py: 19 unit tests, all passing - .gitignore: exclude mcp_servers/ (third-party local installs) One-time setup required: cd mcp_servers/linkedin-mcp-server && uv run -m linkedin_mcp_server --login Secrets needed in Infisical: LINKEDIN_EMAIL, LINKEDIN_PASSWORD Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

mypy cannot infer the Linkedin type from a lazy import inside a method, so it types self._client as None and flags .search_jobs() as attr-defined. The ignore is safe: the assignment on the line above guarantees non-None. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Live dashboard: - Duration counter increments live via JS requestAnimationFrame loop - Running node arrow (⟳) now rotates with CSS spin animation - Per-node time column counts live while node is active, shows final on complete - Jobs treated column (search_jobs, search_companies, analyze_jobs, store_results) - Discarded jobs section shows below stored jobs with real scores and reasoning Scoring pipeline: - LLM now scores ALL jobs instead of omitting low scorers - score_jobs_batch() returns (passed, discarded) tuple - discarded_jobs added to AgentState and live snapshots - analyze_jobs writes query/jobs_discarded.jsonl per run - store_results appends new discarded jobs to .data/discarded_jobs.jsonl (deduped by URL) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…column Both conflicts were main reverting the pipeline table to the old 5-column format (no Jobs column). Kept the HEAD version with the Jobs column and jobs_str logic intact. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The merge-base changed after approval.

+            for line in f:
+                try:
+                    existing_urls.add(json.loads(line).get("url", ""))
+                except json.JSONDecodeError:


score_jobs_batch return type changed to tuple in the scoring refactor; scoring_baseline.py was still treating the result as a plain list. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

bjridicodes and others added 11 commits May 19, 2026 22:05

fix(gitignore): track SEARCH_DIRECTIVE and SEARCH_COMPANY prompt file…

ab2c014

…s in query/

bayrem previously approved these changes Jun 2, 2026

View reviewed changes

github-advanced-security AI found potential problems Jun 2, 2026

View reviewed changes

Comment thread agent/nodes/store_results.py

for line in f:

try:

existing_urls.add(json.loads(line).get("url", ""))

except json.JSONDecodeError:

fix(baseline): unpack (passed, discarded) tuple from score_jobs_batch

0ac6872

score_jobs_batch return type changed to tuple in the scoring refactor; scoring_baseline.py was still treating the result as a plain list. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

bjridicodes dismissed bayrem’s stale review via 0ac6872 June 2, 2026 21:53

bayrem approved these changes Jun 2, 2026

View reviewed changes

bayrem merged commit 13a756d into main Jun 2, 2026
5 checks passed

bjridicodes mentioned this pull request Jun 2, 2026

docs(readme): update for LinkedIn, discarded jobs, live monitor features #85

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(monitor+scoring): live dashboard counters + store discarded jobs#84

feat(monitor+scoring): live dashboard counters + store discarded jobs#84
bayrem merged 13 commits into
mainfrom
fix/graph-routing-and-search-tools

bjridicodes commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bjridicodes commented Jun 2, 2026

Summary

Live dashboard changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants