🐛 Keep Scoutbot on local authority by arach · Pull Request #178 · arach/openscout

arach · 2026-06-01T16:41:48Z

Summary

keep node-local product agents like scoutbot from being overwritten by peer mesh sync
make Scoutbot bootstrap refresh registration when ownership drifts away from the local node
clear stale local-session endpoint failure metadata and mark warmed sessions idle so Scoutbot can run again

Verification

bun test packages/web/server/scoutbot/runner.test.ts
bun test packages/runtime/src/broker-daemon.test.ts --test-name-pattern "keeps node-local scoutbot authority"
bun test packages/runtime/src/local-agents.test.ts --test-name-pattern "session warmup|warmed local session"
npm --prefix packages/runtime run check
manual web-runner smoke: /api/send to Scoutbot completed flight flt-mpvek2j5-9t1uhl with output OK

SCO-059 session-knowledge exploration as an interactive studio surface. Six-stage pipeline (discover → drilldown) walking a real Codex or Claude session through preparation. Discover scans ~/.codex/sessions and ~/.claude/projects against the actual filesystem; Normalize opens the selected JSONL and parses the head into uniform records. Introduces a studio-internal primitive: Command<I, O> + runCommand with per-input TTL caching, plus a CommandSurface chrome component owning copyable shell line, ran/cached badge, and output frame. Renderer registry is parked for the next consumer. Also lands the parallel KnowledgeSearchScreen sketch on the web product side for future cross-reference.

Remove Extract/Index/Query/Drilldown panels and the WeekBudgetFooter: their contents were synthesized hand-coded outputs, not real runs of the described work. Per direction to "not do anything fake", the page now only shows stages with real implementations behind them: - Discover: real filesystem scan of ~/.codex/sessions and ~/.claude/projects - Normalize: real JSONL parse of the selected session, with raw → normalized inspect view Pipeline strip narrows to 2 chips. ~700 lines of dead illustration code removed (StudySelection narrows to {sessionId, stageId}, helpers gone). Also unwraps Codex response_item / event_msg / turn_context wrappers in the normalizer so the stream shows real user_turn / assistant_turn / command_or_tool / observation kinds instead of opaque system_record.

Real Extract stage on the session-search workbench. Selecting Extract on a real session runs one extraction end-to-end and writes 6 files to $TMPDIR/scout-study/qmd/<session>/: Mechanical pass (no LLM, iterates normalized records): - manifest.json — source path, harness, recordsScanned, bytesRead - files.md — paths pulled from tool args, grouped by hits + tools touched - tool-calls.md — counts per tool name + first 30 invocations - events-NNN.md — windowed event lines with source-ordered indices LLM pass (one MiniMax-M2 call per session, cached for 1h): - overview.md — what the session was about (2 paragraphs) - decisions.md — decisions made + open follow-ups - _llm-call.json — model, usage, latency for inspection Adds the studio command primitive's first real LLM consumer. Secret access via lib/secrets.ts shells to `secret get` (keychain) so no env files touched. MiniMax client in lib/llm/minimax.ts returns content + reasoning + structured usage so the panel can surface real cost. ExtractPanel renders the file list (mech / llm tagged), live preview of the selected artifact, and a footnote with real disk path + timings + token counts. Pipeline strip now shows Discover → Normalize → Extract, all real.

Every page render now ends with a run trace showing what just happened underneath: one row per command (inventory, parse-session, extract-qmd), with wall time, cached status (with "saved ~Nms" callout), and the LLM model + token breakdown for the entries that ran a model. Header summarizes the request as a whole — wall ms (cached entries contribute 0), number cached and ms saved, cumulative LLM tokens, rough USD cost using a per-model rate table (MiniMax-M2 in for now). Page handler now orchestrates all commands explicitly so the run log can be built before rendering. NormalizePanel and ExtractPanel become presentational (receive their CommandRun as a prop). This is the "what's happening underneath" the workbench needed — when a fresh session triggers a real LLM call you see Extract QMD · ran · 7385 ms · MiniMax-M2 · 1002+681t, and on re-visit you see cached (saved ~Nms) with the original cost visible so you understand what the cache bought.

The page handler now only awaits the cheap inventory before returning. Everything panel-dependent (parse, extract, enrich) moves into an async <StageBody> wrapped in <Suspense>. When you click a session pill or a pipeline chip, chrome (header, picker, pipeline strip, stage header, prev/next nav) renders instantly while the stage body shows a structured in-flight skeleton. Skeleton lists the commands that will run for the active stage with expected timing — Inventory · Parse session · Extract QMD · Enrich (LLM). For Enrich the user immediately sees "first run: 5–15 s" so the wait is explained, not opaque. Suspense boundary keyed on session+stage so each click resets the boundary cleanly. RunSummary moves inside StageBody so its log is built after all panel commands resolve. This pairs with the existing run trace footer: skeleton = expected sequence, trace = what actually ran.

Three workbench polish moves prompted by usage: 1. ArtifactPicker client island — Extract / Enrich panels pre-read every artifact's content server-side and hand the array to a "use client" component that swaps preview body via local state. URL stays in sync via router.replace (pushState) so links are shareable, but clicking a file does not navigate, does not refetch, does not flash the chrome. Verified zero document loads + zero HTTP doc requests on click. 2. Force re-run — runCommand learns an optional { force: true } that bypasses the cache lookup. URL ?force=<stageId> threads it through to the active stage; CommandSurface gains a "re-run ↻" link in the command header. Lets you see Extract / Normalize / Enrich actually run instead of always reading "cached". 3. Drop "(mechanical)" labels now that Enrich is split out — Extract's single output kind speaks for itself. Suspense boundary key includes the force param so a re-run triggers the in-flight skeleton properly.

- parseSessionCommand reads from the head in 64 KiB chunks until it has `limit` newlines (or EOF). Previously it read one 128 KiB buffer, which capped any session at whatever fit in that window — Normalize at limit=14 was always ~20 ms regardless of source size. - Lift Normalize's record limit from 14 to 1500, matching Extract / Enrich so all three share the parse-session cache key. Force-re-running Normalize now shows a meaningful workload (~150–340 ms on codex-large parsing 1500 records). - NormalizedStreamBody caps the visible stream at 30 rows + a "N parsed but hidden · M not parsed" tail so the page stays compact. - Force handling expands: ?force=all bypasses every command's cache in the active pipeline; ?force=<stageId> still works for a single stage. Inventory honors force too via the discover stage id. - Three new re-run affordances: * "re-run all ↻" link in the page header (force=all) * "re-run ↻" already exists in each CommandSurface header * per-row "re-run ↻" links in the run trace footer, mapping command id to its stage so you can rerun any single step from the trace Suspense key includes the force param so the in-flight skeleton appears on force-rerun.

Two responsiveness wins: 1. Run-in-place — RerunLink Client Component wraps every re-run affordance (header "re-run all", in-CommandSurface "re-run", per-row trace re-runs). It uses useTransition so the previous UI stays mounted while the navigation streams in. The link swaps its label to "running ↻", gets aria-busy=true, and pulses (animate-pulse + status-info-fg) while the request is in flight. The stage Suspense key drops the force param so force-rerun navigation no longer unmounts the panel — old data stays visible until the new render lands. Session / stage switches still show the skeleton (key still changes on those). 2. Trace inspect — each run-trace row is now a <details> that expands to show the actual shell-equivalent, input summary, output summary, and resolved cache key for that command's run. makeRunLogEntry gains summarizeInput / summarizeOutput hooks so each command projects its inputs and outputs into one-line strings for the inspect drawer.

All three command inputs (parse-session, extract-qmd, enrich-session) now accept limit: number | \"all\". Defaults flip to \"all\" so every session is parsed end-to-end, not capped at a 1500-record head slice. readHeadLines streams from the file until EOF when the limit is Infinity, so memory stays bounded by the file size rather than a fixed buffer. parseSessionCommand.shell switches to \`cat …\` when limit is \"all\" so the copyable shell line stays honest. Measured on a fresh process: - codex-large (13 MiB, 4,220 events) normalize force: ~620 ms - claude-large (52 MiB, 12,009 events) normalize force: ~5.3 s The NormalizedStreamBody display cap of 30 rows still applies; tail text now reflects \"parsed but hidden\" rather than \"not parsed\" once nothing is being skipped.

…x.db Pipeline gains a fifth stage that turns the QMD sidecar files into a real, queryable data store. Selecting Index on a session walks every $TMPDIR/scout-study/qmd/<session>/ directory, splits each markdown file into H2 sections, and writes rows into a better-sqlite3 db with FTS5. Schema: - sessions (id, harness, indexed_at) - documents (id, session_id, kind, path, bytes) - chunks (id, document_id, ordinal, source_ref, text) - chunks_fts (virtual FTS5 over chunks.text, with triggers for sync) Tried `bun --bun next dev` first for bun:sqlite; it ships but breaks the client Router with "Router action dispatched before initialization" errors, so the RerunLink useTransition pattern fails. Reverted to `next dev` + better-sqlite3 — boring, works, FTS5 included. IndexPanel shows the schema with live row counts, plus a "this session" breakdown so you can see what just landed. Run trace footer treats index-corpus as a regular command with input / output summaries and a re-run link. Verified end to end with a sqlite3 CLI snippet from outside the app: - 1.17 MB db file on disk after one session - chunks_fts MATCH 'VOX' returns real source-anchored snippets from decisions.md and events-001.md

New /studies/data surface that makes the shape of the session-search index legible: schema, shortcuts deck, unified MATCH/SELECT Query card with a strategy registry, and an Ask field that turns a question into FTS5 hits via local stopword stripping — sub-millisecond, no proprietary tokeniser in the loop. Same Ask surface also lands as Step 6 in the session-search pipeline workbench with schema-aware suggestion chips.

vercel · 2026-06-01T16:41:55Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
openscout	Skipped		Jun 1, 2026 5:04pm

arach added 13 commits May 30, 2026 14:41

✨ Add Studio session search enrichment

4713373

🐛 Keep Scoutbot on local authority

faa7264

🔀 Merge main into Scoutbot fix PR

390412a

vercel Bot deployed to Preview June 1, 2026 16:45 View deployment

🧹 Cut over conversation identity metadata

7506003

vercel Bot temporarily deployed to Preview June 1, 2026 17:04 Inactive

arach merged commit 1f84868 into main Jun 1, 2026
3 checks passed

arach deleted the codex/search-workbench-studio-view branch June 1, 2026 17:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 Keep Scoutbot on local authority#178

🐛 Keep Scoutbot on local authority#178
arach merged 15 commits into
mainfrom
codex/search-workbench-studio-view

arach commented Jun 1, 2026

Uh oh!

vercel Bot commented Jun 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

arach commented Jun 1, 2026

Summary

Verification

Uh oh!

vercel Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Jun 1, 2026 •

edited

Loading