Skip to content

test(examples-chat): aimock research-subagent scenario — Phase 2d#330

Merged
blove merged 1 commit into
mainfrom
claude/aimock-subagent-panel
May 15, 2026
Merged

test(examples-chat): aimock research-subagent scenario — Phase 2d#330
blove merged 1 commit into
mainfrom
claude/aimock-subagent-panel

Conversation

@blove
Copy link
Copy Markdown
Contributor

@blove blove commented May 15, 2026

Summary

Adds an end-to-end Playwright scenario for the research subagent flow through the harness:

  • Parent LLM emits a tool_call to research(topic="...")
  • LangGraph dispatches the research subgraph
  • Subagent's LLM call returns a focused factual summary
  • Parent receives the tool result and composes a final answer
  • Test asserts: a "research" tool-call chip is in the DOM AND subagent-emitted content phrases (standalone components / NgModule) surface in the conversation body

The fixture is captured from real gpt-5-mini for the canonical demo prompt — three LLM calls in a single JSON file (parent first call, subagent, parent continuation).

Sits on Phase 2c (#322).

Test plan

  • New spec passes solo locally: 1/1 in ~1.0s after warm-up
  • Full suite passes 3/3 consecutive runs (with port cooldown between)
  • Runner unit tests still pass (3/3, including the directory-mode test)
  • No production code touched
  • CI green

Notes for reviewers

Multi-turn matching — the real change here

Without this PR, the parent LLM loops up to the langgraph recursion limit (we saw ~17 iterations, ~2.8min wall-clock per suite run). The reason: aimock's default userMessage-only match returns the same response on every parent call, including the post-tool-round continuation that should emit text instead of another tool_call.

The fixture uses match.hasToolResult: true (aimock's FixtureFileEntry.match discriminator) on the continuation entry — order-sensitive, listed BEFORE the first-call entry — so aimock returns the captured final-answer text once the parent has a tool result in its history. Loop is broken; test runs in ~1s.

The runner switched from mock.onMessage(userMessage, response) to mock.addFixturesFromJSON(entries) so the richer match shapes (hasToolResult, toolName, toolCallId, turnIndex) survive the load path. Existing single-userMessage fixtures keep working unchanged.

Single-<chat-subagents>-panel assertion intentionally NOT included

The smoke checklist says the <chat-subagents> panel renders while the subagent is running. With aimock the subagent completes essentially instantly, so the panel exists for a sub-millisecond window — not catchable without artificial latency. The chosen assertions (tool-call chip + content phrases) catch the same regression class without timing fragility.

…n loop

Captures parent (tool_call) + subagent (summary) + parent continuation
(final text) into a single fixture file. The continuation entry uses
match.hasToolResult=true so aimock returns the final text answer instead
of looping back to the first-call tool_call response. Without this, the
parent re-emits research up to the langgraph recursion limit and the test
takes ~3 minutes wall-clock.

Runner switched from mock.onMessage() to mock.addFixturesFromJSON() so the
JSON fixture's richer match shapes (toolName, hasToolResult, turnIndex,
etc.) survive the load path. Existing fixtures use just userMessage and
keep working.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 15, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
cacheplane Ready Ready Preview, Comment May 15, 2026 7:25pm

Request Review

@blove blove merged commit fdfbbf6 into main May 15, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant