test(examples-chat): aimock research-subagent scenario — Phase 2d#330
Merged
Conversation
…n loop Captures parent (tool_call) + subagent (summary) + parent continuation (final text) into a single fixture file. The continuation entry uses match.hasToolResult=true so aimock returns the final text answer instead of looping back to the first-call tool_call response. Without this, the parent re-emits research up to the langgraph recursion limit and the test takes ~3 minutes wall-clock. Runner switched from mock.onMessage() to mock.addFixturesFromJSON() so the JSON fixture's richer match shapes (toolName, hasToolResult, turnIndex, etc.) survive the load path. Existing fixtures use just userMessage and keep working.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an end-to-end Playwright scenario for the research subagent flow through the harness:
tool_calltoresearch(topic="...")standalone components/NgModule) surface in the conversation bodyThe fixture is captured from real
gpt-5-minifor the canonical demo prompt — three LLM calls in a single JSON file (parent first call, subagent, parent continuation).Sits on Phase 2c (#322).
Test plan
Notes for reviewers
Multi-turn matching — the real change here
Without this PR, the parent LLM loops up to the langgraph recursion limit (we saw ~17 iterations, ~2.8min wall-clock per suite run). The reason: aimock's default
userMessage-only match returns the same response on every parent call, including the post-tool-round continuation that should emit text instead of anothertool_call.The fixture uses
match.hasToolResult: true(aimock'sFixtureFileEntry.matchdiscriminator) on the continuation entry — order-sensitive, listed BEFORE the first-call entry — so aimock returns the captured final-answer text once the parent has a tool result in its history. Loop is broken; test runs in ~1s.The runner switched from
mock.onMessage(userMessage, response)tomock.addFixturesFromJSON(entries)so the richer match shapes (hasToolResult,toolName,toolCallId,turnIndex) survive the load path. Existing single-userMessagefixtures keep working unchanged.Single-
<chat-subagents>-panel assertion intentionally NOT includedThe smoke checklist says the
<chat-subagents>panel renders while the subagent is running. With aimock the subagent completes essentially instantly, so the panel exists for a sub-millisecond window — not catchable without artificial latency. The chosen assertions (tool-call chip + content phrases) catch the same regression class without timing fragility.