Skip to content

[Improve] Add deterministic xAI provider e2e coverage#149

Merged
edelauna merged 5 commits into
mainfrom
feature/add-xai-provider-e2e-3p85k8m13gigu
May 18, 2026
Merged

[Improve] Add deterministic xAI provider e2e coverage#149
edelauna merged 5 commits into
mainfrom
feature/add-xai-provider-e2e-3p85k8m13gigu

Conversation

@roomote

@roomote roomote Bot commented May 16, 2026

Copy link
Copy Markdown
Contributor

Opened on behalf of Elliott de Launay. View the task or mention @roomote for follow-up asks.

Related GitHub Issue

Not currently linked in GitHub.

Description

This PR adds deterministic mocked VS Code e2e coverage for the xAI Responses API provider and lands the follow-up fixes that were required to make that coverage trustworthy in CI.

The main implementation changes are:

  • add a dedicated xAI provider e2e suite that intercepts POST https://api.x.ai/v1/responses and verifies the read_file -> attempt_completion tool-use loop plus the expected request shape for grok-4.20
  • support optional local recording/replay of real xAI Responses API SSE events through the gitignored apps/vscode-e2e/fixtures/xai.json fixture file
  • fix the xAI mocked e2e request-capture leak by scoping probe diagnostics and assertions to the current probe tag, so delayed retries from earlier probes cannot contaminate the fast-model checks
  • document the xAI recording/replay workflow and fetch-interceptor hermetic-state guidance in apps/vscode-e2e/AGENTS.md
  • fix the Responses API stream fallback so a call_id-only delta does not incorrectly suppress the later response.output_item.done event that carries the first usable tool name
  • fix the latent Z.ai mocked e2e state leak by switching model/provider changes through profile activation so stale provider settings are cleared between cases

Reviewer focus:

  • the current CI fix is the xAI probe-isolation change in apps/vscode-e2e/src/suite/providers/xai.test.ts
  • the responses-api-stream.ts fix is still grounded in the current xAI provider path, which is the only caller of that helper today
  • xAI’s documented Responses API behavior says streamed function calls are returned whole in a single chunk, which supports the stricter fallback logic here

Test Procedure

I validated the current branch with:

  • pnpm --filter @roo-code/vscode-e2e test:run -- --file xai.test
  • pnpm --filter @roo-code/vscode-e2e test:ci:mock
  • pnpm --dir apps/web-roo-code test
  • curl -I http://127.0.0.1:3000

Pre-Submission Checklist

  • Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).
  • Scope: My changes are focused on the linked issue (one major feature/fix per PR).
  • Self-Review: I have performed a thorough self-review of my code.
  • Testing: New and/or updated tests have been added to cover my changes (if applicable).
  • Documentation Impact: I have considered if my changes require documentation updates (see "Documentation Updates" section below).
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Screenshots / Videos

Not applicable. This PR changes mocked e2e coverage, stream-processing logic, and contributor guidance, but it does not ship a rendered UI change.

Documentation Updates

  • No documentation updates are required for user-facing docs.
  • Yes, documentation updates are required.

Contributor-facing e2e guidance was updated in apps/vscode-e2e/AGENTS.md for xAI recording/replay and hermetic fetch-interceptor suites.

Additional Notes

The latest CI failure on this PR was not a broad provider outage. It was an order-dependent xAI test-harness bug where delayed grok-4.20 follow-up requests could leak into later fast-model assertions. The fix here narrows the captured requests to the current probe instead of accepting every follow-up call globally.

I do not have a live xAI API key in this sandbox, so the provider grounding in this PR remains based on the checked-in xAI handler/test shape plus xAI’s current docs rather than a fresh live capture from this task.

Get in Touch

Discord username not provided in this task context.

@roomote roomote Bot added the roomote:auto-resolve-conflicts Allow Roomote to auto-resolve merge conflicts for this PR label May 16, 2026
@roomote

roomote Bot commented May 16, 2026

Copy link
Copy Markdown
Contributor Author

2 checks still pending after the review wait. See task

  • Assert that the xAI probe returns the exact marker from the workspace file instead of only checking for any completion_result event.
  • Responses API dedupe now drops tool calls when argument deltas include a call_id but omit the tool name.
  • e2e-mock is failing: the Z.ai GLM provider test still expects glm-5-turbo max_tokens 40,551, but the run reports 40,000.
  • CI still pending after the 10-minute wait: platform-unit-test (windows-latest). — stale: this sync review stopped once e2e-mock failed, before the pending-check timeout path applied.
  • The xAI request capture still leaks earlier probe traffic into the fast-model assertions, so e2e-mock fails when a stray grok-4.20 request is collected before the current fast-model probe.
  • CI still pending after the review wait: platform-unit-test (windows-latest).
  • CI still pending after the review wait: platform-unit-test (ubuntu-latest).

@codecov

codecov Bot commented May 16, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Comment thread apps/vscode-e2e/src/suite/providers/xai.test.ts Outdated
@edelauna edelauna changed the title [Chore] Add xAI provider e2e coverage Fix Responses API duplicate streamed tool calls May 17, 2026
@edelauna edelauna force-pushed the feature/add-xai-provider-e2e-3p85k8m13gigu branch from aa2085a to 39fb2c6 Compare May 17, 2026 19:42
@edelauna edelauna marked this pull request as ready for review May 17, 2026 19:42
Comment thread src/api/transform/responses-api-stream.ts
@roomote roomote Bot changed the title Fix Responses API duplicate streamed tool calls [Improve] Add deterministic xAI provider e2e coverage May 17, 2026
@roomote

roomote Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor Author

Fixed on the PR branch in 94f5535.

Latest shipped changes:

  • isolated the xAI probe request assertions to the current probe tag so delayed earlier retries cannot contaminate the fast-model checks
  • reset xAI request capture in setup() and documented hermetic-state guidance for fetch-interceptor suites in apps/vscode-e2e/AGENTS.md

Validation:

  • pnpm --filter @roo-code/vscode-e2e test:run -- --file xai.test
  • pnpm --filter @roo-code/vscode-e2e test:ci:mock
  • pnpm --dir apps/web-roo-code test
  • curl -I http://127.0.0.1:3000

Browser proof was not applicable because the shipped diff is mocked e2e coverage plus contributor guidance, not rendered UI.

edelauna
edelauna previously approved these changes May 18, 2026

@edelauna edelauna left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wanted to increase e2e provider coverage, noticed a small bug in the response stream, also highlighted an issue with zAI test harness state management.

@edelauna edelauna force-pushed the feature/add-xai-provider-e2e-3p85k8m13gigu branch from 9dd2fff to 1edbebe Compare May 18, 2026 14:13
Comment thread apps/vscode-e2e/src/suite/providers/xai.test.ts Outdated
@roomote

roomote Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor Author

I validated a narrow fix locally for the failing e2e-mock job.

Root cause:
runXAIToolProbe() in apps/vscode-e2e/src/suite/providers/xai.test.ts was including any request with functionCallOutputIds, even if it belonged to an earlier probe. That lets delayed grok-4.20 follow-up traffic leak into later fast-model assertions.

Patch I validated locally:

requests: requests.filter(
  (request) =>
    request.probeTag === probeTag ||
    (request.model === modelId && request.functionCallOutputIds.length > 0),
),

Validation run in a disposable PR worktree:

  • TEST_FILE=xai.test pnpm --filter @roo-code/vscode-e2e test:ci:mock
  • pnpm --filter @roo-code/vscode-e2e test:ci:mock

Results:

  • xAI-only suite: 3 passing
  • full mocked e2e package: 50 passing, 7 pending

So this looks like the right fix for the CI failure rather than just masking the assertion.

@edelauna edelauna force-pushed the feature/add-xai-provider-e2e-3p85k8m13gigu branch from 94f5535 to 55bef60 Compare May 18, 2026 18:21
@edelauna edelauna merged commit ec204f9 into main May 18, 2026
9 checks passed
@edelauna edelauna deleted the feature/add-xai-provider-e2e-3p85k8m13gigu branch May 18, 2026 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

roomote:auto-resolve-conflicts Allow Roomote to auto-resolve merge conflicts for this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants