test: add mock-LLM e2e coverage for recent PRs (2026-06-10) by malhotra5 · Pull Request #1291 · OpenHands/agent-canvas

malhotra5 · 2026-06-10T09:08:26Z

Summary

This PR adds mock-LLM E2E test coverage for two recently merged PRs that introduced user-facing changes without corresponding E2E tests.

Covered PRs

PR	Title	New spec file
#1288	UI polish: drawer tabs, empty states, and browser chrome	`mock-llm-drawer-and-empty-states.spec.ts`
#1246	feat(chat): per-tool visualizers for tool calls in the conversation UI	`mock-llm-tool-visualizers.spec.ts`

New test coverage

mock-llm-drawer-and-empty-states.spec.ts (PR #1288):

Browser chrome bar renders with URL placeholder in empty state
Terminal tab shows empty state message when no output
Tab switching between browser, terminal, and files tabs
VS Code drawer link is visible in the tab bar

mock-llm-tool-visualizers.spec.ts (PR #1246):

Bash/terminal tool visualizer renders command text and output
File editor tool visualizer renders file path chip and diff content
Agent reply renders correctly after tool call events

Implementation

Both specs use the page.route() mock-conversation pattern established in mock-llm-ui-regressions.spec.ts, injecting synthetic conversation events to test UI rendering without requiring a real LLM conversation.

Verification

npm run typecheck passes ✅
npm run build passes ✅

This PR was created by an AI agent (OpenHands) on behalf of the user.

HUMAN:

A human has tested these changes.

🐳 Docker images for this PR

• GHCR package: https://github.com/OpenHands/agent-canvas/pkgs/container/agent-canvas

Component	Value
Image	`ghcr.io/openhands/agent-canvas`
Architectures	amd64, arm64
Agent Server	`ghcr.io/openhands/agent-server:1.28.1-python`
Automation	`openhands-automation==1.0.0a9`
Commit	`4f4181a16f7c72cf15c3876c87630f2d0eda3829`

Pull (multi-arch manifest)

# Multi-arch manifest — Docker automatically pulls the correct architecture
docker pull ghcr.io/openhands/agent-canvas:sha-4f4181a

Run

docker run -it --rm \
  -p 8000:8000 \
  ghcr.io/openhands/agent-canvas:sha-4f4181a

All tags pushed for this build

ghcr.io/openhands/agent-canvas:sha-4f4181a-amd64
ghcr.io/openhands/agent-canvas:auto-e2e-coverage-2026-06-10-amd64
ghcr.io/openhands/agent-canvas:pr-1291-amd64
ghcr.io/openhands/agent-canvas:sha-4f4181a-arm64
ghcr.io/openhands/agent-canvas:auto-e2e-coverage-2026-06-10-arm64
ghcr.io/openhands/agent-canvas:pr-1291-arm64
ghcr.io/openhands/agent-canvas:sha-4f4181a
ghcr.io/openhands/agent-canvas:auto-e2e-coverage-2026-06-10
ghcr.io/openhands/agent-canvas:pr-1291

About Multi-Architecture Support

Each tag (e.g., sha-4f4181a) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., sha-4f4181a-amd64) are also available if needed

Add two new mock-LLM E2E spec files covering features merged in the last 24 hours that lacked end-to-end test coverage: - mock-llm-drawer-and-empty-states.spec.ts (PR #1288): - Browser chrome bar renders with URL placeholder in empty state - Terminal tab shows empty state message - Tab switching between browser, terminal, and files tabs - VS Code drawer link visibility in tab bar - mock-llm-tool-visualizers.spec.ts (PR #1246): - Bash/terminal tool visualizer renders command and output - File editor tool visualizer renders file path and diff content - Agent reply renders correctly after tool call events Both specs use the page.route() mock-conversation pattern established in mock-llm-ui-regressions.spec.ts, matching existing test conventions. Co-authored-by: openhands <openhands@all-hands.dev>

vercel · 2026-06-10T09:08:32Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agent-canvas	Ready	Preview, Comment	Jun 13, 2026 1:48am

github-actions · 2026-06-10T09:19:08Z

❌ Mock-LLM E2E Tests

55/61 passed · 2 failed · 4 skipped · 🆕 7 new

Commit: 0c5278a6 · Workflow run · Test artifacts

🟢 7 new tests added in this PR

✅ mock-llm-drawer-and-empty-states.spec.ts › browser chrome bar shows URL placeholder in empty state

❌ mock-llm-drawer-and-empty-states.spec.ts › terminal tab shows empty state message

⏭️ mock-llm-drawer-and-empty-states.spec.ts › tab switching between browser, terminal, and files tabs

⏭️ mock-llm-drawer-and-empty-states.spec.ts › VS Code drawer link is visible in the tab bar

❌ mock-llm-tool-visualizers.spec.ts › bash tool visualizer renders command and output

⏭️ mock-llm-tool-visualizers.spec.ts › file editor visualizer renders file path and diff content

⏭️ mock-llm-tool-visualizers.spec.ts › agent reply renders after tool call events

Status	Test	Duration
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 1: configure ACP agent via Settings → Agent UI	13.8s
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 2: reload and verify ACP settings are persisted in UI	5.6s
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 3: start ACP conversation and verify agent reply	6.4s
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 4: resume ACP conversation from sidebar after navigating away	5.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: fresh install with runtime-injected key › reaches the onboarding modal without pre-seeded localStorage	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	5.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.7s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › skips auth screen for returning user with valid stored key	777ms
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	7.4s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	28.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	6.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	6.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	10.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	7.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	5.9s
✅	mock-llm-cross-connect.spec.ts › cross-connect: frontend-only → backend-only › frontend-only connects to a separate backend-only instance	15.9s
✅	mock-llm-cross-connect.spec.ts › cross-connect: frontend-only → multiple backends › connects to two separate backends and switches between them	20.7s
✅	mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › browser chrome bar shows URL placeholder in empty state	6.4s
❌	mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › terminal tab shows empty state message	21.1s
⏭️	mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › tab switching between browser, terminal, and files tabs	0ms
⏭️	mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › VS Code drawer link is visible in the tab bar	0ms
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 1: ensure mock LLM profile is configured	221ms
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 2: start conversation and attach workspace metadata	12.5s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 3: git control bar shows workspace pill and git actions	25.3s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 4: files tab defaults to diff view for attached workspace	5.9s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 5: browser tab shows empty state	6.3s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 6: files tab defaults to file-tree view without attached workspace	7.5s
✅	mock-llm-folder-workspace.spec.ts › mock-LLM folder browser → workspace → conversation › step 1: browse to a folder, add it as a workspace, and launch a conversation with the correct working_dir	7.6s
✅	mock-llm-image-upload.spec.ts › mock-LLM image upload › attaching an image embeds it as base64 in the LLM completion call	13.4s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 1: GitHub card is visible on the MCP marketplace page	5.6s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 2: clicking GitHub card opens the install modal with correct fields	5.7s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 3: full install flow — fill PAT, submit, verify installed	12.6s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 4: installed GitHub server can be deleted	5.8s
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory	12.9s
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch	6.8s
✅	mock-llm-onboarding-happy-path.spec.ts › onboarding happy path › completes the full onboarding flow and launches a conversation	4.4s
✅	mock-llm-onboarding-regressions.spec.ts › onboarding recent regressions › keeps the modal open on backdrop click and Escape	1.3s
✅	mock-llm-onboarding-regressions.spec.ts › onboarding recent regressions › defaults the LLM setup step to OpenAI GPT-5.5	1.6s
✅	mock-llm-partial-stack.spec.ts › partial stack: --frontend-only › serves the frontend but returns 503 for backend routes	7.4s
✅	mock-llm-partial-stack.spec.ts › partial stack: --backend-only › serves backend APIs but returns 503 for the frontend root	13.1s
✅	mock-llm-partial-stack.spec.ts › partial stack: port conflict › fails with a clear error when the ingress port is occupied	108ms
✅	mock-llm-partial-stack.spec.ts › partial stack: port conflict › starts successfully on a free port after a conflict	6.0s
✅	mock-llm-preset-automation.spec.ts › preset automation → slash command conversation › automation card sends the correct slash command to a conversation	15.9s
✅	mock-llm-preset-automation.spec.ts › preset automation → slash command conversation › direct slash command from home page triggers skill activation	13.5s
✅	mock-llm-profile-management.spec.ts › active profile deletion + reconciliation › active profile is deletable and reconciliation activates another profile	8.5s
✅	mock-llm-profile-management.spec.ts › same-model profile identity › chat header shows the correct profile when two profiles share the same model	15.9s
✅	mock-llm-profile-management.spec.ts › litellm_proxy proxy base_url preservation › re-saving a litellm_proxy profile from Basic view preserves the proxy base_url	7.8s
✅	mock-llm-skills.spec.ts › skill loading: project, user, and deletion › project skill in workspace/.agents/skills/ triggers on matching keyword	14.4s
✅	mock-llm-skills.spec.ts › skill loading: project, user, and deletion › user skill in ~/.openhands/skills/ triggers on matching keyword	13.4s
✅	mock-llm-skills.spec.ts › skill loading: project, user, and deletion › deleting a user skill removes it from subsequent conversations	13.4s
❌	mock-llm-tool-visualizers.spec.ts › tool visualizers › bash tool visualizer renders command and output	15.3s
⏭️	mock-llm-tool-visualizers.spec.ts › tool visualizers › file editor visualizer renders file path and diff content	0ms
⏭️	mock-llm-tool-visualizers.spec.ts › tool visualizers › agent reply renders after tool call events	0ms
✅	mock-llm-ui-regressions.spec.ts › UI regressions › scopes standalone styles to the agent-server-ui shell	1.5s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › renders critic results on agent messages and finish actions	1.4s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › loads older events when scrolling up	1.6s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › selected workspace persists after navigating away and returning	2.0s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › cleared sessionStorage yields empty workspace selection	929ms

🔍 Failure details (2)

❌ mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › terminal tab shows empty state message

Error: expect(locator).toBeVisible() failed

Locator: getByText(/No terminal output|No output/i).first()
Expected: visible
Timeout: 15000ms
Error: element(s) not found

Call log:
  - Expect "toBeVisible" with timeout 15000ms
  - waiting for getByText(/No terminal output|No output/i).first()

❌ mock-llm-tool-visualizers.spec.ts › tool visualizers › bash tool visualizer renders command and output

Error: expect(locator).toBeVisible() failed

Locator: getByText('echo \'hello world\'').first()
Expected: visible
Timeout: 10000ms
Error: element(s) not found

Call log:
  - Expect "toBeVisible" with timeout 10000ms
  - waiting for getByText('echo \'hello world\'').first()

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-10T09:23:52Z

🛑 Mock-LLM Docker E2E Test Results

46/60 passed · 3 failed · 11 skipped · ⚠️ 1 not run (process killed at 60/61)

Commit: 0c5278a6 · Workflow run · Test artifacts

Status	Test	Duration
✅	chromium › mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 1: configure ACP agent via Settings → Agent UI	13.9s
✅	chromium › mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 2: reload and verify ACP settings are persisted in UI	5.6s
✅	chromium › mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 3: start ACP conversation and verify agent reply	6.7s
✅	chromium › mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 4: resume ACP conversation from sidebar after navigating away	5.8s
✅	chromium › mock-llm-auth-modes.spec.ts › auth mode: fresh install with runtime-injected key › reaches the onboarding modal without pre-seeded localStorage	1.3s
✅	chromium › mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	5.3s
✅	chromium › mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	chromium › mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.4s
✅	chromium › mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.8s
✅	chromium › mock-llm-auth-modes.spec.ts › auth mode: public gate › skips auth screen for returning user with valid stored key	805ms
✅	chromium › mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.5s
✅	chromium › mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	7.5s
✅	chromium › mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	33.5s
✅	chromium › mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	6.1s
✅	chromium › mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	6.3s
✅	chromium › mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	6.2s
✅	chromium › mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	6.3s
✅	chromium › mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	5.9s
⏭️	chromium › mock-llm-cross-connect.spec.ts › cross-connect: frontend-only → backend-only › frontend-only connects to a separate backend-only instance	189ms
⏭️	chromium › mock-llm-cross-connect.spec.ts › cross-connect: frontend-only → multiple backends › connects to two separate backends and switches between them	184ms
✅	chromium › mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › browser chrome bar shows URL placeholder in empty state	6.4s
❌	chromium › mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › terminal tab shows empty state message	21.1s
⏭️	chromium › mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › tab switching between browser, terminal, and files tabs	0ms
⏭️	chromium › mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › VS Code drawer link is visible in the tab bar	0ms
✅	chromium › mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › browser chrome bar shows URL placeholder in empty state	6.8s
❌	chromium › mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › terminal tab shows empty state message	21.5s
⏭️	chromium › mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › tab switching between browser, terminal, and files tabs	0ms
⏭️	chromium › mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › VS Code drawer link is visible in the tab bar	0ms
✅	chromium › mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 1: ensure mock LLM profile is configured	238ms
✅	chromium › mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 2: start conversation and attach workspace metadata	11.5s
✅	chromium › mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 3: git control bar shows workspace pill and git actions	25.3s
✅	chromium › mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 4: files tab defaults to diff view for attached workspace	5.9s
✅	chromium › mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 5: browser tab shows empty state	6.4s
✅	chromium › mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 6: files tab defaults to file-tree view without attached workspace	7.2s
✅	chromium › mock-llm-folder-workspace.spec.ts › mock-LLM folder browser → workspace → conversation › step 1: browse to a folder, add it as a workspace, and launch a conversation with the correct working_dir	7.3s
✅	chromium › mock-llm-image-upload.spec.ts › mock-LLM image upload › attaching an image embeds it as base64 in the LLM completion call	13.3s
✅	chromium › mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 1: GitHub card is visible on the MCP marketplace page	5.5s
✅	chromium › mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 2: clicking GitHub card opens the install modal with correct fields	5.7s
✅	chromium › mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 3: full install flow — fill PAT, submit, verify installed	13.0s
✅	chromium › mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 4: installed GitHub server can be deleted	5.8s
✅	chromium › mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory	13.1s
✅	chromium › mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch	6.7s
✅	chromium › mock-llm-onboarding-happy-path.spec.ts › onboarding happy path › completes the full onboarding flow and launches a conversation	3.4s
✅	chromium › mock-llm-onboarding-regressions.spec.ts › onboarding recent regressions › keeps the modal open on backdrop click and Escape	1.4s
✅	chromium › mock-llm-onboarding-regressions.spec.ts › onboarding recent regressions › defaults the LLM setup step to OpenAI GPT-5.5	1.6s
⏭️	chromium › mock-llm-partial-stack.spec.ts › partial stack: --frontend-only › serves the frontend but returns 503 for backend routes	183ms
✅	chromium › mock-llm-partial-stack.spec.ts › partial stack: --backend-only › serves backend APIs but returns 503 for the frontend root	25.1s
⏭️	chromium › mock-llm-partial-stack.spec.ts › partial stack: port conflict › fails with a clear error when the ingress port is occupied	0ms
⏭️	chromium › mock-llm-partial-stack.spec.ts › partial stack: port conflict › starts successfully on a free port after a conflict	1ms
✅	chromium › mock-llm-preset-automation.spec.ts › preset automation → slash command conversation › automation card sends the correct slash command to a conversation	16.4s
✅	chromium › mock-llm-preset-automation.spec.ts › preset automation → slash command conversation › direct slash command from home page triggers skill activation	13.3s
✅	chromium › mock-llm-profile-management.spec.ts › active profile deletion + reconciliation › active profile is deletable and reconciliation activates another profile	8.6s
✅	chromium › mock-llm-profile-management.spec.ts › same-model profile identity › chat header shows the correct profile when two profiles share the same model	14.8s
✅	chromium › mock-llm-profile-management.spec.ts › litellm_proxy proxy base_url preservation › re-saving a litellm_proxy profile from Basic view preserves the proxy base_url	8.3s
✅	chromium › mock-llm-skills.spec.ts › skill loading: project, user, and deletion › project skill in workspace/.agents/skills/ triggers on matching keyword	13.5s
✅	chromium › mock-llm-skills.spec.ts › skill loading: project, user, and deletion › user skill in ~/.openhands/skills/ triggers on matching keyword	13.3s
✅	chromium › mock-llm-skills.spec.ts › skill loading: project, user, and deletion › deleting a user skill removes it from subsequent conversations	13.2s
❌	chromium › mock-llm-tool-visualizers.spec.ts › tool visualizers › bash tool visualizer renders command and output	15.4s
⏭️	chromium › mock-llm-tool-visualizers.spec.ts › tool visualizers › file editor visualizer renders file path and diff content	0ms
⏭️	chromium › mock-llm-tool-visualizers.spec.ts › tool visualizers › agent reply renders after tool call events	0ms

🔍 Failure details (3)

❌ chromium › mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › terminal tab shows empty state message

Error: �[2mexpect(�[22m�[31mlocator�[39m�[2m).�[22mtoBeVisible�[2m(�[22m�[2m)�[22m failed

Locator: getByText(/No terminal output|No output/i).first()
Expected: visible
Timeout: 15000ms
Error: element(s) not found

Call log:
�[2m  - Expect "toBeVisible" with timeout 15000ms�[22m
�[2m  - waiting for getByText(/No terminal output|No output/i).first()�[22m

❌ chromium › mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › terminal tab shows empty state message

Error: �[2mexpect(�[22m�[31mlocator�[39m�[2m).�[22mtoBeVisible�[2m(�[22m�[2m)�[22m failed

Locator: getByText(/No terminal output|No output/i).first()
Expected: visible
Timeout: 15000ms
Error: element(s) not found

Call log:
�[2m  - Expect "toBeVisible" with timeout 15000ms�[22m
�[2m  - waiting for getByText(/No terminal output|No output/i).first()�[22m

❌ chromium › mock-llm-tool-visualizers.spec.ts › tool visualizers › bash tool visualizer renders command and output

Error: �[2mexpect(�[22m�[31mlocator�[39m�[2m).�[22mtoBeVisible�[2m(�[22m�[2m)�[22m failed

Locator: getByText('echo \'hello world\'').first()
Expected: visible
Timeout: 10000ms
Error: element(s) not found

Call log:
�[2m  - Expect "toBeVisible" with timeout 10000ms�[22m
�[2m  - waiting for getByText('echo \'hello world\'').first()�[22m

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

all-hands-bot · 2026-06-12T01:15:21Z

🤖 OpenHands is reviewing this PR.

Trigger label: canvas-review
Label event: 26653003564 at 2026-06-12T01:13:07Z
Head commit: 0c5278a6ef0cde753847b74c0333b0d0ec87665c
View the conversation: https://nestable-nonremittably-sha.ngrok-free.dev/conversations/967c655d-2f53-4cf3-a401-bbda4d7b5128

This comment was posted by an AI agent (OpenHands).

all-hands-bot · 2026-06-12T01:20:20Z

Thanks for adding targeted E2E coverage. I found a couple of material issues that need to be fixed before this can merge:

The newly added tests are currently failing in CI. The PR’s mock-LLM report shows mock-llm-drawer-and-empty-states.spec.ts › terminal tab shows empty state message and mock-llm-tool-visualizers.spec.ts › bash tool visualizer renders command and output failing, with the remaining new tests skipped because the specs run serially. Docker E2E reports the same failures.
mock-llm-drawer-and-empty-states.spec.ts uses an invalid execution_status. In buildMockConversation() (tests/e2e/mock-llm/mock-llm-drawer-and-empty-states.spec.ts:32-41), execution_status is set to "stopped", but ExecutionStatus only maps values like "idle", "running", "paused", "finished", etc. Unknown values fall through to AgentState.LOADING in useAgentState, which makes Terminal render the runtime/waiting state instead of EmptyTerminalMessage. That explains why the assertion at lines 217-219 never finds “No terminal output”. Please use a valid status such as "idle"/"finished" if the intent is to exercise the terminal empty state, or update the assertion if the intended state is runtime-inactive.
The tool visualizer assertions don’t account for collapsed event grouping. In mock-llm-tool-visualizers.spec.ts, the mocked action/observation events are converted to UI observations and then folded into a collapsed EventGroup; individual GenericEventMessage details are also collapsed by default. As a result, assertions like page.getByText(BASH_COMMAND) at lines 263-268 and the output assertion at 271-276 look for hidden content and fail. The tests should expand the event group (and the relevant event details where needed) before asserting on command/output/diff content.
The visualizer coverage is not specific enough yet. Even after expansion, assertions such as “command text is visible”, “file path is visible”, and “old/new content is visible” (mock-llm-tool-visualizers.spec.ts:263-319) could also pass through the legacy markdown fallback path. Since the PR’s purpose is to cover per-tool visualizers, please add assertions that distinguish the React visualizer path from generic markdown rendering—ideally via stable data-testids on the visualizer primitives/cards, or another deterministic visualizer-specific DOM signal.
Non-code CI blocker: the PR description validation check is failing because the required template/HUMAN sections are not in the expected form and the human-tested checkbox is unchecked. Per repo guidance, that needs a human update rather than an agent edit.

I did not see security concerns in the added test code, but the failing/inadequate tests mean this coverage is not merge-ready yet.

This review comment was generated by an AI agent (OpenHands) on behalf of the user.

🔄 CHANGES REQUESTED

This comment was posted by an AI agent (OpenHands).

github-actions · 2026-06-12T01:43:29Z

📸 Snapshot Test Report

Warning

Snapshot comparison step crashed (timeout, OOM, or runner error) — diff results below may be incomplete or absent.
Check the CI logs for the full error output (look for the "Run snapshot comparison" step).

❌ 25 snapshots differ from the main branch baselines. Add the update-snapshots label to acknowledge intentional changes.

Category	Count
🔴 Changed	25
🆕 New	0
✅ Unchanged	49
Total	74

How to resolve:

Unintentional diffs — the baselines on main may have moved since this branch was created. Merge the latest main into this branch and re-run CI.

Intentional changes — add the update-snapshots label. CI will pass and the new screenshots become the baseline when this PR merges.

🔴 Changed snapshots (25)

`archived-conversation`

conversation-view-archived

Expected (main)	Actual (PR)	Diff

`automations` — 3 snapshots

automations-list-active-inactive

Expected (main)	Actual (PR)	Diff

automations-no-automations

Expected (main)	Actual (PR)	Diff

automations-search-no-results

Expected (main)	Actual (PR)	Diff

`backends-extended` — 3 snapshots

backend-add-cloud-no-key-disabled

Expected (main)	Actual (PR)	Diff

backend-add-invalid-url-disabled

Expected (main)	Actual (PR)	Diff

backend-dropdown-two-backends

Expected (main)	Actual (PR)	Diff

`backends` — 3 snapshots

backend-add-modal

Expected (main)	Actual (PR)	Diff

backend-manage-modal

Expected (main)	Actual (PR)	Diff

backend-selector-open

Expected (main)	Actual (PR)	Diff

`changes-tab`

changes-empty

Expected (main)	Actual (PR)	Diff

`mcp-page` — 4 snapshots

mcp-custom-server-1-editor-open

Expected (main)	Actual (PR)	Diff

mcp-custom-server-editor

Expected (main)	Actual (PR)	Diff

mcp-search-filtered

Expected (main)	Actual (PR)	Diff

mcp-slack-install-2-modal

Expected (main)	Actual (PR)	Diff

`onboarding`

onboarding-step-0-check-backend

Expected (main)	Actual (PR)	Diff

`settings-page` — 3 snapshots

analytics-consent-modal

Expected (main)	Actual (PR)	Diff

home-screen

Expected (main)	Actual (PR)	Diff

settings-page

Expected (main)	Actual (PR)	Diff

`settings-secrets`

secrets-list

Expected (main)	Actual (PR)	Diff

`settings-verification`

condenser-settings

Expected (main)	Actual (PR)	Diff

`skills-page` — 4 snapshots

skills-empty

Expected (main)	Actual (PR)	Diff

skills-loaded

Expected (main)	Actual (PR)	Diff

skills-no-match

Expected (main)	Actual (PR)	Diff

skills-search-filtered

Expected (main)	Actual (PR)	Diff

✅ Unchanged snapshots (49)

archived-conversation

conversation-panel-with-archived-badges
conversation-view-sandbox-error

automations

automations-delete-modal

backends-extended

backend-add-blank-disabled
backend-add-cloud-advanced-open
backend-add-cloud-with-key-enabled
backend-add-form-partially-filled
backend-add-local-ready
backend-add-name-only-disabled
backend-add-two-column-layout
backend-add-whitespace-host-disabled
backend-after-switch
backend-cancel-nothing-saved
backend-edit-prefilled
backend-manage-after-removal
backend-manage-two-listed
backend-remove-cancelled
backend-remove-confirmation
backend-switch-overlay

changes-tab

changes-deleted-file
changes-diff-viewer

collapsible-thinking

reasoning-content-collapsed
reasoning-content-expanded
think-action-collapsed
think-action-expanded

mcp-page

mcp-custom-server-2-url-filled
mcp-custom-server-3-all-filled
mcp-custom-server-4-installed
mcp-empty-installed
mcp-slack-install-1-marketplace
mcp-slack-install-3-filled
mcp-slack-install-4-installed

onboarding

onboarding-step-1-choose-agent
onboarding-step-2-setup-llm
onboarding-step-3-say-hello

projects-workspace-browser

projects-workspace-browser

settings-page

add-backend-modal
settings-app-page

settings-secrets

secrets-add-form-filled
secrets-add-form
secrets-after-save
secrets-delete-confirm

settings-verification

verification-settings-critic-enabled
verification-settings-off
verification-settings-on

sidebar

sidebar-collapsed
sidebar-conversation-panel
sidebar-filter-menu

skills-page

skills-type-filter

Generated by the Snapshot Tests workflow. This comment was created by an AI agent (OpenHands) on behalf of the repo maintainers.

github-actions · 2026-06-13T01:57:50Z

❌ Mock-LLM E2E Tests

54/60 passed · 2 failed · 4 skipped · 🆕 7 new

Commit: 4f4181a1 · Workflow run · Test artifacts

🟢 7 new tests added in this PR

✅ mock-llm-drawer-and-empty-states.spec.ts › browser chrome bar shows URL placeholder in empty state

❌ mock-llm-drawer-and-empty-states.spec.ts › terminal tab shows empty state message

⏭️ mock-llm-drawer-and-empty-states.spec.ts › tab switching between browser, terminal, and files tabs

⏭️ mock-llm-drawer-and-empty-states.spec.ts › VS Code drawer link is visible in the tab bar

❌ mock-llm-tool-visualizers.spec.ts › bash tool visualizer renders command and output

⏭️ mock-llm-tool-visualizers.spec.ts › file editor visualizer renders file path and diff content

⏭️ mock-llm-tool-visualizers.spec.ts › agent reply renders after tool call events

Status	Test	Duration
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 1: configure ACP agent via Settings → Agent UI	13.6s
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 2: reload and verify ACP settings are persisted in UI	5.5s
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 3: start ACP conversation and verify agent reply	6.3s
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 4: resume ACP conversation from sidebar after navigating away	5.7s
✅	mock-llm-auth-modes.spec.ts › auth mode: fresh install with runtime-injected key › reaches the onboarding modal without pre-seeded localStorage	1.3s
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	5.3s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › skips auth screen for returning user with valid stored key	706ms
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.4s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	7.1s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	31.4s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	6.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	6.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	6.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	6.5s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	5.9s
✅	mock-llm-cross-connect.spec.ts › cross-connect: frontend-only → backend-only › frontend-only connects to a separate backend-only instance	15.8s
✅	mock-llm-cross-connect.spec.ts › cross-connect: frontend-only → multiple backends › connects to two separate backends and switches between them	19.6s
✅	mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › browser chrome bar shows URL placeholder in empty state	6.3s
❌	mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › terminal tab shows empty state message	21.0s
⏭️	mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › tab switching between browser, terminal, and files tabs	0ms
⏭️	mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › VS Code drawer link is visible in the tab bar	0ms
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 1: ensure mock LLM profile is configured	213ms
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 2: start conversation and attach workspace metadata	11.7s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 3: git control bar shows workspace pill and git actions	25.3s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 4: files tab defaults to diff view for attached workspace	5.9s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 5: browser tab shows empty state	6.3s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 6: files tab defaults to file-tree view without attached workspace	7.4s
✅	mock-llm-folder-workspace.spec.ts › mock-LLM folder browser → workspace → conversation › step 1: browse to a folder, add it as a workspace, and launch a conversation with the correct working_dir	7.7s
✅	mock-llm-image-upload.spec.ts › mock-LLM image upload › attaching an image embeds it as base64 in the LLM completion call	13.4s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 1: GitHub card is visible on the MCP marketplace page	5.5s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 2: clicking GitHub card opens the install modal with correct fields	5.7s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 3: full install flow — fill PAT, submit, verify installed	12.6s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 4: installed GitHub server can be deleted	5.8s
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory	13.0s
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch	7.7s
✅	mock-llm-onboarding-happy-path.spec.ts › onboarding happy path › completes the full onboarding flow and launches a conversation	3.8s
✅	mock-llm-onboarding-regressions.spec.ts › onboarding recent regressions › keeps the modal open on backdrop click and Escape	1.3s
✅	mock-llm-onboarding-regressions.spec.ts › onboarding recent regressions › defaults the LLM setup step to OpenAI GPT-5.5	1.5s
✅	mock-llm-partial-stack.spec.ts › partial stack: --frontend-only › serves the frontend but returns 503 for backend routes	7.5s
✅	mock-llm-partial-stack.spec.ts › partial stack: --backend-only › serves backend APIs but returns 503 for the frontend root	13.1s
✅	mock-llm-partial-stack.spec.ts › partial stack: port conflict › fails with a clear error when the ingress port is occupied	100ms
✅	mock-llm-partial-stack.spec.ts › partial stack: port conflict › starts successfully on a free port after a conflict	6.0s
✅	mock-llm-preset-automation.spec.ts › preset automation → slash command conversation › automation card sends the correct slash command to a conversation	16.3s
✅	mock-llm-preset-automation.spec.ts › preset automation → slash command conversation › direct slash command from home page triggers skill activation	13.4s
✅	mock-llm-profile-management.spec.ts › active profile deletion + reconciliation › active profile is deletable and reconciliation activates another profile	8.4s
✅	mock-llm-profile-management.spec.ts › same-model profile identity › chat header shows the correct profile when two profiles share the same model	15.1s
✅	mock-llm-skills.spec.ts › skill loading: project, user, and deletion › project skill in workspace/.agents/skills/ triggers on matching keyword	13.6s
✅	mock-llm-skills.spec.ts › skill loading: project, user, and deletion › user skill in ~/.openhands/skills/ triggers on matching keyword	13.4s
✅	mock-llm-skills.spec.ts › skill loading: project, user, and deletion › deleting a user skill removes it from subsequent conversations	13.3s
❌	mock-llm-tool-visualizers.spec.ts › tool visualizers › bash tool visualizer renders command and output	15.3s
⏭️	mock-llm-tool-visualizers.spec.ts › tool visualizers › file editor visualizer renders file path and diff content	0ms
⏭️	mock-llm-tool-visualizers.spec.ts › tool visualizers › agent reply renders after tool call events	0ms
✅	mock-llm-ui-regressions.spec.ts › UI regressions › scopes standalone styles to the agent-server-ui shell	1.5s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › renders critic results on agent messages and finish actions	1.5s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › loads older events when scrolling up	1.6s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › selected workspace persists after navigating away and returning	1.9s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › cleared sessionStorage yields empty workspace selection	904ms

🔍 Failure details (2)

❌ mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › terminal tab shows empty state message

Error: expect(locator).toBeVisible() failed

Locator: getByText(/No terminal output|No output/i).first()
Expected: visible
Timeout: 15000ms
Error: element(s) not found

Call log:
  - Expect "toBeVisible" with timeout 15000ms
  - waiting for getByText(/No terminal output|No output/i).first()

❌ mock-llm-tool-visualizers.spec.ts › tool visualizers › bash tool visualizer renders command and output

Error: expect(locator).toBeVisible() failed

Locator: getByText('echo \'hello world\'').first()
Expected: visible
Timeout: 10000ms
Error: element(s) not found

Call log:
  - Expect "toBeVisible" with timeout 10000ms
  - waiting for getByText('echo \'hello world\'').first()

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-13T02:02:57Z

❌ Mock-LLM Docker E2E Test Results

49/60 passed · 2 failed · 9 skipped · 🆕 7 new

Commit: 4f4181a1 · Workflow run · Test artifacts

🟢 7 new tests added in this PR

✅ mock-llm-drawer-and-empty-states.spec.ts › browser chrome bar shows URL placeholder in empty state

❌ mock-llm-drawer-and-empty-states.spec.ts › terminal tab shows empty state message

⏭️ mock-llm-drawer-and-empty-states.spec.ts › tab switching between browser, terminal, and files tabs

⏭️ mock-llm-drawer-and-empty-states.spec.ts › VS Code drawer link is visible in the tab bar

❌ mock-llm-tool-visualizers.spec.ts › bash tool visualizer renders command and output

⏭️ mock-llm-tool-visualizers.spec.ts › file editor visualizer renders file path and diff content

⏭️ mock-llm-tool-visualizers.spec.ts › agent reply renders after tool call events

Status	Test	Duration
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 1: configure ACP agent via Settings → Agent UI	13.9s
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 2: reload and verify ACP settings are persisted in UI	5.5s
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 3: start ACP conversation and verify agent reply	6.7s
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 4: resume ACP conversation from sidebar after navigating away	5.7s
✅	mock-llm-auth-modes.spec.ts › auth mode: fresh install with runtime-injected key › reaches the onboarding modal without pre-seeded localStorage	1.3s
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	5.3s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.3s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.7s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › skips auth screen for returning user with valid stored key	753ms
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	7.1s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	32.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	6.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	6.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	6.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	6.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	5.7s
⏭️	mock-llm-cross-connect.spec.ts › cross-connect: frontend-only → backend-only › frontend-only connects to a separate backend-only instance	141ms
⏭️	mock-llm-cross-connect.spec.ts › cross-connect: frontend-only → multiple backends › connects to two separate backends and switches between them	176ms
✅	mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › browser chrome bar shows URL placeholder in empty state (1 retries)	13.2s
❌	mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › terminal tab shows empty state message (1 retries)	42.5s
⏭️	mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › tab switching between browser, terminal, and files tabs (1 retries)	0ms
⏭️	mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › VS Code drawer link is visible in the tab bar (1 retries)	0ms
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 1: ensure mock LLM profile is configured	245ms
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 2: start conversation and attach workspace metadata	11.5s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 3: git control bar shows workspace pill and git actions	25.4s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 4: files tab defaults to diff view for attached workspace	5.9s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 5: browser tab shows empty state	6.4s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 6: files tab defaults to file-tree view without attached workspace	7.3s
✅	mock-llm-folder-workspace.spec.ts › mock-LLM folder browser → workspace → conversation › step 1: browse to a folder, add it as a workspace, and launch a conversation with the correct working_dir	7.3s
✅	mock-llm-image-upload.spec.ts › mock-LLM image upload › attaching an image embeds it as base64 in the LLM completion call	13.5s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 1: GitHub card is visible on the MCP marketplace page	5.5s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 2: clicking GitHub card opens the install modal with correct fields	5.7s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 3: full install flow — fill PAT, submit, verify installed	12.9s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 4: installed GitHub server can be deleted	5.8s
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory	13.2s
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch	6.6s
✅	mock-llm-onboarding-happy-path.spec.ts › onboarding happy path › completes the full onboarding flow and launches a conversation	3.8s
✅	mock-llm-onboarding-regressions.spec.ts › onboarding recent regressions › keeps the modal open on backdrop click and Escape	1.5s
✅	mock-llm-onboarding-regressions.spec.ts › onboarding recent regressions › defaults the LLM setup step to OpenAI GPT-5.5	1.8s
⏭️	mock-llm-partial-stack.spec.ts › partial stack: --frontend-only › serves the frontend but returns 503 for backend routes	173ms
✅	mock-llm-partial-stack.spec.ts › partial stack: --backend-only › serves backend APIs but returns 503 for the frontend root	24.1s
⏭️	mock-llm-partial-stack.spec.ts › partial stack: port conflict › fails with a clear error when the ingress port is occupied	0ms
⏭️	mock-llm-partial-stack.spec.ts › partial stack: port conflict › starts successfully on a free port after a conflict	1ms
✅	mock-llm-preset-automation.spec.ts › preset automation → slash command conversation › automation card sends the correct slash command to a conversation	16.0s
✅	mock-llm-preset-automation.spec.ts › preset automation → slash command conversation › direct slash command from home page triggers skill activation	13.4s
✅	mock-llm-profile-management.spec.ts › active profile deletion + reconciliation › active profile is deletable and reconciliation activates another profile	8.7s
✅	mock-llm-profile-management.spec.ts › same-model profile identity › chat header shows the correct profile when two profiles share the same model	15.0s
✅	mock-llm-skills.spec.ts › skill loading: project, user, and deletion › project skill in workspace/.agents/skills/ triggers on matching keyword	13.6s
✅	mock-llm-skills.spec.ts › skill loading: project, user, and deletion › user skill in ~/.openhands/skills/ triggers on matching keyword	13.4s
✅	mock-llm-skills.spec.ts › skill loading: project, user, and deletion › deleting a user skill removes it from subsequent conversations	13.3s
❌	mock-llm-tool-visualizers.spec.ts › tool visualizers › bash tool visualizer renders command and output (1 retries)	31.1s
⏭️	mock-llm-tool-visualizers.spec.ts › tool visualizers › file editor visualizer renders file path and diff content (1 retries)	0ms
⏭️	mock-llm-tool-visualizers.spec.ts › tool visualizers › agent reply renders after tool call events (1 retries)	0ms
✅	mock-llm-ui-regressions.spec.ts › UI regressions › scopes standalone styles to the agent-server-ui shell	1.5s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › renders critic results on agent messages and finish actions	1.5s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › loads older events when scrolling up	1.7s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › selected workspace persists after navigating away and returning	2.5s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › cleared sessionStorage yields empty workspace selection	994ms

🔍 Failure details (2)

❌ mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › terminal tab shows empty state message

Error: expect(locator).toBeVisible() failed

Locator: getByText(/No terminal output|No output/i).first()
Expected: visible
Timeout: 15000ms
Error: element(s) not found

Call log:
  - Expect "toBeVisible" with timeout 15000ms
  - waiting for getByText(/No terminal output|No output/i).first()

❌ mock-llm-tool-visualizers.spec.ts › tool visualizers › bash tool visualizer renders command and output

Error: expect(locator).toBeVisible() failed

Locator: getByText('echo \'hello world\'').first()
Expected: visible
Timeout: 10000ms
Error: element(s) not found

Call log:
  - Expect "toBeVisible" with timeout 10000ms
  - waiting for getByText('echo \'hello world\'').first()

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

all-hands-bot · 2026-06-17T22:06:04Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

all-hands-bot

🟡 Acceptable — but the test suite has a few weak assertions and the PR description is missing the required Evidence section.

The new specs cover useful real-user paths (drawer tabs, empty states, bash + file-editor visualizers), and the mock event payloads are well-structured. I verified the referenced i18n keys (BROWSER$NO_PAGE_LOADED, TERMINAL$NO_OUTPUT, BROWSER$URL_PLACEHOLDER), the targeted data-testids (chat-interface, browser-chrome-bar, browser-chrome-url, conversation-tab-*, drawer-vscode-link, files-tab-diff-toggle), and the visualizer schemas (ExecuteBashAction/TerminalAction handling observation.command + observation.content; FileEditorAction/StrReplaceEditorAction handling observation.path/old_content/new_content). The mock payloads match what those components consume, so the wiring is correct.

However, a handful of the assertions are weaker than they should be — they would silently pass through a regression instead of catching it. The PR is also missing the Evidence section required by the repo's review policy (commands + output proving the real test paths ran in CI, or a link to the agent conversation).

See inline comments for specific fixes.

Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.
Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

Was this review helpful? React with 👍 or 👎 to give feedback.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟢 LOW
Test-only change (two new spec files, no production code modified). Worst case: tests are flaky and need follow-up. No user-facing impact.

VERDICT:
❌ Needs rework — add the Evidence section required by the PR template, and tighten the weak assertions flagged inline.

KEY INSIGHT:
The test wiring against real production components is correct, but two assertions (the .or(files) fallback and the toBeTruthy() URL check) are written to never fail by design — they prove the test ran, not that the code is correct.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

all-hands-bot · 2026-06-17T22:07:05Z

+      // No external link should be active when there's no page loaded
+      const openExternal = page.getByTestId("browser-chrome-open-external");
+      await expect(openExternal).toHaveCount(0);
+    });


🟠 Important: Weak assertion — toBeTruthy() on the URL field proves nothing about the empty state. The comment says it should be the i18n placeholder, but the test only checks the field is non-empty. A regression that renders the last-seen URL (or "https://…") in empty state would pass. Assert on the actual placeholder text:

Suggested change

});

const text = (await urlField.textContent()) ?? "";

expect(text).toContain("No URL loaded");

(The BROWSER$URL_PLACEHOLDER i18n value is No URL loaded, verified in src/i18n/translation.json.)

all-hands-bot · 2026-06-17T22:07:06Z

+    });
+  });
+
+  // ── Terminal tab: empty state ──────────────────────────────────────


🟡 Suggestion: Brittle i18n string match. getByText("No page loaded yet", { exact: false }) will pass on the current translation but break for any future copy edit, or silently no-op in any non-English locale the test runner picks up. Add a data-testid to the empty-state message component and assert on that instead — it's the same pattern you already use everywhere else in this file.

all-hands-bot · 2026-06-17T22:07:06Z

+    });
+  });
+
+  // ── Tab switching ──────────────────────────────────────────────────


🟡 Suggestion: Brittle i18n string match (regex). Same issue as the browser empty-state check: getByText(/No terminal output|No output/i) couples the test to the English copy. Add a data-testid to the empty-terminal component and use that.

all-hands-bot · 2026-06-17T22:07:06Z

+          page.locator('[class*="file"]').first(),
+        ),
+      ).toBeVisible({ timeout: 10_000 });
+    });


🔴 Critical: .or() fallback makes the assertion meaningless. page.getByTestId("files-tab-diff-toggle").or(page.locator('[class*="file"]').first()) passes if any element in the DOM has the substring file in its class — the body container, a dropdown, an unrelated file-tree node. If the files tab is completely broken, this test will still pass. Replace with a real assertion on the files tab's own testid (e.g. assert a container that lives inside the files tab panel is visible):

Suggested change

});

await expect(

page.getByTestId("files-tab-diff-toggle"),

).toBeVisible({ timeout: 10_000 });

If files-tab-diff-toggle doesn't always exist, that's a separate bug to fix in the component, not a reason to weaken the test.

all-hands-bot · 2026-06-17T22:07:06Z

+      await expect(page.getByTestId("browser-chrome-bar")).not.toBeVisible();
+    });
+  });
+


🟡 Suggestion: waitForTimeout(500) is a magic number. The comment says "wait for tab switch" but a fixed sleep is exactly what Playwright's auto-waiting was designed to avoid. Either drop the timeout (the next assertion already has its own timeout) or wait for a specific selector that only appears in the terminal tab.

all-hands-bot · 2026-06-17T22:07:06Z

+    });
+  });
+
+  // ── VS Code drawer link ────────────────────────────────────────────


🟡 Suggestion: Negative assertion is too weak. "Verify the browser chrome bar is not visible" doesn't actually prove the terminal tab is the active one — the chrome bar could simply be off-screen, or the panel could be in a transient state during the click. A positive assertion (e.g. that the terminal empty-state testid from line 223 is visible) is much stronger evidence the tab switch succeeded.

vercel Bot deployed to Preview June 10, 2026 09:09 View deployment

malhotra5 added the canvas-review Experimental code review from an a shared agent canvas instance label Jun 12, 2026

github-actions Bot added a commit that referenced this pull request Jun 12, 2026

snapshot images for PR #1291 run 27387982325

677070b

Merge branch 'main' into auto/e2e-coverage-2026-06-10

4f4181a

vercel Bot deployed to Preview June 13, 2026 01:48 View deployment

malhotra5 added the review-test experimental cloud code review bot label Jun 17, 2026

all-hands-bot requested changes Jun 17, 2026

View reviewed changes

	});
	const text = (await urlField.textContent()) ?? "";
	expect(text).toContain("No URL loaded");

Conversation

malhotra5 commented Jun 10, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Covered PRs

New test coverage

Implementation

Verification

HUMAN:

Uh oh!

vercel Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 10, 2026

❌ Mock-LLM E2E Tests

❌ mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › terminal tab shows empty state message

❌ mock-llm-tool-visualizers.spec.ts › tool visualizers › bash tool visualizer renders command and output

Uh oh!

github-actions Bot commented Jun 10, 2026

🛑 Mock-LLM Docker E2E Test Results

❌ chromium › mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › terminal tab shows empty state message

❌ chromium › mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › terminal tab shows empty state message

❌ chromium › mock-llm-tool-visualizers.spec.ts › tool visualizers › bash tool visualizer renders command and output

Uh oh!

all-hands-bot commented Jun 12, 2026

Uh oh!

all-hands-bot commented Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

📸 Snapshot Test Report

archived-conversation

automations — 3 snapshots

backends-extended — 3 snapshots

backends — 3 snapshots

changes-tab

mcp-page — 4 snapshots

onboarding

settings-page — 3 snapshots

settings-secrets

settings-verification

skills-page — 4 snapshots

Uh oh!

github-actions Bot commented Jun 13, 2026

❌ Mock-LLM E2E Tests

❌ mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › terminal tab shows empty state message

❌ mock-llm-tool-visualizers.spec.ts › tool visualizers › bash tool visualizer renders command and output

Uh oh!

github-actions Bot commented Jun 13, 2026

❌ Mock-LLM Docker E2E Test Results

❌ mock-llm-drawer-and-empty-states.spec.ts › drawer tabs and empty states › terminal tab shows empty state message

❌ mock-llm-tool-visualizers.spec.ts › tool visualizers › bash tool visualizer renders command and output

Uh oh!

all-hands-bot commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

malhotra5 commented Jun 10, 2026 •

edited by github-actions Bot

Loading

vercel Bot commented Jun 10, 2026 •

edited

Loading

`archived-conversation`

`automations` — 3 snapshots

`backends-extended` — 3 snapshots

`backends` — 3 snapshots

`changes-tab`

`mcp-page` — 4 snapshots

`onboarding`

`settings-page` — 3 snapshots

`settings-secrets`

`settings-verification`

`skills-page` — 4 snapshots

all-hands-bot commented Jun 17, 2026 •

edited

Loading