WIP Test quality audit reports using Dave Farley's framework by aivong-openhands · Pull Request #1335 · OpenHands/agent-canvas

aivong-openhands · 2026-06-12T18:30:35Z

HUMAN:

A human has tested these changes.

AGENT:

This is an informational/WIP PR (not intended to merge as-is). It mirrors
OpenHands/integrations-hub#108 but for this repo: a point-in-time test-design
audit of the @openhands/agent-canvas test suite, produced with the
test-design-reviewer skill
which scores tests against Dave Farley's 8 Properties of Good Tests.

It adds two Markdown documents only — no product code or test code changes:

TEST_QUALITY_REPORT.md — executive summary with the aggregate Farley score.
TEST_QUALITY_PER_FILE_REPORT.md — per-category scores, detailed audits of
notable files, and a measured lines / tests / duration appendix for all
405 __tests__/ files.

Evidence (commands run):

npm ci
npm run test:coverage   # vitest run --coverage

Result: Test Files 413 passed | 1 skipped (414), Tests 3144 passed | 5 skipped | 9 todo.
Coverage: Statements 77.56%, Branches 67.82%, Functions 75.23%, Lines 78.59%.
The per-file durations, test counts, and coverage figures in the reports are
extracted directly from this run and from wc -l of each spec — no fabricated
numbers.

Why

We wanted a structured, point-in-time read on test-suite health (beyond raw
coverage %) to spotlight where tests serve as good living documentation and where
they could be faster / less brittle. This reproduces integrations-hub#108's audit
format for agent-canvas.

Summary

Add TEST_QUALITY_REPORT.md (executive summary; overall Farley score 7.9/10 — Excellent).
Add TEST_QUALITY_PER_FILE_REPORT.md (category scores + detailed file audits + measured-metrics appendix for all 405 __tests__ files).

Issue Number

N/A

How to Test

Docs-only change. To regenerate the underlying numbers:

npm ci
npm run test:coverage

Then read TEST_QUALITY_REPORT.md and TEST_QUALITY_PER_FILE_REPORT.md at the
repo root. There is no runtime behavior to exercise.

Video/Screenshots

N/A — documentation only (no UI change).

Type

Notes

Informational/WIP, like the reference PR; not necessarily meant to merge.
The generated coverage/ directory is intentionally not committed.
Per-file Farley sub-scores are only assigned to files that were read closely;
the remaining files are presented with measured metrics (no invented scores).

This PR description was created by an AI agent (OpenHands) on behalf of the user.

@aivong-openhands can click here to continue refining the PR

Add a point-in-time test-design audit of the agent-canvas test suite using Dave Farley's 8 Properties of Good Tests, via the test-design-reviewer skill. - TEST_QUALITY_REPORT.md: executive summary with aggregate Farley score (7.9/10) - TEST_QUALITY_PER_FILE_REPORT.md: category scores, detailed audits of notable files, and a measured lines/tests/duration appendix for all 405 __tests__ files Informational only; no product or test code is changed. Co-authored-by: openhands <openhands@all-hands.dev>

vercel · 2026-06-12T18:30:42Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agent-canvas	Ready	Preview, Comment	Jun 12, 2026 6:31pm

github-actions · 2026-06-12T18:34:53Z

⚠️ Mock-LLM Docker E2E Test Results

0/0 passed

Commit: 92e73e5d · Workflow run

Status	Test	Duration

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-12T18:40:15Z

✅ Mock-LLM E2E Tests

53/53 passed

Commit: 92e73e5d · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 1: configure ACP agent via Settings → Agent UI	13.9s
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 2: reload and verify ACP settings are persisted in UI	5.6s
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 3: start ACP conversation and verify agent reply	6.7s
✅	mock-llm-acp-agent.spec.ts › mock-LLM ACP agent conversation › step 4: resume ACP conversation from sidebar after navigating away	5.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: fresh install with runtime-injected key › reaches the onboarding modal without pre-seeded localStorage	1.3s
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	5.3s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.7s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › skips auth screen for returning user with valid stored key	803ms
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	7.1s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	29.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	6.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	6.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	6.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	6.6s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	5.9s
✅	mock-llm-cross-connect.spec.ts › cross-connect: frontend-only → backend-only › frontend-only connects to a separate backend-only instance	15.8s
✅	mock-llm-cross-connect.spec.ts › cross-connect: frontend-only → multiple backends › connects to two separate backends and switches between them	21.6s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 1: ensure mock LLM profile is configured	170ms
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 2: start conversation and attach workspace metadata	11.7s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 3: git control bar shows workspace pill and git actions	25.3s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 4: files tab defaults to diff view for attached workspace	5.9s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 5: browser tab shows empty state	6.4s
✅	mock-llm-files-and-git.spec.ts › files tab, git control bar, and browser tab › step 6: files tab defaults to file-tree view without attached workspace	7.5s
✅	mock-llm-folder-workspace.spec.ts › mock-LLM folder browser → workspace → conversation › step 1: browse to a folder, add it as a workspace, and launch a conversation with the correct working_dir	7.8s
✅	mock-llm-image-upload.spec.ts › mock-LLM image upload › attaching an image embeds it as base64 in the LLM completion call	13.5s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 1: GitHub card is visible on the MCP marketplace page	5.5s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 2: clicking GitHub card opens the install modal with correct fields	5.8s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 3: full install flow — fill PAT, submit, verify installed	12.9s
✅	mock-llm-mcp-github.spec.ts › MCP GitHub server install flow › step 4: installed GitHub server can be deleted	5.8s
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory	13.1s
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch	7.0s
✅	mock-llm-onboarding-happy-path.spec.ts › onboarding happy path › completes the full onboarding flow and launches a conversation	3.7s
✅	mock-llm-onboarding-regressions.spec.ts › onboarding recent regressions › keeps the modal open on backdrop click and Escape	1.4s
✅	mock-llm-onboarding-regressions.spec.ts › onboarding recent regressions › defaults the LLM setup step to OpenAI GPT-5.5	1.6s
✅	mock-llm-partial-stack.spec.ts › partial stack: --frontend-only › serves the frontend but returns 503 for backend routes	7.4s
✅	mock-llm-partial-stack.spec.ts › partial stack: --backend-only › serves backend APIs but returns 503 for the frontend root	14.1s
✅	mock-llm-partial-stack.spec.ts › partial stack: port conflict › fails with a clear error when the ingress port is occupied	113ms
✅	mock-llm-partial-stack.spec.ts › partial stack: port conflict › starts successfully on a free port after a conflict	6.0s
✅	mock-llm-preset-automation.spec.ts › preset automation → slash command conversation › automation card sends the correct slash command to a conversation	16.0s
✅	mock-llm-preset-automation.spec.ts › preset automation → slash command conversation › direct slash command from home page triggers skill activation	13.6s
✅	mock-llm-profile-management.spec.ts › active profile deletion + reconciliation › active profile is deletable and reconciliation activates another profile	8.6s
✅	mock-llm-profile-management.spec.ts › same-model profile identity › chat header shows the correct profile when two profiles share the same model	15.3s
✅	mock-llm-skills.spec.ts › skill loading: project, user, and deletion › project skill in workspace/.agents/skills/ triggers on matching keyword	13.6s
✅	mock-llm-skills.spec.ts › skill loading: project, user, and deletion › user skill in ~/.openhands/skills/ triggers on matching keyword	13.5s
✅	mock-llm-skills.spec.ts › skill loading: project, user, and deletion › deleting a user skill removes it from subsequent conversations	13.4s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › scopes standalone styles to the agent-server-ui shell	1.4s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › renders critic results on agent messages and finish actions	1.5s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › loads older events when scrolling up	1.7s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › selected workspace persists after navigating away and returning	2.1s
✅	mock-llm-ui-regressions.spec.ts › UI regressions › cleared sessionStorage yields empty workspace selection	989ms

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

vercel Bot deployed to Preview June 12, 2026 18:31 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP Test quality audit reports using Dave Farley's framework#1335

WIP Test quality audit reports using Dave Farley's framework#1335
aivong-openhands wants to merge 1 commit into
mainfrom
test-quality-audit-farley

aivong-openhands commented Jun 12, 2026

Uh oh!

vercel Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aivong-openhands commented Jun 12, 2026

Why

Summary

Issue Number

How to Test

Video/Screenshots

Type

Notes

Uh oh!

vercel Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 12, 2026

⚠️ Mock-LLM Docker E2E Test Results

Uh oh!

github-actions Bot commented Jun 12, 2026

✅ Mock-LLM E2E Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel Bot commented Jun 12, 2026 •

edited

Loading