Phase 2: real e2e harness — spawn build/index.js + nock + isolated HOME#37
Conversation
Adds the infrastructure for spawning the real MCP server binary as a
child process, attaching the SDK's StdioClientTransport, and serving
LeetCode HTTP from a JSON fixture via a nock-activating preload script.
- nock devDep (^14.0.15)
- tests/e2e/harness/preload.mjs activates nock + replays fixture
- tests/e2e/harness/spawn-server.ts mkdtemp HOME + StdioClientTransport
- tests/e2e/harness/global-setup.ts rebuilds build/index.js if stale
- tests/e2e/harness/types.ts shared fixture types (preload + tests)
- vitest.e2e.config.ts dedicated config + globalSetup wiring
- package.json test:e2e uses the new config; the
default 'test' script now excludes
tests/e2e so unit/integration runs
stay fast
Locks in the wire-level surface area: server name/version are non-empty after the MCP handshake, and the registered tools / prompts / resource templates match the expected set. Any drift in tool names or registry shape now fails CI before clients do.
Spawns a real server with a pre-seeded ~/.leetcode-mcp/credentials.json and a mocked userStatus GraphQL response, then calls check_auth_status over stdio. Fails if the Phase 1 fix regresses (i.e., the on-disk creds are read but never pushed into the in-memory Credential). Also asserts the negative path: with no credentials file, the tool reports authenticated=false.
Spawns the server, mocks the leetcode-query 'question(titleSlug:…)'
GraphQL operation, and asserts get_problem returns the expected
{ titleSlug, problem } envelope with a topicTags-as-string[] projection.
Also rewrites tests/e2e/README.md as the harness's actual user-facing
docs (how it mocks HTTP, how to author a fixture, why the default 'test'
script excludes tests/e2e), and removes the Phase 0 placeholder spec
now that the directory has real specs.
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
Code reviewOverall this is a thoughtful, well-documented harness. Two concrete bugs, both in the harness itself: Bug 1 — Stale-binary check defeats its own purpose
const [binStat, srcStat] = await Promise.all([
stat("build/index.js"),
stat("src/index.ts")
]);
return binStat.mtimeMs < srcStat.mtimeMs;The file's docstring says "ensures … Bug 2 —
|
Comparing build/index.js mtime against only src/index.ts let edits to any other module slip past the freshness check, exercising a stale binary against the new specs — the very class of bug the hook was supposed to prevent. Now recursively walks src/, takes the max mtime of every .ts file, and rebuilds when the binary lags behind any of them. Verified manually: `touch src/utils/logger.ts && npm run test:e2e` triggers a full rebuild (~3s) and bumps the binary mtime above the touched source file.
…er HOME
Two fixes in spawn-server.ts surfaced by review:
1. `new URL(`file://${path}`)` does not percent-encode path segments,
so `--import file:///Users/Some Name/.../preload.mjs` would fail when
the harness lived under a path containing spaces or other URL-reserved
characters (common on macOS user dirs). Switched to
`pathToFileURL(PRELOAD).href` which is the documented primitive.
2. When a caller supplied `options.home`, the harness still wrote
`fixture.json` into that directory but cleanup only removed homes
the harness created. Specs that pre-seed credentials silently got an
extra `fixture.json` byproduct. Now the fixture lives in its own
`mkdtemp` directory regardless of who owns HOME, and the harness
always cleans it up.
|
Both bugs and the smaller drawback are real — fixed in bb37ace (stale-binary) and 1f5eafd (preload URL + fixture isolation). CI green; e2e 7/7. Bug 1 — stale-binary check ( Bug 2 — Smaller drawback — fixture leaks into caller-owned HOME ( |
f66913d
into
devin/1778098342-phase-0-hygiene
Summary
Phase 2 of the redesign plan — replace the Phase 0 placeholder spec with a real end-to-end suite that exercises the MCP server as a black box: each spec spawns the built binary (
build/index.js) as a child process over stdio, attaches the SDK'sStdioClientTransport, and drives the server exactly like a real client would. Stacked on top of #36; base auto-rebases tomainonce #36 reaches main.Harness pieces (one commit):
tests/e2e/harness/preload.mjs— runs inside the spawned child viaNODE_OPTIONS=--import …before any user code. ActivatesnockwithdisableNetConnect()so the child can never accidentally reach the realleetcode.com, then reads a JSON fixture fromprocess.env.E2E_FIXTURE_PATHand installs interceptors that replay canned GraphQL / REST responses.tests/e2e/harness/spawn-server.ts—spawnServer({ fixture?, home?, env? }). mkdtemps a freshHOMEper call (so~/.leetcode-mcp/credentials.jsonis per-test), writes the fixture to a temp file, spawnsnode build/index.jswith the preload + isolated env, and returns a connected MCPClient. Specs that need to pre-seed credentials can pass their ownhome.tests/e2e/harness/global-setup.ts— vitestglobalSetuphook that rebuildsbuild/index.jsifsrc/is newer. Without this, editing source and runningtest:e2ewould silently exercise a stale binary.tests/e2e/harness/types.ts— sharedE2EFixturetype, dependency-free so both vitest specs and the lightweight preload script can import it.vitest.e2e.config.ts— dedicated config (only includestests/e2e/**, 30s test timeout, wiresglobalSetup).package.json—test:e2euses the new config; defaulttestscript now excludestests/e2e/**so unit/integration runs stay fast (~13s vs ~17s with e2e).test:allchainstest+test:e2e.nock@^14.0.15.Specs (three commits, 7 tests total):
lifecycle.test.ts(4 tests) — server name/version are non-empty after handshake; the registered tools / prompts / resource templates match the expected set. Locks the wire-level surface area: any drift in tool names or registry shape now fails CI before clients do.auth-restore.test.ts(2 tests) — silent-logout-on-restart regression. Spawns a real server with a pre-seeded~/.leetcode-mcp/credentials.jsonand a mockeduserStatusGraphQL response, then callscheck_auth_statusover stdio and assertsauthenticated: true, username: "alice". Also asserts the negative path (no credentials file →authenticated: false). If the Phase 1 fix regresses this fails.problem-flow.test.ts(1 test) — happy path. Mocks theleetcode-queryquestion(titleSlug:…)GraphQL operation, callsget_problem, asserts the wire-level envelope shape ({ titleSlug, problem }withtopicTagsprojected tostring[]).Drops: the Phase 0
placeholder.test.ts(no longer needed now that the directory has real specs).Tests: unit/integration 152/152 (was 153 with placeholder); e2e 7/7.
npm run buildclean;npm run test:typesclean;npm run formatclean.Why this design vs. the alternatives:
nockruns inside the child via the preload script rather than the parent test process — the parent can't intercept HTTP from a separately-spawned process, so this is the only way to exercise the real binary while still serving canned responses.globalSetuprunsnpm run build(only whensrc/is newer) rather than relying on the developer to remember. ~1s up front beats a silent stale-binary class of bug.vitest.e2e.config.tskeeps the slow spawn-the-binary specs out of the defaulttestrun.Review & Testing Checklist for Human
npm run test:e2epasses locally (it builds the binary if needed, then spawns three child specs)tests/e2e/harness/preload.mjs— particularly thetimes === undefined → scope.persist()branch, since nock 14's.persist()lives on the Scope (not the Interceptor) and that's a subtle one to get right--importESM preload path works on your platform (tested on Linux Node 22; macOS / Windows untested in CI)Notes
main, the base will auto-rebase and this PR's diff will shrink to just the four Phase 2 commits.submit_solution,save_leetcode_credentials, resource reads, etc. — adding a spec is now ~10 lines of fixture + test.@hono/node-server/honochain) pernpm audit; not blocking and unrelated to runtime code.cwd/${slug}.${ext},instructionsfield withget_startedkept as deprecated stub. Each is reversible with a single env var or follow-up PR.Link to Devin session: https://app.devin.ai/sessions/d003a60939484686b2953ae32fe2794d
Requested by: @SPerekrestova