Devin/1778098342 phase 0 hygiene by SPerekrestova · Pull Request #44 · SPerekrestova/interactive-leetcode-mcp

SPerekrestova · 2026-06-01T17:03:17Z

No description provided.

- problem.ts: Problem, SimplifiedProblem, ProblemSummary, ProblemSearchResult, DailyChallenge, CodeSnippet, TopicTag, SimilarQuestion - user.ts: UserStatus, UserProfile, UserContestInfo, UserAllSubmissions, UserRecentSubmissions, UserRecentACSubmissions, UserSubmissionDetail, UserProgressQuestionList, SubmissionRow - solution.ts: SolutionArticleSummary, SolutionArticleList, SolutionArticleDetail - errors.ts: ErrorCode discriminated union, LeetCodeError class, isLeetCodeError type guard - index.ts: re-exports Types describe the shapes returned by LeetcodeServiceInterface methods — the projected envelopes the service emits, not the raw upstream GraphQL payloads. Existing src/types/credentials.ts and src/types/submission.ts unchanged (their shapes already match the interface). No behavior change; no consumers wired up yet.

src/leetcode/schemas.ts exports zod validators for the four LeetCode response shapes the server actually parses today: - SubmitResponseSchema: response from POST /problems/<slug>/submit/ - CheckResponseSchema: response from GET /submissions/detail/<id>/check/ - QuestionIdResponseSchema: response from the questionId GraphQL query - ValidateCredentialsResponseSchema: response from the userStatus GraphQL query Each schema uses .passthrough() so unexpected fields don't fail the parse — only missing required fields do. CheckResponse is intentionally loose on status_msg, code_answer, and expected_answer because LeetCode omits / changes their shape between PENDING and SUCCESS states. z.infer<> types are exported alongside each schema for use in the service impl. No behavior change; schemas not yet consumed.

LeetcodeServiceInterface: - Replace every Promise<any> with concrete typed returns drawn from src/types/{problem,user,solution,submission} - Add SolutionArticlesQueryOptions for fetchQuestionSolutionArticles - Method signatures unchanged in name/arity; only return types tightened LeetCodeGlobalService: - Implement all methods against the new return types - Replace stringly-typed throws with LeetCodeError(ErrorCode.AUTH_REQUIRED, …) on every authenticated path; LeetCodeError(ErrorCode.PROBLEM_NOT_FOUND, …) on missing problems - Validate the four LeetCode response shapes via zod at the wire boundary: submitSolution() now safeParses both SubmitResponse and CheckResponse and throws LeetCodeError(ErrorCode.UPSTREAM_PAYLOAD_INVALID, …) on schema failure with the zod issue list attached as .cause - getQuestionId() validates QuestionIdResponseSchema and surfaces a typed error instead of returning a stringified shape - validateCredentials() validates ValidateCredentialsResponseSchema; on schema failure logs a warning and returns null (preserves the previous null-on-failure contract for the auth path) - Project nullable upstream fields (runtime_percentile, total_correct, etc.) into the SubmissionResult shape via ?? undefined so consumers see number | undefined, not number | null | undefined This commit removes 15+ Promise<any> from the interface and ~20 'as any' casts from the implementation. fetchSolutionArticleDetail() now returns SolutionArticleDetail | null (LeetCode actually returns null for unknown topicIds; previous 'as ... | undefined' was a lie).

…n-memory Fixes the silent-logout-on-restart bug from the assessment: the server loaded ~/.leetcode-mcp/credentials.json from disk, validated them, and told the user they were 'authenticated' — but never actually pushed the cookies into the in-memory Credential the LeetCode client reads from. Result: every authenticated tool call after a restart failed with 'Authentication required' until the user re-pasted their cookies. Three pieces: - src/auth/auth-flow.ts (new): restoreCredentials(service, storage) loads the persisted credentials, calls service.validateCredentials() to confirm they're still good, and on success calls service.updateCredentials() to push them into the running service. Returns a typed RestoreOutcome ('no_credentials' | 'invalid' | 'restored') for logging. Never throws. - src/index.ts: invoke restoreCredentials(leetcodeService) at startup before tools/prompts are registered. Best-effort: any failure leaves the server unauthenticated as before. - src/mcp/tools/auth-tools.ts: in check_auth_status, when validation succeeds, also call leetcodeService.updateCredentials() so the very next authenticated tool call works without forcing a server restart.

- tests/auth/auth-flow.test.ts (new, 5 tests): exercises restoreCredentials with a mocked service+storage across the four outcome paths (no_credentials / load_failed / expired / restored) plus the validate-throws path - tests/mcp/tools/auth-tools.test.ts: extend the existing check_auth_status happy-path test to assert service.updateCredentials() is called (regression guard for the silent-logout bug) - tests/services/{problem,solution}-services.test.ts: tighten access patterns now that return types are concrete (optional chaining where upstream fields can legitimately be missing). solution test asserts toBeNull() for the invalid-topicId case to match fetchSolutionArticleDetail's actual return. Test count: 146 → 151 (5 new auth-flow tests). All 151 pass; test:types passes; build clean; npm audit reports 0.

Drive-by formatting fix from running 'npm run format' during Phase 1. No behavior change.

Address review comments on PR #36: - fetchUserAllSubmissions: drop `| null` from interface lang/status so it matches the impl. No callers ever pass null. - fetchUserProgressQuestionList: tighten impl to require `filters: { offset: number; limit: number; ... }` matching the interface. The single caller (get_user_progress_questions tool) zod-defaults offset/limit, so they're always passed. - fetchProblem: now throws LeetCodeError(PROBLEM_NOT_FOUND) on missing upstream rather than returning a typed-but-actually-null Problem. Removes the contradictory shape where the impl returned null but the signature claimed `Promise<Problem>`. fetchProblemSimplified no longer needs its own null-check. - UserStatus.username and .avatar are now `string | null` (was `string` defaulted to ""). With `isSignedIn: false` and `username: ""` consumers couldn't distinguish 'signed out' from 'signed in, no display name'. No behavior change for happy paths; error paths now use the structured LeetCodeError instead of a stringified misshape.

Addresses review on PR #36: redeclaring 'public readonly cause?: unknown' on LeetCodeError shadowed the native ES2022 Error.cause field via the class-field installer (under useDefineForClassFields). Anything walking the standard chain via `err.cause` saw the right thing only by accident. Use `super(message, { cause })` and drop the field — same external API, but err.cause now refers to the actual native chain so loggers and debuggers that expect ES2022 semantics work consistently.

…ry/catch Addresses review on PR #36: - Extract validate->updateCredentials sequence into a small reusable helper applyValidatedCredentials(service, csrf, session) so both restoreCredentials() and check_auth_status share one implementation. - Drop the try/catch in restoreCredentials that mapped any throw to {status: 'invalid', reason: 'expired'}. The interface contract for validateCredentials is Promise<string | null>; the catch was dead code against any conforming impl. If a future impl violates the contract by throwing, surfacing the error is more useful than silently swallowing it. - Update auth-flow tests: replace the dead 'swallows the error when validate throws' test with one asserting that exceptions propagate; add unit tests for the new applyValidatedCredentials helper. Test count: 153 (was 151). All passing.

Phase 1: types, zod runtime validation, structured errors, auth restore bugfix

Adds the infrastructure for spawning the real MCP server binary as a child process, attaching the SDK's StdioClientTransport, and serving LeetCode HTTP from a JSON fixture via a nock-activating preload script. - nock devDep (^14.0.15) - tests/e2e/harness/preload.mjs activates nock + replays fixture - tests/e2e/harness/spawn-server.ts mkdtemp HOME + StdioClientTransport - tests/e2e/harness/global-setup.ts rebuilds build/index.js if stale - tests/e2e/harness/types.ts shared fixture types (preload + tests) - vitest.e2e.config.ts dedicated config + globalSetup wiring - package.json test:e2e uses the new config; the default 'test' script now excludes tests/e2e so unit/integration runs stay fast

Locks in the wire-level surface area: server name/version are non-empty after the MCP handshake, and the registered tools / prompts / resource templates match the expected set. Any drift in tool names or registry shape now fails CI before clients do.

Spawns a real server with a pre-seeded ~/.leetcode-mcp/credentials.json and a mocked userStatus GraphQL response, then calls check_auth_status over stdio. Fails if the Phase 1 fix regresses (i.e., the on-disk creds are read but never pushed into the in-memory Credential). Also asserts the negative path: with no credentials file, the tool reports authenticated=false.

Spawns the server, mocks the leetcode-query 'question(titleSlug:…)' GraphQL operation, and asserts get_problem returns the expected { titleSlug, problem } envelope with a topicTags-as-string[] projection. Also rewrites tests/e2e/README.md as the harness's actual user-facing docs (how it mocks HTTP, how to author a fixture, why the default 'test' script excludes tests/e2e), and removes the Phase 0 placeholder spec now that the directory has real specs.

Comparing build/index.js mtime against only src/index.ts let edits to any other module slip past the freshness check, exercising a stale binary against the new specs — the very class of bug the hook was supposed to prevent. Now recursively walks src/, takes the max mtime of every .ts file, and rebuilds when the binary lags behind any of them. Verified manually: `touch src/utils/logger.ts && npm run test:e2e` triggers a full rebuild (~3s) and bumps the binary mtime above the touched source file.

…er HOME Two fixes in spawn-server.ts surfaced by review: 1. `new URL(`file://${path}`)` does not percent-encode path segments, so `--import file:///Users/Some Name/.../preload.mjs` would fail when the harness lived under a path containing spaces or other URL-reserved characters (common on macOS user dirs). Switched to `pathToFileURL(PRELOAD).href` which is the documented primitive. 2. When a caller supplied `options.home`, the harness still wrote `fixture.json` into that directory but cleanup only removed homes the harness created. Specs that pre-seed credentials silently got an extra `fixture.json` byproduct. Now the fixture lives in its own `mkdtemp` directory regardless of who owns HOME, and the harness always cleans it up.

…e-harness Phase 2: real e2e harness — spawn build/index.js + nock + isolated HOME

Pure-logic foundation for the server-enforced tutoring contract — no server wiring yet, no behaviour change for existing tools. - src/types/session.ts: SessionState + HintLevel + SessionStatus - src/types/errors.ts: add HINT_LEVEL_TOO_LOW + SESSION_NOT_FOUND - src/domain/hint-state-machine.ts: advanceHint / resetSession / assertSolutionUnlocked (the gate that solution-returning tools will call) - src/domain/pedagogy.ts: generateHint(problem, level, userCode?) projecting problem-derived hint text per level (1..4). userCode parameter is reserved for Phase 5 (workspace awareness) so the contract is stable. - src/domain/session-store.ts: FileSessionStore — one JSON per slug under ~/.leetcode-mcp/sessions, mode 0o600, with malformed-file recovery and slug validation. 23 unit tests cover the level transitions, the gate, the hint projections, and the session round-trip / path-traversal / malformed-file paths.

…ate machine + hint generator Tools should depend on SessionService, not on FileSessionStore / hint-state-machine / pedagogy directly — it's the seam that makes the gate uniform across solution-returning tools and gives the wire layer a single object to wire up. - startOrResume(slug, language?): idempotent — re-running on a slug the user is already mid-way through preserves hint progress. - get(slug): null when no start_problem call. - advance(slug, problem): bumps level + persists + returns generated hint text; throws SESSION_NOT_FOUND when called without start_problem. - reset(slug): zeroes level / attempts / lastLocalRunPassed. - assertSolutionUnlocked(slug): the gate that solution tools call. Pure domain types (no IO) move into FileSessionStore via constructor injection so tests can pass an in-memory or fixture-backed store.

…s with MCP instructions field MCP prompts are opt-in; "agent must remember to call leetcode_learning_mode" is precisely the kind of instruction-following LLMs reliably fail at. The MCP `instructions` field, supported on McpServer's ServerOptions since the SDK shipped MCP 2025-06-18 support, is delivered to clients at handshake regardless of whether the agent ever asks for prompts. The single source of truth for the pedagogy contract now lives there and is read once per session. Kept as a plain exported constant so it's easy to unit-test independently and to evolve as later phases land (workspace awareness, runners, strict-mode submission gating).

…tate / reset_session Four new MCP tools that drive the pedagogy state machine. - start_problem(titleSlug, language?): idempotent — opens (or resumes) a tutoring session. Must be called before request_hint, list_problem_solutions, or get_problem_solution. Re-running on a slug the user is already mid-way through preserves their hint progress. - request_hint(titleSlug): advances the hint level by 1 and returns generated text. Levels: 1 clarification → 2 approach → 3 implementation sketch → 4 solution unlock. - get_session_state(titleSlug): returns the persisted session for a problem, or null if start_problem was never called for it. Useful for restoring context after a restart. - reset_session(titleSlug): clears the session back to level 0 with attempts/lastLocalRunPassed zeroed. Lifecycle status reset to 'started'. Errors flow through a shared errorEnvelope that surfaces the structured `code` field (HINT_LEVEL_TOO_LOW / SESSION_NOT_FOUND / etc) alongside the human message, so clients can dispatch on it. Re-exported so the solution tools can render the same shape when their gate trips.

…nt level Both community-solution tools now reject with HINT_LEVEL_TOO_LOW until the active session for the slug has reached the maximum hint level. - list_problem_solutions: gates on the questionSlug parameter that is already required. - get_problem_solution: adds a required `titleSlug` parameter (the topicId alone doesn't tell us which session to gate on). New parameter, additive: existing clients calling without titleSlug see a clear "required" error. Tool descriptions explicitly call out the GATED status so any agent reading them knows the contract. Errors render through the shared errorEnvelope so a missing or underleveled session surfaces as the structured code, not a free-form text `error: ...`.

…strap - McpServer constructor now receives ServerOptions with `instructions` set to the canonical pedagogy contract — delivered at handshake. - SessionService is constructed once per server lifetime and shared between the new session tools and the gated solution tools. - registerSessionTools and registerSolutionTools take the service via constructor injection, so unit/integration tests can supply a store backed by a per-test temp dir.

- tests/integration/solution-tools-integration.test.ts: rewritten to inject a SessionService over a per-test temp dir. Adds three gate cases (no session, session below unlock, session at unlock) for both list_problem_solutions and get_problem_solution. 8 tests, all pass. - tests/e2e/lifecycle.test.ts: extends the expected-tools list to include start_problem, request_hint, get_session_state, reset_session. Catches drift between the registered tool set and the wire surface. - tests/e2e/pedagogy-gate.test.ts: end-to-end regression for the gate. Drives a real server through start_problem → request_hint × 4 with a mocked GraphQL fixture, asserts list_problem_solutions rejects with HINT_LEVEL_TOO_LOW at every pre-unlock level and only opens up at level 4. Second case: reset_session clamps back to 0 and re-engages the gate. Locks the contract end-to-end so a future refactor that drops the gate fails CI before clients do. 178/178 unit+integration; 11/11 e2e (was 9; +2 gate cases).

…dagogy Phase 3: server-enforced pedagogy state machine + new session/hint tools

…ict-mode gate Merge PR #39 after review fixes and local verification.

SPerekrestova

Reviewed for correctness/security. Three concrete findings worth addressing — none are blockers, but #1 is a real (latent) state-machine bug and #2 is a security hardening gap.

1. recordLocalRun keeps solved across failing runs — src/domain/session-service.ts:164-165

status:
    session.status === "started" ? "attempting" : session.status,

Once a session reaches solved, a subsequent failing run_local_tests keeps status === "solved" while lastLocalRunPassed flips to false — the docstring claims the lifecycle stays "accurate", but a failed run after solving silently keeps the solved label. Today nothing in the codebase sets "solved" (only "started" → "attempting" happens), so this is latent — but the moment something writes "solved", this branch will lie. Suggest: when passed === false, demote from "solved" back to "attempting".

2. sandbox-exec profile escaping is incomplete — src/runner/sandbox.ts:161

(allow file-write* (subpath "${cwdAllowed.replace(/"/g, '\\"')}"))

Only " is escaped; a \ in the path will produce a broken escape or terminate the string early, breaking the profile (and theoretically allowing injection). Today cwdAllowed only ever comes from mkdtemp(join(tmpdir(), "leetcode-mcp-run-")), so the risk is theoretical — but it's a foot-gun if any future caller passes a less-controlled path. Suggest: cwdAllowed.replace(/\\/g, "\\\\").replace(/"/g, '\\"'), and reject paths containing newlines.

3. spawnAndCapture finalizes twice on error then close — src/runner/subprocess-runner.ts:322-350
The error handler resolve(...)s and the close handler later runs finalize() which also resolve(...)s. Promise semantics make the second resolve a no-op, so no observable bug — but Buffer.concat(stdout) + clampOutput rerun on potentially large buffers, and the listeners aren't torn down. Suggest a settled flag (or detach the other listener) so the second path short-circuits.

Other things I checked and they look correct: assertSafeSlug regex, applyValidatedCredentials reference semantics, advanceHint clamping at MAX_HINT_LEVEL, env scrubbing in spawnAndCapture (only PATH/HOME/LANG forwarded), the 1MB output cap belt-and-braces, and the zod passthrough() usage.

Generated by Claude Code

SPerekrestova and others added 26 commits May 6, 2026 23:56

chore: prettier sweep on scripts/sync-version.cjs

a345a83

Drive-by formatting fix from running 'npm run format' during Phase 1. No behavior change.

Merge pull request #36 from SPerekrestova/devin/1778111135-phase-1-types

a7b80ea

Phase 1: types, zod runtime validation, structured errors, auth restore bugfix

Merge pull request #37 from SPerekrestova/devin/1778177573-phase-2-e2…

f66913d

…e-harness Phase 2: real e2e harness — spawn build/index.js + nock + isolated HOME

Merge pull request #38 from SPerekrestova/devin/1778178452-phase-3-pe…

6b5f02b

…dagogy Phase 3: server-enforced pedagogy state machine + new session/hint tools

Phase 4a: Python local runner — run_local_tests + runner_doctor + str…

4a125de

…ict-mode gate Merge PR #39 after review fixes and local verification.

SPerekrestova self-assigned this Jun 1, 2026

SPerekrestova commented Jun 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Devin/1778098342 phase 0 hygiene#44

Devin/1778098342 phase 0 hygiene#44
SPerekrestova wants to merge 26 commits into
mainfrom
devin/1778098342-phase-0-hygiene

SPerekrestova commented Jun 1, 2026

Uh oh!

SPerekrestova left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SPerekrestova commented Jun 1, 2026

Uh oh!

SPerekrestova left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant