Skip to content

fix(ws): harden auth error handling#2946

Merged
Bohan-J merged 1 commit into
multica-ai:mainfrom
YOMXXX:codex/ws-auth-error-handling
May 21, 2026
Merged

fix(ws): harden auth error handling#2946
Bohan-J merged 1 commit into
multica-ai:mainfrom
YOMXXX:codex/ws-auth-error-handling

Conversation

@YOMXXX
Copy link
Copy Markdown
Contributor

@YOMXXX YOMXXX commented May 20, 2026

What does this PR do?

Hardens WebSocket auth/error handling on both sides of the connection.

On the server, first-message auth and auth_ack writes now check WriteMessage errors and log write failures with frame/user/workspace context. On the client, malformed WebSocket frames are logged and skipped instead of throwing out of the onmessage callback and preventing later frames from being processed.

Related Issue

Closes #2933
Closes #2934

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Refactor / code improvement (no behavior change)
  • Documentation update
  • Tests (adding or improving test coverage)
  • CI / infrastructure

Changes Made

  • Added guarded JSON parsing in packages/core/api/ws-client.ts; unparseable frames now call logger.warn and return.
  • Added a WS client regression test proving malformed frames do not throw and later valid frames still dispatch.
  • Added writeWSAuthFrame / writeWSAuthErrorAndClose helpers in server/internal/realtime/hub.go.
  • Updated first-message auth error paths and auth_ack to log failed writes instead of discarding errors.
  • Added a backend regression test that captures slog output when auth frame writes fail.

How to Test

  1. Send a malformed frame to WSClient.onmessage; it should log and skip without throwing.
  2. Send a valid frame afterward; the registered handler should still run.
  3. Force an auth-frame write failure in the backend helper; the failure should be logged with frame and workspace context.

Verification

  • pnpm --filter @multica/core exec vitest run api/ws-client.test.ts
  • pnpm --filter @multica/core typecheck
  • pnpm --filter @multica/core test
  • pnpm --filter @multica/core lint (exited 0 with one pre-existing hook warning in packages/core/platform/auth-initializer.tsx)
  • cd server && go test ./internal/realtime -run TestWriteWSAuthFrameLogsWriteErrors
  • cd server && go test ./internal/realtime
  • cd server && go test ./...
  • pnpm typecheck
  • git diff --check

Checklist

  • I have included a thinking path that traces from project context to this change
  • I have run tests locally and they pass
  • I have added or updated tests where applicable
  • If this change affects the UI, I have included before/after screenshots
  • I have updated relevant documentation to reflect my changes
  • If I added a new runtime / coding tool / UI tab, I synced the change to landing copy (apps/web/features/landing/i18n/) and relevant docs (apps/docs/content/docs/)
  • If this PR touches Chinese product copy, I checked it against apps/docs/content/docs/developers/conventions.zh.mdx (terminology, mixed-rule for task / issue / skill)
  • I have considered and documented any risks above
  • I will address all reviewer comments before requesting merge

AI Disclosure

AI tool used: Codex

Prompt / approach: Investigated #2933/#2934 as one WebSocket reliability issue. Traced the server first-message auth write paths and the client onmessage dispatch loop, wrote failing regression tests first, then made the narrowest changes to log failed auth frame writes and skip malformed client frames without disrupting later messages.

Screenshots (optional)

N/A - WebSocket reliability/logging behavior only.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 20, 2026

@YOMXXX is attempting to deploy a commit to the IndexLabs Team on Vercel.

A member of the Team first needs to authorize it.

@Bohan-J
Copy link
Copy Markdown
Collaborator

Bohan-J commented May 21, 2026

Thanks for the thorough fix, @YOMXXX! Reviewed by two agents — both recommended merging this PR as a superset of #2945 (closes #2933 + #2934 together, stronger regression tests, and the auth_ack write-failure handling avoids the server/client split-brain state).

Merging now. A small follow-up will be opened to truncate the raw event.data in the client logger.warn to avoid unbounded log growth on garbage frames.

(MUL-2490)

@Bohan-J Bohan-J merged commit 83e90c9 into multica-ai:main May 21, 2026
4 checks passed
Bohan-J added a commit that referenced this pull request May 21, 2026
The post-#2946 onmessage guard logs the raw event.data alongside the
warning. A malformed or rogue server can stream arbitrarily large
garbage and bloat the renderer / desktop main-process log buffers, so
cap the logged payload to the first 200 chars and append a
"(truncated, N chars total)" suffix when truncation occurs.

MUL-2490

Co-authored-by: multica-agent <github@multica.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Unguarded JSON.parse in WSClient.onmessage can silently swallow auth errors [Bug]: WebSocket auth error frame write silently ignored in hub.go

2 participants