fix: SSE streaming reliability — heartbeat race, idle watchdog, fast-path, e2e tests#79
Conversation
… spiral Go provider models (minimax, qwen) now go through the OpenAI Chat Completions transform path instead of the raw Anthropic endpoint, preventing "Unknown server-tool shorthand" 400 errors from Claude Code's MCP tool format. Changes: - Add AnthropicToolsDisabled config flag for models that don't support Anthropic tool format - Add sanitizeAnthropicBody() to strip tool type fields in raw Anthropic path (also protects Zen Claude models) - Add 4xx fail-fast in fallback handler — non-retryable errors skip circuit breaker to avoid opening it for format mismatches - Add temperature constraint for kimi-k2.7-code (only accepts 1.0) - Remove Go provider models from IsAnthropicModel() — all Go models now use the Chat Completions transform path - Comprehensive tests for sanitization, retryable errors, circuit breaker behavior, and temperature constraints Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… spiral Go provider models (minimax, qwen) now go through the OpenAI Chat Completions transform path instead of the raw Anthropic endpoint, preventing "Unknown server-tool shorthand" 400 errors from Claude Code's MCP tool format. Changes: - Add AnthropicToolsDisabled config flag for models that don't support Anthropic tool format - Add sanitizeAnthropicBody() to strip tool type fields in raw Anthropic path (also protects Zen Claude models) - Add 4xx fail-fast in fallback handler — non-retryable errors skip circuit breaker to avoid opening it for format mismatches - Add temperature constraint for kimi-k2.7-code (only accepts 1.0) - Remove Go provider models from IsAnthropicModel() — all Go models now use the Chat Completions transform path - Comprehensive tests for sanitization, retryable errors, circuit breaker behavior, and temperature constraints - E2E test script validating all models with tool-format requests Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
qwen3.7-max on OpenCode Go provider doesn't support the OpenAI Chat
Completions format ("oa-compat"). It must stay on the raw Anthropic
endpoint, but will be protected by the body sanitization added in the
previous commit.
Other qwen models (qwen3.5/3.6/3.7-plus) work fine through the
transform path.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ne only Removes http.Client.Timeout that was killing streams after 5 min regardless of activity. Server WriteTimeout set to 0. Each upstream read uses a per-Read deadline via http.ResponseController.SetReadDeadline that is renewed on every successful byte. Only an idle gap exceeding stream_timeout_ms (defaults to timeout_ms) treats the connection as stuck and routes to the next fallback model. Also demotes "client disconnected during stream" logs from Info to Debug — this is normal during Claude Code tool execution, not a failure signal. Adds StreamTimeoutMs config field to both OpenCodeGo and OpenCodeZen. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…sclassification, fast-path edge case, and add e2e streaming tests Fixes three bugs in the streaming SSE proxy path: 1. **CRITICAL — ResponseWriter race condition** (messages.go): The heartbeat goroutine (every 3s) and the stream goroutine wrote concurrently to the same http.ResponseWriter with no synchronization. Interleaved writes corrupted SSE frames, causing the Anthropic SDK to report InvalidHTTPResponse. Fixed by adding a sync.Mutex to the responseWriter wrapper and serializing all Write/WriteHeader/Flush calls. 2. **MEDIUM — Idle watchdog produces wrong error type** (stream.go): After switching from SetReadDeadline to context.Cancel-based idle detection, the cancel() call produced context.Canceled on Read(), which was not caught by isIdleTimeoutErr(). Added explicit check: errors.Is(err, context.Canceled) && clientCtx.Err() == nil → ErrStreamIdle. Same fix applied to ProxyStream, ProxyResponsesStream, ProxyGeminiStream, and handleAnthropicStreaming (added clientCtx parameter). 3. **MINOR — SSE fast-path content truncation** (stream.go): Used strings.Index on raw JSON content to find the terminating quote, which stopped at escaped \" inside content. Replaced with a byte walk that skips backslash-escaped characters. 4. Added internal/transformer/idle.go with StartIdleWatchdog for context-based stream idle detection (extracted from stream.go). 5. **e2e tests**: Added streaming SSE verification tests and a long-stream test that exercises the heartbeat path (max_tokens: 500, 120s timeout). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…lback-death-spiral fix: SSE streaming reliability — heartbeat race, idle watchdog, fast-path, e2e tests
Code Review Roast 🔥Verdict: 1 Critical + 2 Warnings | Recommendation: Address before merge Overview
Issue Details (click to expand)
🏆 Best part: The previous 💀 Worst part: 📊 Overall: A PR that swings from "decent bug fixes" to "broke the only thing that worked" in six lines. Like finding out your dishwasher also filters your air supply — the improvement is… questionable. Files Reviewed (4 files)
Fix these issues in Kilo Cloud Previous Review Summaries (8 snapshots, latest commit e112ea1)Current summary above is authoritative. Previous snapshots are kept for context only. Previous review (commit e112ea1)Verdict: 1 Critical + 2 Warnings | Recommendation: Address before merge Overview
Issue Details (click to expand)
🏆 Best part: The previous 💀 Worst part: 📊 Overall: A PR that swings from "decent bug fixes" to "broke the only thing that worked" in six lines. Like finding out your dishwasher also filters your air supply — the improvement is… questionable. Files Reviewed (4 files)
Fix these issues in Kilo Cloud Previous review (commit 868504a)Verdict: 2 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)
🏆 Best part: The 💀 Worst part: The heartbeat still lies to 📊 Overall: Like finding a horror movie sequel with fewer monsters, but the ones left know where you live. Files Reviewed (4 files)
Fix these issues in Kilo Cloud Previous review (commit 656e458)Verdict: 4 Issues Found | Recommendation: Address before merge Overview
Active Issues (click to expand)
Resolved Since Last Review (click to expand)
Previously Resolved (click to expand)
🏆 Best part: The model-router fix is actually clean: returning an error for a missing streaming model is the kind of boring correctness that prevents three layers of downstream panic theater. 💀 Worst part: The heartbeat keepalive still lies to 📊 Overall: This PR cleaned up several real gremlins, but the remaining critical SSE-state issues still need attention before merge. Like a horror movie sequel: fewer monsters, but the ones left know where you live. 🔗 Fix these issues in Kilo Cloud Files Reviewed (6 files)
Previous review (commit 3ce7feb)Verdict: 4 Issues Found | Recommendation: Address before merge Overview
Active Issues (click to expand)
Resolved Since Last Review (click to expand)
Previously Resolved (click to expand)
🏆 Best part: The model-router fix is actually clean: returning an error for a missing streaming model is the kind of boring correctness that prevents three layers of downstream panic theater. 💀 Worst part: The heartbeat keepalive still lies to 📊 Overall: This PR cleaned up several real gremlins, but the remaining critical SSE-state issues still need attention before merge. Like a horror movie sequel: fewer monsters, but the ones left know where you live. 🔗 Fix these issues in Kilo Cloud Files Reviewed (6 files)
Previous review (commit 8cf917d)Verdict: 8 Issues Found (1 Fixed, 7 Remaining) | Recommendation: Address before merge Overview
Active Issues (click to expand)
Resolved Since Last Review (click to expand)
🏆 Best part: The Flush mutex fix is actually correct. Holding the lock across 💀 Worst part: The e2e test harness still insists on starting the proxy in the foreground, making the entire test suite a hostage situation. It's been how many reviews? 📊 Overall: One gremlin caught and fixed, seven still roaming free. Progress, I suppose — like watching paint dry, but at least the color is changing. Files Reviewed (11 files, 1 changed)
Previous review (commit eb8e122)Verdict: 7 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)
🏆 Best part: The nil dedup guard, narrowed sanitizer, and idle-watchdog classification fixes actually landed. Annoyingly, the PR learned some manners and still brought three more gremlins. 💀 Worst part: The e2e harness still blocks before running tests. A streaming reliability PR whose test script is a server-shaped statue is comedy with production consequences. 📊 Overall: Resolved since last review: 🔗 Fix these issues in Kilo Cloud Files Reviewed (11 files)
Previous review (commit 0d9ca7f)Verdict: 5 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)
🏆 Best part: The mutex-protected 💀 Worst part: 📊 Overall: The streaming fixes are real, but this PR also smuggled in a runtime panic and a broken e2e harness. Like bringing a fire extinguisher to a fire and also pouring gasoline on the couch. 🔗 Fix these issues in Kilo Cloud Files Reviewed (20 files)
Previous review (commit 71235e3)Verdict: 5 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)
🏆 Best part: The mutex-protected 💀 Worst part: 📊 Overall: The streaming fixes are real, [Snapshot truncated.] Additional previous summary content was truncated to keep this comment within platform limits. Reviewed by nex-n2-pro:free · 416,354 tokens |
requestDedup is intentionally nil when NewMessagesHandler creates the handler without a deduplicator. Guard the TryAcquire call with a nil check. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ctx.Done, e2e backgrounding Five fixes from code review: 1. ssePayloadWritten guards on ErrStreamIdle continue paths: ProxyStream emits message_start before knowing the upstream is healthy. If ErrStreamIdle fires and we continue to the next model, the client would see duplicate message_start events. Guard all 4 streaming fallback paths (Zen Anthropic, Responses, Gemini, OpenAI) with rw.ssePayloadWritten checks — if SSE output has already started, send a stream error and return instead of continuing. 2. Narrow sanitizeAnthropicBody to delete only "type":"custom": The sanitizer was deleting every type field from every tool, not just the Claude Code "type":"custom" server-tool shorthand. Legitimate Anthropic tools that use type (e.g. "type":"function") were being stripped. Now only deletes type when the value is "custom". 3. handleAnthropicStreaming ctx.Done() distinguishes watchdog vs client: Both the idle watchdog and client disconnect cancel the same context. The select case returned ErrClientDisconnected unconditionally. Now checks clientCtx.Err() — if client is still connected, the cancellation came from the watchdog → return ErrStreamIdle. 4. e2e-test.sh: start proxy with explicit & backgrounding and PID- based cleanup. Health check polls /health with 10s timeout instead of a blind sleep. Cleanup kills the captured PID for reliable teardown in CI. 5. nil guard on requestDedup (committed earlier at c558ae9). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ming error handling
…andling in streaming
…s and adjust related tests
Co-authored-by: kilo-code-bot[bot] <240665456+kilo-code-bot[bot]@users.noreply.github.com> Signed-off-by: TUYIZERE Samuel <tuyizeres0@gmail.com>
Summary
Fixes three streaming SSE proxy reliability bugs and adds comprehensive streaming e2e tests. The primary symptom was
API Error: InvalidHTTPResponsefrom the Anthropic SDK during large file writes (long-running streams where the heartbeat had time to race with SSE writes).Changes
1. CRITICAL: ResponseWriter race condition (messages.go)
Root cause: The heartbeat goroutine (every 3s writing
:keepalive\n\n+ Flush) and the stream goroutine (writing SSE events + Flush) both wrote directly to the samehttp.ResponseWriterwithout synchronization. On long-running streams (e.g., large file writes), the heartbeat fires 20-40+ times, making the race almost certain.Fix: Added
sync.Mutexto theresponseWriterwrapper. AllWrite,WriteHeader, andFlushcalls are serialized. Both the heartbeat and stream path now flush through the mutex-protected wrapper.Impact: Clean SSE output on concurrent writes. Zero data races confirmed via
go test -race.2. MEDIUM: Idle watchdog misclassification (stream.go + messages.go)
Root cause: After switching from
http.ResponseController.SetReadDeadlineto context-based idle detection, the idle watchdog callscancel()which producescontext.CanceledonRead(). ButisIdleTimeoutErr()only caughtcontext.DeadlineExceededandnet.Error.Timeout(). Idle timeouts were misclassified as generic failures and logged misleadingly as "stream proxy failed".Fix: Added explicit check:
errors.Is(err, context.Canceled) && clientCtx.Err() == nil— the upstream was canceled but the client is still connected, which can only come from the watchdog. ReturnsErrStreamIdlecorrectly.Affected paths:
ProxyStream,ProxyResponsesStream,ProxyGeminiStream, andhandleAnthropicStreaming(gainedclientCtxparameter).3. MINOR: SSE fast-path content truncation (stream.go)
Root cause: The fast-path content extractor used
strings.Index(data[start:], '"')to find the end of a JSON string value. This stopped at escaped\"inside the content, silently dropping everything after it.Fix: Replaced with a byte-by-byte walk that skips
\+ the escaped character, correctly bypassing escaped quotes.4. Refactor: idle watchdog module (idle.go, new file)
Extracted
StartIdleWatchdogfromstream.gointo its own file. No behavior change.5. e2e streaming tests (scripts/e2e-test.sh)
Added two new test functions:
test_streaming_model(SSE with tools, verifiesmessage_start+message_stopframing) andtest_streaming_long(500-token output to exercise multiple heartbeat ticks). Five new test cases covering Go provider streaming, Zen streaming, Anthropic endpoint streaming, and long-stream heartbeat path.Risk Analysis
clientCtx.Err() == nilcheck cleanly distinguishes watchdog from client disconnect. The upstreamcancel()function is still called once, then becomes a no-op on subsequent calls.Verification
go build ./...— successgo test -race ./...— 302 passed in 12 packages, 0 racesbash -n scripts/e2e-test.sh— syntax OKgo vet ./...— no issuesKnown Limitations
\uXXXX) — these are rare in LLM output and would be handled by the JSON fallback path instead.🤖 Generated with Claude Code