Skip to content

[BUG] Reasoning Content Leaks into Text Blocks Across Providers #750

Description

@awschmeder

Reasoning Content Streaming Leak in Multiple Providers

Issue Description

When streaming responses from OpenAI-compatible providers, a single chunk delta may contain both reasoning_content (or whitespace-only content separators) and content. In several API provider handlers, the current logic eagerly yields content as text before checking for and yielding reasoningText.

This causes the "thinking" block in the UI to close prematurely, leaking reasoning content or intermediate formatting into the standard text block, which breaks the dedicated "thinking" rendering state and causes a UI desync.

This bug was recently identified in the LiteLLMHandler (src/api/providers/lite-llm.ts). However, the same problematic logic pattern persists across several other provider handlers.

Affected Providers

The following providers share the same bug where content is yielded independently or before extractReasoningFromDelta(delta):

  1. BaseOpenAiCompatibleProvider (src/api/providers/base-openai-compatible-provider.ts)
  2. OpenAiHandler (src/api/providers/openai.ts)
  3. DeepSeekHandler (src/api/providers/deepseek.ts)
  4. RequestyHandler (src/api/providers/requesty.ts)
  5. MimoHandler (src/api/providers/mimo.ts)
  6. QwenCodeHandler (src/api/providers/qwen-code.ts)
  7. OpenCodeGoHandler (src/api/providers/opencode-go.ts)
  8. UnboundHandler (src/api/providers/unbound.ts)

Steps to Reproduce

  1. Use an OpenAI-compatible provider (or one of the listed handlers) with a reasoning-enabled model (e.g., DeepSeek R1, a Gemini proxy, etc.).
  2. Observe the streaming behavior when the proxy/provider sends a single chunk containing both reasoning_content (or reasoning / whitespace) and content.
  3. The UI will fail to render the collapsible "Thinking" block and dump reasoning text or separators as standard text into the text response.

Expected Behavior

The extractReasoningFromDelta check should be prioritized. Standard text (delta.content) should only be yielded if no reasoning content is present on the delta, ensuring that the stream chunk is treated as either reasoning or content, but not both in an order that breaks the UI state.

Related Issues

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions