Reasoning Content Streaming Leak in Multiple Providers
Issue Description
When streaming responses from OpenAI-compatible providers, a single chunk delta may contain both reasoning_content (or whitespace-only content separators) and content. In several API provider handlers, the current logic eagerly yields content as text before checking for and yielding reasoningText.
This causes the "thinking" block in the UI to close prematurely, leaking reasoning content or intermediate formatting into the standard text block, which breaks the dedicated "thinking" rendering state and causes a UI desync.
This bug was recently identified in the LiteLLMHandler (src/api/providers/lite-llm.ts). However, the same problematic logic pattern persists across several other provider handlers.
Affected Providers
The following providers share the same bug where content is yielded independently or before extractReasoningFromDelta(delta):
BaseOpenAiCompatibleProvider (src/api/providers/base-openai-compatible-provider.ts)
OpenAiHandler (src/api/providers/openai.ts)
DeepSeekHandler (src/api/providers/deepseek.ts)
RequestyHandler (src/api/providers/requesty.ts)
MimoHandler (src/api/providers/mimo.ts)
QwenCodeHandler (src/api/providers/qwen-code.ts)
OpenCodeGoHandler (src/api/providers/opencode-go.ts)
UnboundHandler (src/api/providers/unbound.ts)
Steps to Reproduce
- Use an OpenAI-compatible provider (or one of the listed handlers) with a reasoning-enabled model (e.g., DeepSeek R1, a Gemini proxy, etc.).
- Observe the streaming behavior when the proxy/provider sends a single chunk containing both
reasoning_content (or reasoning / whitespace) and content.
- The UI will fail to render the collapsible "Thinking" block and dump reasoning text or separators as standard text into the text response.
Expected Behavior
The extractReasoningFromDelta check should be prioritized. Standard text (delta.content) should only be yielded if no reasoning content is present on the delta, ensuring that the stream chunk is treated as either reasoning or content, but not both in an order that breaks the UI state.
Related Issues
Reasoning Content Streaming Leak in Multiple Providers
Issue Description
When streaming responses from OpenAI-compatible providers, a single chunk delta may contain both
reasoning_content(or whitespace-onlycontentseparators) andcontent. In several API provider handlers, the current logic eagerly yieldscontentas text before checking for and yieldingreasoningText.This causes the "thinking" block in the UI to close prematurely, leaking reasoning content or intermediate formatting into the standard text block, which breaks the dedicated "thinking" rendering state and causes a UI desync.
This bug was recently identified in the
LiteLLMHandler(src/api/providers/lite-llm.ts). However, the same problematic logic pattern persists across several other provider handlers.Affected Providers
The following providers share the same bug where
contentis yielded independently or beforeextractReasoningFromDelta(delta):BaseOpenAiCompatibleProvider(src/api/providers/base-openai-compatible-provider.ts)OpenAiHandler(src/api/providers/openai.ts)DeepSeekHandler(src/api/providers/deepseek.ts)RequestyHandler(src/api/providers/requesty.ts)MimoHandler(src/api/providers/mimo.ts)QwenCodeHandler(src/api/providers/qwen-code.ts)OpenCodeGoHandler(src/api/providers/opencode-go.ts)UnboundHandler(src/api/providers/unbound.ts)Steps to Reproduce
reasoning_content(orreasoning/ whitespace) andcontent.Expected Behavior
The
extractReasoningFromDeltacheck should be prioritized. Standard text (delta.content) should only be yielded if no reasoning content is present on the delta, ensuring that the stream chunk is treated as either reasoning or content, but not both in an order that breaks the UI state.Related Issues
c56f1ff5564cfeb016043c8eec145c1108cc7431(branch:fix/gemini-thought-signature)