fix(tongyi): handle edge case where reasoning_content and content coexist in same streaming chunk#3282
fix(tongyi): handle edge case where reasoning_content and content coexist in same streaming chunk#3282ZhXZhao wants to merge 2 commits into
Conversation
…xist in same streaming chunk DashScope/Bailian API occasionally returns both reasoning_content and content as non-empty in the same streaming chunk at the transition boundary. The previous implementation overwrites the content variable with reasoning_content when reasoning_content is truthy, silently discarding the content value. This fix refactors _wrap_thinking_by_reasoning_content to use an accumulator pattern (similar to the fix in openai_api_compatible plugin PR langgenius#3031), ensuring both fields are properly handled when they coexist. Closes langgenius#3277
There was a problem hiding this comment.
Code Review
This pull request refactors the streaming reasoning and content wrapping logic in the Tongyi plugin to better handle edge cases, such as when both reasoning and content are received in the same chunk, and adds corresponding unit tests. The review feedback suggests using Pythonic implicit truthiness checks instead of explicit length comparisons (e.g., len(seq) > 0), which adheres to PEP 8 and prevents a potential TypeError if content is None.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| output = "" | ||
| if len(reasoning_content) > 0: | ||
| if not is_reasoning: | ||
| # Open a think block on first reasoning token | ||
| output += f"<think>\n{reasoning_content}" | ||
| is_reasoning = True | ||
| else: | ||
| # Continue streaming inside the think block | ||
| output += reasoning_content | ||
|
|
||
| if is_reasoning: | ||
| if len(reasoning_content) == 0 and len(content) == 0: | ||
| # No reasoning or content token, close the think block | ||
| is_reasoning = False | ||
| output += "\n</think>" | ||
| # Handle edge case: both reasoning_content and content are non-empty | ||
| # in the same chunk (DashScope/Bailian API occasionally does this at | ||
| # the transition boundary between reasoning and content phases) | ||
| if len(content) > 0: | ||
| is_reasoning = False | ||
| output += f"\n</think>{content}" | ||
| elif len(content) > 0: | ||
| # No reasoning token and not in a reasoning block | ||
| output += content |
There was a problem hiding this comment.
According to the PEP 8 style guide, we should use the implicit truthiness of empty sequences (like strings) instead of checking their length (e.g., len(seq) > 0 or len(seq) == 0).
Additionally, using len(content) is risky here. If content is parsed as None (which can happen if content is a list of dicts where the first dict lacks a "text" key, resulting in content being set to None on line 831), calling len(content) will raise a TypeError: object of type 'NoneType' has no len(). Using the idiomatic if content: check is robust against None values because None is falsy in Python, avoiding any runtime exceptions.
output = ""
if reasoning_content:
if not is_reasoning:
# Open a think block on first reasoning token
output += f"<think>\n{reasoning_content}"
is_reasoning = True
else:
# Continue streaming inside the think block
output += reasoning_content
if is_reasoning:
if not reasoning_content and not content:
# No reasoning or content token, close the think block
is_reasoning = False
output += "\n</think>"
# Handle edge case: both reasoning_content and content are non-empty
# in the same chunk (DashScope/Bailian API occasionally does this at
# the transition boundary between reasoning and content phases)
if content:
is_reasoning = False
output += f"\n</think>{content}"
elif content:
# No reasoning token and not in a reasoning block
output += contentReferences
- PEP 8 recommends using the implicit truthiness of empty sequences (e.g.,
if seq:orif not seq:) instead of comparing their length to 0. (link)
| output = "" | ||
| if len(reasoning_content) > 0: | ||
| if not is_reasoning: | ||
| # Open a think block on first reasoning token | ||
| output += f"<think>\n{reasoning_content}" | ||
| is_reasoning = True | ||
| else: | ||
| # Continue streaming inside the think block | ||
| output += reasoning_content | ||
|
|
||
| if is_reasoning: | ||
| if len(reasoning_content) == 0 and len(content) == 0: | ||
| # No reasoning or content token, close the think block | ||
| is_reasoning = False | ||
| output += "\n</think>" | ||
| # Handle edge case: both reasoning_content and content are non-empty | ||
| # in the same chunk (DashScope/Bailian API occasionally does this at | ||
| # the transition boundary between reasoning and content phases) | ||
| if len(content) > 0: | ||
| is_reasoning = False | ||
| output += f"\n</think>{content}" | ||
| elif len(content) > 0: | ||
| # No reasoning token and not in a reasoning block | ||
| output += content |
There was a problem hiding this comment.
To keep the test helper function in sync with the production code in llm.py, we should apply the same PEP 8 improvement here. Using implicit truthiness (e.g., if reasoning_content: and if content:) is more Pythonic and robust against potential None values.
output = ""
if reasoning_content:
if not is_reasoning:
# Open a think block on first reasoning token
output += f"<think>\n{reasoning_content}"
is_reasoning = True
else:
# Continue streaming inside the think block
output += reasoning_content
if is_reasoning:
if not reasoning_content and not content:
# No reasoning or content token, close the think block
is_reasoning = False
output += "\n</think>"
# Handle edge case: both reasoning_content and content are non-empty
# in the same chunk (DashScope/Bailian API occasionally does this at
# the transition boundary between reasoning and content phases)
if content:
is_reasoning = False
output += f"\n</think>{content}"
elif content:
# No reasoning token and not in a reasoning block
output += contentReferences
- PEP 8 recommends using the implicit truthiness of empty sequences (e.g.,
if seq:orif not seq:) instead of comparing their length to 0. (link)
Summary
Fixes a bug in the tongyi plugin where streaming content gets silently dropped when a single SSE chunk contains both
reasoning_content(non-empty) andcontent(non-empty) at the same time. This happens at the reasoning-to-answer transition for some DashScope/Bailian models (e.g. Qwen3 thinking models).Closes #3277
Root Cause
In
_wrap_thinking_by_reasoning_content(models/tongyi/models/llm/llm.py), whenreasoning_contentis truthy, the original code overwrites thecontentvariable entirely:The fallback branch
elif is_reasoning and content:only runs whenreasoning_contentis empty, so on a transitional chunk where both fields are non-empty, thecontentpart is lost.Fix
Adopt the same accumulator pattern that PR #3031 introduced for the
openai_api_compatibleplugin, adapted to tongyi's implementation. The new logic uses anoutputaccumulator and+=, and explicitly handles the case wherereasoning_contentandcontentare both present by appending the reasoning piece, closing</think>, then appending the content piece.Key behavior:
<think>block (existing behavior preserved).reasoning_contentnon-empty: open / continue the<think>block (existing behavior preserved).contentnon-empty (after reasoning ended): plain pass-through (existing behavior preserved).</think>, then append content (NEW — this is the fix).Changes
models/tongyi/models/llm/llm.py— rewrote_wrap_thinking_by_reasoning_contentusing accumulator pattern; simplified exception handling.models/tongyi/tests/test_reason_wrapper.py— new test file with 9 cases covering normal reasoning flow, plain content flow, the edge case fix, andreasoning_contentas a list.models/tongyi/manifest.yaml— bumped version0.2.0→0.2.1.Test Plan
reasoning_content="Z",content="Hello") now produces"Z\n</think>Hello"withis_reasoning=False, instead of dropping"Hello"<think>, continue reasoning, close on empty delta, plain content) preservedReference
This fix mirrors the approach in PR #3031 (which fixed the same class of bug in
openai_api_compatible). The two plugins have independent implementations of_wrap_thinking_by_reasoning_content, so each needs its own fix.