Skip to content

fix(tongyi): handle edge case where reasoning_content and content coexist in same streaming chunk#3282

Open
ZhXZhao wants to merge 2 commits into
langgenius:mainfrom
ZhXZhao:fix/tongyi-reasoning-content-loss
Open

fix(tongyi): handle edge case where reasoning_content and content coexist in same streaming chunk#3282
ZhXZhao wants to merge 2 commits into
langgenius:mainfrom
ZhXZhao:fix/tongyi-reasoning-content-loss

Conversation

@ZhXZhao

@ZhXZhao ZhXZhao commented Jun 12, 2026

Copy link
Copy Markdown

Summary

Fixes a bug in the tongyi plugin where streaming content gets silently dropped when a single SSE chunk contains both reasoning_content (non-empty) and content (non-empty) at the same time. This happens at the reasoning-to-answer transition for some DashScope/Bailian models (e.g. Qwen3 thinking models).

Closes #3277

Root Cause

In _wrap_thinking_by_reasoning_content (models/tongyi/models/llm/llm.py), when reasoning_content is truthy, the original code overwrites the content variable entirely:

if reasoning_content:
    if not is_reasoning:
        content = "<think>\n" + reasoning_content   # original content discarded
    else:
        content = reasoning_content                  # original content discarded

The fallback branch elif is_reasoning and content: only runs when reasoning_content is empty, so on a transitional chunk where both fields are non-empty, the content part is lost.

Fix

Adopt the same accumulator pattern that PR #3031 introduced for the openai_api_compatible plugin, adapted to tongyi's implementation. The new logic uses an output accumulator and +=, and explicitly handles the case where reasoning_content and content are both present by appending the reasoning piece, closing </think>, then appending the content piece.

Key behavior:

  • Both fields empty in the same chunk: close the <think> block (existing behavior preserved).
  • Only reasoning_content non-empty: open / continue the <think> block (existing behavior preserved).
  • Only content non-empty (after reasoning ended): plain pass-through (existing behavior preserved).
  • Both fields non-empty in the same chunk: append reasoning, close </think>, then append content (NEW — this is the fix).

Changes

  • models/tongyi/models/llm/llm.py — rewrote _wrap_thinking_by_reasoning_content using accumulator pattern; simplified exception handling.
  • models/tongyi/tests/test_reason_wrapper.py — new test file with 9 cases covering normal reasoning flow, plain content flow, the edge case fix, and reasoning_content as a list.
  • models/tongyi/manifest.yaml — bumped version 0.2.00.2.1.

Test Plan

  • All 9 unit tests pass locally
  • Edge case (reasoning_content="Z", content="Hello") now produces "Z\n</think>Hello" with is_reasoning=False, instead of dropping "Hello"
  • Existing flows (open <think>, continue reasoning, close on empty delta, plain content) preserved

Reference

This fix mirrors the approach in PR #3031 (which fixed the same class of bug in openai_api_compatible). The two plugins have independent implementations of _wrap_thinking_by_reasoning_content, so each needs its own fix.

…xist in same streaming chunk

DashScope/Bailian API occasionally returns both reasoning_content and content
as non-empty in the same streaming chunk at the transition boundary. The previous
implementation overwrites the content variable with reasoning_content when
reasoning_content is truthy, silently discarding the content value.

This fix refactors _wrap_thinking_by_reasoning_content to use an accumulator
pattern (similar to the fix in openai_api_compatible plugin PR langgenius#3031), ensuring
both fields are properly handled when they coexist.

Closes langgenius#3277
@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. bug Something isn't working labels Jun 12, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the streaming reasoning and content wrapping logic in the Tongyi plugin to better handle edge cases, such as when both reasoning and content are received in the same chunk, and adds corresponding unit tests. The review feedback suggests using Pythonic implicit truthiness checks instead of explicit length comparisons (e.g., len(seq) > 0), which adheres to PEP 8 and prevents a potential TypeError if content is None.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +841 to +864
output = ""
if len(reasoning_content) > 0:
if not is_reasoning:
# Open a think block on first reasoning token
output += f"<think>\n{reasoning_content}"
is_reasoning = True
else:
# Continue streaming inside the think block
output += reasoning_content

if is_reasoning:
if len(reasoning_content) == 0 and len(content) == 0:
# No reasoning or content token, close the think block
is_reasoning = False
output += "\n</think>"
# Handle edge case: both reasoning_content and content are non-empty
# in the same chunk (DashScope/Bailian API occasionally does this at
# the transition boundary between reasoning and content phases)
if len(content) > 0:
is_reasoning = False
output += f"\n</think>{content}"
elif len(content) > 0:
# No reasoning token and not in a reasoning block
output += content

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

According to the PEP 8 style guide, we should use the implicit truthiness of empty sequences (like strings) instead of checking their length (e.g., len(seq) > 0 or len(seq) == 0).

Additionally, using len(content) is risky here. If content is parsed as None (which can happen if content is a list of dicts where the first dict lacks a "text" key, resulting in content being set to None on line 831), calling len(content) will raise a TypeError: object of type 'NoneType' has no len(). Using the idiomatic if content: check is robust against None values because None is falsy in Python, avoiding any runtime exceptions.

            output = ""
            if reasoning_content:
                if not is_reasoning:
                    # Open a think block on first reasoning token
                    output += f"<think>\n{reasoning_content}"
                    is_reasoning = True
                else:
                    # Continue streaming inside the think block
                    output += reasoning_content

            if is_reasoning:
                if not reasoning_content and not content:
                    # No reasoning or content token, close the think block
                    is_reasoning = False
                    output += "\n</think>"
                # Handle edge case: both reasoning_content and content are non-empty
                # in the same chunk (DashScope/Bailian API occasionally does this at
                # the transition boundary between reasoning and content phases)
                if content:
                    is_reasoning = False
                    output += f"\n</think>{content}"
            elif content:
                # No reasoning token and not in a reasoning block
                output += content
References
  1. PEP 8 recommends using the implicit truthiness of empty sequences (e.g., if seq: or if not seq:) instead of comparing their length to 0. (link)

Comment on lines +26 to +49
output = ""
if len(reasoning_content) > 0:
if not is_reasoning:
# Open a think block on first reasoning token
output += f"<think>\n{reasoning_content}"
is_reasoning = True
else:
# Continue streaming inside the think block
output += reasoning_content

if is_reasoning:
if len(reasoning_content) == 0 and len(content) == 0:
# No reasoning or content token, close the think block
is_reasoning = False
output += "\n</think>"
# Handle edge case: both reasoning_content and content are non-empty
# in the same chunk (DashScope/Bailian API occasionally does this at
# the transition boundary between reasoning and content phases)
if len(content) > 0:
is_reasoning = False
output += f"\n</think>{content}"
elif len(content) > 0:
# No reasoning token and not in a reasoning block
output += content

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To keep the test helper function in sync with the production code in llm.py, we should apply the same PEP 8 improvement here. Using implicit truthiness (e.g., if reasoning_content: and if content:) is more Pythonic and robust against potential None values.

        output = ""
        if reasoning_content:
            if not is_reasoning:
                # Open a think block on first reasoning token
                output += f"<think>\n{reasoning_content}"
                is_reasoning = True
            else:
                # Continue streaming inside the think block
                output += reasoning_content

        if is_reasoning:
            if not reasoning_content and not content:
                # No reasoning or content token, close the think block
                is_reasoning = False
                output += "\n</think>"
            # Handle edge case: both reasoning_content and content are non-empty
            # in the same chunk (DashScope/Bailian API occasionally does this at
            # the transition boundary between reasoning and content phases)
            if content:
                is_reasoning = False
                output += f"\n</think>{content}"
        elif content:
            # No reasoning token and not in a reasoning block
            output += content
References
  1. PEP 8 recommends using the implicit truthiness of empty sequences (e.g., if seq: or if not seq:) instead of comparing their length to 0. (link)

@ZhXZhao ZhXZhao deployed to models/tongyi June 12, 2026 10:28 — with GitHub Actions Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Tongyi plugin loses content when reasoning_content and content coexist in the same streaming chunk

1 participant