feat(chat): streaming tool-call deltas + Qwen-friendly arg normalisation by dusterbloom · Pull Request #164 · panbanda/higgs

dusterbloom · 2026-05-21T14:12:02Z

Summary

Completes the streaming tool-call story that was scoped in closed PR #63
but never landed. Previously the streaming /v1/chat/completions route
stripped tools from the prompt and warned "tool_calls deltas are
unsupported"; the model — deprived of tool context — would hallucinate
fake tool calls as plain text, and voice-agent clients would speak them
out loud. After this PR the streaming route extracts <tool_call>… </tool_call> blocks from the model output on the fly and emits proper
ToolCallDelta SSE events.

Three pieces (~580 LOC)

`tool_parser::StreamingToolCallTracker`

New state machine next to the existing parse_tool_calls. Buffers
streamed chunks, extracts complete <tool_call>…</tool_call> blocks
on the fly. When active=false (no tools in request) collapses to a
single-allocation passthrough with zero parsing cost.

8 new unit tests verify the invariants:

inactive=false → pure passthrough
complete tag in one chunk → tool call emitted, no visible
tag split across N chunks → tracker reassembles
text before AND after → both visible, tool extracted
invalid JSON inside tag → preserved as visible (no silent loss)
unclosed tag at flush → buffered prefix emitted as visible
UTF-8 char-boundary safety on tail flush
multiple calls with text between → indices tracked correctly

`chat::chat_completions_stream` wiring

Pass req.tools.as_deref() to prepare_chat_prompt_with_thinking so
the chat template renders the tool spec the model recognises (was
always None before).
Wrap the reasoning-tracker's visible text through
StreamingToolCallTracker::process and emit a ToolCallDelta SSE
event per completed call.
Defer finish_reason until after the tracker has drained, so
"tool_calls" is reported when the response carried any tool calls.
Final flush drains both trackers — no tokens vanish silently.

`chat_template::normalize_tool_call_for_template`

New helper that walks a tool-call JSON value and:

Hoists function.{name,arguments} to the top level — Qwen's
chat_template.jinja references tool_call.name and
tool_call.arguments directly, not the OpenAI-nested shape.
Parses string-encoded arguments to a JSON value — fixes the
cannot convert value into pairs (in chat:120) minijinja crash that
killed multi-turn conversations carrying assistant tool_calls in
their history.

Called from convert_messages so both streaming and non-streaming paths
get the fix. 4 new unit tests cover the OpenAI shape, Qwen-flat shape
(no-op), unparseable-string arguments (kept as string), and non-object
inputs (no-op).

Test plan

cargo test -p higgs-models --lib — 242 passed
cargo test -p higgs-engine --lib — 338 passed (+27 tracker tests, +4 normalize tests, 25 ignored)
cargo test -p higgs --release --lib — 457 passed (no regression on existing convert_messages tests)
cargo clippy -p higgs -p higgs-engine -p higgs-models --tests --all-features -- -Dwarnings — clean
Release binary built and the new log strings shipped (Streaming with tool-calls enabled; will emit tool_calls deltas via StreamingToolCallTracker)
CI live integration smoke

Closes/replaces

Supersedes closed #63 ("feat: streaming tool call support + Hermes XML
parser") with a surgical port targeting current main — the original
PR's diff has shrunk from 40 files / 2K+ lines to 3 files / 580 lines
now that the Qwen3.5/3.6 + TurboQuant + paged-cache base stack has
landed.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Streaming chat now emits structured tool-call events (SSE) instead of raw tags and defers final reason when tool-calls occurred.
- Parsing accepts both JSON and XML tool-call forms and applies per-request schema-based coercion.
- Tool-call data is normalized for template compatibility (hoists nested fields, parses string-encoded arguments).
Bug Fixes / Reliability
- Robust streaming extraction across chunk splits, unclosed-tag flushes, UTF‑8 boundary safety, and capped buffering to avoid unbounded growth.
Tests
- Expanded test coverage for parsing, streaming, normalization, and edge cases.

The streaming `/v1/chat/completions` route previously stripped tools from the prompt (`prepare_chat_prompt_with_thinking(..., None, ...)`) and warned that "tool_calls deltas are unsupported". The model, deprived of tool context, would hallucinate fake tool calls as plain text — and the client's TTS pipeline would happily speak `<nanobot>read_skill newsreader</nanobot>` out loud. This commit completes the streaming tool-call story that was documented but never finished (see closed upstream PR panbanda#63). Three pieces, ~580 net new lines: 1. `tool_parser::StreamingToolCallTracker` New `pub struct` next to the existing `parse_tool_calls`. A small state machine that buffers streamed text chunks and extracts complete `<tool_call>{json}</tool_call>` blocks on the fly. When `active=false` (no tools in the request) it collapses to a single-allocation passthrough with zero parsing cost. Invariants verified by 8 new unit tests: - inactive=false → pure passthrough - single complete tag in one chunk - tag split across N chunks → reassembled - text before AND after tags → both visible - invalid JSON inside tag → preserved as visible (no silent loss) - unclosed tag at flush → buffered prefix emitted as visible - UTF-8 char-boundary safety on tail flush - multiple calls with text between → indices tracked correctly 2. `chat::chat_completions_stream` wiring - Pass `req.tools.as_deref()` (not None) to `prepare_chat_prompt_with_thinking` so the chat template renders the tool spec the model recognises. - Construct a `StreamingToolCallTracker` keyed off `stream_includes_tools`. - On every chunk: route the reasoning-tracker's visible text through the tool tracker; emit `ToolCallDelta` SSE events for each completed call; emit content delta for the surviving visible text. - Defer `finish_reason` until after the tool tracker has drained, so we report `"tool_calls"` when the response actually contained any. - Final flush drains both trackers so no tokens vanish silently. 3. `chat_template::normalize_tool_call_for_template` New `pub fn` that walks a tool-call JSON value and (a) hoists `function.{name,arguments}` to the top level — Qwen's `chat_template.jinja` references `tool_call.name` and `tool_call.arguments` directly, not the OpenAI-nested shape; (b) parses string-encoded `arguments` to a JSON value, fixing the `cannot convert value into pairs (in chat:120)` minijinja error that crashed multi-turn conversations carrying assistant tool_calls in their history. Verified by 4 new unit tests covering OpenAI shape, Qwen-flat shape (no-op), unparseable-string arguments (kept as string), and non-object inputs (no-op). `convert_messages` in the chat route calls the normaliser per tool call so both the streaming and non-streaming paths get the fix. Test impact: - higgs-models: 242 passed (no change) - higgs-engine: 338 passed (+27 tracker tests, +4 normalize tests, 25 ignored) - higgs: 457 passed (no regression on existing convert_messages tests) - cargo clippy -Dwarnings: clean across all three crates

coderabbitai · 2026-05-21T14:12:16Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5dd580ee-7b4d-44db-8618-1ce261069969

📥 Commits

Reviewing files that changed from the base of the PR and between ab8e8fe and c793a89.

📒 Files selected for processing (3)

crates/higgs-engine/src/chat_template.rs
crates/higgs-engine/src/tool_parser.rs
crates/higgs/src/routes/chat.rs

🚧 Files skipped from review as they are similar to previous changes (3)

crates/higgs-engine/src/chat_template.rs
crates/higgs/src/routes/chat.rs
crates/higgs-engine/src/tool_parser.rs

📝 Walkthrough

Walkthrough

Adds Hermes XML and JSON tool-call parsing, a StreamingToolCallTracker to buffer and extract <tool_call> blocks across chunks, template normalization for tool-call JSON, and integration into the chat streaming pipeline to emit structured ToolCallDelta SSE events and defer finish_reason when tool calls occur.

Changes

Streaming Tool-Call Support

Layer / File(s)	Summary
Tool-call template normalization `crates/higgs-engine/src/chat_template.rs`, `crates/higgs/src/routes/chat.rs`	`normalize_tool_call_for_template` flattens OpenAI nested `function.{name,arguments}` to top-level `{name,arguments}`, parses string-encoded JSON arguments into structured values, coerces non-mapping arguments to `{}`, and adds unit tests and lint adjustments.
Tool parser XML/JSON and schema coercion `crates/higgs-engine/src/tool_parser.rs`	Add Hermes XML parsing and `ToolSchema` to coerce XML `<parameter>` raw strings into typed JSON (int/bool/object/array) when schema provided; update `parse_tool_calls(text, schema)` signature and module documentation.
StreamingToolCallTracker and streaming parsing `crates/higgs-engine/src/tool_parser.rs`	Implement `StreamingToolOutput` and `StreamingToolCallTracker` to buffer chunked text, detect/extract `<tool_call>` boundaries, preserve invalid blocks as visible text, cap unclosed-tag buffering, provide UTF-8-safe flush, and add comprehensive tests for chunking and edge cases.
Chat completions streaming integration `crates/higgs/src/routes/chat.rs`	Include tools in prompt rendering, construct tracker with optional `ToolSchema`, route visible output through the tracker, emit `tool_calls` SSE deltas and `content` deltas per chunk, drain/flush tracker at stream end, and defer/override `finish_reason` to `"tool_calls"` when calls were detected. Logging updated to debug.

Sequence Diagram

sequenceDiagram
  participant ChatAPI
  participant StreamGenerator
  participant ReasoningTracker
  participant ToolTracker
  participant SSEOutput
  ChatAPI->>StreamGenerator: start streaming with tools
  StreamGenerator->>ReasoningTracker: produce chunk (reasoning, visible)
  ReasoningTracker->>ToolTracker: visible text
  ToolTracker->>ToolTracker: extract <tool_call> blocks across chunks
  ToolTracker->>SSEOutput: emit ToolCallDelta for each completed call
  ToolTracker->>SSEOutput: emit content with remaining visible text
  StreamGenerator->>ToolTracker: flush() on stream end
  ToolTracker->>SSEOutput: emit buffered unclosed content and final finish_reason

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰
I hop through bytes and buffered streams,
I chase the tags in broken dreams,
I stitch the JSON, XML too,
Emit the calls so outputs chew —
A rabbit’s patch to parse for you. 🥕

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: streaming tool-call deltas and tool-call argument normalization for Qwen compatibility, which are the core objectives of the PR.
Linked Issues check	✅ Passed	The PR fulfills all coding objectives from issue `#63`: implements StreamingToolCallTracker for extracting tool calls from streams, adds normalize_tool_call_for_template for template compatibility, supports Qwen XML parsing, and enables end-to-end streaming tool-call support.
Out of Scope Changes check	✅ Passed	All changes are directly related to the linked objectives: three modified files (chat_template.rs, tool_parser.rs, chat.rs) implement streaming tool calls, Qwen XML parsing, normalization, and template support without introducing unrelated functionality.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

crates/higgs-engine/src/tool_parser.rs (1)
101-104: ⚡ Quick win

Document StreamingToolOutput public fields.

StreamingToolOutput.visible and StreamingToolOutput.new_tool_calls are public API fields and should have field-level rustdoc comments for discoverability.

As per coding guidelines, **/*.rs: Add doc comments on public structs/fields in Rust when changing user-facing behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/higgs-engine/src/tool_parser.rs` around lines 101 - 104, Add rustdoc
comments for the public struct StreamingToolOutput and each of its public
fields: document what StreamingToolOutput represents, and add brief field-level
comments for visible (what the string contains and visibility/format
expectations) and new_tool_calls (what ParsedToolCall entries represent and when
they are populated), referencing StreamingToolOutput, visible, and
new_tool_calls so consumers can discover and understand the API.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/higgs/src/routes/chat.rs`:
- Around line 456-459: The code passes Some(&[]) for an empty tools array which
still marks `tools` as defined in templates; change how `prompt_tools` is
derived so empty slices become None: replace `let prompt_tools =
req.tools.as_deref();` with logic that converts empty slices to None (e.g. `let
prompt_tools = req.tools.as_deref().and_then(|t| if t.is_empty() { None } else {
Some(t) });`) so that the call to
`engine.prepare_chat_prompt_with_thinking(&messages, prompt_tools,
thinking_enabled_stream)` receives None for absent/empty tools.

---

Nitpick comments:
In `@crates/higgs-engine/src/tool_parser.rs`:
- Around line 101-104: Add rustdoc comments for the public struct
StreamingToolOutput and each of its public fields: document what
StreamingToolOutput represents, and add brief field-level comments for visible
(what the string contains and visibility/format expectations) and new_tool_calls
(what ParsedToolCall entries represent and when they are populated), referencing
StreamingToolOutput, visible, and new_tool_calls so consumers can discover and
understand the API.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7a10d527-c5d2-481a-9a66-5870a5f99e83

📥 Commits

Reviewing files that changed from the base of the PR and between ae6f8b0 and 5aea6d0.

📒 Files selected for processing (3)

crates/higgs-engine/src/chat_template.rs
crates/higgs-engine/src/tool_parser.rs
crates/higgs/src/routes/chat.rs

…bind Qwen's chat_template.jinja:107-108 rebinds the loop variable: {%- for tool_call in message.tool_calls %} {%- if tool_call.function is defined %} {%- set tool_call = tool_call.function %} {%- endif %} After this rebind, line 120's `tool_call.arguments|items` walks into the ORIGINAL `function.arguments` JSON-encoded string — bypassing the top-level `arguments` we just normalised. minijinja raises "cannot convert value into pairs" and the request 500s. Two-part fix: 1. `normalize_tool_call_for_template` now normalises BOTH the hoisted top-level `arguments` AND the still-nested `function.arguments`. Pulled the parse-and-coerce logic into a private helper `normalize_arguments_value` so it's applied identically at both sites (DRY). 2. The coercion now defends against every non-mapping shape — null, bool, number, array, string-that-fails-to-parse — by replacing the value with an empty `{}`. A warn is logged so pathological shapes stay visible. Previously only string-parsing was attempted and any other shape leaked through to the template. New invariant test (`normalize_handles_qwen_rebind_to_function`) pins both `tool_call.arguments` AND `tool_call.function.arguments` to be mappings after normalize, so future edits can't regress the rebind path silently. Five additional coercion tests cover null / array / number / unparseable-string / array-as-string cases. Verified by manual smoke: `nanobot agent -l -m "Use your tools to list files in /tmp"` previously 500'd with the template error; now returns a real directory listing streamed back through `StreamingToolCallTracker`. Test count: chat_template 36 (was 31, +5 net for the new shapes), all green. Clippy `-Dwarnings` clean across higgs / higgs-engine / higgs-models. Also: cargo fmt sweep on the new code that landed unformatted in the prior commit; pure layout, no behaviour change.

`req.tools.as_deref()` returns `Some(&[])` when the request carries an empty `tools` array. The chat-template renderer treats that as `tools` being *defined* in the Jinja context (just empty), versus `None` which omits the key entirely. Templates branching on `{% if tools is defined %}` vs `{% if tools %}` see different state. Convert empty slices to `None` before handing them to `prepare_chat_prompt_with_thinking`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Addresses CRITICAL finding from the closed upstream PR panbanda#63 review: a model that emits `<tool_call>` and never produces a matching `</tool_call>` would grow `StreamingToolCallTracker::buffer` until OOM — a model-controlled DoS vector. On overflow we abandon the parse, surface `<tool_call>` plus the buffered bytes as visible content (preserving the "never silently drop tokens" invariant), and reset state so a subsequent well-formed tool call in the same stream still parses. New test `streaming_unbounded_buffer_capped_and_recovers` pushes MAX_INSIDE_TOOL_CALL_BYTES + 1 inside an unclosed tag followed by a valid call, and asserts both that the overflow is flushed verbatim and that the post-overflow call still parses (completed_count == 1). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

funkymonkeymonk · 2026-06-01T22:53:42Z

This PR fixes the exact issue where streaming API drops tool definitions and warns 'tool-calls are unsupported'. Running Higgs 1.3.0 with Qwen3-Coder-Next and OpenCode — every streaming request gets tools stripped, forcing a text-based fallback that's unreliable for complex tool calls. The non-streaming path does parse the model's XML output but doesn't convert it to OpenAI's structured tool_calls format either.

All 6 CI checks are green, the implementation is surgical (3 files, +710 -17), and this unblocks local MLX models for use with agentic tools like OpenCode, Claude Code, and Aider. Would love to see this merged for the next release. 🙏

… coercion Qwen3.5/3.6 models' chat_template.jinja instructs them to emit `<tool_call><function=NAME><parameter=KEY>value</parameter></function></tool_call>`, but the existing parser only decodes the older JSON-in-`<tool_call>` shape, so tool calls leaked into `content` with `finish_reason: stop` and the OpenAI streaming/non-streaming clients saw nothing. Add a second parse path that dispatches on the block's first token (`<function=` → XML, else → JSON), reusing the existing `StreamingToolCallTracker` boundary detection unchanged. XML values are raw strings, so coerce them through a `ToolSchema` built from the request's `tools` array (integer/number/boolean/object/array → typed JSON, string-typed `"123"` stays a string), falling back to best-effort JSON parsing when no schema is declared. - `parse_tool_calls` and `StreamingToolCallTracker::new` now take an `Option<&ToolSchema>` / `Option<ToolSchema>`; the streaming path captures it from `req.tools` before the `async_stream!` block, the non-streaming path threads it at the call site - Backward-compatible: pre-existing 28 JSON tests still pass; one new test exercises a mixed JSON+XML stream Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/higgs-engine/src/tool_parser.rs (1)
340-344: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document the public StreamingToolOutput fields.

This struct is new public API, but visible and new_tool_calls still rely on the struct-level summary for their contract. Add field-level rustdoc so the generated API docs stay explicit.
📚 Suggested doc shape
 pub struct StreamingToolOutput {
+    /// Text that should be forwarded to the client as a normal content delta.
     pub visible: String,
+    /// Tool calls that became complete while processing this chunk.
     pub new_tool_calls: Vec<ParsedToolCall>,
 }
As per coding guidelines, **/*.rs: Add doc comments on public structs/fields in Rust when changing user-facing behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/higgs-engine/src/tool_parser.rs` around lines 340 - 344, Add
field-level rustdoc comments for the public struct StreamingToolOutput: document
the visible field to explain it contains the partial/accumulated human-visible
output text (what callers should render or append) and document new_tool_calls
to state it holds ParsedToolCall items produced since the last update (e.g.,
newly detected tool invocations for the caller to process). Update the struct
definition around StreamingToolOutput to include these doc comments for both
visible and new_tool_calls so the generated API docs explicitly describe their
contracts.

♻️ Duplicate comments (1)

crates/higgs/src/routes/chat.rs (1)
269-279: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Treat empty tool arrays as absent in the non-streaming path too.

The streaming path now normalizes [] to None, but the non-streaming path still treats Some(&[]) as tool-enabled. That means templates can still see tools as defined here, and has_tools can still drive parse_tool_calls / "tool_calls" responses even though the caller provided no usable tools.
💡 Suggested fix
-    let tools = req.tools.as_deref();
+    let tools = req.tools.as_deref().filter(|tools| !tools.is_empty());
@@
-    let has_tools = req.tools.is_some();
+    let has_tools = tools.is_some();
Also applies to: 318-320
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/higgs/src/routes/chat.rs` around lines 269 - 279, The non-streaming
path still passes Some(&[]) into template generation which treats empty tool
arrays as present; update the code around convert_messages / let tools =
req.tools.as_deref() and the analogous block near
prepare_chat_prompt_with_thinking (and the other occurrence around lines
318-320) to normalize empty slices to None (e.g., set tools =
req.tools.as_deref().filter(|s| !s.is_empty()) or equivalent) before calling
engine.prepare_chat_prompt_with_thinking so templates and has_tools logic see
absent tools when the caller provided an empty array.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/higgs-engine/src/chat_template.rs`:
- Around line 217-230: The public rustdoc for normalize_tool_call_for_template
is out of date: the implementation now normalizes nested function.arguments (via
normalize_arguments_value) and coerces any non-object or unparseable string into
an empty object {} rather than leaving failed parses untouched; update the
rustdoc text for normalize_tool_call_for_template (and any doc comments on
normalize_arguments_value if present) to state that both top-level arguments and
nested function.arguments are normalized and that non-object results are
replaced with {}.

In `@crates/higgs-engine/src/tool_parser.rs`:
- Around line 231-240: coerce_param_value currently treats ParamType::Integer
like Number (using parsed_if(Value::is_number)), allowing fractional inputs like
42.5; update the ParamType::Integer branch in coerce_param_value to only accept
true integers (e.g., use parsed_if with a predicate that returns true for
Value::is_i64 or Value::is_u64 or for Value::is_f64 only when the fractional
part is zero) so fractional numbers are rejected, and keep ParamType::Number
using Value::is_number; also add Rustdoc comments (/// ...) to the public fields
visible and new_tool_calls on the StreamingToolOutput struct to document their
purpose.

---

Outside diff comments:
In `@crates/higgs-engine/src/tool_parser.rs`:
- Around line 340-344: Add field-level rustdoc comments for the public struct
StreamingToolOutput: document the visible field to explain it contains the
partial/accumulated human-visible output text (what callers should render or
append) and document new_tool_calls to state it holds ParsedToolCall items
produced since the last update (e.g., newly detected tool invocations for the
caller to process). Update the struct definition around StreamingToolOutput to
include these doc comments for both visible and new_tool_calls so the generated
API docs explicitly describe their contracts.

---

Duplicate comments:
In `@crates/higgs/src/routes/chat.rs`:
- Around line 269-279: The non-streaming path still passes Some(&[]) into
template generation which treats empty tool arrays as present; update the code
around convert_messages / let tools = req.tools.as_deref() and the analogous
block near prepare_chat_prompt_with_thinking (and the other occurrence around
lines 318-320) to normalize empty slices to None (e.g., set tools =
req.tools.as_deref().filter(|s| !s.is_empty()) or equivalent) before calling
engine.prepare_chat_prompt_with_thinking so templates and has_tools logic see
absent tools when the caller provided an empty array.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a8dba805-542c-4233-9792-d4f16d930066

📥 Commits

Reviewing files that changed from the base of the PR and between 5aea6d0 and ab8e8fe.

📒 Files selected for processing (3)

crates/higgs-engine/src/chat_template.rs
crates/higgs-engine/src/tool_parser.rs
crates/higgs/src/routes/chat.rs

dusterbloom · 2026-06-02T12:47:24Z

📋 Motivation & root-cause writeup (why Qwen tool calls were silently dropped): #177

- coerce_param_value: split `integer` from `number` — `is_number` accepts floats, so an `integer` param wrongly coerced `3.14`. `integer` now only accepts i64/u64 values, falling back to the raw string otherwise. - chat (non-streaming): treat an empty `tools: []` as absent, mirroring the streaming path — it no longer defines `tools` in the template context or triggers tool parsing. - docs: refresh `normalize_tool_call_for_template` (non-object arguments are coerced to `{}`, not left untouched; dropped a stray leading line) and document the public `StreamingToolOutput` fields. Adds a test pinning the integer/fractional coercion contract. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Qwen3.6 reasons first: in thinking mode the chat template opens `<think>`, so generation starts inside the think block and the tool call is emitted AFTER `</think>`. The chat route prepends `<think>`, splits reasoning via `parse_reasoning`, then runs `parse_tool_calls` on the remainder. A parser that scanned the whole output (or only the reasoning) would silently drop the call — the most common thinking+tools failure mode, and the one dimension the existing suite did not cover. Adds `xml_tool_call_after_think_block_is_extracted`, replicating that composition with the Qwen3.6 XML tool-call format. The 35+ existing parser/streaming tests already cover XML parse + schema coercion, streaming chunk-splits, and unbounded-buffer recovery. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

Comment thread crates/higgs/src/routes/chat.rs Outdated

dusterbloom and others added 3 commits May 21, 2026 17:08

coderabbitai Bot requested changes Jun 2, 2026

View reviewed changes

Comment thread crates/higgs-engine/src/chat_template.rs

Comment thread crates/higgs-engine/src/tool_parser.rs

dusterbloom and others added 2 commits June 2, 2026 15:37

coderabbitai Bot approved these changes Jun 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(chat): streaming tool-call deltas + Qwen-friendly arg normalisation#164

feat(chat): streaming tool-call deltas + Qwen-friendly arg normalisation#164
dusterbloom wants to merge 7 commits into
panbanda:mainfrom
dusterbloom:dusterbloom/streaming-tools-port

dusterbloom commented May 21, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 21, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

funkymonkeymonk commented Jun 1, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

dusterbloom commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dusterbloom commented May 21, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Three pieces (~580 LOC)

tool_parser::StreamingToolCallTracker

chat::chat_completions_stream wiring

chat_template::normalize_tool_call_for_template

Test plan

Closes/replaces

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

funkymonkeymonk commented Jun 1, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dusterbloom commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dusterbloom commented May 21, 2026 •

edited by coderabbitai Bot

Loading

`tool_parser::StreamingToolCallTracker`

`chat::chat_completions_stream` wiring

`chat_template::normalize_tool_call_for_template`

coderabbitai Bot commented May 21, 2026 •

edited

Loading