fix: preserve reasoning_tokens and cached_tokens in usage conversion by pjb157 · Pull Request #174 · doublewordai/onwards

pjb157 · 2026-04-10T06:39:42Z

Problem

When the Responses API adapter converts a Chat Completions response to a Responses API response, it constructs a ResponseUsage from the upstream Usage. The previous code hardcoded both detail fields to zero:

input_tokens_details: InputTokensDetails { cached_tokens: 0 },
output_tokens_details: OutputTokensDetails { reasoning_tokens: 0 },

This silently dropped the reasoning_tokens count reported by thinking models (DeepSeek-R1, Qwen3, etc.) and the cached_tokens count reported by providers that support prompt caching, even when the upstream had populated completion_tokens_details.reasoning_tokens and prompt_tokens_details.cached_tokens correctly.

The bug was duplicated across five call sites:

adapter.rs — non-streaming to_responses_response_with_usage default usage
streaming.rs — StreamingState usage capture
handlers.rs × 3 — usage accumulation across tool-loop iterations, the unhandled-tools response path, and the final response path

The accumulation site was additionally dropping the entire completion_tokens_details and prompt_tokens_details JSON blobs (= None) when summing across iterations, so even the chat-completions passthrough for tool-loop responses lost the detail.

Fix

Adds a single shared helper in src/strict/mod.rs:

pub(crate) fn chat_usage_to_response_usage(
    u: &schemas::chat_completions::Usage,
) -> schemas::responses::ResponseUsage

It extracts reasoning_tokens from completion_tokens_details.reasoning_tokens and cached_tokens from prompt_tokens_details.cached_tokens (both stored as raw serde_json::Value on the chat completions side), and threads them through into the typed Responses API fields. All five call sites now use this helper.

The accumulation site in handlers.rs now preserves the latest detail JSON across iterations using .or() instead of dropping it.

Test plan

cargo build — clean, no warnings
cargo test — all 317 tests pass
cargo clippy — no new warnings introduced

🤖 Generated with Claude Code

The strict mode adapter was hardcoding `reasoning_tokens: 0` and `cached_tokens: 0` in every chat completions → Responses API usage conversion, silently dropping the reasoning token counts reported by thinking models. The tool-loop accumulator also discarded `completion_tokens_details` and `prompt_tokens_details` entirely when summing usage across iterations. Adds `chat_usage_to_response_usage()` helper that extracts the real values from the raw JSON detail fields, and uses it in all five call sites (non-streaming adapter, streaming state, accumulation, unhandled tools path, final response path). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Fixes loss of token-usage detail when converting Chat Completions responses into Responses API responses in strict-mode, ensuring reasoning_tokens (thinking models) and cached_tokens (prompt caching) are preserved end-to-end.

Changes:

Added a shared chat_usage_to_response_usage helper to extract cached_tokens / reasoning_tokens from chat usage detail JSON into typed ResponseUsage.
Replaced hardcoded-zero usage conversions in adapter, streaming, and handler paths to use the shared helper.
Updated tool-loop usage accumulation to preserve the latest *_tokens_details JSON instead of dropping it entirely.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File	Description
`src/strict/mod.rs`	Introduces shared chat→responses usage conversion helper that extracts `cached_tokens` and `reasoning_tokens`.
`src/strict/adapter.rs`	Uses shared helper when mapping `ChatCompletionResponse.usage` into `ResponsesResponse.usage`.
`src/strict/handlers.rs`	Uses shared helper for aggregate usage; preserves `*_tokens_details` JSON during accumulation.
`src/strict/streaming.rs`	Uses shared helper to capture usage from final streaming chunk.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Clamp `cached_tokens` / `reasoning_tokens` to `u32::MAX` when parsing from untrusted JSON so oversized values don't silently wrap. - Sum `prompt_tokens_details` / `completion_tokens_details` across tool-loop iterations via new `merge_usage_details()` helper, so the aggregate detail counts stay consistent with the summed totals instead of keeping only one iteration's values. - Add unit tests for `chat_usage_to_response_usage` (extraction, missing details, u32 clamping) and `merge_usage_details` (summing, key union, None handling). - Add adapter test verifying `to_responses_response` extracts cached/reasoning tokens from the chat response's detail JSON. - Add streaming test verifying a chunk with populated usage details flows through `StreamingState` to `ResponseUsage`. - Extend `test_token_accumulation_across_iterations` to exercise the new summing behavior for detail counts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…g-tokens-in-usage # Conflicts: # src/strict/handlers.rs

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

## 🤖 New release * `onwards`: 0.24.1 -> 0.24.2 (✓ API compatible changes) <details><summary>Changelog</summary> <blockquote> ## [0.24.2](v0.24.1...v0.24.2) - 2026-04-15 ### Fixed - preserve reasoning_tokens and cached_tokens in usage conversion ([#174](#174)) </blockquote> </details> --- This PR was generated with [release-plz](https://github.com/release-plz/release-plz/). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

pjb157 requested a review from Copilot April 10, 2026 10:48

Copilot started reviewing on behalf of pjb157 April 10, 2026 10:49 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

Comment thread src/strict/mod.rs Outdated

Comment thread src/strict/handlers.rs

Comment thread src/strict/adapter.rs

Comment thread src/strict/streaming.rs

pjb157 requested a review from Copilot April 10, 2026 16:13

Copilot started reviewing on behalf of pjb157 April 10, 2026 16:14 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

Comment thread src/strict/mod.rs

Merge remote-tracking branch 'origin/main' into fix/preserve-reasonin…

84d9e19

…g-tokens-in-usage # Conflicts: # src/strict/handlers.rs

pjb157 requested a review from Copilot April 15, 2026 13:59

Copilot started reviewing on behalf of pjb157 April 15, 2026 14:00 View session

Copilot AI reviewed Apr 15, 2026

View reviewed changes

pjb157 merged commit 7437aea into main Apr 15, 2026
9 checks passed

github-actions bot mentioned this pull request Apr 15, 2026

chore: release v0.24.2 #178

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: preserve reasoning_tokens and cached_tokens in usage conversion#174

fix: preserve reasoning_tokens and cached_tokens in usage conversion#174
pjb157 merged 3 commits intomainfrom
fix/preserve-reasoning-tokens-in-usage

pjb157 commented Apr 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pjb157 commented Apr 10, 2026

Problem

Fix

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants