Skip to content

fix(usage): inflated input_tokens in Claude stream parsing and Zhipu quota issues #259

@thedavidweng

Description

@thedavidweng

Problem

1. Inflated input_tokens in Claude stream parsing

Some Anthropic-compatible SSE providers (e.g. Qwen, MiniMax) report the full context (fresh + cached) as input_tokens in message_start, double-counting the cached portion that is also reported in cache_read_input_tokens. This inflates the cacheable-input denominator and pushes the displayed cache hit rate artificially low.

In src-tauri/src/proxy/usage/parser.rs, the from_claude_stream_events function (line ~288) sets input_tokens unconditionally from message_start. The message_delta handler (line ~336) only reads input_tokens when the existing value is exactly zero (if usage.input_tokens == 0), so the delta value is silently discarded when message_start already provided a (larger) value.

Fix: When message_delta carries a smaller positive input_tokens, prefer it over the message_start value and adopt the cache counts from the same usage block.

2. Zhipu quota query hardcoded to international endpoint

query_zhipu() in src-tauri/src/services/coding_plan.rs (line ~184) is hardcoded to https://api.z.ai. Users who configured the mainland China preset (open.bigmodel.cn) cannot retrieve usage because the international endpoint may be unreachable from their network.

The detect_provider() function correctly distinguishes ZhipuCn from ZhipuEn, but the dispatch (line ~445) collapses both variants into the same query_zhipu(api_key) call that ignores the base URL.

Fix: Accept base_url in query_zhipu and route to the correct endpoint based on the configured provider URL.

3. Zhipu tier sorting when nextResetTime is absent

When the 5-hour bucket is at 0% utilization, Zhipu API omits nextResetTime. The tiers are not sorted after construction, and missing nextResetTime is not mapped to the correct time bucket, which can cause the weekly bucket to incorrectly claim the five-hour slot in tray and usage quota display.

Location: src-tauri/src/services/coding_plan.rsquery_zhipu function (lines ~240-284).

Expected Behavior

  • Claude stream parsing should prefer the delta input_tokens when it is smaller than the start value, avoiding double-counting
  • Zhipu quota queries should route to the endpoint matching the user configured base_url
  • Zhipu tiers should be sorted correctly, with missing nextResetTime mapping to the five-hour bucket

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions