Problem
1. Inflated input_tokens in Claude stream parsing
Some Anthropic-compatible SSE providers (e.g. Qwen, MiniMax) report the full context (fresh + cached) as input_tokens in message_start, double-counting the cached portion that is also reported in cache_read_input_tokens. This inflates the cacheable-input denominator and pushes the displayed cache hit rate artificially low.
In src-tauri/src/proxy/usage/parser.rs, the from_claude_stream_events function (line ~288) sets input_tokens unconditionally from message_start. The message_delta handler (line ~336) only reads input_tokens when the existing value is exactly zero (if usage.input_tokens == 0), so the delta value is silently discarded when message_start already provided a (larger) value.
Fix: When message_delta carries a smaller positive input_tokens, prefer it over the message_start value and adopt the cache counts from the same usage block.
2. Zhipu quota query hardcoded to international endpoint
query_zhipu() in src-tauri/src/services/coding_plan.rs (line ~184) is hardcoded to https://api.z.ai. Users who configured the mainland China preset (open.bigmodel.cn) cannot retrieve usage because the international endpoint may be unreachable from their network.
The detect_provider() function correctly distinguishes ZhipuCn from ZhipuEn, but the dispatch (line ~445) collapses both variants into the same query_zhipu(api_key) call that ignores the base URL.
Fix: Accept base_url in query_zhipu and route to the correct endpoint based on the configured provider URL.
3. Zhipu tier sorting when nextResetTime is absent
When the 5-hour bucket is at 0% utilization, Zhipu API omits nextResetTime. The tiers are not sorted after construction, and missing nextResetTime is not mapped to the correct time bucket, which can cause the weekly bucket to incorrectly claim the five-hour slot in tray and usage quota display.
Location: src-tauri/src/services/coding_plan.rs — query_zhipu function (lines ~240-284).
Expected Behavior
- Claude stream parsing should prefer the delta
input_tokens when it is smaller than the start value, avoiding double-counting
- Zhipu quota queries should route to the endpoint matching the user configured
base_url
- Zhipu tiers should be sorted correctly, with missing
nextResetTime mapping to the five-hour bucket
Problem
1. Inflated
input_tokensin Claude stream parsingSome Anthropic-compatible SSE providers (e.g. Qwen, MiniMax) report the full context (fresh + cached) as
input_tokensinmessage_start, double-counting the cached portion that is also reported incache_read_input_tokens. This inflates the cacheable-input denominator and pushes the displayed cache hit rate artificially low.In
src-tauri/src/proxy/usage/parser.rs, thefrom_claude_stream_eventsfunction (line ~288) setsinput_tokensunconditionally frommessage_start. Themessage_deltahandler (line ~336) only readsinput_tokenswhen the existing value is exactly zero (if usage.input_tokens == 0), so the delta value is silently discarded whenmessage_startalready provided a (larger) value.Fix: When
message_deltacarries a smaller positiveinput_tokens, prefer it over themessage_startvalue and adopt the cache counts from the same usage block.2. Zhipu quota query hardcoded to international endpoint
query_zhipu()insrc-tauri/src/services/coding_plan.rs(line ~184) is hardcoded tohttps://api.z.ai. Users who configured the mainland China preset (open.bigmodel.cn) cannot retrieve usage because the international endpoint may be unreachable from their network.The
detect_provider()function correctly distinguishesZhipuCnfromZhipuEn, but the dispatch (line ~445) collapses both variants into the samequery_zhipu(api_key)call that ignores the base URL.Fix: Accept
base_urlinquery_zhipuand route to the correct endpoint based on the configured provider URL.3. Zhipu tier sorting when
nextResetTimeis absentWhen the 5-hour bucket is at 0% utilization, Zhipu API omits
nextResetTime. The tiers are not sorted after construction, and missingnextResetTimeis not mapped to the correct time bucket, which can cause the weekly bucket to incorrectly claim the five-hour slot in tray and usage quota display.Location:
src-tauri/src/services/coding_plan.rs—query_zhipufunction (lines ~240-284).Expected Behavior
input_tokenswhen it is smaller than the start value, avoiding double-countingbase_urlnextResetTimemapping to the five-hour bucket