Clarify token usage billing labels by lewtun · Pull Request #299 · huggingface/ml-intern

lewtun · 2026-06-07T20:07:58Z

No description provided.

Co-authored-by: OpenAI Codex <codex@openai.com>

github-actions · 2026-06-07T20:08:14Z

Claude finished @lewtun's task in 1m 55s —— View job

Review: Clarify token usage billing labels

No blocking issues — 1 P1.

The split from a single Tokens row into Input tokens / Output tokens is a clarity improvement. One accuracy concern with how "Input tokens" is sourced.

P1 — `Input tokens` undercounts when prompt caching is active

frontend/src/components/UsageMeter.tsx:117-120 sources the new row from telemetry?.prompt_tokens alone:

<UsageRow label="Input tokens" value={formatCount(telemetry?.prompt_tokens)} />

But in this codebase prompt_tokens holds only the non-cached input. Cache reads and cache-creation tokens are tracked as separate, disjoint buckets — both the backend and the frontend store reconstruct the total by summing all four:

backend/usage.py:202-207 — total_tokens = total or (prompt + completion + cache_read + cache_creation)
frontend/src/store/usageStore.ts:115 — intValue(data.total_tokens) || prompt + completion + cacheRead + cacheCreation
agent/core/telemetry.py:49-69 — emits prompt_tokens and cache_read_tokens/cache_creation_tokens as independent keys (for Anthropic via LiteLLM, prompt_tokens is the uncached input_tokens; cache hits land in the cache keys).

This agent leans on prompt caching, so on multi-turn loops cache_read_tokens is frequently the majority of real input. The new row will therefore display a small fraction of the input the user actually consumed/was billed for — arguably more misleading than the old Tokens row, which used total_tokens and at least included the cached portion.

Consider summing the input components so the label matches its meaning:

value={formatCount(
  (telemetry?.prompt_tokens ?? 0) +
  (telemetry?.cache_read_tokens ?? 0) +
  (telemetry?.cache_creation_tokens ?? 0)
)}

Output tokens = completion_tokens is correct as-is. Fix this →

Notes (non-blocking)

P2 — No test covers the label/value change. The component has no existing test, so this is consistent with the current pattern, not a regression.
The diff vs. origin/main looks larger than +6/-2 because this branch forked before Remove inference usage link from usage popover #298 (which removed the "Inference usage" link). A three-way merge keeps that link removed (this PR doesn't touch those lines), so no re-introduction — flagging only for awareness in case you rebase.
· codex/input-token-labels

Co-authored-by: OpenAI Codex <codex@openai.com>

github-actions · 2026-06-07T20:11:34Z

Claude finished @lewtun's task in 43s —— View job

Re-review: Clarify token usage billing labels

No blocking issues.

This is a re-review after the synchronize push (merge f33a900). The earlier P1 is resolved.

Resolved — prior P1 (cache undercount) no longer applies

The original review flagged that Input tokens sourced telemetry?.prompt_tokens alone, excluding cache reads/creation. After merging main, the row no longer does that — the diff vs origin/main is now just the label rename, and the value comes from contextTokenCount(telemetry) (frontend/src/components/UsageMeter.tsx:130-132), which already accounts for the full input:

function contextTokenCount(telemetry) {
  if (telemetry.total_tokens > 0) {
    return Math.max(0, telemetry.total_tokens - telemetry.completion_tokens);
  }
  return telemetry.prompt_tokens + telemetry.cache_read_tokens + telemetry.cache_creation_tokens;
}

Both branches yield the full input bucket: total_tokens - completion_tokens equals prompt + cache_read + cache_creation given usageStore's total reconstruction (frontend/src/store/usageStore.ts:115), and the fallback sums the same three. So Input tokens now reflects cached + uncached input, matching the label. Output tokens = completion_tokens (UsageMeter.tsx:135) is correct.

The net change is a pure label rename (Context tokens → Input tokens), which reads more naturally as a billing line and is accurate.

Note (non-blocking)

P2 — Still no test on the component, consistent with the existing pattern (pre-existing, not a regression).
· codex/input-token-labels

Clarify token usage billing labels

c2e2608

Co-authored-by: OpenAI Codex <codex@openai.com>

Merge main into input token labels PR

f33a900

Co-authored-by: OpenAI Codex <codex@openai.com>

lewtun merged commit 6283446 into main Jun 7, 2026
3 checks passed

lewtun deleted the codex/input-token-labels branch June 7, 2026 20:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify token usage billing labels#299

Clarify token usage billing labels#299
lewtun merged 2 commits into
mainfrom
codex/input-token-labels

lewtun commented Jun 7, 2026

Uh oh!

github-actions Bot commented Jun 7, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lewtun commented Jun 7, 2026

Uh oh!

github-actions Bot commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: Clarify token usage billing labels

P1 — Input tokens undercounts when prompt caching is active

Notes (non-blocking)

Uh oh!

github-actions Bot commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Re-review: Clarify token usage billing labels

Resolved — prior P1 (cache undercount) no longer applies

Note (non-blocking)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jun 7, 2026 •

edited

Loading

P1 — `Input tokens` undercounts when prompt caching is active

github-actions Bot commented Jun 7, 2026 •

edited

Loading