feat: track per-generation token usage in telemetry#325
feat: track per-generation token usage in telemetry#325suryaiyer95 wants to merge 2 commits intomainfrom
Conversation
…he/reasoning) - Emit `generation` telemetry event on every LLM step-finish with model_id, provider_id, agent, finish_reason, cost, duration_ms, and token breakdown - Token fields are flat (no nested objects) to comply with Azure App Insights custom measurements schema: `tokens_input`, `tokens_output`, and optionally `tokens_reasoning`, `tokens_cache_read`, `tokens_cache_write` - Optional token fields are only included when the provider actually returns them — reasoning only for reasoning models, cache_read/write only when prompt caching is active — never defaulted to 0 - Step duration tracked from `start-step` to `finish-step` events - Adds `altimate_change` markers in `processor.ts` (upstream file) - Updates telemetry.md docs with accurate generation event description Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughGeneration telemetry was changed to emit flattened top-level token fields and added step timing telemetry. The session processor now emits a Changes
Sequence Diagram(s)sequenceDiagram
participant SessionProcessor as SessionProcessor
participant Telemetry as TelemetryModule
participant AppInsights as AppInsightsExporter
SessionProcessor->>SessionProcessor: start-step (record stepStartTime)
SessionProcessor->>SessionProcessor: run step / receive response (usage, tokens, cost, finish_reason)
SessionProcessor->>Telemetry: track(generation event with sessionId, messageId, model, provider, agent, finish_reason, cost, duration_ms, tokens_input, tokens_output, tokens_reasoning?, tokens_cache_read?, tokens_cache_write?)
Telemetry->>AppInsights: toAppInsightsEnvelopes(flat measurements & properties)
AppInsights-->>Telemetry: accept/enqueue
Telemetry-->>SessionProcessor: ack (telemetry emitted)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/opencode/src/altimate/telemetry/index.ts (1)
17-23: Remove the unusedTokensPayloadtype.The
TokensPayloadtype (lines 17-23) is no longer referenced anywhere in the codebase after the schema was changed to use flattokens_*fields. Removing it eliminates dead code and prevents confusion about the expected payload structure.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opencode/src/altimate/telemetry/index.ts` around lines 17 - 23, Remove the unused TokensPayload type declaration (export type TokensPayload) from the altimate telemetry module; delete the block defining input/output/reasoning/cache_read/cache_write and any exports of that symbol, then run TypeScript type-check to confirm no remaining references and update/remove any imports that referenced TokensPayload elsewhere (if found) so the code compiles with the schema using flat tokens_* fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@packages/opencode/src/altimate/telemetry/index.ts`:
- Around line 17-23: Remove the unused TokensPayload type declaration (export
type TokensPayload) from the altimate telemetry module; delete the block
defining input/output/reasoning/cache_read/cache_write and any exports of that
symbol, then run TypeScript type-check to confirm no remaining references and
update/remove any imports that referenced TokensPayload elsewhere (if found) so
the code compiles with the schema using flat tokens_* fields.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: a9acd731-145c-4d0c-b8a5-039dc6c5ae3d
📒 Files selected for processing (3)
docs/docs/reference/telemetry.mdpackages/opencode/src/altimate/telemetry/index.tspackages/opencode/src/session/processor.ts
There was a problem hiding this comment.
Pull request overview
Adds per-generation (generation) telemetry emission on each LLM step completion, including a detailed token breakdown and step duration, and updates the public telemetry reference docs accordingly.
Changes:
- Emit the
generationtelemetry event on everyfinish-step, including cost, finish reason, duration, and token measurements. - Update the
Telemetry.Eventschema to use flattokens_*numeric fields (instead of a nestedtokensobject) to satisfy Azure App Insights measurement constraints. - Update
docs/docs/reference/telemetry.mdto describe thegenerationevent fields accurately.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| packages/opencode/src/session/processor.ts | Emits generation telemetry on finish-step and tracks duration_ms via a step start timestamp. |
| packages/opencode/src/altimate/telemetry/index.ts | Updates the generation event type to use flat tokens_* fields. |
| docs/docs/reference/telemetry.md | Updates documentation for the generation event’s fields/token breakdown. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Flat token fields — only present when data is available from the provider. | ||
| // No nested objects: Azure App Insights custom measures must be top-level numbers. | ||
| tokens_input: number | ||
| tokens_output: number | ||
| tokens_reasoning?: number // only for reasoning models | ||
| tokens_cache_read?: number // only when a cached prompt was reused | ||
| tokens_cache_write?: number // only when a new cache entry was written |
There was a problem hiding this comment.
Changing the generation telemetry event from a nested tokens object to flat tokens_* fields will break existing type-checked tests that still construct Telemetry.Event with tokens: { ... } (e.g. packages/opencode/test/session/processor.test.ts and packages/opencode/test/telemetry/telemetry.test.ts). Please update those tests to use tokens_input, tokens_output, etc., and adjust the envelope-flattening assertions accordingly (since tokens will no longer be present).
| // Flat token fields — only present when data is available from the provider. | |
| // No nested objects: Azure App Insights custom measures must be top-level numbers. | |
| tokens_input: number | |
| tokens_output: number | |
| tokens_reasoning?: number // only for reasoning models | |
| tokens_cache_read?: number // only when a cached prompt was reused | |
| tokens_cache_write?: number // only when a new cache entry was written | |
| tokens: TokensPayload |
… fields - Remove `TokensPayload` export (dead code since generation event now uses flat fields) - Update processor.test.ts: construct generation event with flat tokens_* fields - Update telemetry.test.ts: use flat tokens_* fields, rename test to reflect new shape Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
packages/opencode/test/session/processor.test.ts (1)
461-485: Assert the rest of the flattened token fields in this case.Right now this only checks
tokens_inputandtokens_cache_read, sotokens_output,tokens_reasoning, andtokens_cache_writecan regress without failing this processor-mapping coverage.➕ Suggested assertions
expect(event.model_id).toBe("claude-opus-4-6") expect(event.tokens_input).toBe(1000) + expect(event.tokens_output).toBe(500) + expect(event.tokens_reasoning).toBe(200) expect(event.tokens_cache_read).toBe(800) + expect(event.tokens_cache_write).toBe(100) expect(event.cost).toBe(0.05) expect(event.finish_reason).toBe("end_turn")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opencode/test/session/processor.test.ts` around lines 461 - 485, Update the "generation event contains all required fields" test to assert the remaining flattened token fields so regressions are caught: add expect checks for event.tokens_output, event.tokens_reasoning, and event.tokens_cache_write (and keep existing assertions) inside the same test block that defines the event variable; reference the test name and the event object when adding these expect statements.packages/opencode/test/telemetry/telemetry.test.ts (1)
627-660: Add the omit-when-unavailable case for optional token metrics.This only exercises the all-fields-present path. A companion case where the event omits
tokens_reasoning,tokens_cache_read, andtokens_cache_writewould protect the new “missing, not zero” contract.➕ Suggested test
test("flat token fields appear in measurements", async () => { const { fetchCalls, cleanup } = await initWithMockedFetch() try { Telemetry.track({ type: "generation", @@ } finally { cleanup() } }) + + test("optional token metrics stay omitted when not present on the event", async () => { + const { fetchCalls, cleanup } = await initWithMockedFetch() + try { + Telemetry.track({ + type: "generation", + timestamp: 1700000000000, + session_id: "sess-1", + message_id: "msg-1", + model_id: "claude-3", + provider_id: "anthropic", + agent: "builder", + finish_reason: "end_turn", + tokens_input: 100, + tokens_output: 200, + cost: 0.01, + duration_ms: 2000, + }) + + await Telemetry.flush() + + const measurements = JSON.parse(fetchCalls[0].body)[0].data.baseData.measurements + expect(measurements.tokens_reasoning).toBeUndefined() + expect(measurements.tokens_cache_read).toBeUndefined() + expect(measurements.tokens_cache_write).toBeUndefined() + } finally { + cleanup() + } + })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opencode/test/telemetry/telemetry.test.ts` around lines 627 - 660, Add a companion test for Telemetry.track that omits the optional token fields tokens_reasoning, tokens_cache_read, and tokens_cache_write to verify the "omit-when-unavailable" behavior: call Telemetry.track with the same required fields but leave out those three, call await Telemetry.flush(), parse fetchCalls[0].body to get envelopes[0].data.baseData.measurements, and assert that measurements.tokens_reasoning, measurements.tokens_cache_read, and measurements.tokens_cache_write are not present (or are undefined) while tokens_input/tokens_output still appear; use the same initWithMockedFetch/cleanup pattern as the existing flat token fields test to scope the mocked fetch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@packages/opencode/test/session/processor.test.ts`:
- Around line 461-485: Update the "generation event contains all required
fields" test to assert the remaining flattened token fields so regressions are
caught: add expect checks for event.tokens_output, event.tokens_reasoning, and
event.tokens_cache_write (and keep existing assertions) inside the same test
block that defines the event variable; reference the test name and the event
object when adding these expect statements.
In `@packages/opencode/test/telemetry/telemetry.test.ts`:
- Around line 627-660: Add a companion test for Telemetry.track that omits the
optional token fields tokens_reasoning, tokens_cache_read, and
tokens_cache_write to verify the "omit-when-unavailable" behavior: call
Telemetry.track with the same required fields but leave out those three, call
await Telemetry.flush(), parse fetchCalls[0].body to get
envelopes[0].data.baseData.measurements, and assert that
measurements.tokens_reasoning, measurements.tokens_cache_read, and
measurements.tokens_cache_write are not present (or are undefined) while
tokens_input/tokens_output still appear; use the same
initWithMockedFetch/cleanup pattern as the existing flat token fields test to
scope the mocked fetch.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 9df61593-d829-4149-8b44-dcd10eef53d5
📒 Files selected for processing (3)
packages/opencode/src/altimate/telemetry/index.tspackages/opencode/test/session/processor.test.tspackages/opencode/test/telemetry/telemetry.test.ts
✅ Tests — All PassedTypeScript — passedcc @suryaiyer95 |
Summary
generationtelemetry event (previously defined but never fired) on every LLM step-finishtokens_reasoning,tokens_cache_read,tokens_cache_write) are only included when the provider actually returns them — never defaulted to 0telemetry.mdwith accurate description of the generation event fieldsDesign Decisions
Flat fields, not nested objects — Azure App Insights custom measurements must be top-level numbers. The
generationevent type now usestokens_input,tokens_output, etc. directly instead of a nestedtokens: TokensPayloadobject."Not available" vs "zero" — Each optional field uses the raw AI SDK usage values to determine availability:
tokens_reasoning: only whenvalue.usage.reasoningTokens !== undefined(reasoning models)tokens_cache_read: only whenvalue.usage.cachedInputTokens !== undefined(cache hit)tokens_cache_write: only whenusage.tokens.cache.write > 0(Anthropic/Bedrock metadata)Step duration — Tracked via
stepStartTimeset atstart-step, computed atfinish-step.Note on "1 input token" in traces
The existing
tokens_inputcorrectly reflects Anthropic's semantics: after the first step, all previous context is cached, soinputTokensfrom Anthropic is only the new tokens added since the last step. The large cached portion appears intokens_cache_read. This is correct behavior —total = input + output + cache_read + cache_write.Checklist
@opencode-ai/utilmissing in worktree)processor.tsdocs/docs/reference/telemetry.md🤖 Generated with Claude Code
Summary by CodeRabbit
Documentation
Chores