Skip to content

Per-session usage/cost metrics (tokens, cache, cost, context %) #2

@alexanderkreidich

Description

@alexanderkreidich

Consumers migrating from bespoke chat UIs (e.g. the appx canvas template) show per-session usage in the chat settings: input/output tokens, cache read/write, monetary cost, and context-window utilization. agent-client currently has no notion of usage at all — neither in the reducer/SessionState nor as a client helper.

The data is already on the wire: AssistantMessage.usage is part of the contract (REST history via getSessionMessages and SSE WireEvents). What's missing:

  1. Aggregation — accumulate usage from assistant messages into SessionState (or expose a pure helper that folds a UiMessage[]/history into totals: input, output, cacheRead, cacheWrite, cost, context % of the active model's contextWindow).
  2. Cost recalculation when usage.cost is 0 — custom LiteLLM-routed models often report zero cost; the canvas template recalculates from per-million-token rates. AgentModelRow from GET /v1/sessions/models doesn't expose cost rates, so either the rates get added to the agent-server models contract (cross-repo dependency) or the consumer passes a rates map into the helper.

Previously this was done by reading Pi session JSONL files from disk — impossible once sessions live inside the agent-server container under .pi-global/sessions/{projectId}/, so the client/contract path is the only one that survives the containerized architecture.

Suggested split: do (1) in this package now with an optional consumer-supplied rates map for (2); file the rates-in-models-endpoint question against agent-server.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions