feat(cost): native budget + cost on the SDK transports; consolidate pricing#24
Merged
Conversation
…ricing The native-vs-hand-rolled audit found one genuine catch-up: budget enforcement and cost tracking only worked on the managed-agents transport. This closes that gap with native SDK primitives and unifies the pricing tables. The budget POLICY stays fab's; only enforcement + cost now ride native knobs where the transport exposes them. ─── Native budget + cost on sdk / sdk-k8s / claude-cli (Rank 1) ─── These transports had NO budget enforcement and NO cost tracking: the tracker in streamSessionWithAdvisor only accumulates `span.model_request_end` events, which only managed-agents emits. - sdk.ts passes `maxBudgetUsd` (from getBudgetLimit) into the Agent SDK query(); the SDK stops the run with an `error_max_budget_usd` result when the USD cap is exceeded, which sdk-events already maps to session.error. Native per-run budget enforcement on the SDK transports. - sdk-events attaches the result's native `total_cost_usd` onto session.status_idle; streamSessionWithAdvisor reads it and reports the actual run cost on those transports (previously $0 / unknown). claude-cli rides the same translator, so it's covered too. ─── Pricing consolidation (Rank 2) ─── - New src/pricing.ts is the single source: MODEL_RATES (Opus $5/$25, Sonnet $3/$15, Haiku $1/$5) + cache multipliers (read 0.1x, write 1.25x) + rateFor + estimateCost. Replaces four divergent copies — usage.ts's (good), workflows.ts MODEL_PRICING (speed-keyed Sonnet-flat), and perf.ts's per-role + totals (Sonnet-flat inline). - workflows.ts budget tracker now prices spans model+cache-aware via the role's model instead of a flat Sonnet rate. - usage.ts + perf.ts consume pricing.ts; perf per-role and totals are now model-aware, so Opus roles are no longer under-priced. ─── Tests ─── - pricing.test.ts — rateFor tiers + sonnet fallback; estimateCost input/output + cache-read (0.1x) / cache-write (1.25x) + null cache fields. - sdk-events.test.ts — total_cost_usd attaches to status_idle; omitted on the managed-agents result shape. Depends on the SDK 0.3.x bump + model-aware usage.ts from #21 (this branch is stacked on it). Verification: npm run lint / build / format:check clean; 274 tests. Co-authored-by: stxkxsbot <275011021+stxkxsbot@users.noreply.github.com>
0251c7b to
c2d7389
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The one genuine catch-up from the native-vs-hand-rolled audit. See the commit message for full detail.
Stacked on #21 (base =
feat/claude-mid-2026-currency) — it needs the SDK 0.3.x bump (maxBudgetUsdexists only there) and the model-awareusage.tsfrom #21. Retarget tomainonce #21 merges.Summary
managed-agents. Now thesdk/sdk-k8s/claude-clitransports get native budget (maxBudgetUsd→error_max_budget_usd) and native cost (total_cost_usdfrom the result, surfaced onsession.status_idle) — previously neither existed there.src/pricing.ts(model + cache aware), replacing 4 divergent copies (2 of them Sonnet-flat). The live budget tracker and the perf report are now model-aware, so Opus roles aren't under-priced.Theory preserved
The budget policy (per-run limit from
getBudgetLimit) stays fab's — only enforcement + cost now use native primitives where the transport exposes them.Verification
npm run lint/build/format:checkclean ·npm test274/274 (+pricing + sdk-events cost tests).