feat: refresh to mid-2026 Claude/Anthropic capabilities#21
Merged
Conversation
Six months of platform drift left a few concrete staleness points. This brings the model references, the Agent SDK pin, and the cost/docs surfaces current, verified against primary Anthropic/AWS docs. ─── Models & inference ─── - Bump every Opus escalation path to claude-opus-4-8 (current flagship, GA 2026-05-28): the advisor model (src/advisor.ts), the external-reviewer and prompt-optimizer lab roles (src/team/lab.ts), and the LLM_POLICY escalation tier in src/standards.ts that FACTORY_PREAMBLE injects into every factory role and produced app. Sonnet 4.6 / Haiku 4.5 remain the current Sonnet/Haiku tiers and are unchanged. - Add claude-opus-4-8 -> anthropic.claude-opus-4-8 to BEDROCK_MODEL_IDS (no -v1/date suffix, per the Bedrock model card); keep the 4.6/4.7 entries for back-compat. New resolveModelId assertion in inference.test.ts. ─── Agent SDK ─── - Unstick @anthropic-ai/claude-agent-sdk from ^0.1.0 (which cannot resolve 0.3.x) to ^0.3.161; lockfile regenerated to 0.3.161. The sdk / sdk-k8s runtimes were silently held ~2 minors back. Validated against the installed 0.3.161: lint clean, 267 tests pass, and the query() export resolves. - sdk-events.ts gains a documented no-op branch for non-init `system` messages, so any status/compaction subtype the 0.3.x SDK introduces is an explicit ignore rather than a silent drop. ─── Cost reporting ─── - src/usage.ts cost estimation is now model-aware (Opus $5/$25, Sonnet $3/$15, Haiku $1/$5, resolved per session from session.agent.model.id) and cache-token-aware (read 0.1x, 5-minute write 1.25x). Previously every role was priced at a flat Sonnet rate and cache tokens were ignored. fab sums API-returned token counts (there is no client-side tokenizer), so only the rate card is maintained here. ─── Docs & currency sweep ─── - docs/transports.md documents the 2026-06-15 Agent SDK + `claude -p` subscription billing: a monthly Agent SDK credit on Pro/Max/Team/Enterprise separate from interactive chat limits; API-key accounts stay pay-as-you-go. - Illustrative opus-4-6 references swept to 4-8: the fab.schema.json model example, the state.test override fixture, and the kagent / eks-agent-platform baseline skill docs. - Vendored src/standards/llm-policy.json escalation tier bumped for consistency; its canonical source is the nanohype repo, so the durable fix lands there on the next standards sync. Verification: npm run lint / build / format:check clean; npm test 267/267; npm audit 0 vulnerabilities. Co-authored-by: stxkxsbot <275011021+stxkxsbot@users.noreply.github.com>
This was referenced Jun 3, 2026
stxkxs
added a commit
that referenced
this pull request
Jun 4, 2026
…ricing The native-vs-hand-rolled audit found one genuine catch-up: budget enforcement and cost tracking only worked on the managed-agents transport. This closes that gap with native SDK primitives and unifies the pricing tables. The budget POLICY stays fab's; only enforcement + cost now ride native knobs where the transport exposes them. ─── Native budget + cost on sdk / sdk-k8s / claude-cli (Rank 1) ─── These transports had NO budget enforcement and NO cost tracking: the tracker in streamSessionWithAdvisor only accumulates `span.model_request_end` events, which only managed-agents emits. - sdk.ts passes `maxBudgetUsd` (from getBudgetLimit) into the Agent SDK query(); the SDK stops the run with an `error_max_budget_usd` result when the USD cap is exceeded, which sdk-events already maps to session.error. Native per-run budget enforcement on the SDK transports. - sdk-events attaches the result's native `total_cost_usd` onto session.status_idle; streamSessionWithAdvisor reads it and reports the actual run cost on those transports (previously $0 / unknown). claude-cli rides the same translator, so it's covered too. ─── Pricing consolidation (Rank 2) ─── - New src/pricing.ts is the single source: MODEL_RATES (Opus $5/$25, Sonnet $3/$15, Haiku $1/$5) + cache multipliers (read 0.1x, write 1.25x) + rateFor + estimateCost. Replaces four divergent copies — usage.ts's (good), workflows.ts MODEL_PRICING (speed-keyed Sonnet-flat), and perf.ts's per-role + totals (Sonnet-flat inline). - workflows.ts budget tracker now prices spans model+cache-aware via the role's model instead of a flat Sonnet rate. - usage.ts + perf.ts consume pricing.ts; perf per-role and totals are now model-aware, so Opus roles are no longer under-priced. ─── Tests ─── - pricing.test.ts — rateFor tiers + sonnet fallback; estimateCost input/output + cache-read (0.1x) / cache-write (1.25x) + null cache fields. - sdk-events.test.ts — total_cost_usd attaches to status_idle; omitted on the managed-agents result shape. Depends on the SDK 0.3.x bump + model-aware usage.ts from #21 (this branch is stacked on it). Verification: npm run lint / build / format:check clean; 274 tests. Co-authored-by: stxkxsbot <275011021+stxkxsbot@users.noreply.github.com>
stxkxs
added a commit
that referenced
this pull request
Jun 4, 2026
…ricing The native-vs-hand-rolled audit found one genuine catch-up: budget enforcement and cost tracking only worked on the managed-agents transport. This closes that gap with native SDK primitives and unifies the pricing tables. The budget POLICY stays fab's; only enforcement + cost now ride native knobs where the transport exposes them. ─── Native budget + cost on sdk / sdk-k8s / claude-cli (Rank 1) ─── These transports had NO budget enforcement and NO cost tracking: the tracker in streamSessionWithAdvisor only accumulates `span.model_request_end` events, which only managed-agents emits. - sdk.ts passes `maxBudgetUsd` (from getBudgetLimit) into the Agent SDK query(); the SDK stops the run with an `error_max_budget_usd` result when the USD cap is exceeded, which sdk-events already maps to session.error. Native per-run budget enforcement on the SDK transports. - sdk-events attaches the result's native `total_cost_usd` onto session.status_idle; streamSessionWithAdvisor reads it and reports the actual run cost on those transports (previously $0 / unknown). claude-cli rides the same translator, so it's covered too. ─── Pricing consolidation (Rank 2) ─── - New src/pricing.ts is the single source: MODEL_RATES (Opus $5/$25, Sonnet $3/$15, Haiku $1/$5) + cache multipliers (read 0.1x, write 1.25x) + rateFor + estimateCost. Replaces four divergent copies — usage.ts's (good), workflows.ts MODEL_PRICING (speed-keyed Sonnet-flat), and perf.ts's per-role + totals (Sonnet-flat inline). - workflows.ts budget tracker now prices spans model+cache-aware via the role's model instead of a flat Sonnet rate. - usage.ts + perf.ts consume pricing.ts; perf per-role and totals are now model-aware, so Opus roles are no longer under-priced. ─── Tests ─── - pricing.test.ts — rateFor tiers + sonnet fallback; estimateCost input/output + cache-read (0.1x) / cache-write (1.25x) + null cache fields. - sdk-events.test.ts — total_cost_usd attaches to status_idle; omitted on the managed-agents result shape. Depends on the SDK 0.3.x bump + model-aware usage.ts from #21 (this branch is stacked on it). Verification: npm run lint / build / format:check clean; 274 tests. Co-authored-by: stxkxsbot <275011021+stxkxsbot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See the commit message for full details.
Summary
LLM_POLICY, docs)claude-opus-4-6→claude-opus-4-8, and adds it to the Bedrock map (4.6/4.7 kept for back-compat) with a newresolveModelIdtest.@anthropic-ai/claude-agent-sdk^0.1.0→^0.3.161(lockfile regenerated; validated against the installed 0.3.161 — lint, 267 tests,query()export).usage.ts; documents the 2026-06-15 Agent SDK +claude -psubscription-credit billing indocs/transports.md.Verification
npm run lint/build/format:checkclean ·npm test267/267 ·npm audit0 vulnerabilities.Follow-ups (not in this PR)
llm-policy.jsonescalation tier lives upstream in thenanohyperepo (vendored copy reverts on next sync).