feat: refresh to mid-2026 Claude/Anthropic capabilities by stxkxs · Pull Request #21 · nanohype/fab

stxkxs · 2026-06-03T21:15:20Z

See the commit message for full details.

Summary

Bumps every Opus escalation path (advisor, lab roles, LLM_POLICY, docs) claude-opus-4-6 → claude-opus-4-8, and adds it to the Bedrock map (4.6/4.7 kept for back-compat) with a new resolveModelId test.
Unsticks @anthropic-ai/claude-agent-sdk ^0.1.0 → ^0.3.161 (lockfile regenerated; validated against the installed 0.3.161 — lint, 267 tests, query() export).
Model-aware + cache-aware cost reporting in usage.ts; documents the 2026-06-15 Agent SDK + claude -p subscription-credit billing in docs/transports.md.

Verification

npm run lint / build / format:check clean · npm test 267/267 · npm audit 0 vulnerabilities.

Follow-ups (not in this PR)

Durable fix for llm-policy.json escalation tier lives upstream in the nanohype repo (vendored copy reverts on next sync).
P2 evaluations are planned separately (gate citation-verification is next).

Six months of platform drift left a few concrete staleness points. This brings the model references, the Agent SDK pin, and the cost/docs surfaces current, verified against primary Anthropic/AWS docs. ─── Models & inference ─── - Bump every Opus escalation path to claude-opus-4-8 (current flagship, GA 2026-05-28): the advisor model (src/advisor.ts), the external-reviewer and prompt-optimizer lab roles (src/team/lab.ts), and the LLM_POLICY escalation tier in src/standards.ts that FACTORY_PREAMBLE injects into every factory role and produced app. Sonnet 4.6 / Haiku 4.5 remain the current Sonnet/Haiku tiers and are unchanged. - Add claude-opus-4-8 -> anthropic.claude-opus-4-8 to BEDROCK_MODEL_IDS (no -v1/date suffix, per the Bedrock model card); keep the 4.6/4.7 entries for back-compat. New resolveModelId assertion in inference.test.ts. ─── Agent SDK ─── - Unstick @anthropic-ai/claude-agent-sdk from ^0.1.0 (which cannot resolve 0.3.x) to ^0.3.161; lockfile regenerated to 0.3.161. The sdk / sdk-k8s runtimes were silently held ~2 minors back. Validated against the installed 0.3.161: lint clean, 267 tests pass, and the query() export resolves. - sdk-events.ts gains a documented no-op branch for non-init `system` messages, so any status/compaction subtype the 0.3.x SDK introduces is an explicit ignore rather than a silent drop. ─── Cost reporting ─── - src/usage.ts cost estimation is now model-aware (Opus $5/$25, Sonnet $3/$15, Haiku $1/$5, resolved per session from session.agent.model.id) and cache-token-aware (read 0.1x, 5-minute write 1.25x). Previously every role was priced at a flat Sonnet rate and cache tokens were ignored. fab sums API-returned token counts (there is no client-side tokenizer), so only the rate card is maintained here. ─── Docs & currency sweep ─── - docs/transports.md documents the 2026-06-15 Agent SDK + `claude -p` subscription billing: a monthly Agent SDK credit on Pro/Max/Team/Enterprise separate from interactive chat limits; API-key accounts stay pay-as-you-go. - Illustrative opus-4-6 references swept to 4-8: the fab.schema.json model example, the state.test override fixture, and the kagent / eks-agent-platform baseline skill docs. - Vendored src/standards/llm-policy.json escalation tier bumped for consistency; its canonical source is the nanohype repo, so the durable fix lands there on the next standards sync. Verification: npm run lint / build / format:check clean; npm test 267/267; npm audit 0 vulnerabilities. Co-authored-by: stxkxsbot <275011021+stxkxsbot@users.noreply.github.com>

…ricing The native-vs-hand-rolled audit found one genuine catch-up: budget enforcement and cost tracking only worked on the managed-agents transport. This closes that gap with native SDK primitives and unifies the pricing tables. The budget POLICY stays fab's; only enforcement + cost now ride native knobs where the transport exposes them. ─── Native budget + cost on sdk / sdk-k8s / claude-cli (Rank 1) ─── These transports had NO budget enforcement and NO cost tracking: the tracker in streamSessionWithAdvisor only accumulates `span.model_request_end` events, which only managed-agents emits. - sdk.ts passes `maxBudgetUsd` (from getBudgetLimit) into the Agent SDK query(); the SDK stops the run with an `error_max_budget_usd` result when the USD cap is exceeded, which sdk-events already maps to session.error. Native per-run budget enforcement on the SDK transports. - sdk-events attaches the result's native `total_cost_usd` onto session.status_idle; streamSessionWithAdvisor reads it and reports the actual run cost on those transports (previously $0 / unknown). claude-cli rides the same translator, so it's covered too. ─── Pricing consolidation (Rank 2) ─── - New src/pricing.ts is the single source: MODEL_RATES (Opus $5/$25, Sonnet $3/$15, Haiku $1/$5) + cache multipliers (read 0.1x, write 1.25x) + rateFor + estimateCost. Replaces four divergent copies — usage.ts's (good), workflows.ts MODEL_PRICING (speed-keyed Sonnet-flat), and perf.ts's per-role + totals (Sonnet-flat inline). - workflows.ts budget tracker now prices spans model+cache-aware via the role's model instead of a flat Sonnet rate. - usage.ts + perf.ts consume pricing.ts; perf per-role and totals are now model-aware, so Opus roles are no longer under-priced. ─── Tests ─── - pricing.test.ts — rateFor tiers + sonnet fallback; estimateCost input/output + cache-read (0.1x) / cache-write (1.25x) + null cache fields. - sdk-events.test.ts — total_cost_usd attaches to status_idle; omitted on the managed-agents result shape. Depends on the SDK 0.3.x bump + model-aware usage.ts from #21 (this branch is stacked on it). Verification: npm run lint / build / format:check clean; 274 tests. Co-authored-by: stxkxsbot <275011021+stxkxsbot@users.noreply.github.com>

stxkxs marked this pull request as ready for review June 4, 2026 02:06

stxkxs merged commit 716e0f7 into main Jun 4, 2026
2 checks passed

stxkxs deleted the feat/claude-mid-2026-currency branch June 4, 2026 02:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: refresh to mid-2026 Claude/Anthropic capabilities#21

feat: refresh to mid-2026 Claude/Anthropic capabilities#21
stxkxs merged 1 commit into
mainfrom
feat/claude-mid-2026-currency

stxkxs commented Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

stxkxs commented Jun 3, 2026

Summary

Verification

Follow-ups (not in this PR)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant