Skip to content

[BUG] Anthropic prompt caching gated on hard-coded model-id switch instead of info.supportsPromptCache #717

Description

@sumleo

Problem (one or two sentences)

In the Anthropic provider, prompt-cache breakpoints (cache_control) are attached only inside a hard-coded switch (modelId) allowlist of literal model ids. Any cacheable Claude model whose exact id is not pasted into that list falls through to the default: branch, which sends the system prompt and messages with no cache_control, so prompt caching is silently disabled for it.

Context (who is affected and when)

Affects anyone using a Claude model on the Anthropic provider whose literal id is not in the allowlist — most relevantly, every newly added Anthropic model until someone hand-edits the switch. The model is fully cacheable (supportsPromptCache: true in the model registry) but receives full-priced uncached input on every turn because the breakpoint is never emitted. On long agentic loops the system prompt + tool schema + prior turns are re-sent uncached each turn.

Reproduction steps

  1. Environment: Zoo Code with the Anthropic provider, any Claude model that is supportsPromptCache: true in packages/types/src/providers/anthropic.ts but whose id is absent from the two switch statements in src/api/providers/anthropic.ts.
  2. Open src/api/providers/anthropic.ts at createMessage. The outer switch (modelId) (~line 91) is the only place that builds the request with cache_control on the system block and the last/second-last user messages; the nested switch (modelId) (~line 161) is the only place that adds the prompt-caching beta header.
  3. Select a Claude model not listed in those two switches. The request is built by the default: branch (~line 198), which sends system: [{ text, type }] (no cache_control) and plain messages.
  4. Observe in the API usage that cache_creation_input_tokens / cache_read_input_tokens stay at 0 for that model — caching never engages.

This is a static-analysis finding (a small linter for prompt-cache anti-patterns we call CacheLint) that I then confirmed by reading the live source on main at commit 0084cc8.

Expected result

Whether caching is applied should be driven by the model's own capability flag, ModelInfo.supportsPromptCache, which is already true for all entries in anthropicModels. Every cacheable Anthropic model should get the cache_control breakpoints and the prompt-caching beta header automatically — no hand-maintained id list.

Actual result

Caching is gated on two parallel hard-coded id allowlists. New/unlisted models fall to the default: branch and get zero caching until both switches are manually updated (as PR #555 "Add Fable 5" and the Opus 4.8 PR had to do).

Suggested fix

Replace the two switch (modelId) statements with a single capability check, e.g.:

if (info.supportsPromptCache) {
  // attach cache_control to the system block + last/second-last user message
  // and push the "prompt-caching-2024-07-31" beta header
} else {
  // existing default: plain system + messages, no cache_control
}

info is already destructured from this.getModel(), so no new plumbing is needed, and the change keeps behavior identical for all currently-listed models while letting future models inherit caching automatically. Happy to verify against a specific model id if helpful.

App Version

main @ 0084cc8 (latest at time of writing)

API Provider (optional)

Anthropic

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions