Skip to content

Cache local image summaries #244

@alfozan

Description

@alfozan

Problem

Repeated local image summary runs do not appear to use the summary cache. The same local image, prompt options, and model path can invoke the model again instead of returning a cached summary.

This is separate from #240 / #241. That issue/PR is about transcript cache keying for local audio/video extraction. This one is about summary caching for binary image attachments.

Repro

Use any local JPG/PNG and a CLI model path that makes repeated calls observable:

CODEX_PATH=/path/to/codex summarize ./fixture.jpg --length short --plain --verbose
CODEX_PATH=/path/to/codex summarize ./fixture.jpg --length short --plain --verbose

Expected:

  • First run writes a summary cache entry.
  • Second run reports or behaves as a summary cache hit.
  • The model/provider is not invoked again for the identical local image and options.

Actual:

  • The repeated run does not hit the summary cache.
  • The image is summarized again.

Likely cause

The asset summary path gates summary cache reads/writes on both contentHash and promptHash:

  • src/run/flows/asset/summary.ts computes contentHash = buildPromptContentHash({ prompt: promptText }).
  • It only reads the summary cache when cacheStore && contentHash && promptHash.
  • It only writes the summary cache under the same condition.

For image attachments, prepareAssetPrompt() builds a normal file summary prompt and attaches the image bytes separately. The prompt's <content> block is empty because the content is binary, not inline text. That makes buildPromptContentHash() return null, so the summary cache block is skipped entirely.

Relevant code paths:

  • src/run/flows/asset/preprocess.ts, image attachments are passed as binary attachments.
  • packages/core/src/prompts/file.ts, buildFileSummaryPrompt() uses content: "".
  • src/cache-keys.ts, buildPromptContentHash() returns null for empty content.
  • src/run/flows/asset/summary.ts, summary cache lookup/write require contentHash.

Suggested direction

For binary attachments, the summary cache needs a stable content identity that is not derived from inline prompt content. For local images, that could include a hash of the image bytes, or a file identity that includes the local file URL/path plus size/mtime. Byte hashing is likely more robust across path changes, while file metadata is cheaper but can miss same-content moves/copies.

The cache key should still include the existing prompt hash, model, length, language, and cache format version so prompt/options/model changes invalidate correctly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Normal priority bug or improvement with limited blast radius.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.clawsweeper:queueable-fixClawSweeper marked this issue as an existing queue_fix_pr work candidate.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:otherThis issue has meaningful maintainer-visible impact outside the owned taxonomy.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions