Skip to content

Fix Codex image arg order#243

Open
alfozan wants to merge 1 commit into
steipete:mainfrom
alfozan:alfozan/fix-codex-image-arg-order
Open

Fix Codex image arg order#243
alfozan wants to merge 1 commit into
steipete:mainfrom
alfozan:alfozan/fix-codex-image-arg-order

Conversation

@alfozan
Copy link
Copy Markdown

@alfozan alfozan commented Jun 8, 2026

Summary

  • put the Codex exec subcommand before provider-specific extra args
  • keep image args like -i <image> attached to codex exec instead of the top-level Codex command
  • add regression coverage for the -i ordering case

Fixes #242.

Why

Local image summaries can currently build this argv when the Codex backend is selected:

codex -i <image.jpg> exec --output-last-message ...

Codex's -i/--image <FILE>... option is variadic, so exec can be consumed as another image argument. The remaining --output-last-message flag is then parsed at the wrong command level and Codex exits before the summary starts.

This is adjacent to the Codex argument construction discussed in #215, but that issue fixed isolation/context loading and left the codexExtraArgs/exec order unchanged.

Repo example

This came up from a repo workspace with a local WhatsApp media image. Any local JPEG should reproduce the same command shape when the Codex backend is selected:

summarize /path/to/WhatsApp-Assistant/profiles/global-equities-testing/media/3AFF789EBAE19B8737AB/0-example.jpg --length short --plain

Before this PR, summarize can invoke:

codex -i /path/to/image.jpg exec --output-last-message /tmp/.../last-message.txt --skip-git-repo-check --json -m gpt-5.5

and Codex reports:

error: unexpected argument '--output-last-message' found

After this PR, the command is assembled as:

codex exec -i /path/to/image.jpg --output-last-message /tmp/.../last-message.txt --skip-git-repo-check --json -m gpt-5.5

Validation

  • pnpm test tests/llm.cli.regressions.test.ts tests/llm.cli.test.ts
  • pnpm exec oxfmt --check src/llm/cli.ts tests/llm.cli.regressions.test.ts
  • git diff --check

- Put the Codex exec subcommand before provider extra args so image flags belong to codex exec.\n- Add regression coverage for Codex image args so -i cannot consume exec.\n\nFixes steipete#242.
@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented Jun 8, 2026

Codex review: needs real behavior proof before merge. Reviewed June 8, 2026, 1:34 AM ET / 05:34 UTC.

Summary
The PR moves Codex provider extra args to follow codex exec and adds a regression test for image -i argument ordering.

Reproducibility: yes. from source inspection: current main passes image -i args before exec, producing the failing command shape described in the linked report. I did not run a live Codex image summary in this read-only review.

Review metrics: 1 noteworthy metric.

  • Patch surface: 2 files affected; 28 additions and 1 deletion. The diff is small enough that review can focus on Codex argv ordering and the new regression coverage.

Merge readiness
Overall: 🦪 silver shellfish
Proof: 🦪 silver shellfish
Patch quality: 🐚 platinum hermit
Result: blocked until real behavior proof from a real setup is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P1] Add redacted terminal output or a terminal screenshot from a real local image summary using the Codex backend after the fix.

Proof guidance:

  • [P1] Needs real behavior proof before merge: The PR has mocked/unit regression validation but no after-fix live terminal output, logs, screenshot, recording, or linked artifact from a real Codex image summary. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge

  • [P1] The PR body does not yet include after-fix proof from a real Codex image-summary run; it only describes validation commands and adds mocked regression coverage.

Maintainer options:

  1. Decide the mitigation before merge
    Land the narrow argv reorder with regression coverage once a redacted real Codex image-summary run demonstrates the fixed command succeeds.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • [P1] Contributor or maintainer follow-up is needed for real behavior proof; no ClawSweeper code repair is identified.

Security
Cleared: The diff only reorders local Codex argv construction and adds a test; it does not touch dependency, CI, secrets, publishing, or code-download paths.

Review details

Best possible solution:

Land the narrow argv reorder with regression coverage once a redacted real Codex image-summary run demonstrates the fixed command succeeds.

Do we have a high-confidence way to reproduce the issue?

Yes from source inspection: current main passes image -i args before exec, producing the failing command shape described in the linked report. I did not run a live Codex image summary in this read-only review.

Is this the best way to solve the issue?

Yes, the proposed reorder is the narrow maintainable fix and matches the codex exec help surface. The remaining merge gate is real behavior proof from an after-fix run.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 769d01f0898f.

Label changes

Label changes:

  • add P2: This is a normal-priority CLI bug fix for Codex local image summaries with limited blast radius.
  • add rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦪 silver shellfish and patch quality is 🐚 platinum hermit.
  • add status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR has mocked/unit regression validation but no after-fix live terminal output, logs, screenshot, recording, or linked artifact from a real Codex image summary. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Label justifications:

  • P2: This is a normal-priority CLI bug fix for Codex local image summaries with limited blast radius.
  • rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦪 silver shellfish and patch quality is 🐚 platinum hermit.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR has mocked/unit regression validation but no after-fix live terminal output, logs, screenshot, recording, or linked artifact from a real Codex image summary. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
Evidence reviewed

What I checked:

  • Repository policy read: AGENTS.md was read fully and confirms this summarize checkout is the target source tree, with read-only/vendor and monorepo guidance relevant to review scope. (AGENTS.md:1, 769d01f0898f)
  • Current main still builds the old Codex order: On current main, the Codex path pushes codexExtraArgs before pushing the exec subcommand. (src/llm/cli.ts:307, 769d01f0898f)
  • Image summaries provide Codex -i extra args: The asset CLI context supplies codex: ["-i", filePath] for image attachments, so current main can assemble codex -i <image> exec .... (src/run/flows/asset/summary-attempts.ts:134, 769d01f0898f)
  • Docs describe Codex image handling under exec: The CLI docs describe Codex attachments as codex exec --output-last-message ... plus -i <image> for images. (docs/cli.md:101, 769d01f0898f)
  • Local Codex help supports the PR direction: codex exec --help lists both -i/--image and --output-last-message, while top-level codex --help does not list --output-last-message.
  • PR diff is focused: The supplied PR context shows one source ordering change in src/llm/cli.ts and one regression test covering exec before -i. (src/llm/cli.ts:307, e5f24a4ae624)

Likely related people:

  • Peter Steinberger: Local blame attributes the current Codex CLI provider block and image attachment context to the v0.16.3 release commit; the checkout history is grafted, so this is current-main provenance rather than a deep authorship trail. (role: current-main feature provenance; confidence: medium; commits: fcf8c8e5e98d; files: src/llm/cli.ts, src/run/flows/asset/summary-attempts.ts, docs/cli.md)
  • Mykhailo: Recently touched the shared CLI provider implementation and CLI docs while adding another provider, so they may have useful context for provider argument ordering even though the commit is not Codex-specific. (role: recent adjacent CLI contributor; confidence: low; commits: 821e76613ded; files: src/llm/cli.ts, docs/cli.md)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal priority bug or improvement with limited blast radius. labels Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P2 Normal priority bug or improvement with limited blast radius. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Codex image summaries fail because -i consumes exec

1 participant