v1.5: claude-build (Codex orchestrates, Claude executes) + Opus default + Windows status-reader fix by alanshurafa · Pull Request #37 · alanshurafa/co-evolution

alanshurafa · 2026-06-13T18:42:07Z

Three changes stack here, smallest blast radius first.

What's in it

Status reader works under native Windows jq (733972a)
dev-review-status.sh --json failed only on Windows: native jq.exe can't open Git Bash process-substitution paths (/proc/<pid>/fd/N). Swapped --slurpfile <(jq ...) for --argjson "$(jq -c ...)". Every other --slurpfile in the repo reads a real file, so this was the only offender.

codex-build defaults to Opus, not Fable (e3d7fdc)
Fable access was lost, so the preset's two Claude seats pointed at an unreachable model. They now default through a best/opus alias (→ claude-opus-4-8), so the next model bump is a one-line edit in resolve_claude_model_alias. The cross-agent leak guard learned to drop best|opus from codex seats, and the docs-sync + preset-expansion drift guards moved in lockstep. A session-follow opt-in is documented for callers who want the seats to match their session model.

claude-build — the mirror of codex-build (7934bcd)
A Codex-hosted orchestrator: Codex plans and reviews while Claude executes the build. The reverse ladder already worked via role flags; this names it as a preset (Codex plans/reviews at xhigh model-unpinned, Claude executes at best/high) and adds the Codex-facing protocol doc dev-review/codex/claude-build.md. Because Codex has no harness wake-on-exit, the orchestration is synchronous; a detached+polling variant is documented as future design only. The Claude-executor path hard-requires claude auth (no degrade — degrading the executor to Codex is just codex-build).

This feature was itself built through codex-build: Codex executed the plan in an isolated worktree, reviewed in-session.

Verification

Full suite green locally on Windows (25/25). This PR exists to run the same suite on the macOS (BSD / bash 3.2) and Ubuntu legs of the CI matrix — macOS is the GNU-ism canary. Static audit of the new code found no portability hazards.

🤖 Generated with Claude Code

… research notes Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…hase 1) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…Phase 2) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

… + status reader (v1.5 Phase 3) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…s (v1.5 Phase 5) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ier seat now real (v1.5 Phase 6) Found by live dogfood: (1) --preset codex-build --verifier codex leaked VERIFIER_MODEL=fable into CODEX_MODEL (ChatGPT plan rejects with HTTP 400); apply_seat_env now drops a cross-agent-kind model+effort pair as a unit. (2) review-verdict.json lacked additionalProperties:false on issues.items and complete required lists (OpenAI strict structured output) — the codex --output-schema verify path could never have produced a verdict. Proven E2E: real run exits 0 with verdict APPROVED@96, verify tokens captured. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ATE live, token evidence note Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Native jq.exe cannot open Git Bash process-substitution paths (/proc/<pid>/fd/N), so dev-review-status.sh --json failed only on Windows. Replace --slurpfile <(jq ...) with --argjson "$(jq -c ...)". Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Fable access was lost, so the codex-build preset pointed its two Claude seats (plan + review) at an unreachable model. Default them through a best/opus alias (-> claude-opus-4-8) so a future model bump is one line in resolve_claude_model_alias. Extend the cross-agent leak guard to drop best|opus from codex seats, document a session-follow opt-in, and move the docs-sync + preset-expansion drift guards in lockstep. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Mirror of codex-build: a Codex-hosted orchestrator that plans and reviews while Claude executes the build. Adds the claude-build preset arm (Codex plans/reviews at xhigh, model unpinned; Claude executes at best/high), help + unknown-preset text, the Codex-facing protocol doc dev-review/codex/claude-build.md (synchronous gate; HARD-requires claude auth since Claude is the executor; detached+polling documented as future design only), routing + AGENTS contract, and drift-guard scenarios pinning both preset triples. Built via codex-build (Codex executed, reviewed in-session). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7934bcdb2b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-13T18:48:24Z

  write_state_field "$STATE_JSON" ".completed_at" "string" "$(date -u +%Y-%m-%dT%H:%M:%SZ)"
+  # v1.5 Phase 3: plan-only runs also reach a clean terminal — clear the in-flight
+  # phase so a status reader does not see a stale compose/bounce as "still running".
+  write_state_field "$STATE_JSON" ".current_phase" "null"


Mark plan-only runs terminal in state

In the --plan-only exit path, this records completed_at and clears current_phase but leaves the initial .status as pending. I verified --skip-plan --plan ... --plan-only exits 0 with state.status still "pending"; dev-review-status.sh --json then treats the finished run as non-terminal/dead and exits 4. Set .status based on PLAN_EXIT before cleanup so completed plan-only runs are reported as completed/partial instead of presumed dead.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-13T18:48:24Z

+    --claude-model)
+      [[ $# -gt 1 ]] || die "--claude-model requires a value"
+      # v1.5: resolve friendly alias (fable -> claude-fable-5) else passthrough.
+      CLAUDE_MODEL=$(resolve_claude_model_alias "$2")


Let --claude-model override preset seats

When a preset has already set a Claude seat model such as COMPOSER_MODEL, EXECUTOR_MODEL, or VERIFIER_MODEL to best, this flag updates only the global CLAUDE_MODEL; apply_seat_env later prefers the seat-specific model, so --preset claude-build --claude-model claude-opus-4-6 still invokes Claude with --model claude-opus-4-8 (reproduced with stubs: seat_models.executor stayed opus:claude-opus-4-8@high). That contradicts the new last-wins preset help and prevents callers from pinning Claude via the documented flag; update or clear the active Claude seat overrides here.

Useful? React with 👍 / 👎.

The execute/verify auth gate scanned both the output and the full stderr log for auth substrings unconditionally, so any run whose working log echoed auth strings — plan text, or the auth-detection source itself — was misread as "authentication failed" and aborted with no verdict (observed building claude-build via codex-build). Mirror lib's robust validate_agent_artifact discriminator: an auth banner in the output counts only when the output is short (<50 words), and a banner in stderr counts only when the output is empty (a real work product means the CLI authenticated). Adds reliability sim S5a-S5d covering the regression. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

alanshurafa and others added 11 commits June 12, 2026 12:40

docs(planning): register v1.5 'Build with Codex'; Phase 0 env + R1/R2…

b672145

… research notes Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

feat: per-seat model/effort seats + B1/B2/B3 env-export fixes (v1.5 P…

fb6862a

…hase 1) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

feat: claude-verifier verdict hardening + --preset codex-build (v1.5 …

ffe765f

…Phase 2) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

feat: runner observability — current_phase, runner_pid, shas, lineage…

13c2bee

… + status reader (v1.5 Phase 3) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

feat: gated per-phase token capture into state.json (v1.5 Phase 4)

fb965ad

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

feat: /codex-build orchestration skill + routing docs, both transport…

30a9a00

…s (v1.5 Phase 5) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

docs(planning): v1.5 Phase 6 — mcp parity green, dogfood ACCEPT+ESCAL…

2c4d229

…ATE live, token evidence note Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

chatgpt-codex-connector Bot reviewed Jun 13, 2026

View reviewed changes

alanshurafa merged commit 14b684a into master Jun 13, 2026
6 checks passed

alanshurafa deleted the feat/v1.5-codex-build branch June 13, 2026 22:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.5: claude-build (Codex orchestrates, Claude executes) + Opus default + Windows status-reader fix#37

v1.5: claude-build (Codex orchestrates, Claude executes) + Opus default + Windows status-reader fix#37
alanshurafa merged 12 commits into
masterfrom
feat/v1.5-codex-build

alanshurafa commented Jun 13, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 13, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alanshurafa commented Jun 13, 2026

What's in it

Verification

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant