Skip to content

docs: codex-pivot spec rev 2 — dual-surface (App + Action), shadow-mode phase#278

Merged
cbeaulieu-gt merged 4 commits into
mainfrom
docs/codex-pivot-spec-v2
May 20, 2026
Merged

docs: codex-pivot spec rev 2 — dual-surface (App + Action), shadow-mode phase#278
cbeaulieu-gt merged 4 commits into
mainfrom
docs/codex-pivot-spec-v2

Conversation

@cbeaulieu-gt
Copy link
Copy Markdown
Member

Summary

Rev 2 of the Codex pivot spec, incorporating:

  • Spike Spike: verify Codex GitHub App coverage on org-owned repos under subscription #275 findings — Codex GitHub App works on org-owned repos under ChatGPT subscription; context-aware reading verified end-to-end (chatgpt-codex-connector[bot] ran git ls-files | rg in sandbox to validate a claim).
  • Project-reviewer feedback on v1 (4 BLOCKING + 5 CONCERN + 3 NIT) — each finding addressed or explicitly carried forward as unverified: with resolution path.

Architectural change v1 → v2

Dual-surface architecture (was: single-surface API-billed migration):

  • pr-review retires entirely. No replacement workflow file in repo. Codex GitHub App handles it cloud-side under subscription. Zero YAML, zero composite action, zero API billing for the review surface.
  • apply-fix, lint-failure, ci-failure, tag-respond migrate to openai/codex-action@v1.8 with OPENAI_API_KEY. API-billed (low-volume since failure-/comment-triggered).
  • Verb router (tag-claude/, claude-command-router/, check-auth/) collapses entirely — the Codex App handles @codex review and @codex address that feedback natively.

New phase: Shadow mode

Mandatory shadow-mode evaluation between App enablement and Claude pr-review retirement. Written kill criteria (≥7 days OR ≥30 PRs window; revert if Codex misses a BLOCKING finding Claude caught, or false-positive rate >2× Claude's, or latency >30 min on >20% of PRs). Addresses the BLOCKING-#3 transition-window finding without forcing a synthetic-stimulus decision.

Quality gate replacement

New codex-gate.yml filters on chatgpt-codex-connector[bot] review state != CHANGES_REQUESTED. Replaces both claude-pr-review/quality-gate and its shadow variant. No severity-regex parsing — escapes the #271 trap.

Open items

Spec carries 5 unverified: claims to be resolved during shadow mode or post-cutover measurement (monthly API spend, severity-to-state mapping, synchronize-event behavior, finding-density delta, external-consumer audit). 5 new gating unknowns (§13) flagged for user decision before sub-issues are filed.

Review path

Per harness rule (project-planner spec output → project-reviewer auto-fires), I'll dispatch the reviewer after this PR is up. If significant findings, will raise to user about an inquisitor pass.

Refs

Closes — (none yet; spec lands first, sub-issues file after review)
Refs #273 (epic), #275 (spike findings incorporated), #277 (research doc — companion deliverable)

🤖 Generated by Claude Code on behalf of @cbeaulieu-gt

Rev 2 incorporates spike #275 findings and the project-reviewer feedback
on the conceptual v1. Architectural change: pr-review retires entirely
(handled by the Codex GitHub App under subscription, zero in-repo YAML);
the four write-side workflows migrate to openai/codex-action with API
billing; the verb router collapses entirely since the App handles
@codex review / @codex address feedback natively. Adds a mandatory
shadow-mode phase with written kill criteria between App enablement
and Claude pr-review retirement. Quality gate becomes a thin workflow
filtered on the Codex bot's review state, escaping the severity-regex
trap from #271.

Refs #273, #275, #277
@cbeaulieu-gt cbeaulieu-gt added this to the codex-pivot milestone May 20, 2026
@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create an environment for this repo.

@claude-action-runner
Copy link
Copy Markdown
Contributor

claude-action-runner Bot commented May 20, 2026

Claude finished @cbeaulieu-gt's task in 2m 13s —— View job


PR Review Complete

Reviewed the incremental changes in this push (640e20a243a4ee).

Analysis

This push refines the timing model for sub-issue #M (external consumer audit) and resolves Question 3 from §13. The changes address a critical sequencing concern: decoupling audit completion from the OAuth deadline while still guaranteeing consumers receive 30 days notice before GHCR image deletion.

Changes in this push:

  1. Sub-issue #M expansion (lines 587-593): Clarifies that audit completion gates the start of the 30-day GHCR deletion grace window
  2. Question 3 resolution (lines 643-650): Marks the timing decision as resolved with explicit semantics

Findings

🟢 Medium - Potential ambiguity in "in parallel with #L" phrasing

Location: docs/superpowers/specs/2026-05-20-codex-pivot.md:588-589

Issue: The phrase "Survey post-release in parallel with #L (GHCR-deletion clock prep)" may be ambiguous. Since the spec now requires #M to complete before the 30-day window starts, and #L happens after that window, these two sub-issues aren't truly "in parallel" — #M precedes #L by at least 30 days.

Current text:

- **#M — Audit external consumers pinned to `v2`.** Survey post-release in
  parallel with `#L` (GHCR-deletion clock prep). Notify each consumer...

Suggested refinement:

- **#M — Audit external consumers pinned to `v2`.** Survey post-release,
  preceding #L's 30-day grace window. Notify each consumer...

Or more explicitly:

- **#M — Audit external consumers pinned to `v2`.** Survey post-release to
  prepare for #L (GHCR deletion begins 30 days after audit completes). Notify...

Why it matters: Clear sequencing prevents misunderstandings during execution. The current phrasing could be read as "both happen around the same time," when the actual model is: #M completes → 30-day wait → #L executes.


✅ Positive: Cross-references are accurate

  • The reference to "§9 row 12" from sub-issue #M body is correct (verified by counting rows in the sequencing table)
  • The resolution in §13 Q3 correctly points to both §9 row 12 and the #M body
  • The timing model is consistently stated across all three locations (§6 GHCR section, §9 row 12, sub-issue #M, §13 Q3)

✅ Positive: Decoupling from OAuth deadline is sound

  • Making #M non-blocking on cutover timing while still guaranteeing consumers 30 days notice is the right architectural trade-off
  • This resolves the BLOCKING concern without creating artificial urgency on the audit
  • The model correctly prioritizes the hard deadline (OAuth EOL ~2026-06-20) over the softer constraint (external consumer notification)

Summary

The incremental changes are coherent and well-cross-referenced. The timing resolution addresses a legitimate sequencing concern raised by project-reviewer. One minor ambiguity in phrasing could be clarified, but the underlying model is sound.

Verdict: APPROVE

Claude Auto-Fix and others added 2 commits May 20, 2026 14:20
…NCERN + 2 NIT)

BLOCKING:
- examples/** and docs/consumer-onboarding.md DO exist on main (false negative
  in planner's verification step); add to touches, correct false claim,
  expand sub-issue #J's scope to cover them.
- Shadow-mode duration logic rewritten: 7 days AND >=7 PRs binding floor;
  >=30 PRs is advisory, not blocking -- closes the deadline trap.
- codex-gate.yml YAML draft hardened: empty-SHA guard + description field +
  workflow-level statuses:write permission + filter to chatgpt-codex-connector[bot].

CONCERN:
- openai/codex-action @v1 pinning policy rescoped (precondition was unsatisfiable).
- AGENTS.md sub-issue #A gains concrete acceptance test.
- New sub-issue #O gates synchronize-event verification before Claude retirement.

NIT:
- S6 GHCR deletion sequencing note added.
- S9 verb-router-retirement atomicity note added.

Refs #273, #278 (this PR), #275 (spike findings unchanged).
MAJOR:
- Shadow mode duration logic clarified (7d AND 7 PRs binding; 14d max; 30 PRs aspirational)
- #M sequencing contradiction resolved — 30d GHCR clock gated on #M completion, not v3 release

Medium:
- #O sub-issue gains explicit timing (run during #A/#B, before shadow window)

Nits:
- Empty-SHA error message includes recovery hint
- Redundant if-condition retained with explanatory comment
- §8 observation methodology specifies structured PR-by-PR logging schema
- #J acceptance test method specifies throwaway consumer-test repo
- BLOCKING-#4 verification step now shows the git ls-tree command + 2026-05-20 date

Findings from claude-action-runner[bot] review of e06277407a08f0.

Refs #273, #278

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

⚠️ Automated PR review incomplete

Claude hit the turn limit before finishing this review. The PR may be too large or complex for the current settings.

What happened: The review was cut off at 30 turns (PR has 1 files changed).

Options:

  • Re-run the workflow with a higher max_turns input
  • Break the PR into smaller, focused pieces
  • Trigger a manual review with @claude review this PR

The fix-up in 640e20a applied Option B to §9 row 12 (conditional
30-day clock) but left two references with Option-A "before-release"
framing: #M's body and §13 Q3. This commit propagates Option B to
both — #M runs post-release in parallel with #L; the GHCR-deletion
clock is gated on #M completion; §13 Q3 marked resolved.

Refs #273, #278
@cbeaulieu-gt cbeaulieu-gt merged commit 59d6ba8 into main May 20, 2026
7 checks passed
@cbeaulieu-gt cbeaulieu-gt deleted the docs/codex-pivot-spec-v2 branch May 20, 2026 21:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant