A self-evolving Software Development Life Cycle (SDLC) enforcement system for AI coding agents, adapted for OpenAI's Codex CLI. It makes Codex plan before coding, test before shipping, state confidence, and self-review with repo-local guardrails instead of relying on memory.
This adapter brings the SDLC Wizard discipline into Codex projects with Codex-native skills, .codex/ hooks, AGENTS.md, adaptive setup/update, and proof-aware git gates.
# Setup a new repo or sync an already-initialized clone
npx codex-sdlc-wizard@latest
# Start coding with SDLC enforcement and an explicit model profile
codex -m gpt-5.5 -c 'model_reasoning_effort="xhigh"'codex -m gpt-5.5 -c 'model_reasoning_effort="xhigh"' is the safest explicit start once this wizard is installed. Use plain codex instead if you want to rely on trusted repo-local config. If a handoff is interrupted and Codex prints a resume id, continue with codex resume -m gpt-5.5 -c 'model_reasoning_effort="xhigh"' <session-id> so resume does not fall back to an older model.
If you normally use yolo-style sessions, use Codex's canonical full-trust flag: --dangerously-bypass-approvals-and-sandbox. Current Codex may accept --yolo as shorthand, but this wizard prints the canonical flag. Full-auto is not full-trust: full-trust bypasses sandbox and approval prompts. Only use that variant in repos you fully trust.
Bare npx codex-sdlc-wizard is the adaptive setup/sync path. In an already-initialized repo clone, it runs the update/check-repair path automatically so a fresh Mac/Windows/Linux checkout can sync hooks, config, and helper skills without remembering separate commands. In a new repo, it bootstraps the repo-local guardrails first, then hands off into a live plain Codex setup session so the unresolved setup questions happen inside Codex instead of inside a shell checklist. At that first-run handoff prompt, press Enter for plain codex or type full-trust if you explicitly want codex --dangerously-bypass-approvals-and-sandbox. setup --yes still exists for automation, but it is not the normal human path.
Generic npm entrypoint examples: npx codex-sdlc-wizard, npx codex-sdlc-wizard check, and npx codex-sdlc-wizard update.
update repairs repo artifacts using the package version you invoked; it does not self-update the npm package. To consume the newest release and apply its repo-side updates in one command, run npx codex-sdlc-wizard@latest from an initialized repo, or use the explicit form npx codex-sdlc-wizard@latest update.
Package upgrade vs repo repair:
- Package upgrade: run
npx codex-sdlc-wizard@latest updateto consume the newest published package. - Repo repair/sync inside Codex: run
$update-wizardto inspect and repair local SDLC artifacts using the skill/package already loaded in the active Codex session.
After either path changes skills, hooks, hook config, or helper scripts, restart/reopen Codex so the active session reloads them.
Useful follow-ups after install:
npx codex-sdlc-wizard@0.7.31 check
npx codex-sdlc-wizard@0.7.31 updateIf you want pinned release examples instead of @latest, see Releases.
For long-running work, add an active-scope contract with:
npx codex-sdlc-wizard@latest setup --yes --goalsGOALS.md is separate from ROADMAP.md: keep ROADMAP.md as backlog/history, and use GOALS.md for the current active run. Its operating phrase is: complete everything in GOALS.md until the user says stop.
The template includes active goals, deferred work, definition of done, runtime boundary, evidence contract, and a paste-ready $sdlc prompt. This prevents “active goal complete” from being confused with “whole roadmap complete.”
Use Codex /goal for long-running roadmap work only after this repo has a real $sdlc setup. Treat the goal as an SDLC-backed active task, not as a substitute for planning, tests, review, proof, or repo-local instructions.
A good goal should include:
- the current
GOALS.md, roadmap slice, or milestone $sdlcas the mandatory delivery contract- any additional repo-local skills that are already installed and relevant
- the 95% confidence rule, including stopping to research or hand back when confidence drops
- RED/GREEN tests, focused checks, full tests/lint when code or config changed, and native review/self-review
- a clean break requirement: docs updated, evidence recorded, and changes committed locally before claiming the active task is done
Suggested manual /goal text:
Get as far as possible through GOALS.md in small, shippable slices. Follow $sdlc for every code/doc change and include any relevant repo-local skills already installed here. Keep confidence >=95%; if confidence drops, research or stop at a clean handoff point. Use RED/GREEN tests, focused checks, full tests/lint when code or config changed, and native review/self-review before shipping. Stop only at a clean break with evidence recorded and changes committed locally.
This repo is now a Codex skill plus adaptive installer-style adapter for Codex projects.
- It ships a repo-root
SKILL.mdfor the normal Codex skill install flow. - It is not a Codex plugin today.
- It still ships
install.sh/setup.shwhen you want direct repo mutation from GitHub or npm.
| Need | Use | Why |
|---|---|---|
| Install a reusable Codex skill from this repo | SKILL.md |
The repo root is now a Codex skill package for normal GitHub skill-install flow |
| Add SDLC enforcement to an existing Codex project now | npx codex-sdlc-wizard or setup.sh |
The npm package bootstraps then hands off into live Codex setup; direct scripts still exist for advanced/manual shell paths |
| Install a Codex plugin from this repo | Not supported | There is no .codex-plugin/plugin.json package here |
Current recommended install/discovery path: npx codex-sdlc-wizard@latest for repo setup, plus the repo-root SKILL.md / agents/openai.yaml package for skill installation.
Official Codex docs now clarify the packaging boundary:
- Agent Skills: skills are the authoring format for reusable workflows.
- Plugins and Build plugins: plugins are the installable distribution unit for reusable skills, app integrations, and MCP servers in Codex.
What that means for this repo:
- Today, keep npm/npx as the supported consumer path and keep this README honest that the repo is not a plugin yet.
- Next packaging path, when worth doing, is a real Codex plugin with
.codex-plugin/plugin.json, bundledskills/, optional.mcp.json, optional.app.json, and presentation assets. - Local or team testing should use a plugin marketplace file at
.agents/plugins/marketplace.json, then verify withcodex plugin marketplace addand the CLI/pluginsbrowser. - Official public plugin listing is not self-serve yet; self-serve plugin publishing is coming soon. Until a listing actually exists, no approval or listing is implied.
You want Codex to follow engineering discipline automatically:
- Plan before coding instead of jumping straight to edits
- Write tests first and keep TDD visible in the repo contract
- State confidence so low-confidence work triggers research instead of guessing
- Self-review before presenting using Codex-native review where appropriate
- Prove the work is shippable with fresh test/review evidence before commit or push
- Preserve repo truth in
AGENTS.md, setup docs, hooks, and skills instead of one-off chat memory
The wizard auto-detects your stack, generates repo-specific SDLC docs, installs Codex hook enforcement, and gives you check / update paths so the setup can be repaired without flattening local customizations.
This adapter brings the SDLC Wizard discipline into Codex today with hard guardrails, repo-local guidance, and adaptive setup/update flows that work in existing projects.
What works today:
- Hard enforcement hooks that block bad habits (
git commitwithout proof,git pushwithout review) - AGENTS.md guidance for planning, confidence tracking, TDD, and review
- Non-destructive installer that merges into your existing Codex config
- Adaptive setup that bootstraps first and then continues inside Codex when you use the default npm entrypoint
- Task-routing guidance that sends auth-heavy browser, tenant, MFA, and admin-portal work to Desktop/computer-use before unsafe CLI/browser instructions
check/updateflows for drift detection and selective repair
What's still coming from upstream:
- richer scoring mechanisms and self-improvement from E2E evaluation
- more domain-adaptive guidance refinements beyond the current templates
This adapter tracks the upstream SDLC Wizard. A weekly sync workflow checks for upstream releases and opens follow-up issues here when translation work is needed.
Five layers working together:
Layer 5: SELF-IMPROVEMENT
Upstream sync checks, roadmap issues, release proof, and pilot feedback
keep the wizard improving without silently changing consumer repos.
Layer 4: RELEASE VALIDATION
Packaging, npm, release, roadmap, benchmark, setup/update, and E2E
tests verify the shipped adapter surface before tags are published.
Layer 3: ADAPTIVE SETUP / UPDATE
Deterministic scan plus live Codex refinement generates repo-specific
docs and repairs drift while preserving intentional customizations.
Layer 2: ENFORCEMENT
Codex hooks block commit/push until fresh reviewed proof exists, while
compact lifecycle hooks preserve SDLC state across context compaction and
repo-scoped skills carry the explicit workflow contract.
Layer 1: LOCAL TRUTH
AGENTS.md, START-SDLC.md, SDLC-LOOP.md, PROVE-IT.md, TESTING.md, and
ARCHITECTURE.md keep the SDLC rules durable inside the target repo.
| SDLC Goal | Enforcement | Level |
|---|---|---|
| TDD workflow | AGENTS.md guidance | Soft (Codex has no file-edit hooks) |
| git commit gate | PreToolUse blocks git commit |
Hard |
| git push gate | PreToolUse blocks git push |
Hard |
| Compact lifecycle | PreCompact/PostCompact compact guard | Warns with SDLC carry-forward context |
| SDLC baseline | repo docs + installed skills | Hard/Soft mix |
| Session init | SessionStart hook | Warns if AGENTS.md is missing |
| Capability | Codex-specific shape |
|---|---|
| Proof-aware git gates | git commit and git push stay blocked until a fresh reviewed SDLC proof stamp is tied to the current repo content |
| Codex-native review | Uses codex review --uncommitted, --base, or --commit; review_model = "gpt-5.5" provides the intended review pass |
| Adaptive setup/update | Default npx setup bootstraps first, then hands off into Codex for unresolved questions; update repairs drift without blind overwrites |
| Honest skill model | $sdlc is the public repo-scoped workflow; helper skills stay support tooling instead of pretending Codex has slash commands |
| Cross-platform hook shape | Universal Node hook entrypoints avoid Bash/PowerShell hook-config churn across macOS, Linux, Windows, and type: module repos |
| Auth-aware routing | Setup docs route browser sign-in, WAM, MFA, tenant, and admin-portal boundaries to Desktop/computer-use or human-owned proof instead of unsafe CLI guesses |
The git gate is proof-aware: git commit and git push are still hard manual
checkpoints, but they can proceed when a fresh SDLC proof stamp exists.
After running the required checks and self-review, stamp proof:
node .codex/hooks/git-guard.cjs prove --reviewedIf the repo has no detected commands in .codex-sdlc/manifest.json, provide the
proof command explicitly:
node .codex/hooks/git-guard.cjs prove --reviewed --check "npm test"The stamp lives under .git/codex-sdlc/proof.json, expires after four hours,
and is tied to the current repo content, so stale proof blocks again instead of
dirtying the worktree.
For safety, guarded git commit / git push commands that change repo context
with cd, git -C, --git-dir, --work-tree, GIT_DIR, or GIT_WORK_TREE
must be run from the target repo root and stamped there.
The wizard supports two wizard-owned model profiles:
mixed:gpt-5.4-minifor the main pass plusgpt-5.5atxhighfor review. Tradeoff: better speed, lower latency, and lower token usage on routine work after bootstrap.maximum:gpt-5.5atxhighthroughout. Tradeoff: higher latency and token usage in exchange for the most stable and thorough "ultimate mode."
How to choose:
# recommended interactive bootstrap path
npx codex-sdlc-wizard@0.7.31 --model-profile maximum
# interactive bootstrap with the efficiency-first profile if you already know you want it
npx codex-sdlc-wizard@0.7.31 --model-profile mixed
# floating latest release with the same bootstrap recommendation
npx codex-sdlc-wizard@latest --model-profile maximumInteractive setup should ask which profile you want when you do not pass --model-profile, and it should recommend maximum as the safer bootstrap default.
Low-confidence rule:
- Default to
xhighin this repo when the work is meta, setup-heavy, or otherwise high-blast-radius. - if confidence is below
95%, research more first - if it still stays below
95%, escalate review toxhigh - prefer
maximumfor abstract, complex, or high-blast-radius work
The wizard stores the selected profile in .codex-sdlc/model-profile.json so the repo can keep that choice explicit.
It also writes the matching repo-local Codex config to .codex/config.toml so trusted Codex sessions use the selected profile instead of silently inheriting stronger user-level defaults.
mixed is wizard policy, not a native Codex mode. The wizard maps it to:
model = "gpt-5.4-mini"
model_reasoning_effort = "xhigh"
review_model = "gpt-5.5"
[features]
hooks = truemaximum maps to:
model = "gpt-5.5"
model_reasoning_effort = "xhigh"
[features]
hooks = trueCodex only loads project-local .codex/config.toml for trusted projects. Once trusted, project config overrides user config in ~/.codex/config.toml; the wizard does not edit your global config. Current Codex CLI builds warn that [features].codex_hooks is deprecated, so setup/update write [features].hooks = true and migrate active codex_hooks entries when repairing config.
Bootstrap recommendation:
- setup/update should use
maximum; routine work after bootstrap should usemixed - use
maximumfor setup/update because bootstrap work has higher blast radius - switch back to
mixedfor routine day-to-day work after the repo is stable
Repo-specific maintainer rule:
- consumer repos can choose
mixedormaximum - this repo always stays on
maximum(gpt-5.5atxhighthroughout); do not switchcodex-sdlc-wizardmaintenance tomixed, mini-only, or lower-reasoning profiles because it is unusually meta and high-blast-radius
Review behavior is required by the SDLC contract. The portable Codex-native review path is codex review:
# Review staged, unstaged, and untracked local changes before commit
codex review --uncommitted
# Review a branch or PR-sized diff against a base branch
codex review --base main
# Review one already-created commit
codex review --commit <sha>When review_model = "gpt-5.5" is present, native Codex review uses that model for the review pass. In mixed, this gives the intended cross-model shape: faster main work, gpt-5.5 review.
Do not treat /autoreview as a required SDLC command. auto_review is a Codex approval-review setting for eligible tool approval prompts; it is not the code-diff review path. In yolo/full-bypass sessions, approval review usually does not apply because approvals are already bypassed.
install.sh and setup.sh scaffold repo-local Codex skills under .agents/skills.
Repo-scoped skill coverage is still a work in progress:
$sdlcis the supported public workflow skill today- additional repo-scoped workflows stay unnamed until their public contracts are ready
Canonical entrypoint: $sdlc. /sdlc is historical shorthand for the missing slash-command idea, not an invocation command. Adapter-specific SDLC aliases are legacy migration debris and should not appear as second user-facing workflows.
Codex treats same-name skills from different scopes as distinct choices. To avoid duplicate $sdlc workflow rows, normal setup installs global helper skills only (feedback, setup-wizard, and update-wizard) and keeps .agents/skills/sdlc as the canonical repo-scoped workflow.
These are Codex-native skill folders, so a fresh Codex session can discover them directly from repo scope. After install or setup, restart Codex so repo-scoped skills are loaded cleanly.
The bridge here is explicit, not magical: this adapter ships the Codex-native skill copies that target repos consume. It does not depend on local .claude/skills/* paths being present in the target repo.
The current recommended Codex-native architecture is explicit:
skills = explicit workflow layerhooks = silent event enforcementrepo docs = source of local truth
Current Codex CLI 0.130.0 supports eight hook events: PreToolUse, PermissionRequest, PostToolUse, PreCompact, PostCompact, SessionStart, UserPromptSubmit, and Stop.
This wizard actively installs SessionStart, PreToolUse, PreCompact, and PostCompact. The remaining hook events are intentionally left unused until there is a proven SDLC need: PermissionRequest can change approval behavior, PostToolUse can create noisy post-command gates, UserPromptSubmit can over-police prompts, and Stop can interfere with normal session shutdown.
That means:
- use repo-scoped or installed skills for the user-facing workflow contract
- use hooks to block or warn silently at the right events
- keep
AGENTS.md,ARCHITECTURE.md,TESTING.md, and related repo docs as the local source of truth
What not to do:
- do not pretend Codex has native slash commands when it does not
- do not overload hooks to act as the user-facing workflow layer
When you dogfood this wizard in a product repo, keep the active session focused on that product repo.
- if you discover a proven reusable wizard lesson, prefer filing a direct GitHub issue in
codex-sdlc-wizardright away - if you are reporting a consumer-facing failure, use the repo's Consumer bug report template so command, repo shape, failed step, and auth context are captured consistently
- keep building the product repo in the current session
- only switch into live wizard work when the product repo is actually blocked
This keeps dogfooding useful without turning every implementation session into wizard meta-work.
Versioned releases for this adapter live at:
https://github.com/BaseInfinity/codex-sdlc-wizard/releases
If you are consuming this repo in a real project, prefer a tagged release over main.
# npm / npx pinned to the current release
npx codex-sdlc-wizard@0.7.31
# npm / npx floating on the newest published release
npx codex-sdlc-wizard@latest
# Codex skill install
# Install this repository through the normal GitHub skill-install flow
# so $codex-sdlc-wizard is available inside Codex
# git-based install
git clone --branch v0.7.31 --depth 1 https://github.com/BaseInfinity/codex-sdlc-wizard.git /tmp/codex-sdlc-wizardThis adapter should follow the same semver-tag plus GitHub Release rhythm as the upstream wizard.
Use RELEASE.md as the mandatory pre-tag checklist: sync to latest origin/main, run the full proof suite, and only then tag.
# After tests pass on main
git tag vX.Y.Z
git push origin vX.Y.ZPushing a vX.Y.Z tag triggers this repo's release workflow, publishes the npm package, and publishes GitHub Release notes automatically. workflow_dispatch exists as a retry path for an existing tag if a release job needs to be rerun.
To enable npm publish from GitHub Actions, configure npm trusted publishing for this package instead of storing a long-lived token:
- Open the npm package settings for
codex-sdlc-wizard - Go to
Trusted publishing - Choose
GitHub Actions - Configure:
Organization or user:BaseInfinityRepository:codex-sdlc-wizardWorkflow filename:release.ymlEnvironment name: leave blank unless you later add a protected GitHub environment
The workflow uses GitHub OIDC trusted publishing, validates that the tag matches package.json, and skips npm publish on reruns when that exact version already exists on npm. No NPM_TOKEN GitHub secret is required.
- Copies
AGENTS.md(skips if exists, so your customizations are safe) - Copies
SDLC-LOOP.md,START-SDLC.md, andPROVE-IT.mdif missing - Creates or merges
.codex/config.tomlwith[features].hooks = true - Installs
.codex/hooks.json(backs up existing) - Copies universal Node hook entrypoints plus legacy shell/PowerShell helpers to
.codex/hooks/ - Installs the repo-scoped SDLC skill at
.agents/skills/sdlc/SKILL.md - Installs global helper skills under
~/.codex/skillswithout installing a globalsdlcduplicate
In other words, install.sh mutates the target repo by adding or updating AGENTS.md, .codex/config.toml, .codex/hooks.json, .codex/hooks/*, and the repo-scoped SDLC skill. It also writes .codex-sdlc/model-profile.json so the chosen profile is explicit. Existing .codex/config.toml files are merged: model keys and [features].hooks are patched, active deprecated [features].codex_hooks entries are migrated away, and MCP, sandbox, approval, and other custom settings are preserved. If an older wizard-managed global sdlc skill is detected, update/setup backs it up and removes it; user-owned global sdlc skills are preserved.
Installed hook entrypoints are quiet and current-Codex aware: git-guard.cjs handles PreToolUse commit/push gates, session-start.cjs handles SessionStart baseline warnings, and compact-guard.cjs handles PreCompact/PostCompact SDLC carry-forward reminders.
After restart, hook install is not complete until Codex trusts the repo and any pending repo hooks are reviewed. If Codex reports hooks need review, open /hooks and review the pending hooks before relying on SDLC enforcement.
- Codex CLI (
npm i -g @openai/codex) bash(3.x+ macOS, 4.x+ Linux, Git Bash on Windows for the shell path)- Node.js 18+; active Codex hooks use Node entrypoints so the same checked-in hook config works across macOS, Linux, and Windows
All hooks are verified in real Codex CLI sessions, not just unit tested in isolation.
# Top-level maintainer proof runner (parallel by default, serial for debugging)
node scripts/run-proof-suite.cjs
node scripts/run-proof-suite.cjs --serial
# Release contract tests (workflow + docs)
bash tests/test-release.sh
# Packaging smoke test (clean temp project, validates install path)
bash tests/test-packaging.sh
# Codex skill package smoke test
bash tests/test-skill.sh
# npm / npx packaging smoke test, including the packed-tarball scratch smoke
bash tests/test-npm.sh
# Unit tests (no API calls, fast)
bash tests/test-adapter.sh
bash tests/test-setup.sh
bash tests/test-update.sh
# E2E tests (opt-in: requires codex CLI + auth, consumes tokens)
CODEX_E2E=1 bash tests/test-e2e.shnode scripts/run-proof-suite.cjsruns the maintainer proof suite with bounded parallel jobs and per-check logs; use--serialwhen debugging ordering-sensitive failures.- Release contract tests for semver tags, GitHub Releases, and README release docs
- Packaging smoke tests for the documented installer path and README packaging contract
- Skill packaging tests for SKILL.md, agents/openai.yaml, and dual-distribution docs
- npm packaging smoke tests for package metadata, packed contents, and npm exec
- Adapter, setup, and update tests for the Codex-specific behavior surface
- E2E integration tests are token-consuming and opt-in; use
CODEX_E2E=1 bash tests/test-e2e.shwhen you explicitly want real Codex sessions proving hooks fire
| Document | What It Covers |
|---|---|
| AGENTS.md | Repo contract for planning, confidence, TDD, review, and model profile policy |
| START-SDLC.md | Quick operator entrypoint for starting SDLC work in an installed repo |
| SDLC-LOOP.md | Repeatable plan -> test -> implement -> review -> prove loop |
| PROVE-IT.md | Proof-stamp gate for commit/push and examples for explicit check commands |
GOALS.md |
Optional active-scope contract for long-running work; generated with setup --goals |
| RELEASE.md | Maintainer release checklist before semver tags and npm/GitHub release publish |
| ROADMAP.md | Current shipped state, next release cycle, and backlog ordering |
Based on agentic-ai-sdlc-wizard. Same SDLC philosophy, translated to Codex's current tool model with Codex-native skills, repo hooks, and adaptive setup/update flows.
Three ways to report bugs, request features, or ask questions:
- In-session: run
$feedbackwhen installed; it is privacy-first and redacts sensitive context before preparing a report. - Consumer bug report: use the consumer bug report template for install/setup/runtime failures.
- Issues: open a normal GitHub issue for feature requests, docs gaps, or proven reusable wizard findings.
Come join Automation Station — a community Discord packed with software engineers bringing 40+ years of combined experience across every area of the stack (frontend, backend, infra, embedded, data, QA, DevOps, you name it). Share patterns, ask questions, compare notes on AI agents, automation, and SDLC tooling.
MIT