Codex/ecc si2 p4 implementation by EsatanGW · Pull Request #27 · EsatanGW/agent-protocol

EsatanGW · 2026-05-31T17:25:03Z

Summary

Scope classification

Lean — single surface, ≤ 1 consumer, no public behavior change.
Full — multiple surfaces, public behavior change, new contract, or any breaking change.
Trivial — typo / comment / single-file doc fix / chore.

Surfaces touched

User surface (UI, copy, routes, a11y, i18n)
System-interface surface (schemas, APIs, event shapes, public contracts, hook contracts)
Information surface (config, flags, enums, validation rules)
Operational surface (CI, release process, runbook, observability)

Rationale for the surfaces checked (one sentence each is enough):

…

Source of Truth impact

SoT file(s): docs/… / schemas/… / skills/…
Consumer updates in this PR: docs/…, skills/engineering-workflow/templates/…, reference-implementations/…
Known consumers NOT updated and why:

Evidence

Ran python3 .github/scripts/validate-schema-syntax.py
Ran python3 .github/scripts/validate-templates.py
Ran sh .github/scripts/check-version-consistency.sh
Ran sh reference-implementations/hooks-claude-code/selftests/selftest.sh (if hook bundle touched)
Ran pytest in reference-implementations/validator-python/ (if validator touched)
Visual diagrams render in GitHub preview (if docs/diagrams.md touched)

Evidence artifacts (paste relevant output or link to CI run):

…

Merge readiness and branch protection

Feature branch:
CI run URL / check summary:
Reviewer approval link:
Intended merge strategy: squash / merge commit / rebase
Branch-protection verification:
- not run (state why)
- Offline verifier selftest passed: python3 .github/scripts/verify-branch-protection.py --selftest
- Live verifier passed: python3 .github/scripts/verify-branch-protection.py --live --repo <owner/name> --branch main

Breaking-change level

L0 — additive, no consumer action required.
L1 — additive + deprecation, opt-in migration.
L2 — structural, consumers must adapt.
L3 — removal / renaming with migration path.
L4 — removal / renaming without migration path (avoid — requires explicit justification).

If ≥ L1, link the migration guidance (usually CHANGELOG.md under Changed / Removed or a dedicated doc section).

Rollback mode

Mode 1 — reversible by revert (most doc / code changes here).
Mode 2 — forward-fix (e.g. schema version bump that already shipped).
Mode 3 — compensation required (rare in this repo; call it out).

Cross-cutting concerns

Only check items that changed. Leave unchecked if this PR does not touch them.

Security — touches reference-implementations/hooks-*, CI workflow, or anything in .github/. Confirm no untrusted event inputs, no side effects, no network-in-hook.
Observability — touches logging / telemetry / metrics surfaces that adopters rely on.
Testability — adds or removes CI jobs, selftests, or validator coverage.
Performance — changes anything with a declared latency budget (runtime hooks < 500 ms / 2 s per contract).

Change Manifest (Full mode)

ROADMAP initiative:
Or attached manifest:
Or N/A (trivial / single-surface Lean):

Checklist before requesting review

Scope matches the ROADMAP row (or this is a trivial change with no row).
No drive-by refactors; no opportunistic cleanup outside the declared scope.
No vendor / model / framework names introduced into normative content (bridges excepted).
No CHANGELOG rewrites of past entries — the new entry appended, old entries left as historical fact.
CLAUDE.md §5 observed: if a cross-cutting term changed, all consumers updated.

By submitting this PR you confirm: the changes are your own, MIT-compatible, and free of secrets or proprietary content.

Implements the b1 fail-safe-empty Category E reader at scripts/orient.sh. The core emits the §7.1 JSON shape, filters the seven whitelisted orientation fields, validates hook_state_flags, reads optional state without writing it, degrades missing/corrupt state to an empty payload, and forces Codex turn-stop hook output to JSON. This keeps the G4/S25 boundary intact: no state-file producer is added. P4 is opened on codex/ecc-si2-p4-implementation with Implementer identity ecc-pr2-implementer-identity-1. The manifest phase flips to implement, the Implementer author is recorded, b1 evidence is collected, and the F1 sot.source_file_missing waiver is removed because scripts/orient.sh now exists. Makefile and validate.yml get the explicit scripts/orient.sh shellcheck row, and verify-local now runs orient.sh --selftest so the b1 gate validates the actual script. Verification (subprocess return codes): SI2_SCHEMA=0; ROOT_MANIFESTS=0; INTERNAL_LINKS=0; SECTION_INDEX=0; CHANGELOG_JSON=0; VERIFY_DOCS=0; VERIFY_LOCAL=0 (with existing advisory that [Unreleased] is empty); FULL_VALIDATOR=0; B1_GATE make verify-hooks-shellcheck=0 and bash scripts/orient.sh --selftest=0.

Adds docs/runtime-hook-threat-model.md §1.9.7 for orientation-payload context injection with the required distinctness clause against §1.9.2, §1.9.4, and §1.9.5. This is additive to the protected threat-model bodies and keeps the §7.1 contract plus bridge item 6 untouched. Adds .github/scripts/check-threat-model-additive.sh so the b2 gate can compare §1.1-§1.8 and §1.9.1-§1.9.6 byte-for-byte against origin/main. The visible Category-E failure-mode lists and docs section index are updated, and the manifest records b2 evidence under Implementer identity ecc-pr2-implementer-identity-1. Verification (subprocess return codes): SI2_SCHEMA=0; ROOT_MANIFESTS=0; INTERNAL_LINKS=0; SECTION_INDEX=0; CHANGELOG_JSON=0; VERIFY_DOCS=0; VERIFY_LOCAL=0 (with existing advisory that [Unreleased] is empty); FULL_VALIDATOR=0; DIFF_CHECK=0; B2_GATE check-internal-links=0, generate-section-index --check=0, check-threat-model-additive=0, §1.9.7 grep=0.

Resolves the S26 runtime-event ABI fixture slice without fabricating fixtures. No real runtime hook-input payloads or maintainer-donated captures were available in this P4 workspace, so tests/fixtures/runtime-events/deferrals.json records every P2 candidate pair as deferred with reason and README.md now states captured=0/deferred=12. Adds validate-runtime-event-fixtures.py for hook-input fixture validation and check-source-verification-flipped.sh for S26/S27 ledger consistency. docs/source-verification-notes.md flips S26/S27 to captured-or-deferred reality, including codex session-start and full gemini S18 deferral, while never claiming repo-smoke-confirmed for uncaptured pairs. Verification (subprocess return codes): SI2_SCHEMA=0; ROOT_MANIFESTS=0; INTERNAL_LINKS=0; SECTION_INDEX=0; CHANGELOG_JSON=0; VERIFY_DOCS=0; VERIFY_LOCAL=0 (with existing advisory that [Unreleased] is empty); FULL_VALIDATOR=0; PY_COMPILE=0; DIFF_CHECK=0; B3_GATE check-paths-exist=0, validate-runtime-event-fixtures=0, check-source-verification-flipped=0.

Completes the operational b4 slice. The explicit scripts/orient.sh shellcheck row and verify-local selftest wiring were landed in b1 so the new reader was linted and exercised immediately; this commit records that evidence in the manifest and adds [Unreleased] CHANGELOG entries for b1, b2, and b3. Regenerates CHANGELOG.json from CHANGELOG.md. The b4 gate confirms Makefile and validate.yml contain exact scripts/orient.sh rows rather than a glob, and make verify-local now reports [Unreleased] has content yes instead of the prior advisory. Verification (subprocess return codes): SI2_SCHEMA=0; ROOT_MANIFESTS=0; INTERNAL_LINKS=0; SECTION_INDEX=0; CHANGELOG_JSON=0; VERIFY_DOCS=0; VERIFY_LOCAL=0; FULL_VALIDATOR=0; DIFF_CHECK=0; B1_SELFTEST=0; B3_GATE check-paths-exist=0, validate-runtime-event-fixtures=0, check-source-verification-flipped=0; B4_GATE make verify-hooks-shellcheck=0, Makefile grep=0, validate.yml grep=0, generate-changelog-json --check=0.

Closes Phase 4 only and advances the SI-2 manifest to phase=review with status still in_progress. The handoff narrative records the actual P4 delivery: b1 orient.sh reader and F1 waiver removal, b2 §1.9.7 plus additive checker, b3 captured=0/deferred=12 runtime-event fixture reality, and b4 explicit orient.sh CI wiring plus CHANGELOG artifacts. Updates ROADMAP P4 from in_progress to passed with the concrete b1-b4 commit chain and the repeatable final gate. P5/P6 are intentionally not attempted here; the next Reviewer identity must differ from both ecc-pr2-planner-identity-1 and ecc-pr2-implementer-identity-1. Verification (subprocess return codes): SI2_SCHEMA=0; ROOT_MANIFESTS=0; INTERNAL_LINKS=0; SECTION_INDEX=0; CHANGELOG_JSON=0; VERIFY_DOCS=0; VERIFY_LOCAL=0; FULL_VALIDATOR=0; DIFF_CHECK=0; B1_GATE=0; B2_GATE=0; B3_GATE=0; B4_GATE=0.

Persists the P5 read-only audit by distinct Reviewer identity ecc-pr2-reviewer-identity-1 (Claude Opus 4.8 1M context) into the SI-2 manifest: six review_notes (five pass + one pass_with_followup) and one AI approval, plus the reviewer added to authors[]. Verdict PASS_WITH_FOLLOWUP. The audit confirmed O4 reader-only + the three-site producer tension preserved (threat-model §1.9.2 byte-identical to origin/main), O6 captured=0/deferred=12 no-fabrication, O8 additive §1.9.7 with the distinctness clause, O9 punt (4 task_slices, no b5), and the §7.1 contract surface invariants; all 17 gates reproduced exit 0. Cross-verified independently by an external Codex review. ROADMAP SI-2 P5 row flips to passed with the review_notes/approvals artifact, and the Status + Identity-sequence lines record the now-assigned distinct Reviewer identity. No implementation files are edited in this persistence step: orient.sh, the threat model, the fixtures, and the check-paths-exist.sh:127 placeholder label are all untouched. The single non-blocking pass_with_followup (that stale label) is recorded, not fixed. Manifest phase stays review / status in_progress (pass_with_followup does not advance to signoff). P6 human sign-off and any version bump remain pending. No push. Verification (subprocess return codes): SCOPED_SCHEMA=0; ROOT_RESOLVER=0; FULL_VALIDATOR=0 (findings: []); INTERNAL_LINKS=0; DIFF_CHECK=0; changed files = manifest + ROADMAP only.

Manifest-only fix-up for the P5 persistence commit (259d69b), which updated review_notes / approvals / authors[] / last_updated and ROADMAP P5 but left the handoff_narrative still claiming "ready for Phase 5 review", "Reviewer remains unassigned", and "P5/P6 are NOT attempted" — contradicting the same file's recorded P5 audit and the ROADMAP P5=passed row. External Codex review flagged the inconsistency. Updates the narrative to state P5 PASSED (PASS_WITH_FOLLOWUP) by distinct Reviewer identity ecc-pr2-reviewer-identity-1, names the one non-blocking follow-up (the unfixed check-paths-exist.sh:127 placeholder label), and records that P6 human sign-off + version bump remain pending. No state field changed: phase stays review / status in_progress. The b1-b4 description and O4/O6/O8/O9 dispositions are preserved verbatim. Scope: manifest handoff_narrative only. No implementation files, no ROADMAP, no check-paths-exist.sh label edit, no P6, no push. Verification (subprocess return codes): SCOPED_SCHEMA=0; ROOT_RESOLVER=0; FULL_VALIDATOR=0 (findings: []); DIFF_CHECK=0; stale-phrase grep = 0 matches; changed files = manifest only.

The closed/accepted RFC-026 root manifest (shipped v1.51.0) left all three residual_risks[].accepted_by at the "unaccepted" sentinel even though it was signed off with a recorded Reviewer approval whose conditions enumerate the residual risk. The completion-audit stop hook binds to the only root-level change-manifest*.yaml, so it blocked every session's stop with completion.unaccepted-risk on this stale, internally-inconsistent record. Flips the three accepted_by fields to the P5 Reviewer of record (rfc-026-reviewer-identity-3, Reviewer Meridian); the value points to approvals[0].conditions, so it records the acceptance that already happened at P5 rather than backdating a new one. Bumped last_updated. No state change: phase stays signoff / status stays accepted; residual_risks descriptions, severities, and mitigations are unchanged. Scope: residual_risks[].accepted_by + last_updated only. Unrelated working-tree changes (the SI-2 check-paths-exist.sh label edit and the untracked docs/proposals/* drafts) are deliberately excluded. Verification (subprocess return codes): completion-audit.sh = 0 (was 1); unaccepted-risk count 3 -> 0; validate-change-manifests.py = 0; direct Draft 2020-12 schema validation = 0 errors. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 89a3223d74

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-31T17:30:35Z

+    rule = ALLOWED_FLAGS.get(name)
+    if rule is None:
+        return sep == ""


Drop unknown agent-protocol flags explicitly

When the state file contains an undeclared no-value flag such as agent-protocol/not-allowed, ALLOWED_FLAGS.get(name) returns None, the same value used for known no-value flags, so valid_flag() returns true and propagates arbitrary methodology-namespace tags into the Category E payload. This violates the closed-set rule in docs/runtime-hook-contract.md for hook_state_flags, and it matters because this payload can enter model context from a poisoned or stale state file.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-31T17:30:35Z

+        return
+    try:
+        repo = Path.cwd().resolve()
+        resolved = (repo / candidate).resolve(strict=False)


Reject unresolved manifest paths

For a poisoned or stale state file that names a repo-relative path that does not exist, resolve(strict=False) succeeds and the hook emits that path as active_manifest_path instead of null plus agent-protocol/manifest-rejected:unresolved. The contract says unknown/unavailable active manifests must not be guessed, and the threat model lists unresolved as a rejected path reason, so this lets an attacker or stale producer steer the agent toward a fabricated manifest path.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-31T17:30:35Z

+    for key in ("current_phase", "current_mode", "roadmap_row_id", "roadmap_row_state"):
+        value = source.get(key)
+        orientation[key] = value if isinstance(value, str) else None


Bound phase and roadmap fields before emitting

If the runtime-owned state file is poisoned or stale, these fields are copied as arbitrary strings, including newlines, ANSI/control characters, or instruction-shaped text, and then emitted into the Category E payload that may enter model context. This undercuts the new §1.9.7 mitigation that the core keep orientation values free of free-text instruction channels; at minimum these phase/mode/state/id fields need closed enums or the same kind of length/charset filtering used for branches and flags.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-31T17:30:35Z

+    raw_flags = source.get("hook_state_flags", [])
+    if isinstance(raw_flags, list):
+        for item in raw_flags:
+            if valid_flag(item):
+                add_flag(flags, item)
+    add_flag(flags, state_mtime_flag(state_path))


Do not trust state-provided mtime flags

When a poisoned or stale state file already contains a syntactically valid agent-protocol/state-mtime:<timestamp>, the loop accepts it before appending the real file mtime, so the payload can contain a forged future freshness signal alongside the actual one. Since §1.9.1 relies on the core stamping state-mtime as the stale-orientation mitigation, state-sourced state-mtime should be ignored or replaced rather than propagated from untrusted input.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-31T17:30:35Z

        # explicitly does not equal full CI parity).
        run: |
          shellcheck -s sh \
+            scripts/orient.sh \


Run the runtime-event fixture verifier in CI

I checked .github/workflows/validate.yml and Makefile: this change only adds scripts/orient.sh to shellcheck/selftest, while the new validate-runtime-event-fixtures.py and check-source-verification-flipped.sh gates are only referenced from documentation/manifest text. That means future invalid or hand-written runtime-event fixtures, or stale S26/S27 deferrals, can still pass the repository's normal CI/local gates unless a reviewer remembers to run the README commands manually.

Useful? React with 👍 / 👎.

Abel and others added 8 commits May 31, 2026 23:01

EsatanGW closed this May 31, 2026

chatgpt-codex-connector Bot reviewed May 31, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codex/ecc si2 p4 implementation#27

Codex/ecc si2 p4 implementation#27
EsatanGW wants to merge 8 commits into
mainfrom
codex/ecc-si2-p4-implementation

EsatanGW commented May 31, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

EsatanGW commented May 31, 2026

Summary

Scope classification

Surfaces touched

Source of Truth impact

Evidence

Merge readiness and branch protection

Breaking-change level

Rollback mode

Cross-cutting concerns

Change Manifest (Full mode)

Checklist before requesting review

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant