Add performance audit and remediation plan#31
Conversation
Evidence-based audit covering portability, dependency coupling, artifact drift, test/CI gaps, and the latency/cost/convergence profile. Top finding: hard undetected dependency on mikefarah/yq disables the policy tier and the scoring harness under the Python yq variant. https://claude.ai/code/session_01XdPesF5fex8d3axheSegS8
Companion to PERFORMANCE-AUDIT.md. Converged via one co-evolution critique->resolve pass. Staged for independent Codex review via the dev-review methodology run from an authenticated local workstation. https://claude.ai/code/session_01XdPesF5fex8d3axheSegS8
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c14e7e3efa
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| `yq --version 2>&1 | grep -qi mikefarah || die "<actionable message>"`. Keep the | ||
| snippet byte-identical across all three sites and add a grep-pinned test asserting |
There was a problem hiding this comment.
Don't use a byte-identical yq guard across these sites
This remediation step tells the implementer to add the same ... || die snippet to all three locations, but the inspected targets do not share that contract: lab/pel/proposer/policy/proposer.sh is intentionally self-contained and has no die helper, while lab/pel/pr-emitter/pr-emitter.sh currently exits dependency failures with code 2 via die "..." 2. Following the plan literally would either hit die: command not found in the policy proposer or regress the emitter's documented exit-code taxonomy when the wrong yq is installed.
Useful? React with 👍 / 👎.
| - **Why it matters:** `lib/co-evolution.sh:validate_review_verdict` enforces the *stricter* | ||
| rules at runtime while the skill ships the *looser* schema — a verifier can accept a | ||
| verdict the runtime contract rejects (or vice-versa). This is exactly the drift the repo's | ||
| own `CONCERNS.md §5` warns about, now realized. |
There was a problem hiding this comment.
Correct the runtime/schema drift diagnosis
The audit says the runtime enforces the stricter root schema, but lib/co-evolution.sh:533-581 only requires verdict, confidence, summary, and issues; issue entries only need severity and description, and scope_creep_detected/iteration_notes are optional if present. That means the skill schema is not accepting verdicts that the runtime rejects in the way described here, so the remediation is likely to tighten or canonicalize the schemas for the wrong compatibility reason.
Useful? React with 👍 / 👎.
Fold the union of two independent critique passes into the v2 plan: Claude (empirical) and Codex (structural), run as separate local co-evolution reviews. Resolve every marker into concrete actions and drop the self-referential "converged" status. Add the two must-fix hazards Codex surfaced: a zero-live-markers precondition on the 3.3 early stop (a small-delta stop could otherwise halt with inherited markers still open), and a test-runner-scoped hermetic git env for 2.1 that keeps production signing intact. Fold in Claude's verified catches: die() ignores its exit-code arg (real bug at pr-emitter.sh:570, not just cosmetic at co-evolve-bouncer.sh:166), plus function-name references in place of drifted line numbers and a rollback plan for Phases 2-3. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
What this adds
PERFORMANCE-AUDIT.md— an evidence-based audit of co-evolution covering portability, dependency coupling, artifact drift, test/CI gaps, and the latency/cost/convergence profile. Top finding: an undetected hard dependency on mikefarah/yq disables the policy tier and the scoring harness whenever the Pythonyqvariant is on PATH.REMEDIATION-PLAN.md— the fix plan, sequenced into four PRs (Phase 1 correctness → Phase 2 test hermeticity + CI gate → Phase 3 performance → Phase 4 hygiene), each with verification steps and a rollback note.How the plan was hardened
The plan ran through two independent adversarial reviews using the repo's own co-evolution tooling — Claude and Codex each as a separate critique pass — followed by a synthesis that merged the union of unique findings. The methodology dogfoods the tool the audit covers.
Two must-fixes surfaced and are now in the plan:
Smaller folds: a functional
yqprobe instead of version-string grepping; validator-based positive/negative fixtures behind the schema drift guard; an explicit CI-safe suite declaration with PATH stubs forclaude/codex; and a corrected F-6 finding —die()ignores its exit-code argument, so the real bug ispr-emitter.sh:570(die "..." 2silently exits 1), not the cosmeticco-evolve-bouncer.sh:166. Drifted line-number citations were replaced with function-name references.Scope
Docs only. The implementation lands as the four sequenced PRs the plan describes; an independent dev-review pass gates each one before merge.