feat(v5): experimental Haiku WARN cascade tier (opt-in; F1 honestly caveated) by waitdeadai · Pull Request #26 · waitdeadai/llm-dark-patterns

waitdeadai · 2026-05-23T19:46:26Z

Opt-in experimental cheap-LLM-judge WARN tier above the deterministic regex floor — the v5 cascade the v3/v4 negative result + deepresearch (.taste/research/cascade-llm-judge-tier.md) pointed to. Companion data/scorers PR: waitdeadai/agent-closeout-bench feature/v5-cascade-haiku-tier.

What this adds

hooks/no-sycophancy-warn.sh — reference WARN tier. Opt-in (LDP_SYCOPHANCY_WARN_JUDGE=1); fires only on regex-negative closeouts; cross-model Haiku judge; never exits 2 (WARN-only — the deterministic regex floor owns BLOCK). bash -n ✓; smoke WARNs on a positive, silent on a clean negative, exits 0.
evaluation/v5/SPEC.md + RESULTS.md.
no-sycophancy.sh BLOCK floor unchanged (git diff main empty).

Results — read the caveat, do NOT treat F1 as a win

On the held-out (n=58) + fresh (n=35), regex∪Haiku-WARN scores F1=1.000 (regex-only 0.26/0.23), all previously-missed modes recovered, 0 control FPs. This is an optimistic/circular upper bound, reported as such: the frozen judge labels were κ-validated against the same construction gold; positives are synthetic; control set is tiny. Cross-judge check: Haiku==Sonnet (κ=1.0) on the WARN cases → cheap tier suffices, but the universal 1.0 shows the corpus saturates — too unambiguous to validate real-world precision. An unbiased number needs human gold or an ambiguous/real-trace test set.

Recommendation

Merge as an opt-in experimental tier + honest eval, not as a validated detector. The headline F1 claim is deliberately withheld pending non-synthetic validation.

🤖 Generated with Claude Code

… unchanged) - hooks/no-sycophancy-warn.sh: reference cheap-judge WARN tier; opt-in, regex-negative only, cross-model Haiku, NEVER exits 2 (BLOCK floor owns blocking); bash -n ok, smoke pos/neg pass, exits 0 - evaluation/v5/SPEC.md + RESULTS.md: cascade closes the recall gap in CAPABILITY; F1=1.0 reported as circular/optimistic upper bound, not a production metric; real number needs human gold - no-sycophancy.sh BLOCK path UNCHANGED (git diff main empty) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…saturates Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

eliteinterface and others added 2 commits May 23, 2026 16:30

docs(v5): cross-judge addendum — Haiku==Sonnet on WARN cases; corpus …

465ae78

…saturates Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

waitdeadai merged commit e9172ae into main May 23, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(v5): experimental Haiku WARN cascade tier (opt-in; F1 honestly caveated)#26

feat(v5): experimental Haiku WARN cascade tier (opt-in; F1 honestly caveated)#26
waitdeadai merged 2 commits into
mainfrom
feature/v5-cascade-haiku-tier

waitdeadai commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

waitdeadai commented May 23, 2026

What this adds

Results — read the caveat, do NOT treat F1 as a win

Recommendation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants