Replace Step 5 CI sampling claim with canonical + fix label conflation by PunchTheDev · Pull Request #299 · PunchTheDev/forge-dashboard

PunchTheDev · 2026-06-05T16:40:33Z

Summary

Step 5 of the Quickstart Guide had a factually-wrong CI sampling claim and conflated two distinct PR labels. This PR aligns the section with the canonical sources (eval.yml, score.yml) and surfaces both labels accurately.

Motivation

Two correctness issues found by canonical-source check:

CI sampling: copy said CI runs a quick check (1 easy problem per category). Per .github/workflows/eval.yml L150 + scripts/run_eval_pool.py, CI actually runs 3 specs — one random per round (any tier), 2× determinism on the first. Same shape as the anti-gaming sampling already explained in the Anti-gaming section.
PR label conflation: copy said the optimization label is applied if your agent passes all three categories. Per eval.yml L338–354, that's the passed label's behavior. The optimization label is actually applied when the PR beats SOTA in at least one category. The two were collapsed into one — and the passed label was missing entirely.
"Full scoring runs automatically" timing: ambiguous about whether per-PR or post-merge. Per .github/workflows/score.yml L3, the all-45-spec eval runs only after PR merges to main.

Changes

QuickstartGuide.tsx L721–757: lead rewritten around canonical CI sampling shape; routed #anti-gaming ↓ link added; 199-char CI tooltip + 216-char 3-spec pool sample tooltip.
Two-label <ul> replaces the single misleading sentence: passed (194-char tooltip) + optimization (186-char tooltip) — each accurate per eval.yml L338–339.
Closing paragraph adds the score.yml post-merge framing (221-char tooltip) with routed /rankings → overall_score link.
BACKLOG.md: bundled Step 1 → Step 5 row split into 5 individual rows (matching prior PRs Frame Guide Step 2 with CLI vs API roles and filter hints #295/Frame Guide Step 1 by canonical Docker vs Native paths #296/Surface fork CTA and sandbox limits in Guide Step 3 #297/Replace Step 4 generic copy with canonical 4-stage pipeline #298 split shape); Step 5 flipped ○ ○ ○ → ● ● ● with rationale.

Screenshots

Puppeteer-verified at 1440×900 on /guide#submit: all 5 dotted-underline tooltips render (199/216/194/186/221 chars), 3 <code> chips (passed, optimization, score.yml), routed #anti-gaming + /rankings anchors present, old 1 easy problem per category and passes all three categories phrases gone. 0 new console errors.

Replace Step 5 generic CI claim with canonical sampling + labels

2ddd9e0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace Step 5 CI sampling claim with canonical + fix label conflation#299

Replace Step 5 CI sampling claim with canonical + fix label conflation#299
PunchTheDev wants to merge 1 commit into
mainfrom
punch/guide-step5-submit

PunchTheDev commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

PunchTheDev commented Jun 5, 2026

Summary

Motivation

Changes

Screenshots

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant