Skip to content

Reframe Guide Whitelisted models around harness-picks-model#302

Open
PunchTheDev wants to merge 1 commit into
mainfrom
punch/guide-whitelisted-models-lead
Open

Reframe Guide Whitelisted models around harness-picks-model#302
PunchTheDev wants to merge 1 commit into
mainfrom
punch/guide-whitelisted-models-lead

Conversation

@PunchTheDev
Copy link
Copy Markdown
Owner

Summary

Guide Whitelisted models (L604-637) was a list of 18 chips fronted by a generic "LLM agents receive a harness-injected LLMClient" lead and footnoted with the most important fact (the harness picks which model runs, agents cannot override). Rebuilt around that fact, deduped a 3rd llm.chat() code block already covered in Step 3 and Patterns, and tooltipped the two canonical sources first-timers would otherwise need to grep the repo for.

Motivation

A first-timer reading the prior version sees 18 chips and reasonably concludes "let me pick claude-opus." That's wrong — agents call llm.chat() once and the harness sets FORGE_MODEL per eval run (default anthropic/claude-haiku-4-5, overridable by secrets.FORGE_MODEL in CI). The footnote saying "model is fixed by harness" buried the only fact that prevents the misconception.

The section also re-documented the SDK call shape that Steps 3 and Patterns already cover post-PRs #297 and #300 — a 3rd duplicate. Per feedback_link_to_canonical_explainer.md, route to the canonical site, don't duplicate.

Changes

  • QuickstartGuide.tsx L604-637 — lead rewritten with harness picks which model runs it foregrounded; FORGE_MODEL env var tooltip (326 chars) cites the exact CI workflow lines (eval.yml L157, score.yml L104/151) and SDK enforcement (forge/sdk/llm.py L23, L32-35 — whitelist check raises before chat() ever runs)
  • config/model-whitelist.txt link gets a 200-char tooltip citing the file's own "CI reads it directly. Update this file to add or remove models" header — names it as the canonical authority, not just a reference
  • Removed the 3rd duplicate llm.chat([...]) code block; replaced with a 1-sentence routed <a href="#write">Step 3 - Write your agent ↑</a> pointer carrying the canonical max_tokens=4096 SDK signature from forge/sdk/llm.py L41
  • Anti-gaming rationale moved from buried footnote to a cross-link to #anti-gaming ↓ (carries the step 281/372/373/375 cross-link pattern — surfaces the enforcement chain instead of restating it inline)
  • Chip count auto-derived from WHITELISTED_MODELS.length so the lead and grid never drift if the list is edited
  • BACKLOG.md L111 flipped ● ● ● with rationale

Verification

Puppeteer 1440x900 on /guide#models:

  • hasHarnessPicks=true hasForgeModelInline=true hasNoApiKey=true
  • oldLlmChatBlockGone=true oldFootnoteGone=true (no The model is fixed by the harness via phrase)
  • sdkRefersStep3=true maxTokens4096=true linkToWrite=true linkToAntiGaming=true linkToWhitelistFile=true
  • chipCount=18 (matches canonical config/model-whitelist.txt 18 entries)
  • 2 tooltips: FORGE_MODEL env var (326 chars), config/model-whitelist.txt (200 chars)
  • 0 new console errors (1 pre-existing 404 on a tile asset unchanged)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant