docs: rewrite README, add LICENSE, prune internal docs by claygeo · Pull Request #2 · claygeo/eivra

claygeo · 2026-06-14T20:52:58Z

Deep investigation + domain research pass on the repo, then a full documentation sync. The old README read accurately for the project as it stood weeks ago, but the codebase moved on and the numbers drifted by ~10x.

What changed

Removed internal docs (private paths / launch drafts / stale plan — git history keeps them):

HANDOFF.md, LAUNCH-X.md, PLAN.md

Added LICENSE (MIT) — the README already pointed at it but the file didn't exist.

Genericized scripts/VPS-SETUP.md — dropped a personal local filesystem path + the VPS IP, fixed the dead crucible-ai.netlify.app domain.

Rewrote README.md to match what actually shipped:

Corrected scoring math. Log-loss clamp is [0.01, 0.99] (README said 1e-4). The leaderboard P&L is a flat-$25 directional bet, not the Kelly formula that was shown — that belongs to the trading lab. Eivra Score formula confirmed; clarified ELO does not feed it.
Documented the paper-trading proof lab (/trading), the /live page, and the ~20 /api/trading-* endpoints — an entire shipped subsystem the old README omitted.
Fixed the claude -p "$0 API cost" details — real flags (--max-budget-usd 0.30, --max-turns 5, Task in --disallowedTools), the actual cache key, %TEMP% vs /tmp, and the synthetic backfill created_at.
Refreshed structure tree + schema (added paper_trading_snapshots, evidence_events; dropped per-table row counts that rot on every cron run).
Honest positioning vs ForecastBench, Prophet Arena, Metaculus FutureEval, and Halawi et al., and softened the overstated "first live stats-honest scoreboard" claim.
Strengthened the honesty section: flagged that ~0.02 Brier / ~97% win across all six agents is backfill inflation, not superhuman skill — superforecasters land ~0.15-0.20, the best published LLM forecaster ~0.24.
Fixed the author link (@deforestpeg was pointing at the suspended /claygdev handle), removed the dead HANDOFF.md link and the references to the deleted files.

How it was checked

A multi-agent verification pass queried the live Supabase prod DB for ground-truth counts and cross-checked every numeric/path/mechanism claim against the code; the corrections above come from that. Web research grounded the competitive landscape and the superforecaster Brier baselines.

@deforestpeg

Rewrite README to match the shipped system: - correct scoring math (log-loss clamp [0.01,0.99]; leaderboard P&L is a flat-$25 directional bet, distinct from the quarter-Kelly trading lab) - document the paper-trading proof lab, the /live page, and the ~20 /api/trading-* endpoints (the biggest doc gap) - fix the claude -p "$0 API cost" mechanism details (real flags, cache key, per-call budget, Task in disallowedTools, %TEMP% vs /tmp) - refresh the project-structure tree and schema (add paper_trading_snapshots, evidence_events; drop row counts that rot on every cron run) - honest positioning vs ForecastBench / Prophet Arena / Metaculus FutureEval / Halawi, and flag that ~0.02 Brier / ~97% win is backfill inflation, not superhuman skill (superforecasters land ~0.15-0.20) - fix author link (@deforestpeg pointed at the suspended /claygdev handle) Add MIT LICENSE (README already referenced it). Remove internal docs that don't belong in a public repo: HANDOFF.md, LAUNCH-X.md, PLAN.md (private local paths, launch-thread/DM drafts, account-suspension notes, stale build plan). Genericize scripts/VPS-SETUP.md: drop the personal local path and VPS IP, fix the dead crucible-ai.netlify.app domain.

Echo is currently rank #1 (not #2) — the insight panel text was stale. Also wires rank_delta_24h arrows (↑/↓) to agent cards so rank movement is visible when standings shift ahead of Thursday launch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QGt65pxBC4WKvuLUukSRhL

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: rewrite README, add LICENSE, prune internal docs#2

docs: rewrite README, add LICENSE, prune internal docs#2
claygeo wants to merge 1 commit into
mainfrom
docs/readme-overhaul-and-cleanup

claygeo commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

claygeo commented Jun 14, 2026

What changed

How it was checked

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant