docs: refresh README benchmarks to the 0.5.0 release run#92
Conversation
The README bench snapshot was pinned to run 28413170218 / commit eef9b38, which was still at version 0.4.1 — stale numbers for a 0.5.0 release. Re-pinned to run 28506238242, the truth-linux run on the actual v0.5.0 commit 7d45793 (version 0.5.0, passed, 15m42s). Regenerated the README bench block via docs:gen. Real 0.5.0 numbers: all 7 hard gates green; `stream` parse+patch overhead moved from +0.67% to -1.54% (SSE overflow work added no measurable cost). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01KxU3Y8XueHqfteVGA4KdEh
📝 WalkthroughWalkthroughThis PR refreshes the benchmark snapshot artifacts: README.md's gauntlet benchmark section is updated with new CI run reference, commit hash, date, and performance metrics, while benchmarks/readme-snapshot.json is updated with matching source provenance, duration, hard-gated pair medians, and diagnostic watch values. ChangesBenchmark data refresh
Estimated code review effort: 1 (Trivial) | ~3 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 675da755bb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| | `llm` text chunk parse | 823,421ns | 775,031ns | 6.20% | 15% | | ||
| | `worker` fallback eval | 2,982ns | 2,832ns | 5.33% | 15% | | ||
| | `llm-startup-shared` | 76,604ns | 75,313ns | 1.96% | 25% | | ||
| | `llm-promoted-startup-shared` | 244,095ns | 251,687ns | 2.58% | 25% | |
There was a problem hiding this comment.
Make the promoted-startup overhead internally consistent
For this refreshed snapshot, this row reports a 244,095ns median directive against a 251,687ns median baseline but still shows 2.58% overhead; the displayed medians imply roughly -3.02%, so the public release benchmark table now says this path is both faster and slower than baseline. If 2.58% is the intended median-of-per-replicate overhead, the row needs enough context or values from the same aggregate so operators do not read the 0.5.0 snapshot as a contradictory result.
Useful? React with 👍 / 👎.
The README benchmark snapshot was stale (run
28413170218/ commiteef9b38, still version 0.4.1). Re-pinned to run28506238242— the truth-linux run on the actual v0.5.0 commit7d457936(passed, 15m42s).Real 0.5.0 numbers now in the README: 7/7 hard gates green;
streamparse+patch overhead moved from +0.67% → -1.54% (SSE overflow added no measurable cost).Generated via
refresh-bench-snapshot.ts+docs:gen— README bench block +benchmarks/readme-snapshot.jsononly.🤖 Generated with Claude Code
Summary by CodeRabbit
Greptile Summary
This PR refreshes the README benchmark snapshot for the v0.5.0 CI run. The main changes are:
28506238242.benchmarks/readme-snapshot.json, which feeds the generated README block.Confidence Score: 5/5
The change is limited to generated benchmark documentation and its snapshot data, with no runtime code modified.
The diff scope is narrow and matches the described documentation refresh.
What T-Rex did
Comments Outside Diff (1)
General comment
7d457936. The JSON snapshot hassource.commit: "7d457936", and the GitHub API confirms run28506238242is for full SHA7d4579366efceb256f8d3313d1edbb1d27aba372. However,README.mdline 248 renders commit7d45793, and the executed head extraction reported no README match for7d457936.pnpm run docs:genexits successfully and leaves the README unchanged, so this mismatch is in the generated committed documentation block rather than a local dirty state.7d457936.7d457936), then rerunpnpm run docs:genand commit the resulting README if changed.Reviews (1): Last reviewed commit: "docs: refresh README benchmarks to the 0..." | Re-trigger Greptile