docs: refresh README benchmarks to the 0.5.0 release run by heyoub · Pull Request #92 · freebatteryfactory/LiteShip

heyoub · 2026-07-01T11:48:58Z

The README benchmark snapshot was stale (run 28413170218 / commit eef9b38, still version 0.4.1). Re-pinned to run 28506238242 — the truth-linux run on the actual v0.5.0 commit 7d457936 (passed, 15m42s).

Real 0.5.0 numbers now in the README: 7/7 hard gates green; stream parse+patch overhead moved from +0.67% → -1.54% (SSE overflow added no measurable cost).

Generated via refresh-bench-snapshot.ts + docs:gen — README bench block + benchmarks/readme-snapshot.json only.

🤖 Generated with Claude Code

Summary by CodeRabbit

Documentation
- Refreshed the benchmark snapshot in the README with the latest CI artifact reference, commit, date, and performance metrics.
- Updated the benchmark snapshot data to reflect newer runtime, hard-gated pair, and diagnostic watch results.

Greptile Summary

This PR refreshes the README benchmark snapshot for the v0.5.0 CI run. The main changes are:

Updated the generated README benchmark block to CI run 28506238242.
Refreshed the hard-gated benchmark values and diagnostic watch numbers.
Updated benchmarks/readme-snapshot.json, which feeds the generated README block.

Confidence Score: 5/5

The change is limited to generated benchmark documentation and its snapshot data, with no runtime code modified.

The diff scope is narrow and matches the described documentation refresh.

T-Rex Logs

What T-Rex did

Ran the base benchmark snapshot extraction and captured the initial results, including the stale CI run, duration, and stream overhead values.
Ran the head snapshot extraction and performed docs generation with pnpm run docs:gen; JSON checks passed and docs were unchanged, but the README commit token extraction did not find 7d45793.
Fetched GitHub API metadata for run 28506238242, confirming the full head SHA begins with 7d45793.

_{Ran code and verified through T-Rex}

Comments Outside Diff (1)

General comment

README benchmark block truncates the refreshed commit one character shorter than the claimed contract
- Bug
  - The requested head contract says the README and snapshot should contain commit 7d457936. The JSON snapshot has source.commit: "7d457936", and the GitHub API confirms run 28506238242 is for full SHA 7d4579366efceb256f8d3313d1edbb1d27aba372. However, README.md line 248 renders commit 7d45793, and the executed head extraction reported no README match for 7d457936. pnpm run docs:gen exits successfully and leaves the README unchanged, so this mismatch is in the generated committed documentation block rather than a local dirty state.
- Cause
  - The README generation path appears to shorten the snapshot commit to 7 characters, while the refreshed snapshot/validation contract expects the 8-character commit token 7d457936.
- Fix
  - Update the README generation logic or committed README benchmark block so the rendered commit matches the snapshot contract (7d457936), then rerun pnpm run docs:gen and commit the resulting README if changed.
_{Ran code and verified through T-Rex}

_{Reviews (1): Last reviewed commit: "docs: refresh README benchmarks to the 0..." | Re-trigger Greptile}

The README bench snapshot was pinned to run 28413170218 / commit eef9b38, which was still at version 0.4.1 — stale numbers for a 0.5.0 release. Re-pinned to run 28506238242, the truth-linux run on the actual v0.5.0 commit 7d45793 (version 0.5.0, passed, 15m42s). Regenerated the README bench block via docs:gen. Real 0.5.0 numbers: all 7 hard gates green; `stream` parse+patch overhead moved from +0.67% to -1.54% (SSE overflow work added no measurable cost). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01KxU3Y8XueHqfteVGA4KdEh

coderabbitai · 2026-07-01T11:49:20Z

📝 Walkthrough

Walkthrough

This PR refreshes the benchmark snapshot artifacts: README.md's gauntlet benchmark section is updated with new CI run reference, commit hash, date, and performance metrics, while benchmarks/readme-snapshot.json is updated with matching source provenance, duration, hard-gated pair medians, and diagnostic watch values.

Changes

Benchmark data refresh

Layer / File(s)	Summary
Snapshot JSON data update `benchmarks/readme-snapshot.json`	Updated `source` metadata (runId, commit, capturedAt), `gauntlet.durationFormatted`, `hardGatedPairs` median directive/baseline/overhead values, and `diagnosticWatch` fields for `llm-runtime-steady`.
README benchmark snapshot text update `README.md`	Updated snapshot attribution line, refreshed `gauntlet:full`/`bench:gate`/`package:smoke` results and hard-gated pair table, and updated the Diagnostic watch paragraph.

Estimated code review effort: 1 (Trivial) | ~3 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: refreshing the README benchmark snapshot for the 0.5.0 release run.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch docs/refresh-bench-0.5.0

_{Comment @coderabbitai help to get the list of available commands.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 675da755bb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-07-01T11:53:00Z

+| `llm` text chunk parse | 823,421ns | 775,031ns | 6.20% | 15% |
+| `worker` fallback eval | 2,982ns | 2,832ns | 5.33% | 15% |
+| `llm-startup-shared` | 76,604ns | 75,313ns | 1.96% | 25% |
+| `llm-promoted-startup-shared` | 244,095ns | 251,687ns | 2.58% | 25% |


Make the promoted-startup overhead internally consistent

For this refreshed snapshot, this row reports a 244,095ns median directive against a 251,687ns median baseline but still shows 2.58% overhead; the displayed medians imply roughly -3.02%, so the public release benchmark table now says this path is both faster and slower than baseline. If 2.58% is the intended median-of-per-replicate overhead, the row needs enough context or values from the same aggregate so operators do not read the 0.5.0 snapshot as a contradictory result.

Useful? React with 👍 / 👎.

chatgpt-codex-connector Bot reviewed Jul 1, 2026

View reviewed changes

heyoub merged commit 75bf50b into main Jul 1, 2026
11 checks passed

heyoub deleted the docs/refresh-bench-0.5.0 branch July 1, 2026 12:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: refresh README benchmarks to the 0.5.0 release run#92

docs: refresh README benchmarks to the 0.5.0 release run#92
heyoub merged 1 commit into
mainfrom
docs/refresh-bench-0.5.0

heyoub commented Jul 1, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

coderabbitai Bot commented Jul 1, 2026 •

edited

Loading

Walkthrough

Changes

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

heyoub commented Jul 1, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Greptile Summary

Confidence Score: 5/5

T-Rex Logs

Comments Outside Diff (1)

Uh oh!

coderabbitai Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

heyoub commented Jul 1, 2026 •

edited by greptile-apps Bot

Loading

coderabbitai Bot commented Jul 1, 2026 •

edited

Loading