fix: resolve LzSeqR parallel encode routing bug by ChrisLundquist · Pull Request #120 · ChrisLundquist/libpz

ChrisLundquist · 2026-03-11T00:33:08Z

Summary

Fix LzSeqR encode routing bug: The parallel scheduler routed LzSeqR entropy (stage 1) to stage_rans_encode_webgpu which uses a chunked-payload wire format incompatible with the standard rANS decoder. The single-block and single-thread paths correctly used stage_rans_encode_with_options (CPU rANS). This caused InvalidInput errors when compressing with LzSeqR + WebGpu backend + threads > 1.
Remove dead stage_rans_encode_webgpu: No callers remain after the fix. GPU rANS entropy is known to be slower than CPU (0.54-0.77x), so re-enabling is low priority.
Add 6-stream regression test: test_gpu_rans_interleaved_decode_lzseqr_6stream verifies LzSeqR round-trips correctly with GPU backend and multi-threading.
Update TODO docs: Mark GPU rANS bug as resolved with root cause analysis. Add Criterion benchmark data to Lzfi vs LzssR comparison (Lzfi dominates: 543 vs 333 MB/s compress).

Test plan

696 tests pass, clippy clean, zero warnings
New test test_gpu_rans_interleaved_decode_lzseqr_6stream passes
Existing test_gpu_rans_interleaved_decode_round_trip (LzssR) still passes
Criterion benchmarks show no regressions

🤖 Generated with Claude Code

The parallel scheduler routed LzSeqR stage 1 (entropy) to stage_rans_encode_webgpu which uses a chunked-payload wire format incompatible with the standard rANS decoder. Single-block and single-thread paths used stage_rans_encode_with_options (CPU rANS) correctly. Fixed by removing the GPU routing for LzSeqR entropy, matching the consistent CPU path. Removed the now-dead stage_rans_encode_webgpu function. Added test_gpu_rans_interleaved_decode_lzseqr_6stream to catch regressions. Updated TODO docs with investigation findings and Criterion benchmark data for Lzfi vs LzssR comparison. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add architecture section documenting the unified token pipeline (PR #118), active/removed pipelines table, and Silesia corpus benchmark data. Update project layout to reflect lz_token.rs and removed modules. Update dead ends with streaming path bottleneck finding and LzSeqR routing bug (PR #120). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* bench: enable parallel, large, and webgpu benchmarks for Lzfi Lzfi was only benchmarked on the small Canterbury corpus with no parallel, large-file, or WebGPU variants. Enable all modes to match the LzSeqR and Lzf benchmark coverage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: update CLAUDE.md with architecture overview and Silesia benchmarks Add architecture section documenting the unified token pipeline (PR #118), active/removed pipelines table, and Silesia corpus benchmark data. Update project layout to reflect lz_token.rs and removed modules. Update dead ends with streaming path bottleneck finding and LzSeqR routing bug (PR #120). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

ChrisLundquist merged commit 7593373 into master Mar 11, 2026
4 checks passed

ChrisLundquist mentioned this pull request Mar 12, 2026

docs: update CLAUDE.md and enable Lzfi benchmarks #123

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: resolve LzSeqR parallel encode routing bug#120

fix: resolve LzSeqR parallel encode routing bug#120
ChrisLundquist merged 1 commit intomasterfrom
claude/investigate-todos

ChrisLundquist commented Mar 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ChrisLundquist commented Mar 11, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant