feat(shape): quantization-aware refinement + placeholder R-D study by Jannchie · Pull Request #1 · Jannchie/arthash

Jannchie · 2026-06-13T08:48:03Z

AS IS

arthash's shape modes fit primitives greedily and never revisit a placed shape. Separately, the placeholder formats this repo competes with (blurhash / thumbhash / sqip) have no academic evaluation, and the sub-300-byte regime they live in has no rate-distortion characterization — so "shape modes look better" rests on a couple of PSNR spot checks (README Benchmarks), not a study.

TO BE

One research line, two deliverables:

Feature — SearchOptions.refine_passes (default 0, byte-format-preserving). Optional quantization-aware joint refinement across all four primitive modes: remove each shape, re-search against the shape-removed canvas, keep the replacement only if it lowers exact total SSE under wire-quantized parameters (a continuous-domain win can flip sign after quantization). Shared via common::refine_shapes; each mode adds search_<shape> + quantize_<shape>.
Research scaffolding + write-up. scripts/paper/ (R-D benchmark, refinement ablation, entropy-coding headroom, weighted-objective PoC, encode-latency Pareto, a faithful Marwood ICIP'18 reimplementation, dataset fetch), the findings frozen in docs/RD_STUDY.md, and an ICIP draft in paper/main.tex.

Key results (Kodak + CLIC; PSNR / SSIM / LPIPS / DISTS): geometric primitives Pareto-dominate blurhash / thumbhash and a faithful Marwood reimpl on perceptual metrics — triangle-12 @77 B matches Marwood @187 B on LPIPS (2.4× smaller) and encodes 189× faster than SQIP. Three ablations (refinement, perceptual weighting, entropy coding) localize the bottleneck to primitive expressiveness, not the objective or the serialization.

Verification

125 byte-compat regression tests green; refine_passes=0 keeps output byte-identical (RNG stream untouched).
cargo clippy clean.
Marwood reimpl validated to the paper's magnitude (221 px / ~200 B → ~24 dB on simple content).

Blast radius: the only production-code change is the opt-in refinement; default-path encode/decode bytes are unchanged. Everything else is research tooling + docs (datasets and figures git-ignored).

Opt-in SearchOptions.refine_passes (default 0) runs backfitting passes after the greedy fit across all four primitive modes: remove each shape, re-search against the shape-removed canvas, keep the replacement only when it lowers exact total SSE. The accept test renders wire-quantized params, so it judges decoder output -- a continuous-domain win can flip sign under 5-bit position / 4-bit radius / RGB565 quantization. Default 0 preserves byte-identical output (RNG stream untouched); the 125 byte-compat regression tests stay green. The refinement loop is shared via common::refine_shapes; each mode supplies search_<shape> + quantize_<shape>.

Reproducible scaffolding under scripts/paper/ (R-D benchmark, refinement ablation, entropy-coding headroom, dataset fetch) and the write-up in docs/RD_STUDY.md. Key findings: shape modes are Pareto-dominant on LPIPS/DISTS below ~300 B (circle-4 at 20 B beats blurhash-9x9 at 166 B); joint refinement improves PSNR but not perception, motivating a perceptual objective next; entropy-coding headroom is <5%. Image corpora and regenerated figures are git-ignored (CSVs already were).

Adds the weighted-objective PoC (scripts/paper/perceptual_poc.py) and its negative result to RD_STUDY.md: edge / center / saliency per-pixel weighting all fail to improve LPIPS on a uniform-RNG greedy circle fitter (best is strong saliency at -0.5% LPIPS for -0.65 dB PSNR, and unstable per image). With L2 refinement, perceptual weighting, and entropy coding all bounded, the study reframes around a single thesis -- sub-300-byte placeholders are limited by primitive expressiveness, not the objective or serialization -- and the paper takes a measurement positioning.

speed_benchmark.py measures pure encode latency per method on the Kodak thumbnails and joins each method's mean LPIPS from rd_results_kodak.csv. Key result (RD_STUDY §1.1): arthash shape modes own the fast-and-perceptual lower-left corner; arthash triangle-12 encodes 189x faster than SQIP (1.5 ms vs 284 ms) at ~20x smaller output — the integral-image hill-climb is what buys it. SQIP comes from the existing same-machine js_cross benchmark (its sharp dep is broken locally); blurhash's latency is its pure-Python reference impl and is not leaned on. Also ignore bench/*.png (reproducible figures) and bench/div2k/.

marwood_baseline.py reimplements Marwood et al. "Representing Images in 200 Bytes" (no official code exists): g×g grid vertices, implicit Delaunay connectivity, K-color palette vertex indices, Gouraud fill, error-driven greedy placement + palette coordinate descent, ideal-entropy byte model (generous to the baseline). Validated to the paper's magnitude at 221px/200B. Result (RD_STUDY §1.2): on Kodak, Marwood wins PSNR (MSE-optimal Gouraud mesh) but loses LPIPS decisively -- arthash triangle-12 matches Marwood's 187-byte LPIPS at 77 bytes (2.4x smaller) and wins at every rate. Gouraud smoothing discards the structure LPIPS rewards, same failure mode as blurhash. The split is the study's strongest evidence that PSNR is the wrong metric for sub-300-byte placeholders.

Complete first draft (paper/main.tex, IEEEtran conference) written from the RD_STUDY results: primitives Pareto-dominate industrial formats and a faithful Marwood reimpl on LPIPS/DISTS, the PSNR-vs-perception split, the 189x encode- latency edge over SQIP, and the three bounding ablations. Figures are staged from bench/ per paper/README.md (paper/figures/ git-ignored).

CITATION.cff (GitHub "Cite this repository") plus a Citation section in both READMEs pointing at docs/RD_STUDY.md, scripts/paper/, and the paper/ draft. main.tex gains a Limitations paragraph (single codec, two corpora, Marwood is our reimpl, SQIP's perceptual point inferred not measured, LPIPS/DISTS are proxies).

The TS wasm binding's parse_search built SearchOptions without the new refine_passes field, breaking `cargo build` for arthash-wasm (E0063). Mirror the PyO3 binding: read it from JS, default to the SearchOptions default.

Jannchie added 7 commits June 13, 2026 16:39

Jannchie had a problem deploying to pypi June 13, 2026 08:54 — with GitHub Actions Failure

fix(wasm): thread refine_passes through SearchOptions parse

c910a94

The TS wasm binding's parse_search built SearchOptions without the new refine_passes field, breaking `cargo build` for arthash-wasm (E0063). Mirror the PyO3 binding: read it from JS, default to the SearchOptions default.

Jannchie had a problem deploying to pypi June 13, 2026 09:23 — with GitHub Actions Failure

Jannchie merged commit f529288 into main Jun 13, 2026
19 of 20 checks passed

Jannchie deleted the feature/joint-refinement branch June 13, 2026 09:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(shape): quantization-aware refinement + placeholder R-D study#1

feat(shape): quantization-aware refinement + placeholder R-D study#1
Jannchie merged 8 commits into
mainfrom
feature/joint-refinement

Jannchie commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Jannchie commented Jun 13, 2026

AS IS

TO BE

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant