Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,17 @@ JitPack continue to resolve through the existing coordinates.
`./mvnw -DskipTests -P japicmp verify -pl .`; HTML/MD/XML reports
land in `target/japicmp/`. JitPack repository is scoped to the
`japicmp` profile, so downstream consumers do not inherit it.
- **New `benchmarks/README.md`** (Track B1). Honest framing for the
manual benchmark layer ahead of the Maven Central debut: explicitly
positions the harness as a smoke / diff / endurance tool — not a
JMH-grade benchmark — and tells callers when *not* to use it
(publishable performance claims, architectural decisions,
cross-library comparisons that read too much into a single number).
Documents the file-by-file role of each runner / report tool, the
exact CI smoke invocation, and a "How to read a report" cheat sheet.
Cross-links the planned JMH chain (Track C, B3 → B6 in 1.7.0) so a
reader knows what's coming and how to identify "rigorous"
measurements when they arrive.
- **Class-level `@since 1.0.0` Javadoc on the public entry-point
surface** (Track H1). 26 public types in the canonical user-reached
packages (`com.demcha.compose.GraphCompose`, `com.demcha.compose.document.api.{DocumentSession, DocumentPageSize, PageBackgroundFill}`,
Expand Down
134 changes: 134 additions & 0 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# GraphCompose Benchmarks Module

> **What this is.** A **manual performance harness** for GraphCompose —
> a small set of Java programs that render representative documents
> repeatedly and report rough numbers (latency, throughput, byte size,
> peak memory) to a JSON / CSV / text report.
>
> **What this is _not_.** A JMH-grade benchmark. There is no warmup
> control, no forked JVM, no per-measurement reset, no GC profiling
> beyond what JFR / `-verbose:gc` can pick up out-of-band. Numbers
> produced here are **rough local comparisons** suitable for "did this
> change regress something obviously?" — not for public marketing
> claims, not for cross-machine performance comparisons, and not for
> answering "how does GraphCompose compare to iText / openHTMLToPDF /
> JasperReports?" with rigour.
>
> A separate JMH layer (sibling chain Track C: B3 → B4 → B5 → B6 in the
> 1.7.0 plan) will sit alongside this harness when it lands. Until
> then, treat these numbers as **smoke-test fidelity, not benchmark
> fidelity**.

## When to use the harness

- **Smoke check before a release** — `CurrentSpeedBenchmark -Dgraphcompose.benchmark.profile=smoke`
takes ~15 s, exercises the canonical render path through 5 fixture
scenarios, and prints a single-page latency / throughput table.
CI runs this on every PR (the `perf-smoke` job); the goal is "did
this PR make a representative render visibly slower?" — *not* "is
this number a publishable performance claim".

- **Pre/post comparison on a single machine** — render a fixture
before and after a layout change, run `BenchmarkDiffTool` against
the two JSON reports, eyeball the delta. Variance per run is in
single-digit percent; treat deltas inside ±5 % as noise on the
default machine and tighten the threshold only when comparing on a
quiescent system with a fixed CPU frequency.

- **Stress / endurance check** — `GraphComposeStressTest` and
`EnduranceTest` drive higher-cardinality fixtures over longer
windows to catch GC pressure spikes or memory leaks that a single
smoke run wouldn't surface. Run by hand; not on CI by default.

## When **not** to use the harness

- For a **published "X% faster than Y" claim** of any kind — the
numbers are not statistically rigorous and the comparison setup is
not reproducible across machines / JDKs.
- For **deciding between two architecturally different approaches** —
pick the right invariant (allocation count, big-O of the algorithm,
layout-pass count) and reason about it; the harness is a sanity
check after you've already chosen, not a decision tool before.
- For **comparing GraphCompose to another PDF library** —
`ComparativeBenchmark` does render the same fixture through iText /
openHTMLToPDF / JasperReports for rough sizing, but the comparison
is a manual smoke test: each library has different defaults
(compression, font embedding, image resampling) and reading too much
into a single number is the wrong call.

## Files in this module

| File | Role |
|---|---|
| `CurrentSpeedBenchmark` | Default scenario runner — what CI's `perf-smoke` job exercises. Takes a `-Dgraphcompose.benchmark.profile=smoke\|full\|stress` switch. |
| `ComparativeBenchmark` | Renders the same fixtures through GraphCompose, iText, openHTMLToPDF, JasperReports. **Rough local comparison only** — see "When not to use" above. |
| `FullCvBenchmark`, `ScalabilityBenchmark` | Fixture-specific runners for CV and table-heavy scenarios. |
| `CanonicalBenchmarkSupport`, `BenchmarkSupport` | Shared fixture builders + measurement helpers. |
| `BenchmarkReportWriter` | Writes JSON / CSV / text reports under `benchmarks/target/benchmarks/`. |
| `BenchmarkDiffTool` | Compares two JSON reports and prints a delta table. Useful for pre/post comparisons. |
| `BenchmarkMedianTool` | Median + dispersion across N runs of the same scenario. |
| `GraphComposeStressTest`, `EnduranceTest` | Long-running stress / endurance harnesses. |
| `GraphComposeBenchmark` | Legacy entry point preserved for one downstream caller. New work should target `CurrentSpeedBenchmark`. |

## Running

From the repo root:

```bash
# Smoke profile (~15s) — what CI runs on every PR
./mvnw -B -ntp -f benchmarks/pom.xml -DskipTests \
exec:java \
-Dexec.mainClass=com.demcha.compose.CurrentSpeedBenchmark \
-Dgraphcompose.benchmark.profile=smoke

# Diff two existing report runs under the same scenario
./mvnw -B -ntp -f benchmarks/pom.xml -DskipTests \
exec:java \
-Dexec.mainClass=com.demcha.compose.BenchmarkDiffTool \
-Dexec.args="current-speed"
```

Reports land in `benchmarks/target/benchmarks/<scenario>/`. The CI
`perf-smoke` job uploads the smoke directory as an artifact for every
PR run, so a regression can be diffed against the previous PR's run
without reproducing locally.

## How to read a report

The JSON shape is intentionally simple — a top-level run record with
per-scenario sub-records. Each sub-record carries:

- `avgMs`, `p50Ms`, `p95Ms`, `maxMs` — latency distribution across
iterations within the run.
- `docsPerSec` — rough throughput; **not statistically rigorous**,
intended only as a relative number against a sibling scenario or a
previous run on the same machine.
- `avgKB` — average output byte size. Stable across runs on the same
fixture; useful for catching content corruption (size shifts by
> a few hundred bytes are usually a bug, not a benchmark fluctuation).
- `peakMB` — peak heap as observed by `MemoryMXBean`; coarse, do not
use for memory-budget enforcement.

## Roadmap

The 1.7.0 plan (Track C, B3 → B4 → B5 → B6) introduces a sibling JMH
layer:

- **B3** — pull fixtures into a `fixtures/` package with deterministic
seeds so the JMH layer can reuse them.
- **B4** — JMH infrastructure (`jmh-core`, `jmh-generator-annprocess`,
shade plugin) + first benchmark (`SimpleDocumentJmhBenchmark`).
- **B5** — Invoice / CV / LargeTable / PdfRender JMH benchmarks.
- **B6** — CI job that runs the JMH layer on a `workflow_dispatch` /
weekly cadence and uploads `*.json` reports as artifacts.

Once that chain is in place, any *public* performance claim should
quote the JMH layer's numbers, with explicit warmup / measurement /
fork configuration in the source. This manual harness will stay for
the smoke / diff / endurance roles described above.

---

*This page is the source of truth for what the manual benchmark layer
is and is not. When in doubt — and especially before quoting a number
in a public communication — re-read the "When not to use" section.*