Skip to content

perf(engine): paragraph operator dedup + table cell sanitize-once (F5+F6)#144

Merged
DemchaAV merged 3 commits into
developfrom
perf/render-span-and-table-cell
Jun 8, 2026
Merged

perf(engine): paragraph operator dedup + table cell sanitize-once (F5+F6)#144
DemchaAV merged 3 commits into
developfrom
perf/render-span-and-table-cell

Conversation

@DemchaAV

@DemchaAV DemchaAV commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Two render-path efficiency optimizations. F6 is byte-identical; F5 is visually identical (emits fewer content-stream operators, same rendering) — guarded by the visual-regression suite plus a content-stream operator-count test.

F5 — paragraph render writes font/colour operators only on change

The paragraph render handler emitted a setFont (Tf) and setNonStrokingColor (rg) operator for every text span, even across the spans of a single-style paragraph (a long body wraps into many spans → as many redundant operator pairs). It now tracks the last-written (font, size) and colour across the paragraph's q…Q block and re-emits only on a real change, invalidating after inline images/shapes (which run their own graphics-state ops).

  • Rendered output is unchanged — the skipped operators were redundant no-ops. Guarded by the visual-regression suite plus ParagraphTextStateDedupTest, which renders a one-page single-style paragraph and asserts exactly one Tf across multiple drawn spans (Tj/TJ).

F6 — table cell text sanitized once per cell instead of three times

Resolving a table ran each cell's lines through sanitizeCellLines separately in the natural-width, natural-height and resolve passes — rebuilding the list and its per-line control-character cleanup up to three times per cell. The sanitized lines are now computed once when the logical grid is built (LogicalCell.sanitizedLines) and reused by all three passes.

  • Byte-identical (sanitization is deterministic) — covered by the existing table snapshot/visual tests. Removes the dominant per-cell layout allocation on large tables.

Verification

  • ./mvnw verify -pl . — BUILD SUCCESS, 1146 tests, 0 failures (checkstyle + SpotBugs + javadoc).

Deferred (follow-up)

  • F6 render side: carry the resolved per-line cell width forward so PdfTableRowFragmentRenderHandler stops re-measuring at render — needs a TableResolvedCell model change, tracked separately (the render-side sanitizeLines also trims, so it is not a pure duplicate of the layout sanitize).

DemchaAV added 2 commits June 9, 2026 00:18
The paragraph render handler wrote a setFont (Tf) and setNonStrokingColor (rg)
operator for every text span, even across the spans of a single-style paragraph.
Track the last-written (font, size) and colour across the paragraph's q...Q block
and re-emit only on a real change, invalidating after inline images/shapes; a
multi-span single-style paragraph now carries one Tf + one rg instead of one pair
per span.

Rendered output is unchanged (the skipped operators were redundant). Guarded by
the visual-regression suite plus ParagraphTextStateDedupTest, which asserts a
single-style paragraph emits one Tf across many drawn spans and that a multi-style
paragraph re-emits on each style change.

Finding 5.
Resolving a table ran each cell's lines through sanitizeCellLines separately in
the natural-width, natural-height and resolve passes, rebuilding the list and its
per-line control-character cleanup up to three times per cell. Compute the
sanitized lines once when the logical grid is built (LogicalCell.sanitizedLines)
and reuse them across all three passes.

Output is byte-identical (sanitization is deterministic); on a large table this
removes the dominant per-cell layout allocation. Covered by the existing table
snapshot/visual tests.

Finding 6.
@DemchaAV DemchaAV force-pushed the perf/render-span-and-table-cell branch from 156d97a to 6f2f7c9 Compare June 8, 2026 23:18
@DemchaAV DemchaAV merged commit 031db67 into develop Jun 8, 2026
11 checks passed
@DemchaAV DemchaAV deleted the perf/render-span-and-table-cell branch June 8, 2026 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant