perf(engine): paragraph operator dedup + table cell sanitize-once (F5+F6)#144
Merged
Conversation
The paragraph render handler wrote a setFont (Tf) and setNonStrokingColor (rg) operator for every text span, even across the spans of a single-style paragraph. Track the last-written (font, size) and colour across the paragraph's q...Q block and re-emit only on a real change, invalidating after inline images/shapes; a multi-span single-style paragraph now carries one Tf + one rg instead of one pair per span. Rendered output is unchanged (the skipped operators were redundant). Guarded by the visual-regression suite plus ParagraphTextStateDedupTest, which asserts a single-style paragraph emits one Tf across many drawn spans and that a multi-style paragraph re-emits on each style change. Finding 5.
Resolving a table ran each cell's lines through sanitizeCellLines separately in the natural-width, natural-height and resolve passes, rebuilding the list and its per-line control-character cleanup up to three times per cell. Compute the sanitized lines once when the logical grid is built (LogicalCell.sanitizedLines) and reuse them across all three passes. Output is byte-identical (sanitization is deterministic); on a large table this removes the dominant per-cell layout allocation. Covered by the existing table snapshot/visual tests. Finding 6.
156d97a to
6f2f7c9
Compare
…nd-table-cell # Conflicts: # CHANGELOG.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two render-path efficiency optimizations. F6 is byte-identical; F5 is visually identical (emits fewer content-stream operators, same rendering) — guarded by the visual-regression suite plus a content-stream operator-count test.
F5 — paragraph render writes font/colour operators only on change
The paragraph render handler emitted a
setFont(Tf) andsetNonStrokingColor(rg) operator for every text span, even across the spans of a single-style paragraph (a long body wraps into many spans → as many redundant operator pairs). It now tracks the last-written(font, size)and colour across the paragraph'sq…Qblock and re-emits only on a real change, invalidating after inline images/shapes (which run their own graphics-state ops).ParagraphTextStateDedupTest, which renders a one-page single-style paragraph and asserts exactly oneTfacross multiple drawn spans (Tj/TJ).F6 — table cell text sanitized once per cell instead of three times
Resolving a table ran each cell's lines through
sanitizeCellLinesseparately in the natural-width, natural-height and resolve passes — rebuilding the list and its per-line control-character cleanup up to three times per cell. The sanitized lines are now computed once when the logical grid is built (LogicalCell.sanitizedLines) and reused by all three passes.Verification
./mvnw verify -pl .— BUILD SUCCESS, 1146 tests, 0 failures (checkstyle + SpotBugs + javadoc).Deferred (follow-up)
PdfTableRowFragmentRenderHandlerstops re-measuring at render — needs aTableResolvedCellmodel change, tracked separately (the render-sidesanitizeLinesalso trims, so it is not a pure duplicate of the layout sanitize).