A PDF reader and renderer written in Rust. safe-pdf provides a robust, memory-safe, and extensible foundation for working with PDF files, ideal for both end-users and developers.
Status: this project is under active, heavy development (pre-alpha; APIs may change without notice). Contributions, feedback, and issue reports are very welcome.
safe-pdf is organized as a modular Rust monorepo, with each core PDF concept implemented as a separate crate.
- pdf-tokenizer: Lexical analysis of PDF byte streams into tokens.
- pdf-parser: Syntactic parsing of tokens into PDF objects and structures.
- pdf-object: In-memory representation of all PDF object types (dictionaries, arrays, streams, etc.).
- pdf-document: High-level API for loading, validating, and traversing PDF documents.
- pdf-page: Page tree, page objects, and resource management.
- pdf-content-stream: Parsing and dispatching PDF drawing/text operators.
- pdf-canvas: Abstracts 2D drawing operations, delegating to a backend.
- pdf-graphics, pdf-graphics-skia, pdf-graphics-femtovg: Rendering backends for different graphics engines.
- pdf-renderer: Handles rendering of PDF pages using a chosen backend.
- pdf-font: Font parsing, encoding, glyph access (Type1/TrueType/Type3).
- Input: PDF file bytes
- Tokenization:
pdf-tokenizer→ tokens - Parsing:
pdf-parser→ PDF objects - Object Model:
pdf-object→ in-memory structure - Document API:
pdf-document→ high-level access - Page Handling:
pdf-page→ page tree, resources - Content Stream:
pdf-content-stream→ operator dispatch - Canvas Abstraction:
pdf-canvas→ drawing commands - Rendering:
pdf-graphics-*→ pixels on screen or image
flowchart TD
A[PDF File] --> B[pdf-tokenizer]
B --> C[pdf-parser]
C --> D[pdf-object]
D --> E[pdf-document]
E --> F[pdf-page]
F --> G[pdf-content-stream]
G --> H[pdf-canvas]
H --> I[pdf-graphics-skia/femtovg]
I --> J[Display/Output]
Project directory layout:
safe-pdf/
├── AGENTS.md
├── Cargo.toml
├── README.md
├── crates/
│ ├── pdf-canvas/ # 2D drawing abstraction and stateful canvas API
│ ├── pdf-content-stream/ # PDF content stream operator parsing and dispatch
│ ├── pdf-document/ # High-level PDF document API
│ ├── pdf-font/ # Font parsing and management
│ ├── pdf-graphics/ # Common graphics types (color, transform, etc.)
│ ├── pdf-graphics-femtovg/ # FemtoVG rendering backend
│ ├── pdf-graphics-skia/ # Skia rendering backend
│ ├── pdf-object/ # PDF object model (dictionaries, arrays, etc.)
│ ├── pdf-page/ # Page tree, page objects, resources
│ ├── pdf-parser/ # PDF syntax parser
│ ├── pdf-postscript/ # (Optional) PostScript support
│ ├── pdf-renderer/ # High-level rendering orchestration
│ ├── pdf-tokenizer/ # Tokenizer for PDF byte streams
├── examples/ # Example applications (Skia, FemtoVG)
│ ├── skia.rs
│ ├── femtovg.rs
│ └── Cargo.toml
└── target/ # Build output
- Layered, Decomposed Architecture: Clear separation between tokenization, parsing, object modeling, document semantics, page/resource resolution, operator dispatch, and rendering keeps concerns orthogonal and testable.
- Monorepo of Focused Crates: Each conceptual layer lives in its own crate (e.g.
pdf-tokenizer,pdf-parser,pdf-object), enabling incremental compilation, targeted benchmarks, and reuse in non‑rendering contexts (indexers, validators, analyzers). - Pluggable Operator Handling: Content stream operators are dispatched via traits, so you can substitute a renderer with: (a) a metrics collector, (b) a static analyzer, or (c) a custom export (SVG, canvas, etc.) without forking core logic.
- Backend Agnosticism via
CanvasBackend: Rendering pipelines interact only with an abstract canvas; Skia / FemtoVG backends demonstrate how GPU / vector engines can be integrated with minimal glue. - Error Handling Discipline: Rich domain errors (using
thiserrorinside crates) +Resulteverywhere; workspace Clippy configuration forbidsunwrap/expect, reducing accidental panics. - Safety First (
unsafe_codeforbidden): The workspace lints disallowunsafeby default. Any future exception must be narrowly scoped and justified in docs. - Extensible Font System:
pdf-fontisolates font decoding (Type1 / TrueType / Type3 WIP) so shaping / caching strategies can evolve independently of rendering. - Predictable Rendering Pipeline:
PdfRendererorchestrates: page resource resolution → content stream execution → backend drawing; easy insertion points for caching or preflight stages. - Testing Strategy: Unit tests live close to logic in each crate; cross‑crate integration & rendering behavioral tests will accumulate in a higher-level test harness (planned) to diff raster/command outputs.
- Instrumentation Friendly: Because operator visitation is trait-based, adding logging / tracing / telemetry layers does not require modifying PDF interpretation logic.
- Minimal Global State: State (graphics, text, resources) is threaded explicitly through contexts to simplify future concurrency and parallel page rendering.
- Clarity Over Cleverness: Prefer small, explicit functions and well-named types over macro indirection; easier for contributors new to PDF internals.
Clone the repository and run tests:
git clone https://github.com/Velli20/safe-pdf.git
cd safe-pdf
cargo testThe examples/ workspace member contains runnable showcase applications. Currently two rendering backends are available behind feature flags: Skia (OpenGL) and FemtoVG (wgpu).
Sample PDFs used for experimentation live in examples/assets:
examples/assets/
Gradients.pdf
PlaygroundMDN.pdf
RadialGradientFills.pdf
W3Schools.pdf
is.pdf
test6.pdf
webgl.pdf
This launches an OpenGL + Skia window. Pass the path to a PDF (relative or absolute) as the final argument. Use Up / Down arrow keys to change pages.
cargo run --example skia --features skia -- examples/assets/webgl.pdfFemtoVG + wgpu prototype (may be less feature complete):
cargo run --example femtovg --features femtovgThe FemtoVG example embeds a small PDF internally (see femtovg.rs). You can adapt it to load external files similarly to skia.rs.
Implement the CanvasBackend trait (see pdf-canvas) and create a new crate similar to pdf-graphics-skia. Then expose it behind a feature flag in examples/Cargo.toml so it can be opt‑in at runtime.
Contributions are welcome! Please see CONTRIBUTING.md for guidelines, or open an issue to discuss your ideas.
This project is licensed under the MIT License. See LICENSE for details.
SPDX-License-Identifier: MIT