Skip to content

ChrisTorresLugo/algorithmica-hpc-pdf

Repository files navigation

Algorithmica HPC → Print PDF

Reproducibly turn the open book Algorithms for Modern Hardware by Sergey Slotin into a print-ready US-Letter PDF, suitable for a print shop.

It fetches the book's markdown source from the algorithmica GitHub repo at a pinned commit and renders it through Pandoc + xelatex.

What you get

A single PDF (algorithms-for-modern-hardware.pdf) with a table of contents, numbered chapters/sections, running headers, page numbers, native LaTeX math, and syntax-highlighted code with long lines wrapped to the page.

License & attribution (read before printing)

The source repository has no explicit open-source license; the site footer states "Copyright 2021–2022 Sergey Slotin." The generated PDF includes an attribution page. A personal print copy is generally fine, but redistribution or commercial printing rights are unclear — confirm with the author first.

Option A — Docker (recommended, no local LaTeX)

docker build -t algorithmica-pdf .
mkdir -p out
docker run --rm -v "$PWD/out:/book/out" algorithmica-pdf --output out/algorithms-for-modern-hardware.pdf
# PDF appears in ./out/

Option B — Local toolchain

Prerequisites:

  • Python 3.9+

  • Pandoc (3.x)

  • A TeX distribution with xelatex and the packages framed fvextra fancyhdr xurl microtype (TeX Live / MacTeX). framed is needed by Pandoc's highlighted-code environment and fvextra wraps long code lines:

    • macOS (full MacTeX has them; with the smaller BasicTeX install them via sudo tlmgr install framed fvextra xurl microtype): brew install pandoc && brew install --cask mactex-no-gui
    • Debian/Ubuntu (these packages ship in texlive-latex-extra): apt-get install pandoc texlive-xetex texlive-latex-extra
  • rsvg-convert (from librsvg2-bin) for SVG figures; ImageMagick (magick / convert) or macOS sips for converting GIF/WEBP/PPM figures; and a monospace font with box-drawing glyphs (DejaVu Sans Mono or macOS Menlo) so code-block diagrams render — pass --monofont to choose one explicitly. Docker bundles all of these automatically.

If a render fails with File '<name>.sty' not found, install that package (tlmgr install <name>) or just use the Docker option above, which bundles everything.

Known limitation: Cyrillic body text

The default body font (Latin Modern) does not include Cyrillic, so the book's few Russian phrases (e.g. a "Алгоритм Шора" link label) render blank. To include them, pass a Cyrillic-capable body font, e.g. --mainfont "DejaVu Serif" (bundled in the Docker image). This is left off by default to preserve the classic Latin Modern typography.

Then:

pip install -r requirements.txt
python build_book.py

Usage

python build_book.py [options]

--ref REF              Source commit SHA or branch (default: pinned commit)
--output FILE          Output PDF path (default: algorithms-for-modern-hardware.pdf)
--papersize {letter,a4}  Page size (default: letter)
--exclude NAME ...     Directory names / file basenames to skip (default: slides)
--cache-dir DIR        Download/intermediate dir (default: .cache)
--keep-intermediate    Keep the assembled master.md
--dry-run              Assemble markdown only; skip PDF rendering (no LaTeX needed)
--monofont NAME        Monospace font for code blocks (default: auto-detect a box-drawing-capable font like DejaVu Sans Mono / Menlo)
--mainfont NAME        Body font (default: Latin Modern); pass a Cyrillic-capable font for Russian text

How it works

  1. fetch — downloads the repo tarball at --ref and extracts content/english/hpc/ (markdown and images) into the cache.
  2. discover — parses each file's YAML front matter into an ordered part → chapter → section model (sorted by weight).
  3. preprocess — strips Hugo shortcodes and raw <style>/<script>, rewrites image paths to absolute local paths, rewrites relative links to absolute site URLs, and shifts heading levels — all while never touching fenced code blocks.
  4. assemble — concatenates into one master markdown with a metadata block and \part{} dividers.
  5. render — runs pandoc --pdf-engine=xelatex --top-level-division=chapter with assets/header.tex.

Development

pip install -r requirements.txt pytest
python -m pytest tests/ -v

Reproducibility

The source commit (--ref) and the Docker toolchain are pinned, so the same inputs produce the same PDF. Use --ref master to build from the latest source.

About

Reproducibly generate a print-ready PDF of the Algorithmica book 'Algorithms for Modern Hardware' (Pandoc + xelatex).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors