SlideSmith Vision

Bridge visual slide reconstruction into SlideSmith.

SlideSmith turns HTML slides into editable .pptx files. SlideSmith Vision sits one step upstream: it converts a simple visual reconstruction spec into SlideSmith-compatible HTML, so screenshots, AI-generated slide mockups, or vision/OCR outputs can become editable PowerPoint/WPS decks through the existing SlideSmith engine.

How It Fits With SlideSmith

flowchart LR
  A["slide image / screenshot / AI mockup"] --> B["vision + OCR + layout reconstruction"]
  B --> C["SlideSmith Vision spec"]
  C --> D["SlideSmith HTML"]
  D --> E["SlideSmith"]
  E --> F["editable .pptx for WPS / PowerPoint / Keynote"]

SlideSmith should stay focused on one job: HTML to editable PPTX.

SlideSmith Vision owns the upstream reconstruction layer:

image/OCR/layout output normalization
hybrid editability decisions
chart or figure raster fallbacks
generating SlideSmith-compatible HTML
example specs for screenshot-to-PPTX workflows

Usage

slidesmith-vision <spec.json> -o <output.html>

Converts a reconstruction spec into SlideSmith-compatible HTML (then editable PPTX via SlideSmith). See examples/ for spec formats.

Current Scope

This repository currently provides a small spec -> HTML bridge. It does not try to solve full automatic OCR or layout inference yet.

Supported spec elements:

editable text boxes
rectangles, rounded rectangles, ovals, and simple triangles
straight lines
raster image fallbacks — local image paths (relative to the spec file) are inlined as base64 data URIs, so the generated HTML is self-contained and converts anywhere (see examples/with-image/)

Quick Start

Generate SlideSmith HTML:

npm test
# writes /tmp/slidesmith-vision-basic.html

Or run directly:

node bin/cli.mjs examples/basic/spec.json -o /tmp/basic.html

Then convert with SlideSmith:

node ../slidesmith/bin/cli.mjs /tmp/basic.html -o /tmp/basic.pptx --no-fonts

Spec Format

{
  "canvas_width": 1920,
  "canvas_height": 1080,
  "slides": [
    {
      "background": "#ffffff",
      "elements": [
        {
          "type": "text",
          "x": 120,
          "y": 90,
          "w": 900,
          "h": 80,
          "text": "Editable title",
          "font_size": 42,
          "font_face": "Arial",
          "color": "#111111",
          "bold": true
        }
      ]
    }
  ]
}

Coordinates are source-canvas pixels. The generated HTML uses the same pixel canvas, which lets the browser layout engine and SlideSmith preserve positions.

Reconstruction Policy

Do not force everything into vector objects. Use a hybrid strategy:

keep readable text editable
rebuild simple cards, labels, pills, and diagram nodes as shapes
preserve dense charts, photos, screenshots, heatmaps, and complex illustrations as raster crops
record source image paths for traceability

This matches the practical WPS workflow: key text can be edited, while complex visuals stay visually faithful.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
bin		bin
examples		examples
lib		lib
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SlideSmith Vision

How It Fits With SlideSmith

Usage

Current Scope

Quick Start

Spec Format

Reconstruction Policy

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SlideSmith Vision

How It Fits With SlideSmith

Usage

Current Scope

Quick Start

Spec Format

Reconstruction Policy

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages