Skip to content

Releases: scribeocr/scribe.js

v0.9.3

15 Nov 07:28

Choose a tag to compare

What's Changed

  • Fixed bug causing text layer in PDF exports to be broken (#58)
    • This issue impacts all PDFs created with two patch releases from the last ~week (0.9.1 and 0.9.2). Anybody using those versions should update ASAP.

Full Changelog: v0.9.2...v0.9.3

v0.9.2

14 Nov 04:42

Choose a tag to compare

What's Changed

  • Fixed bug causing crash on single-core systems (#56)
  • Updated scribe.opt.workerN option to cap workers created for PDF rendering

Full Changelog: v0.9.1...v0.9.2

v0.9.1

07 Nov 07:28

Choose a tag to compare

What's Changed

  • Various updates to experimental and debugging-related features.
    • None of the documented features should change with this release.

Full Changelog: v0.9.0...v0.9.1

v0.9.0

08 Sep 08:15

Choose a tag to compare

What's Changed

  • Added URW Gothic font
  • Added Deno support
  • Updated .html export format
    • This format contains a .html file that should closely resemble the original document.
    • This should be useful for converting .pdf files to a format that can be displayed natively in the browser.
  • Added experimental .txt import format
    • For obvious reasons, importing .txt files will not work with most operations.
    • This mode is currently exclusively useful for development/debugging purposes and making basic .pdf files from .txt files.
  • Performance improvements to PDF exports
  • Various refactoring and minor updates.

Full Changelog: v0.8.0...v0.9.0

v0.8.0

09 Mar 09:39

Choose a tag to compare

What's Changed

  • Added scribe CLI command
    • If scribe.js is installed globally (npm i -g scribe.js-ocr), the scribe command can be used to process documents from the command line.
      • For example, scribe recognize analyst_report.png runs OCR on an image and saves the result as a PDF.
    • This feature is still experimental and command/argument names and features may change without warning.
  • Added new intermediate data format .scribe for storing and loading document data.
    • Given OCR is computationally expensive, it is often desirable to save results for later use without losing data.
    • By saving results to .scribe files, results can be re-loaded later (e.g. to export with slightly different settings).
      • While several other output formats can be re-loaded later (notably .hocr and .pdf), only .scribe can be re-loaded without any data being lost in the export/import process.
      • .scribe files only contain the text layer; they do not contain embedded images or PDF files.
        • .scribe files can be loaded alongside image/PDF files to restore both image and text data.

Full Changelog: v0.7.4...v0.8.0

v0.7.4

03 Mar 08:08

Choose a tag to compare

What's Changed

  • Fixed bug causing crash for certain PDF input documents.
  • Added support for bold + italic style (previously only bold or italic style)
  • Added support for underline style.
    • Underlined text is currently detected automatically when importing a text-native PDF or Abbyy XML file.
  • Disabled ligatures by default.
    • To re-enable, set scribe.opt.ligatures to true.

Full Changelog: v0.7.3...v0.7.4

v0.7.3

03 Mar 08:02

Choose a tag to compare

What's Changed

  • Updated HTML export to support Node.js

Full Changelog: v0.7.2...v0.7.3

v0.7.2

20 Feb 04:25

Choose a tag to compare

What's Changed

  • Added HTML output format (browser only).
    • This implementation is still preliminary; the implementation may change substantially in future versions.
  • Standardized fonts and font names

Full Changelog: v0.7.1...v0.7.2

v0.7.1

09 Feb 19:46

Choose a tag to compare

What's Changed

  • Standardized fonts and font names

Full Changelog: v0.7.0...v0.7.1

v0.7.0

07 Jan 08:38

Choose a tag to compare

What's Changed

  • Major rework of PDF export implementation.
    • Writing to PDF is faster and uses less memory.
      • Documents that used to crash due to memory errors now run almost instantly.
    • For many inputs, output PDF file sizes are now much smaller.
  • Fixed memory leaks within OCR module.
  • Misc bug fixes.

Full Changelog: v0.6.1...v0.7.0