Releases: scribeocr/scribe.js
Releases · scribeocr/scribe.js
v0.9.3
What's Changed
- Fixed bug causing text layer in PDF exports to be broken (#58)
- This issue impacts all PDFs created with two patch releases from the last ~week (
0.9.1and0.9.2). Anybody using those versions should update ASAP.
- This issue impacts all PDFs created with two patch releases from the last ~week (
Full Changelog: v0.9.2...v0.9.3
v0.9.2
v0.9.1
What's Changed
- Various updates to experimental and debugging-related features.
- None of the documented features should change with this release.
Full Changelog: v0.9.0...v0.9.1
v0.9.0
What's Changed
- Added URW Gothic font
- Added Deno support
- Updated
.htmlexport format- This format contains a
.htmlfile that should closely resemble the original document. - This should be useful for converting
.pdffiles to a format that can be displayed natively in the browser.
- This format contains a
- Added experimental
.txtimport format- For obvious reasons, importing
.txtfiles will not work with most operations. - This mode is currently exclusively useful for development/debugging purposes and making basic
.pdffiles from.txtfiles.
- For obvious reasons, importing
- Performance improvements to PDF exports
- Various refactoring and minor updates.
Full Changelog: v0.8.0...v0.9.0
v0.8.0
What's Changed
- Added
scribeCLI command- If
scribe.jsis installed globally (npm i -g scribe.js-ocr), thescribecommand can be used to process documents from the command line.- For example,
scribe recognize analyst_report.pngruns OCR on an image and saves the result as a PDF.
- For example,
- This feature is still experimental and command/argument names and features may change without warning.
- If
- Added new intermediate data format
.scribefor storing and loading document data.- Given OCR is computationally expensive, it is often desirable to save results for later use without losing data.
- By saving results to
.scribefiles, results can be re-loaded later (e.g. to export with slightly different settings).- While several other output formats can be re-loaded later (notably
.hocrand.pdf), only.scribecan be re-loaded without any data being lost in the export/import process. .scribefiles only contain the text layer; they do not contain embedded images or PDF files..scribefiles can be loaded alongside image/PDF files to restore both image and text data.
- While several other output formats can be re-loaded later (notably
Full Changelog: v0.7.4...v0.8.0
v0.7.4
What's Changed
- Fixed bug causing crash for certain PDF input documents.
- Added support for bold + italic style (previously only bold or italic style)
- Added support for underline style.
- Underlined text is currently detected automatically when importing a text-native PDF or Abbyy XML file.
- Disabled ligatures by default.
- To re-enable, set
scribe.opt.ligaturestotrue.
- To re-enable, set
Full Changelog: v0.7.3...v0.7.4
v0.7.3
v0.7.2
What's Changed
- Added HTML output format (browser only).
- This implementation is still preliminary; the implementation may change substantially in future versions.
- Standardized fonts and font names
Full Changelog: v0.7.1...v0.7.2
v0.7.1
v0.7.0
What's Changed
- Major rework of PDF export implementation.
- Writing to PDF is faster and uses less memory.
- Documents that used to crash due to memory errors now run almost instantly.
- For many inputs, output PDF file sizes are now much smaller.
- Writing to PDF is faster and uses less memory.
- Fixed memory leaks within OCR module.
- Misc bug fixes.
Full Changelog: v0.6.1...v0.7.0