Skip to content

DigitalCarleton/AI_transcription

Repository files navigation

AI Approaches to Handwriting Transcription

Project Overview

Handwritten document transcription remains labor-intensive, but new AI systems may reduce that burden. This project aims to assess how accurately and usefully current AI tools can transcribe handwritten documents across different genres, levels of legibility, and contextual density. More importantly, we want to identify a stable default workflow for manual correction and build a reusable evaluation framework for a larger-scale study.

Experiment Results

This repository includes several transcription experiments we performed on historical documents and recent handwritten notes. The results are attached as PDF files in the ZIP folder. Using the one-shot prompt method on Gemini 3.0, we received fairly accurate results but pecularities, such as indentations, columns, and symbols, require more refined prompting. Thus, although the LLM could transcribe even the most illegible handwriting, however, the exact output format needs further prompting and specifications. So, we have designed a relatively simple script to convert archival images and metadata into reviewable output.

Scale-up Workflow

The process begins by exporting item metadata and image URLs from the Omeka API, splitting the exported data by collection, and then dividing each collection into smaller JSONL batches for Gemini transcription. Gemini returns transcription results as JSONL, and finally those results can then be converted into readable HTML files.

Run script

python accuracy.py --ai ai.txt --truth truth.txt

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages