Right now we claim Turkish-focused accuracy in the README, but there are no numbers behind it. We should add a small, reproducible benchmark.
What to build
- Run OpenCR over a fixed set of public-domain Turkish PDFs (~50 pages total)
- Measure Word Error Rate and Character Error Rate against gold-standard transcripts
- Run the same fixtures through Tesseract, Surya, PaddleOCR, and Marker for comparison
- Publish the resulting table at
benchmarks/RESULTS.md and link it from the README
Where things live
Fixtures and gold transcripts under benchmarks/fixtures/, the runner script under benchmarks/run.py, comparison tooling under benchmarks/compare/.
Why
Even informal numbers are more useful than the silence we have now. This is also a great way for new contributors to help — no model code needed, mostly careful PDF curation and a bit of scripting.
Good first issue.
Right now we claim Turkish-focused accuracy in the README, but there are no numbers behind it. We should add a small, reproducible benchmark.
What to build
benchmarks/RESULTS.mdand link it from the READMEWhere things live
Fixtures and gold transcripts under
benchmarks/fixtures/, the runner script underbenchmarks/run.py, comparison tooling underbenchmarks/compare/.Why
Even informal numbers are more useful than the silence we have now. This is also a great way for new contributors to help — no model code needed, mostly careful PDF curation and a bit of scripting.
Good first issue.