feat: implement OCR regression harness with golden dataset and CI integration by bytebinders · Pull Request #521 · Pulsefy/Soter

bytebinders · 2026-05-29T17:19:15Z

✅ PR Description:

What was done:

Modular Harness: Developed a full-featured OCR regression harness in app/ai-service/regression_harness/.
Golden Dataset: Created a structured repository for "golden" documents and ground truth values in regression_harness/dataset/.
Automated Evaluation: Implemented evaluator.py to compare actual OCR output against expected fields, supporting text normalization, error classification, and confidence tracking.
CLI & Reporting: Added cli.py to run suites locally, providing human-readable console summaries and machine-readable JSON reports for CI artifacts.
CI/CD Integration: Integrated a new GitHub Actions workflow .github/workflows/ocr-regression.yml to trigger automatically on OCR-related changes.
Documentation: Provided a comprehensive README.md for tool usage, adding new samples, and maintenance.

Why it was done:
To establish a reliable, low-maintenance testing infrastructure that ensures OCR extraction accuracy is preserved as the AI models, prompts, or preprocessing steps evolve.

How it was verified:

Verified the integrity of the data models and evaluation logic.
Validated the CLI's reporting capabilities (JSON and Console).
Verified the GitHub Actions configuration for environment dependencies (Tesseract-OCR).
Local sanity check performed on the directory structure and file system operations.

Summary of Work:

Models: models.py defines the schema for samples and results.
Evaluator: evaluator.py implements the logic for field comparison and IoU calculation.
CLI: cli.py provides the interface to run tests and export results.
Dataset: Established ground_truth.json with sample documents.
CI/CD: Added ocr-regression.yml for automated regression testing.
Documentation: Created README.md.

Required line:
Closes #464

…nd automated workflow support

drips-wave · 2026-05-29T17:19:25Z

@bytebinders Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits.

You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀

Learn more about application limits

bytebinders · 2026-05-29T17:45:52Z

@Cedarich , could you drop a few more GitHub issues for me to work on? I’m ready for more 🔧🙂

Cedarich · 2026-05-29T18:03:41Z

Please fix work flow

bytebinders · 2026-05-30T09:11:06Z

@Cedarich the issue is missing file sampl_001.png but I have fixed it.

Cedarich · 2026-05-30T10:07:30Z

Fix workflow

… field detection

feat: implement OCR regression testing harness with CLI, evaluator, a…

d2ee65b

…nd automated workflow support

feat: add sample_001.png to regression harness document dataset

36e1326

feat: implement OCRService with Tesseract integration and regex-based…

3394ea5

… field detection

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement OCR regression harness with golden dataset and CI integration#521

feat: implement OCR regression harness with golden dataset and CI integration#521
bytebinders wants to merge 3 commits into
Pulsefy:mainfrom
bytebinders:main

bytebinders commented May 29, 2026

Uh oh!

drips-wave Bot commented May 29, 2026

Uh oh!

bytebinders commented May 29, 2026

Uh oh!

Cedarich commented May 29, 2026

Uh oh!

bytebinders commented May 30, 2026

Uh oh!

Cedarich commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bytebinders commented May 29, 2026

✅ PR Description:

Summary of Work:

Uh oh!

drips-wave Bot commented May 29, 2026

Uh oh!

bytebinders commented May 29, 2026

Uh oh!

Cedarich commented May 29, 2026

Uh oh!

bytebinders commented May 30, 2026

Uh oh!

Cedarich commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants