Local evaluation harness for comparing ASR workflows across:
worker-faster_whisperQwen3-ASRFireRedASR2S
It contains:
- benchmark manifests
- a durable curated sample catalog in
sample_catalog.json - local runner scripts
- report generation scripts
- the mobile-friendly HTML report template in
report/index.html
Generated artifacts are intentionally excluded from git:
- downloaded audio in
data/ - benchmark outputs in
out/ - report runtime cache in
report/data.jsonandreport/runs.json - report source-of-truth DB in
report/report.db
The local report server now keeps source registrations in SQLite and exposes simple write APIs:
GET /api/report/sourcesGET /api/report/dataPOST /api/report/rebuildPOST /api/report/register-sourcePOST /api/report/register-batch
Example batch registration:
curl -X POST http://127.0.0.1:18745/api/report/register-batch \
-H 'Content-Type: application/json' \
-d '{
"batch_key": "eval-20260318-fr",
"batch_label": "法语对比",
"sources": [
{"engine": "worker", "result_file": "/home/kevinzhow/github/asr-bench/out/a.json"},
{"engine": "qwen", "result_file": "/home/kevinzhow/github/asr-bench/out/b.json"}
]
}'scripts/- report build / registration / serving
report/- static report frontend
manifest*.json- benchmark sample lists
sample_catalog.json- curated user-provided sample inventory grouped by issue / language / scenario
run_*- local benchmark entrypoints
This repo depends on neighboring local repos for actual model execution:
/home/kevinzhow/worker-faster_whisper/home/kevinzhow/github/Qwen3-ASR/home/kevinzhow/github/FireRedASR2S
The benchmark repo itself stores orchestration and reporting only.