Code and data for our EvaLatin 2026 submission (first on poetry).
-
Updated
Apr 4, 2026 - Python
Code and data for our EvaLatin 2026 submission (first on poetry).
Official code for "Fine-Tashkeel at KSAA-2026" — Systematic evaluation of 18 Seq2Seq, token classification, decoder LLM, and ASR models for automatic Arabic text diacritization. 5th place at KSAA-2026 Shared Task (OSACT7 @ LREC 2026).
Official code for "Ketaba-OCR at AR-MS NakbaNLP 2026" — QLoRA fine-tuning of a specialized HTR model with Linear+Boost ensemble for Arabic manuscript recognition. 1st place per-line (CER 0.082) and 3rd place official leaderboard at NakbaNLP 2026 (LREC 2026).
Additional experimental model for NakbaNLP 2026 Shared Task (AR-MS) — LoRA/DoRA fine-tuning of Qari-OCR (Qwen2-VL-2B) for Arabic handwritten manuscript recognition on the Omar Al-Saleh Memoir Collection (1951-1965).
Main repository of the Ecole nationale des chartes - PSL team for EvaHan 2026 Challenge @LREC2026
OasisSimp: An Open-source Asian-English Sentence Simplification Dataset
TantaArabNLP at KSAA-2026: Adapting CATT-Whisper for Arabic Speech Dictation with Automatic Diacritization.
Evaluate Arabic speech dictation diacritization models with KSAA 2026 results, code, and benchmarks for seq2seq and multimodal approaches
Add a description, image, and links to the lrec2026 topic page so that developers can more easily learn about it.
To associate your repository with the lrec2026 topic, visit your repo's landing page and select "manage topics."