[NeurIPS 2024] 🕸 GlotCC Dataset and Pipline
-
Updated
Apr 6, 2025 - Jupyter Notebook
[NeurIPS 2024] 🕸 GlotCC Dataset and Pipline
Multilingual dataset for principal parts detection in inflectional morphology (CoNLL 2025)
Parallel Literary Corpora: Fiction and Poetry Translations
Multilingual emotional speech datasets for TTS training
Multilingual dataset of world cities with English and Arabic names, population, and country info. Provided in JSON, CSV, SQL, Excel formats. This will provide enriched information of countries, states and their capitals translate these in Arabic and show population of the city
Add a description, image, and links to the multilingual-dataset topic page so that developers can more easily learn about it.
To associate your repository with the multilingual-dataset topic, visit your repo's landing page and select "manage topics."