HATEDEMICS Models and Data

This repository contains resources developed within the HATEDEMICS project, including models and annotated datasets for hate speech detection in English, Spanish, Polish, and Italian.

Languages

The repository currently includes resources for:

English
Spanish
Polish
Italian

Tasks

Hate Speech Detection

Models and annotations for identifying hate speech in online content.

Models

The released models were developed using MaChAmp. We provide final models for hate speech detection in English, Spanish, Polish, and Italian.

The released models are fine-tuned versions of existing pre-trained models. Each model was fine-tuned on the corresponding annotated data released in the data/ directory. The MaChAmp configuration files used for training are provided in the configs/ directory.

Released Models

Task	Language	Base model	Training data
Hate speech	Italian	`MilaNLProc/hate-ita`	Human-annotated Italian Telegram data
Hate speech	Polish	`ptaszynski/bert-base-polish-cyberbullying`	Human-annotated Polish Telegram data
Hate speech	English	`facebook/roberta-hate-speech-dynabench-r4-target`	LLM-annotated English Telegram data
Hate speech	Spanish	`dccuchile/bert-base-spanish-wwm-uncased`	LLM-annotated Spanish Telegram data

Repository Organization

The repository is organized around the three main types of released resources: annotated data, fine-tuned models, and MaChAmp configuration files.

.
├── data/
│   ├── human_annotated/
│   └── llm_annotated/
├── models/
│   └── hate_speech/
├── configs/
│   ├── datasets/
│   └── parameters/
├── docs/
└── README.md

Related Resources

This repository is connected to the MuLTa-Telegram resource, a publicly available Italian and Polish Telegram dataset for hate speech and target detection.

The MuLTa-Telegram dataset is described in the following paper:

MuLTa-Telegram: A Fine-Grained Italian and Polish Dataset for Hate Speech and Target Detection

Acknowledgements

This work was supported by the European Union’s CERV fund under Grant Agreement No. 101143249 (HATEDEMICS)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HATEDEMICS Models and Data

Contents

Languages

Tasks

Hate Speech Detection

Models

Released Models

Repository Organization

Related Resources

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
configs		configs
data		data
README.md		README.md

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

HATEDEMICS Models and Data

Contents

Languages

Tasks

Hate Speech Detection

Models

Released Models

Repository Organization

Related Resources

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages