Hi,
many thanks for releasing this new resource for Italian NER!
I would like to integrate this dataset into our Flair library and into Hugging Face Datasets, but I have a question about the labeling scheme. At the moment the dataset has this format:
In O
fiamme O
l' O
Istituto LOC
Lama LOC
Tzong LOC
Khapa LOC
, O
monastero O
buddhista O
One could just write a conversion script to convert the labels into IOB2 format, so it should has the form of:
In O
fiamme O
l' O
Istituto B-LOC
Lama I-LOC
Tzong I-LOC
Khapa I-LOC
, O
monastero O
buddhista O
However, I would like to know, if you could provide the reference implementation or the IOB2-converted dataset to compare the conversion results 🤔
Many thanks!
Hi,
many thanks for releasing this new resource for Italian NER!
I would like to integrate this dataset into our Flair library and into Hugging Face Datasets, but I have a question about the labeling scheme. At the moment the dataset has this format:
One could just write a conversion script to convert the labels into IOB2 format, so it should has the form of:
However, I would like to know, if you could provide the reference implementation or the IOB2-converted dataset to compare the conversion results 🤔
Many thanks!