Skip to content

IOB2 reference implementation #1

Description

@stefan-it

Hi,

many thanks for releasing this new resource for Italian NER!

I would like to integrate this dataset into our Flair library and into Hugging Face Datasets, but I have a question about the labeling scheme. At the moment the dataset has this format:

In	O
fiamme	O
l'	O
Istituto	LOC
Lama	LOC
Tzong	LOC
Khapa	LOC
,	O
monastero	O
buddhista	O

One could just write a conversion script to convert the labels into IOB2 format, so it should has the form of:

In	O
fiamme	O
l'	O
Istituto	B-LOC
Lama	I-LOC
Tzong	I-LOC
Khapa	I-LOC
,	O
monastero	O
buddhista	O

However, I would like to know, if you could provide the reference implementation or the IOB2-converted dataset to compare the conversion results 🤔

Many thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions