Original vs annotated alignment

Hello! Thank you for making this really cool dataset publicly available :) 

I'm trying to align the annotations and the original text, could you please specify what tokenizer was used to produce the dataset? So far I can't get it quite right. Or is there perhaps an easier way to align original texts and annotations that I'm missing? Thanks in advance