Hi,
I have successfully created the model for other languages tamil and english. But, when try to do alignment `python yalign-align -a en -b ta en-ta en.txt ta.txt > aligned.txt`. I am getting the keyerror
Traceback (most recent call last): File "yalign-align", line 64, in <module> document_b = read_document(args['<document_b>'], lang_b) File "yalign-align", line 44, in read_document return text_to_document(text, language) File "/home/sanjana/Documents/Python_pgms/yalign/yalign/input_conversion.py", line 65, in text_to_document splitter = _sentence_splitters[language] File "/home/sanjana/Documents/Python_pgms/yalign/yalign/utils.py", line 82, in __missing__ x = self.default_factory(key) File "/home/sanjana/Documents/Python_pgms/yalign/yalign/input_conversion.py", line 51, in <lambda> _sentence_splitters = Memoized(lambda lang: nltkload("tokenizers/punkt/%s.pickle" % CODES_TO_LANGUAGE[lang])) KeyError: 'ta'
It would be great if I am getting an earnest reply.
PS:nltk does not support tamil language
Hi,
Traceback (most recent call last): File "yalign-align", line 64, in <module> document_b = read_document(args['<document_b>'], lang_b) File "yalign-align", line 44, in read_document return text_to_document(text, language) File "/home/sanjana/Documents/Python_pgms/yalign/yalign/input_conversion.py", line 65, in text_to_document splitter = _sentence_splitters[language] File "/home/sanjana/Documents/Python_pgms/yalign/yalign/utils.py", line 82, in __missing__ x = self.default_factory(key) File "/home/sanjana/Documents/Python_pgms/yalign/yalign/input_conversion.py", line 51, in <lambda> _sentence_splitters = Memoized(lambda lang: nltkload("tokenizers/punkt/%s.pickle" % CODES_TO_LANGUAGE[lang])) KeyError: 'ta'It would be great if I am getting an earnest reply.
PS:nltk does not support tamil language