-
-
Notifications
You must be signed in to change notification settings - Fork 59
VOSK STT Engine #280
Copy link
Copy link
Closed
Labels
Good First Issue!HacktoberfestSmall or non-core issues that could be worked on by Hacktoberfest participantsSmall or non-core issues that could be worked on by Hacktoberfest participantsPriority: MediumStatus: In ProgressType: Enhancement
Metadata
Metadata
Assignees
Labels
Good First Issue!HacktoberfestSmall or non-core issues that could be worked on by Hacktoberfest participantsSmall or non-core issues that could be worked on by Hacktoberfest participantsPriority: MediumStatus: In ProgressType: Enhancement
Detailed Description
VOSK (https://alphacephei.com/vosk/) is a new open-source STT toolkit/engine built on Kaldi and which is optimized to run on Raspberry Pi. Building a language model is described here https://alphacephei.com/vosk/adaptation.html
Context
Learning to train and adapt the acoustic model, language model and dictionary is enormously helpful in speech recognition. The more you can reduce the total range of probabilities, the better the recognition becomes. Naomi has an advantage in that we have a list of phrases that can be used to build a language model directly from.
Possible Implementation
VOSK can be installed with a simple
pip3 install vosk. The training tools are basically Kaldi, but it is not necessary to install Kaldi to use VOSK. The adaptation page shows a good start on developing a language model from the intent templates.