Skip to content

pauldaywork/batch-parakeet-onnx-asr-local

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Locally Run Parakeet Onnx-asr Multi-GPU Batch Transcriptions

About

This project transcribes audio files using onnx_asr library and uses multiple GPU's to batch process audio transcriptions into .srt files. I wanted to simplify the process of running this locally so anyone can experiment with these libraries. Tested on 2x Nvidia 24GB GPU's but you can specify how many GPU's you want to use and how many files you want to process on each GPU at a time. It works inside a Conda virtual environment so the correct cuda environment is easy to load and switch out of. This allows for your system level environment to stay clean and you can use different verions of CUDA and other Nvidia libraries in other projects on the same machine.

Requirements

Pre-process audio

Put the audio files you want to use in the /files folder and run this script within the /files folder. It currently converts .mp3 files but modify as needed.

for f in *.mp3; do ffmpeg -i "$f" -ar 16000 -ac 1 -c:a pcm_s16le "${f%.*}.wav"; done

Setup virtual envivonment

conda create -n onnx_env python=3.12
conda activate onnx_env
conda install cuda=12.8 -c nvidia/label/cuda-12.8.1 
conda install nvidia::cudnn cuda-version=12.8
pip install onnx-asr onnxruntime-gpu huggingface_hub soundfile
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH

Load virtual environment

conda activate onnx_env
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH

Run script

python onnx.py
python onnx-single.py
python onnx-max.py

Files and folders

  • onnx.py - runs transcriptions on mutiple gpu's
  • onnx-single.py - runs transcriptions on a single gpu
  • onnx-max.py - runs mutiple transcriptions at once on mutiple gpu's
  • files/ - folder with audio files to process
  • models/ - folder with onnx models

Other options

  • run as subprocess
    import subprocess
    
    def transcribe_with_subprocess(audio_file):
        # This script contains only the loading, transcribing, and saving code
        subprocess.run(["python", "transcribe_script.py", "--audio", audio_file])
        # Memory is freed when the script finishes
    

Benchmark

Tested on processing 118 wav files at 9.5gb in size

Script Processing Time Single File Average Failures
onnx-max.py 10m 41s 5.4s 5 (from maxing out memory)
onnx.py 15m 42s 7.9s 0
onnx-single.py 25m 61s 13s 0

Future improvements

  • Split out files that take more than 50% of VRAM so onnx-max can run without any errors and process those edge case files differently

About

Locally Run Parakeet Onnx-asr Multi-GPU Batch Transcriptions

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages