Locally Run Parakeet Onnx-asr Multi-GPU Batch Transcriptions

About

This project transcribes audio files using onnx_asr library and uses multiple GPU's to batch process audio transcriptions into .srt files. I wanted to simplify the process of running this locally so anyone can experiment with these libraries. Tested on 2x Nvidia 24GB GPU's but you can specify how many GPU's you want to use and how many files you want to process on each GPU at a time. It works inside a Conda virtual environment so the correct cuda environment is easy to load and switch out of. This allows for your system level environment to stay clean and you can use different verions of CUDA and other Nvidia libraries in other projects on the same machine.

Requirements

Linux
NVIDIA container toolkit: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
Miniconda
FFMPEG

Pre-process audio

Put the audio files you want to use in the /files folder and run this script within the /files folder. It currently converts .mp3 files but modify as needed.

for f in *.mp3; do ffmpeg -i "$f" -ar 16000 -ac 1 -c:a pcm_s16le "${f%.*}.wav"; done

Setup virtual envivonment

conda create -n onnx_env python=3.12
conda activate onnx_env
conda install cuda=12.8 -c nvidia/label/cuda-12.8.1 
conda install nvidia::cudnn cuda-version=12.8
pip install onnx-asr onnxruntime-gpu huggingface_hub soundfile
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH

Load virtual environment

conda activate onnx_env
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH

Run script

python onnx.py
python onnx-single.py
python onnx-max.py

Files and folders

onnx.py - runs transcriptions on mutiple gpu's
onnx-single.py - runs transcriptions on a single gpu
onnx-max.py - runs mutiple transcriptions at once on mutiple gpu's
files/ - folder with audio files to process
models/ - folder with onnx models

Other options

run as subprocess

import subprocess

def transcribe_with_subprocess(audio_file):
    # This script contains only the loading, transcribing, and saving code
    subprocess.run(["python", "transcribe_script.py", "--audio", audio_file])
    # Memory is freed when the script finishes

Benchmark

Tested on processing 118 wav files at 9.5gb in size

Script	Processing Time	Single File Average	Failures
onnx-max.py	10m 41s	5.4s	5 (from maxing out memory)
onnx.py	15m 42s	7.9s	0
onnx-single.py	25m 61s	13s	0

Future improvements

Split out files that take more than 50% of VRAM so onnx-max can run without any errors and process those edge case files differently

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
README.md		README.md
onnx-max.py		onnx-max.py
onnx-single.py		onnx-single.py
onnx.py		onnx.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Locally Run Parakeet Onnx-asr Multi-GPU Batch Transcriptions

About

Requirements

Pre-process audio

Setup virtual envivonment

Load virtual environment

Run script

Files and folders

Other options

Benchmark

Future improvements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Locally Run Parakeet Onnx-asr Multi-GPU Batch Transcriptions

About

Requirements

Pre-process audio

Setup virtual envivonment

Load virtual environment

Run script

Files and folders

Other options

Benchmark

Future improvements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages