This project uses the Kokoro TTS model to convert .txt files into natural-sounding speech. It processes all input files in batch, saves .wav outputs, and organizes processed data cleanly.
.
├── audio.py # Main batch processing script
├── launch.sh # Environment setup and runner
├── requirements.txt # Python dependencies
├── input/ # Put your .txt files here
├── output/ # Generated .wav files appear here
└── processed/ # Processed .txt files are moved here
-
Add input files
Place.txtfiles into theinput/directory. -
Run the pipeline
Make surelaunch.shis executable:chmod +x launch.sh ./launch.sh
The script will:
- Set up a Python virtual environment
- Install dependencies
- Detect and use GPU if available
- Run
audio.pyto process all.txtfiles
-
Results
- Audio is saved as
.wavinoutput/ - Input files are moved to
processed/
- Audio is saved as
All dependencies are listed in requirements.txt. GPU support is handled automatically in launch.sh.
- Output sample rate: 24 kHz
- Voice used:
af_heart - If no GPU is available, falls back to CPU.