Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)
-
Updated
Oct 13, 2023 - JavaScript
Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)
Implementation of "SpEx: Multi-Scale Time Domain Speaker Extraction Network".
Make the sound you hear pure and clean by deep learning.
Knowledge Boosting During Low Latency Inference introduces a model collaboration technique to allow small, on-device audio models to accept hints from a larger, jointly trained model during inference time to enhance responses in real-time
Master Project at University of Cambridge
The official repo of SUDx for speaker diarization
Batch extract specific speakers from mixed audio using reference samples. Powered by TitanNet & Silero VAD with lossless export.
Neural speech quality assessment — score audio with non-intrusive/intrusive metrics and compare enhancement algorithms in minutes. Backed by Uni-VERSA-Ext.
Add a description, image, and links to the target-speaker-extraction topic page so that developers can more easily learn about it.
To associate your repository with the target-speaker-extraction topic, visit your repo's landing page and select "manage topics."