First thank you for your great work!
I've tried your pretrained model, it can't solve the problem in the title, it just export the original audio.
In non-causal case, maybe i can use spk_model to clasify the whole audio, and export a zero tensor.
But in causal case, what should i do?
I've found that ecapa_tdnn can export frame-level embbeding, but i'm not sure if it's discriminative.
First thank you for your great work!
I've tried your pretrained model, it can't solve the problem in the title, it just export the original audio.
In non-causal case, maybe i can use spk_model to clasify the whole audio, and export a zero tensor.
But in causal case, what should i do?
I've found that ecapa_tdnn can export frame-level embbeding, but i'm not sure if it's discriminative.