Skip to content

What if there is only one speaker in mix audio, and he/she is not the enrollment person #22

@SherryYu33

Description

@SherryYu33

First thank you for your great work!
I've tried your pretrained model, it can't solve the problem in the title, it just export the original audio.

In non-causal case, maybe i can use spk_model to clasify the whole audio, and export a zero tensor.
But in causal case, what should i do?
I've found that ecapa_tdnn can export frame-level embbeding, but i'm not sure if it's discriminative.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions