Skip to content

Hypothetical proteins wrongly grouped despite unique names in clinker #120

@GeybyTatiana

Description

@GeybyTatiana

Image

Hi Clinker team,

I'm using clinker to compare phage genomes, and I’m classifying genes based on their functional categories. All genes annotated as "hypothetical protein" are meant to belong to the same group "Unknown function".

However, I’ve noticed that in the clinker visualization, these genes are often split across multiple groups, even though their function is the same and they are labeled identically in the input. I've double-checked the gene names and confirmed they are consistent across genomes.

I also tried appending numbers or minor changes to unify them better, but clinker still places them into separate clusters in the visualization. This is making it hard to track and compare proteins of unknown function consistently.

Is there a way to force clinker to group genes based on function instead of relying solely on sequence similarity? Or is there a workaround for this case?

Thanks in advance for your help and insights.

Best regards,

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions