The ploidy estimation results obtained by GenomeScope and Smudgeplot are inconsistent.

We are working on a species of *Mentha* (Lamiaceae). Cytological and taxonomic studies of *Mentha* have reported extensive polyploidy in this genus, and the species we are studying has also been reported to include polyploid cytotypes. Therefore, before running Smudgeplot, our biological expectation was that this genome is unlikely to be a simple diploid.

We first estimated genome characteristics using GenomeScope with a k-mer size of 19. The GenomeScope profiles suggested that the genome may be highly heterozygous and potentially polyploid, most likely hexaploid or octoploid. Based on these results, and on previous reports of polyploidy in this taxon/genus, we expected Smudgeplot to support a polyploid interpretation.

At present, we do not yet have our own direct cytological evidence, such as chromosome counting, karyotyping, or in situ hybridization, for this accession. Our current evidence for polyploidy is therefore based on:

1. the GenomeScope k-mer profile, which appears more consistent with a high-ploidy genome than with a simple diploid genome;
2. previous literature reporting polyploidy in *Mentha* and in the species/cytotype related to our material;
3. the known evolutionary history of *Mentha*, where hybridization and polyploidization are common.

<img width="931" height="940" alt="Image" src="https://github.com/user-attachments/assets/3da676a6-4559-4b37-85b6-a3ef63a42d9b" />


We then used Smudgeplot for ploidy inference. The k-mer size used for Smudgeplot was 31, and the lower coverage cutoff, L, was set to 12. The Smudgeplot result was unexpected because it suggested a diploid genome.

This result seems suspicious to us for several reasons. First, it conflicts with the GenomeScope result, which suggested a possible hexaploid or octoploid genome. Second, it is inconsistent with the prior biological expectation for this *Mentha* species, given that polyploidy has been reported in the literature. Third, we are concerned that the choice of parameters may have affected the Smudgeplot inference. In particular, GenomeScope was run with k = 19, whereas Smudgeplot was run with k = 31. We are also unsure whether L = 12 is appropriate for our sequencing depth and k-mer coverage distribution.

Our question is therefore whether the Smudgeplot result should be considered reliable under the current settings, or whether the diploid inference may be caused by parameter choice, insufficient filtering, sequencing coverage, high heterozygosity, collapsed homeologous variation, or the complex polyploid history of *Mentha*.

Given the GenomeScope result and previous reports of polyploidy in this group, we still think that our accession is more likely to be polyploid rather than diploid. We would appreciate your advice on whether we should rerun Smudgeplot using different k-mer sizes and L values, and how to choose appropriate parameters for a potentially hexaploid or octoploid genome.

<img width="2000" height="2000" alt="Image" src="https://github.com/user-attachments/assets/dcf02bc7-0010-4e35-ad7f-516f7f0661e3" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The ploidy estimation results obtained by GenomeScope and Smudgeplot are inconsistent. #259

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

The ploidy estimation results obtained by GenomeScope and Smudgeplot are inconsistent. #259

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions