Skip to content

The ploidy estimation results obtained by GenomeScope and Smudgeplot are inconsistent. #259

@majssssa

Description

@majssssa

We are working on a species of Mentha (Lamiaceae). Cytological and taxonomic studies of Mentha have reported extensive polyploidy in this genus, and the species we are studying has also been reported to include polyploid cytotypes. Therefore, before running Smudgeplot, our biological expectation was that this genome is unlikely to be a simple diploid.

We first estimated genome characteristics using GenomeScope with a k-mer size of 19. The GenomeScope profiles suggested that the genome may be highly heterozygous and potentially polyploid, most likely hexaploid or octoploid. Based on these results, and on previous reports of polyploidy in this taxon/genus, we expected Smudgeplot to support a polyploid interpretation.

At present, we do not yet have our own direct cytological evidence, such as chromosome counting, karyotyping, or in situ hybridization, for this accession. Our current evidence for polyploidy is therefore based on:

  1. the GenomeScope k-mer profile, which appears more consistent with a high-ploidy genome than with a simple diploid genome;
  2. previous literature reporting polyploidy in Mentha and in the species/cytotype related to our material;
  3. the known evolutionary history of Mentha, where hybridization and polyploidization are common.
Image

We then used Smudgeplot for ploidy inference. The k-mer size used for Smudgeplot was 31, and the lower coverage cutoff, L, was set to 12. The Smudgeplot result was unexpected because it suggested a diploid genome.

This result seems suspicious to us for several reasons. First, it conflicts with the GenomeScope result, which suggested a possible hexaploid or octoploid genome. Second, it is inconsistent with the prior biological expectation for this Mentha species, given that polyploidy has been reported in the literature. Third, we are concerned that the choice of parameters may have affected the Smudgeplot inference. In particular, GenomeScope was run with k = 19, whereas Smudgeplot was run with k = 31. We are also unsure whether L = 12 is appropriate for our sequencing depth and k-mer coverage distribution.

Our question is therefore whether the Smudgeplot result should be considered reliable under the current settings, or whether the diploid inference may be caused by parameter choice, insufficient filtering, sequencing coverage, high heterozygosity, collapsed homeologous variation, or the complex polyploid history of Mentha.

Given the GenomeScope result and previous reports of polyploidy in this group, we still think that our accession is more likely to be polyploid rather than diploid. We would appreciate your advice on whether we should rerun Smudgeplot using different k-mer sizes and L values, and how to choose appropriate parameters for a potentially hexaploid or octoploid genome.

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions