Skip to content

Using your own modkit-generated pileups with plot_enrichment_profile.by_modification() instead of BAM input to parse_bam.pileup() #24

@K1999ban

Description

@K1999ban

Hi,
I generated pileup files using modkit, for both 5mC (CG,0) and 6mA (A,0), from BAM files that contain MM and ML tags.

I then reformatted the modkit-generated pileups to match the output of dimelo_v2.parse_bam.pileup() — including removing extra columns, bgzip compression, and tabix indexing.

However, when I try to use them as input to plot_enrichment_profile.by_modification(), like this:

plot_enrichment_profile.by_modification(
mod_file_name='modkit_pileup.tsv.gz',
regions=CTCF_BED,
window_size=1000,
motifs=['CG,0', 'A,0'],
smooth_window=50
)
I get a ValueError: Modification positions out of expected window range

Can the plotting functions in dimelo_v2 work with pileups generated by different modkit versions, as long as the format matches? Or is there something specific in DiMeLo's own pileup output that the plot function needs? Because I run modkit_v0.2.4 for Dimelo V2 when I generate the pileups from BAM file normally.

Also, is there a way to filter by coverage within DiMeLo v2 when generating pileups?

For example, when using modkit, the pileup output (BEDmethyl .bed) has 18 columns, and column 10 corresponds to n_valid_cov. I'd be interested in filtering for positions where n_valid_cov >= 5 to retain only high-confidence base calls.

Is there currently a way to apply such a filter in DiMeLo v2 before or during plotting?
Or should this filtering be done externally on the pileup file before passing it to plot_enrichment_profile.by_modification()?

Thanks a lot!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions