Hello!
I’m using deeprm call prep + deeprm call run to generate site-level outputs, and I’d like to confirm whether my workflow for extracting per-site m6A fraction from the output pileup.bed is correct.
stepA:
deeprm call prep
-p XX/pod5_split_wdx/pod5_dir
-b XX/rep3_final_data/deeprm_baseline/reads_with_moves.bam
-o XX/rep3_final_data/deeprm_baseline/prep_dir
stepB:
deeprm call run
-b XX/rep3_final_data/deeprm_baseline/reads_with_moves.bam
-i XX/rep3_final_data/deeprm_baseline/prep_dir
-o XX/rep3_final_data/deeprm_baseline/call_dir
-s 4000
This generates:
XX/rep3_final_data/deeprm_baseline/call_dir/prep_dir/site-level/pileup.bed
I treat pileup.bed as bedMethyl (18 columns). According to the bedMethyl spec:
• column 10 = valid coverage
• column 11 = percent modified
• column 12 = number of modified calls
• column 13 = number of canonical calls
So I export per-site fraction as frac = col11 / 100 and filter by cov >= 10:
awk 'BEGIN{
OFS=",";
print "chrom","pos1","cov","frac_m6A","N_m6A","canonical_col13"
}
$4=="a" && $10>=10 {
frac=$11; if(frac>1) frac/=100; # 100.00 -> 1.0
print $1, $2+1, $10, frac, $12, $13
}' XX/rep3_final_data/deeprm_baseline/call_dir/prep_dir/site-level/pileup.bed \
XX/rep3_final_data/deeprm_baseline/call_dir/prep_dir/site-level/m6A_sites.csv
Hello!
I’m using deeprm call prep + deeprm call run to generate site-level outputs, and I’d like to confirm whether my workflow for extracting per-site m6A fraction from the output pileup.bed is correct.
stepA:
deeprm call prep
-p XX/pod5_split_wdx/pod5_dir
-b XX/rep3_final_data/deeprm_baseline/reads_with_moves.bam
-o XX/rep3_final_data/deeprm_baseline/prep_dir
stepB:
deeprm call run
-b XX/rep3_final_data/deeprm_baseline/reads_with_moves.bam
-i XX/rep3_final_data/deeprm_baseline/prep_dir
-o XX/rep3_final_data/deeprm_baseline/call_dir
-s 4000
This generates:
XX/rep3_final_data/deeprm_baseline/call_dir/prep_dir/site-level/pileup.bed
I treat pileup.bed as bedMethyl (18 columns). According to the bedMethyl spec:
• column 10 = valid coverage
• column 11 = percent modified
• column 12 = number of modified calls
• column 13 = number of canonical calls
So I export per-site fraction as frac = col11 / 100 and filter by cov >= 10:
awk 'BEGIN{
OFS=",";
print "chrom","pos1","cov","frac_m6A","N_m6A","canonical_col13"
}
$4=="a" && $10>=10 {
frac=$11; if(frac>1) frac/=100; # 100.00 -> 1.0
print $1, $2+1, $10, frac, $12, $13
}' XX/rep3_final_data/deeprm_baseline/call_dir/prep_dir/site-level/pileup.bed \