Skip to content

starch verification error: observed and expected signatures do not match for chromosome [null] #281

@mattmaurano

Description

@mattmaurano

Some starch files contain an extra entry for chromosome null

Example:

$ unstarch --list /gpfs/data/cegs/mapped//FCHWMG7BGXK/capture/Bl6_Cast_Igf2_Igf2_inverteddEnhdCTCFinICR_G12-BS15522A/Bl6_Cast_Igf2_Igf2_inverteddEnhdCTCFinICR_G12-BS15522A.cegsvectors_LP300.reads.starch

chr                      |filename                                                         |compressedSize |uncompressedLineCount    |uncompressedLineMaxStrLength  |totalNonUniqueBases |totalUniqueBases    |duplicateElementExists   |nestedElementExists      |signature                
LPv5_backbone            |LPv5_backbone.pid80866.isgnodepdc005.bz2                         |195            |72                       |38                            |72                  |41                  |true                     |false                    |hALcQvsm9fo392PoqCobW1oHyZM=
null                     |null.pid80866.isgnodepdc005.bz2                                  |14             |0                        |0                             |0                   |0                   |false                    |false                    |null                     

This can manifest as a hash validation error for that chromsome:
$ unstarch --verify-signature /gpfs/data/cegs/mapped//FCHWMG7BGXK/capture/Bl6_Cast_Igf2_Igf2_inverteddEnhdCTCFinICR_G12-```
BS15522A/Bl6_Cast_Igf2_Igf2_inverteddEnhdCTCFinICR_G12-BS15522A.cegsvectors_LP300.reads.starch
Expected and observed data integrity signatures match for chromosome [LPv5_backbone]
ERROR: Specified chromosome record may be corrupt -- observed and expected signatures do not match for chromosome [null]


Most frequently this is for the .genotypes.starch files, but I see this for the following example files, covering a variety of mapped genomes and outputs:
./FCHWMG7BGXK/dna/Water_Control-BS15460A/Water_Control-BS15460A.mm10.genotypes.starch
./FCHWMG7BGXK/dna/Water_Control-BS15460A/hotspot2/Water_Control-BS15460A.mm10.hotspots.fdr0.005.starch
./FCHWMG7BGXK/capture/Bl6_Cast_LP300[Igf2_1_4]_D5_dLsp1_C3-BS15534A/Bl6_Cast_LP300[Igf2_1_4]_D5_dLsp1_C3-BS15534A.cegsvectors_LP300.reads.starch
./FCHWMG7BGXK/capture/Bl6_Cast_LP300[Igf2_1_4]_D5_dHIDAD_C7-BS15529A/Bl6_Cast_LP300[Igf2_1_4]_D5_dHIDAD_C7-BS15529A.cegsvectors_pSpCas9GFP.coverage.allreads.starch
./FCHWMG7BGXK/capture/Bl6_Cast_LP300[Igf2_1_4]_D5_dSyt8_H5-BS15531A/Bl6_Cast_LP300[Igf2_1_4]_D5_dSyt8_H5-BS15531A.cegsvectors_pSpCas9GFP.coverage.starch

With the exception of the hotspot file, these are all created in callsnpsMerge.sh.

The .genotypes.starch is created differently (by starch - directly instead of starchcat).

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions