Skip to content

jakob-schuster/matchbox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

147 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

matchbox

A flexible read processor, capable of performing powerful transformations on your FASTA/FASTQ/BAM files.

You could use matchbox for:

  • Quick, error-tolerant search for primers and known sequences
  • Demultiplexing even the most complex barcoding schemes
  • Investigating and filtering out sequencing artefacts

Read the documentation site for more examples, or check out the preprint.

Installation

matchbox can be installed using cargo:

cargo install matchbox-cli

Usage

Write your matchbox script in a .mb file, and give it to matchbox via --script.

matchbox -s my_script.mb my_reads.fq
  • To allow for edit distance when searching for sequences, use --error
  • To process data on multiple threads for improved speed, use --threads
  • To handle paired reads, use --paired-with

Example scripts

Trim off the first 10 bases of each read:

if read matches [|10| rest:_] => rest.out!('trimmed.fq')

Extract the region between two primer sequences:

left = TATTGCTGGG
right = ACTTGCCTGTC

if read matches {
    [_ left mid:_ right _] => mid.out!('trimmed.fq')
    
    # also output the reads which did not contain the primers
    [_] => read.out!('unmatched.fq')
}

For more examples and a full scripting language reference, read the documentation!

Citation

If you use matchbox, cite the preprint at DOI:10.1101/2025.11.09.685711

About

A flexible processor for sequencing reads

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages