Skip to content

greninger-lab/vadr-models-hrv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Human rhinovirus genome annotation


How to annotate HRV genomes with VADR

Steps for using VADR for HRV annotation:

  1. Download and install the latest version of VADR, following the instructions on this page. Alternatively, you can use the StaPH-B VADR 1.6.4 docker image created by Curtis Kapsak (docker image names: staphb/vadr:1.6.4 and staphb/vadr:latest), available on dockerhub and quay. A brief README for the docker image is here.

  2. Clone the latest HRV VADR model library from this repository. git clone git@github.com:greninger-lab/vadr-models-hrv.git

    Note the path to the directory name created plus the "hrvA", "hrvB" or "hrvC" subdirectory (e.g. /path/to/vadr-models-hrv/hrvA) as <hrv-models-dir-path> for step 4.

  3. Remove terminal ambiguous nucleotides from your input fasta sequence file using the fasta-trim-terminal-ambigs.pl script in $VADRSCRIPTSDIR/miniscripts/.

    To remove terminal ambiguous nucleotides from your sequence file <input-fasta-file> and to remove short and long sequences to create a new trimmed file <trimmed-fasta-file>, execute:

$VADRSCRIPTSDIR/miniscripts/fasta-trim-terminal-ambigs.pl --minlen 50 --maxlen 8000 <input-fasta-file> > <trimmed-fasta-file>
  1. Run the v-annotate.pl program on an input trimmed fasta file with HRV sequences using the recommended command below. In addition, you must indicate the HRV species hrvA, hrvB or hrvC as <hrv-key>.

    For hrvA, run:

v-annotate.pl -r --r_file <hrv-models-dir-path>/hrvA.rpn.fa --mkey <hrv-key> --mdir <hrv-models-dir-path> <fasta-file-to-annotate> <output-directory-to-create>

For hrvB or hrvC, run:

v-annotate.pl -r --mkey <hrv-key> --mdir <hrv-models-dir-path> <fasta-file-to-annotate> <output-directory-to-create>
  1. After running the v-annotate.pl command in step 4, there will be a number of files generated in the <output-directory-to-create>. Among these files, there are 5-column tab-delimited feature table files that end with the suffix .tbl. There is a separate file for passing (XXXXX.vadr.pass.tbl) and failing (XXXXX.vadr.fail.tbl) sequences. The format of the .tbl files is described here: https://www.ncbi.nlm.nih.gov/genbank/feature_table/

    More information about understanding failures and error alerts can be found in the VADR documentation here: https://github.com/ncbi/vadr/blob/master/documentation/annotate.md


HRV VADR model library

  • The VADR model libraries for HRV annotation were developed using representative RefSeq sequences NC_001617(A), NC_001490(B) and NC_009996(C).​

  • All 3 of the model genomes have been modified slightly on the 3' end to have polyA tails of consistent length and facilitate consistent behavior across the models.


Reference

  • The recommended citation for using VADR is: Alejandro A Schäffer, Eneida L Hatcher, Linda Yankie, Lara Shonkwiler, J Rodney Brister, Ilene Karsch-Mizrachi, Eric P Nawrocki; VADR: validation and annotation of virus sequence submissions to GenBank. BMC Bioinformatics 21, 211 (2020). https://doi.org/10.1186/s12859-020-3537-3

  • This page was adapted for HRV from Mpox virus annotation


About

Human rhinovirus (HRV) VADR model libraries

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors