Background
Looking at genetic alignments is a very important part of molecular biology. The human eye can pick up patterns on a small scale, then we can program the computer to pick out similar patterns on a much larger scale.
Making alignments creates an indexed location for each nucleotide in a fasta file. In turn, this will allow downstream processing like determining the distance between two sequences.
Aim
Align the 9 .fasta files in this folder:
mitolin/data/gen/nguyen_nc_2018/20190702-fastas-on-hpc/1739/20lines/
Related to issue 5
Related to issue 1 fluHA
Method
You can start with Blast or Muscle. Muscle will generate a sequence distance score that can be used to make a lineage tree (aka dendogram/hierarchical clusters).
Wikipedia has a list of alignment visualization software here.
Document your work
Please fork & clone this repo. Check out a branch for your work, then push and make a PR for us to merge your note and files.
Add a note (can be .md or .ipynb) with your solution to mitolin/nb.
Your note should be named as follows:
- DATE-issue#-shortdescription.ext
e.g.:
- 20190701-i02-extract-chrM-fa.md
Questions?
Please put questions related to this issue in this issue thread. If you want a quick response, post a link to your comment in this thread to Slack #deepcelllineage or DM @deena. To join Slack enter your email address here. For questions NOT specifically related to this issue, get in touch through any of the communication methods listed in DCL's overview README.
Background
Looking at genetic alignments is a very important part of molecular biology. The human eye can pick up patterns on a small scale, then we can program the computer to pick out similar patterns on a much larger scale.
Making alignments creates an indexed location for each nucleotide in a fasta file. In turn, this will allow downstream processing like determining the distance between two sequences.
Aim
Align the 9 .fasta files in this folder:
mitolin/data/gen/nguyen_nc_2018/20190702-fastas-on-hpc/1739/20lines/
Related to issue 5
Related to issue 1 fluHA
Method
You can start with Blast or Muscle. Muscle will generate a sequence distance score that can be used to make a lineage tree (aka dendogram/hierarchical clusters).
Wikipedia has a list of alignment visualization software here.
Document your work
Please fork & clone this repo. Check out a branch for your work, then push and make a PR for us to merge your note and files.
Add a note (can be .md or .ipynb) with your solution to mitolin/nb.
Your note should be named as follows:
e.g.:
Questions?
Please put questions related to this issue in this issue thread. If you want a quick response, post a link to your comment in this thread to Slack #deepcelllineage or DM @deena. To join Slack enter your email address here. For questions NOT specifically related to this issue, get in touch through any of the communication methods listed in DCL's overview README.