- Add a utils function to extract subsequence counts from a string. It will take a start, stop, window, step offset option (all could be optional, or window could have a default of 15 or whatever), and a sequence string. It will return a
Counter (https://docs.python.org/2/library/collections.html)
- You know number 2, right? :-)
- Add a script in
bin called fasta-extract-subsequences.py (or something like that). Model the script on fasta-ids.py (in terms of reading its input, allowing us to pipe FASTA into it on stdin). It can just print the subsequences and their counts for now. It could optionally sort them (reversed) by count, or not print the counts. But those things can also just be done by external tools like sort, cut, etc. Hint: Counters can be added!
- Update the version number in
dark/__init__.py, the CHANGELOG, and add the script's name to setup.py.
This might be of use if we want to provide that woman who Victor brought by yesterday with a list of all peptides from all Wuhan sequences.
Counter(https://docs.python.org/2/library/collections.html)bincalledfasta-extract-subsequences.py(or something like that). Model the script onfasta-ids.py(in terms of reading its input, allowing us to pipe FASTA into it on stdin). It can just print the subsequences and their counts for now. It could optionally sort them (reversed) by count, or not print the counts. But those things can also just be done by external tools like sort, cut, etc. Hint:Counterscan be added!dark/__init__.py, theCHANGELOG, and add the script's name tosetup.py.This might be of use if we want to provide that woman who Victor brought by yesterday with a list of all peptides from all Wuhan sequences.