Skip to content

Subsequence counting #722

@terrycojones

Description

@terrycojones
  1. Add a utils function to extract subsequence counts from a string. It will take a start, stop, window, step offset option (all could be optional, or window could have a default of 15 or whatever), and a sequence string. It will return a Counter (https://docs.python.org/2/library/collections.html)
  2. You know number 2, right? :-)
  3. Add a script in bin called fasta-extract-subsequences.py (or something like that). Model the script on fasta-ids.py (in terms of reading its input, allowing us to pipe FASTA into it on stdin). It can just print the subsequences and their counts for now. It could optionally sort them (reversed) by count, or not print the counts. But those things can also just be done by external tools like sort, cut, etc. Hint: Counters can be added!
  4. Update the version number in dark/__init__.py, the CHANGELOG, and add the script's name to setup.py.

This might be of use if we want to provide that woman who Victor brought by yesterday with a list of all peptides from all Wuhan sequences.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions