Releases: Kalan-Lab/codoff
Releases · Kalan-Lab/codoff
v1.2.3
- Major Update: Update simulation to use sequential sampling - instead of creating hypothetical gene clusters of equivalent size to the focal region of interest by randomly grouping together genes from across the genome, the simulation samples real genomic loci of similar size to the focal region. This makes the simulation more realistic (preserves spatial structure of the genome) and makes inferences more robust to artifacts due to focal region size.
- Major Update: Instead of an "empirical p-value" to assess codon usage discordance to the background genome, we now report a "discordance percentile".
- Introduce: Introduced
run_simulations.pyto perform simulations of running codoff on random regions of a certain length for any input genome to validate the percentile/p-value distribution is uniform as expected.
What's Changed
Full Changelog: v1.2.2...v1.2.3
v1.2.2
- Fixed: reporting of focal region codon frequencies. While major stats, such as P-value, Spearman correlation, and observed cosine distance were unaffected by this issue, the last set of simulated focal-region frequencies was previously being reported in place of the actual focal region frequencies.
- Switched to a significantly faster approach with caching of pre-computed CDS codon counts - can process ~90 fungal BGCs in ~5 mins using 1 thread.
- Added more testing.
- Options for both
codoffandantismash_codoffhave slightly changed. - Improved coding style.
Full Changelog: v1.2.1...v1.2.2
v1.2.1
- Add support for gzipped input GenBank files.
- Replace usage of pkg_resources API which is becoming deprecated.
Full Changelog: v1.2.0...v1.2.1
v1.2.0
- Restructure codoff to allow for usage as a Python library. API documented on wiki at: https://github.com/Kalan-Lab/codoff/wiki/API-and-usage-of-main-functions-in-Python-programs
- Introduce
antismash_codofffor running codoff in batch for all BGC regions determined by antiSMASH for a single genome. - Improve preliminary checking of input arguments/files.
Full Changelog: v1.1.8...v1.2.0
v1.1.8
v1.1.0
- Add support for additional input types including just a genome in FASTA format with a focal region defined using coordinates. pyrodigal is used in this case to perform gene calling.
- Improve output clarity
- Add argument for producing plot of simulated cosine distances with a marker for actual focal region cosine distnace
- Make empirical P-value computation always on
Full Changelog: v1.0.0...v1.1.0