Hi,
I've been using yacht as a way to reduce false positives in sourmash, and I wanted to ask if it's possible to update the tool to incorporate the latest features from sourmash_plugin_branchwater? This would be helpful for a couple of reasons:
- Currently, the newest version of yacht only supports processing one sample at a time, which becomes time-consuming when working with many samples.
- As highlighted in the tutorial, the training process is indeed time-consuming, especially with large databases. I've been training GTDB-R220 (all genomes) for nearly a week without results, whereas training on the genomic representatives version only took me about a morning. This performance gap is significant.
I believe incorporating improvements like supporting new rocksdb data format and using manysketch and/or fastmultigather could help reduce processing times and allow handling of multiple samples simultaneously.
Thanks for the great tool, and I'm looking forward to potential improvements in future releases!