Skip to content

Conversation

@Taepper
Copy link
Collaborator

@Taepper Taepper commented Nov 26, 2025

This adds a performance benchmark for many short read filters, which were used for profiling and identifying performance critical sections

@Taepper Taepper force-pushed the short-read-filter-benchmark branch from bc530b1 to b6cd27b Compare November 26, 2025 13:35
@github-actions
Copy link
Contributor

github-actions bot commented Nov 26, 2025

This is a preview of the changelog of the next release. If this branch is not up-to-date with the current main branch, the changelog may not be accurate. Rebase your branch on the main branch to get the most accurate changelog.

Note that this might contain changes that are on main, but not yet released.

Changelog:

0.9.2 (2025-11-26)

Features

  • benchmarking: add DEBUG custom variable for running debug builds (541afc8)
  • benchmarking: update api-query, enable CSV log file w/ checksum comparisons (3d135d0)
  • build: tag images with branch name again (670817c)
  • documentation: add documentation for sequence storage format (47e4081)
  • performance: add performance benchmark for the many short-read filters (f542989)
  • silo: better compression for sequences (9eb69c7)

Bug Fixes

  • build: fix Makefile to not repeatedly invoke conan install (346e6a9)

@Taepper Taepper force-pushed the short-read-filter-benchmark branch from b6cd27b to f542989 Compare November 26, 2025 13:57
@Taepper
Copy link
Collaborator Author

Taepper commented Dec 4, 2025

I will create a report that shows with this benchmark, how the performance improved over the commits of the 0.9.2 release

@pflanze
Copy link
Contributor

pflanze commented Jan 4, 2026

Sorry, I promised to look at this. Two thoughts:

  • IMHO it's not a good idea to include large files in Git source code repositories. The sorted.sample.ndjson.zst file increases the total compressed size of all files in this repository from 2.7 MB to 11.2 MB. If you never ever add any other large file then this file won't be a problem, but if you will change it just semi-frequently or if this means giving in to the temptation to include other large files then the source code becomes secondary and dealing with the repository will become bogged down by the weight. And once you've added those files to the Git history you can never get rid of those again (they will always be part of the .git dir).

    I suggest you make another repository with those kinds of files and then add them as a git submodule. Then you can easily drop the submodule again in the future and the file data won't be part of the .git repo.

  • Would it be a good idea to use the evobench tooling for those? Is there anything missing to integrate with the normal benchmarking?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants