Conversation
This adds the unique spras_revision to every single paramater combination (before hashing) and the dataset label, to provide OSDF support on the level of deterministic algorithms.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
agitter
left a comment
There was a problem hiding this comment.
I finished another partial revision. I still haven't thought about the testing implications carefully.
whoops! accidentally feature-regressed
agitter
left a comment
There was a problem hiding this comment.
A few more comments. I still haven't looked through all the test code.
|
Since both past approaches do not scale well, I've decided to only focus on the RECORD file. This fails specifically in the case where SPRAS is somehow ran without being installed as a python module, and I can't think of a plausible scenario where this happens. |
|
As a follow up to our meeting discussion, I'm wondering if this type of output file versioning should be optional. Then when running in CHTC and writing to OSDF (or running locally and opting in) it could be enabled. By making it opt in, we would have simpler filenames by default and ensure the user knows they have to install and run SPRAS a specific way for this feature to work. |
|
That makes the most sense to me as well 👍 |
what is going on in ci???
okay - sysconfig.get_path("purelib") is correct
we need better typing :/
| # | ||
| # By default, this is disabled, as it can make output file names confusing. Here, it's set to true since we use this | ||
| # configuration file for testing. | ||
| osdf_immutable: true |
There was a problem hiding this comment.
This is a little annoying. We use this config for testing, so it's nice to enable this, but this is also our documentation config. I can write some extra code to enable this during testing, but that seems strange as well.
For now, I'm okay with keeping this then writing more documentation later (especially as we start focusing more on the COMBINE25 tutorial.)
This change means that output files will not be reused whenever SPRAS is updated if
osdf_immutableis true, furthering the immutability goal necessary to get OSDF integration working for SPRAS benchmarking. ('updated' depends on the git commit hash or the actual SPRAS release version)This adds the unique
spras_revisionto every single paramater combination (before hashing) and the dataset label, to provide OSDF support on the level of deterministic, non-seeded algorithms when datasets are immutable.This has the added benefit of allowing SPRAS users to simply upgrade their SPRAS version without needing to clear
output, which complements #380. The refactored test also partially covers #165 and #45. (This is also where the majority of the code comes from: The actual feature patch here is a 50 line change.)See #321 implemented by #335 for handling nondeterministic algorithms / seeded algorithms.
To make this change, a significant test refactor in
test/analysiswas needed to remove hardcoded paths (which contained the hashes being modified per-commit in this PR.) It turns out that whenever we make any change to the hash, this [original: the patch here fixes this] test breaks! That's why this PR is depended on by so many other PRs.