Concatenate and merge info, preserving conflicts as lists.#691
Concatenate and merge info, preserving conflicts as lists.#691joseppinilla wants to merge 1 commit intodwavesystems:mainfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #691 +/- ##
==========================================
+ Coverage 91.85% 91.86% +0.01%
==========================================
Files 60 60
Lines 4222 4229 +7
==========================================
+ Hits 3878 3885 +7
Misses 344 344 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hmm, I am worried that the conflict resolution would lead to inconsistent/confusing results. I wonder if we could combine it with #643 by always making a list, even when the values are identical? |
randomir
left a comment
There was a problem hiding this comment.
This is a very nice feature, but I share @arcondello's concern about consistency.
I suggest adding an optional info_merge_strategy argument (or similar) which would control how info fields are combined:
"squeeze"could be your current implementation,"list"would always form lists,"drop"would skip the merge altogether,"inplace"or"recursive"would recursively combine all info dicts,- custom
callablewould allow merge delegation to user's custom function ofndict args.
Inplace/recursive merge might be an overkill, so we can drop that one for now.
In addition, you'd probably want to make copies before merging them, due to dict's mutability.
|
I was considering one option flag, e.g. I see now how "drop" would allow compatibility with the current implementation, but I'm not sure if adding more flexibility is necessary. |
|
I prefer categorical option to boolean because it allows future expansion. Accepting a callable (in addition) is trivial and provides an absolute flexibility. |
|
One reason I didn't originally implement "list" is because I had to decide whether lists are always I think I'd go for the latter, even though it wouldn't allow tracking. |
|
IMO, "list" should include fields from all samplesets. We might consider splitting "list" into "list-all" and "list-existing", although "squeeze" is conceptually already very similar to "list-existing". @arcondello, thoughts? |
|
|
|
I don't see how |
If there is interest in merging info when concatenating samplesets since right now all info is ignored.
This preserves conflicts by listing them, but squeezes unique values.