Add multi-index support to apply_time_correction#103
Open
jon-proximafusion wants to merge 1 commit into
Open
Conversation
apply_time_correction now accepts an iterable of time indices and returns a list of derived tallies, one per index. The tally arrays are read and reshaped once and shared across all indices, so post-processing a D1S mesh tally at many shutdown times no longer re-reads and re-reshapes the data on every call. Scalar index input is unchanged and still returns a single Tally; each derived tally is bit-for-bit identical to the corresponding single-index call. The summed (derived) result no longer stores sum/sum_sq (the public accessors return None for derived tallies regardless). This avoids two full-size multiplies per index and fixes Tally.sparse on summed tallies, whose stored arrays were otherwise shaped inconsistently with the popped ParentNuclideFilter. Speeds up summed post-processing by 1.7-3.1x across tally sizes in benchmarks (larger meshes and more timesteps benefit most), with results bit-for-bit identical to the previous implementation.
5bc2208 to
2e0c553
Compare
5 tasks
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
apply_time_correctionnow accepts an iterable of time indices in addition to a single integer. When given an iterable it returns a list of derived tallies, one per index; scalar input is unchanged and still returns a singleTally.This makes it cheap to post-process a D1S mesh tally at many shutdown times. Previously each time of interest required a separate call, and every call re-read the tally results and re-reshaped the
sum/sum_sq/mean/std_devarrays. Now the arrays are read and reshaped once and shared across all requested indices.Behavior
index→ returns a singleTally(unchanged API).index→ returnslist[Tally], one per index, in the order requested (unordered/partial sequences are honored).sum_nuclides=True, the summed result no longer storessum/sum_sq. The public accessors already returnNonefor any derived tally, so this matchesdevelop's observable behavior. It avoids two full-size multiplies per index on data nothing reads, and fixesTally.sparseon summed tallies (whose stored arrays were otherwise shaped inconsistently with the poppedParentNuclideFilter).Performance
Reading and reshaping the tally arrays once (instead of once per call) avoids the repeated I/O and reshaping when post-processing at many times; larger meshes and more timesteps benefit most. The per-index arithmetic is unchanged, so this is a constant-factor saving on setup, not a faster algorithm.
Testing
Adds
test_apply_time_correction_multi_index, which checks that each element of a multi-index call matches the corresponding scalar call (mean, std_dev, filters, and sum/sum_sq), that unordered/partial index sequences are honored, that scalar input still returns a singleTally, and that the original tally is left unmodified.Relationship to #104
This PR is the bit-for-bit implementation. #104 is an alternative that computes the summed parent-nuclide reduction with an
einsumcontraction — faster still for large mesh tallies, but matchesdevelopto machine precision (~1e-15 relative) rather than bit-for-bit. The two are mutually exclusive; pick whichever trade-off is preferred.Fixes # (issue)
Checklist