Initialize only once + tracking initializations state#624
Initialize only once + tracking initializations state#624jordanflitter wants to merge 19 commits intomainfrom
Conversation
… call broadcast_input_struct only one time in the calculation. This forced me to use a generator for run_global_evolution since we use there input_one_cell, which is the real input struct we should pass. A similar logic was applied in the z-photoncons scenario. All tests pass now
…ast_params in cfuncs.py, so that the broadcasting will happen only once when calling functions from this module multiple times (for example, if regenerate=False, in order to check if the appropriate HaloCatalog is in the cache, there is a repeated call to get_halo_catalog_buffer_size). In addition, I removed the call to lib.Free_cosmo_tables_global() at the end of high-level functions, and in return I made sure that free_cosmo_tables is always called (via the decorator @high_level_func), whether the call succeeded or there was an error
…_by_higher_level. All tests have passed
… @broadcast_params more similar
… @broadcast_params extremely similar
…l, renamed broadcast_params->c_wrapper, and simplified the logic in generator_wrapper (since it is always called from a high_level_func)
…hat is decorated with it to state explicitly if it is_generator
…functions, especially single field and high level functions that require initializing the power spectrum. To avoid segfaults, calls to init_ps and free_ps were removed completely by the C backend. This means that power spectrum initialization and free is done ONLY ONCE, at the wrapper level
…of the functions, especially single field and high level functions that require initializing the sigma tables. To avoid segfaults, calls to initialiseSigmaMInterpTable and freeSigmaMInterpTable were removed completely from the C backend. This means that the sigma tables initialization and free is done ONLY ONCE, at the wrapper level
… of the functions, especially single field and high level functions that require initializing the heat tables. To avoid segfaults, calls to init_heat and destruct_heat were removed completely from the C backend. This means that the heat tables initialization and free is done ONLY ONCE, at the wrapper level
…ble and init_heat_tables under a super decorator called _make_lifecycle_decorator
…lization states. All tests have passed, namely no segfaults with this architecture
…init_manager across different levels of functions, thereby increasing its visibility along the calculation. In addition, it simplifies a lot the logic in wrapper, since it doesn't need to deal with a scenario where the called function doesn't accept init_manager as keyword argument. But most importantly, it allows a separation between init_heat_tables and init_sigma_table, since these two wrappers are completely independent (one doesn't need the other, only c_wrapper for broadcasting inputs)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #624 +/- ##
==========================================
+ Coverage 88.16% 88.32% +0.15%
==========================================
Files 32 32
Lines 4741 4829 +88
Branches 800 822 +22
==========================================
+ Hits 4180 4265 +85
Misses 401 401
- Partials 160 163 +3 ☔ View full report in Codecov by Sentry. |
|
Thanks @jordanflitter, this is a really great effort towards something that is very important! While reading your code (impressive use of decorators, btw!) I think I realized that this can be more simple. I'm going to write psuedo-code below which is untested but that hopefully gives the idea. First, I would make a new (private) module and define a new singleton object (i.e. an object that should only ever have one instance). Let's say the module name is from ... import InputParameters
class _GlobalCInitManager:
def __init__(self):
self.current_inputs: InputParameters | None = None
self._ps_inited = False
self._sigma_inited = False
self._heat_inited = False
def broadcast_inputs(self, inputs: InputParameters):
if self.current_inputs is None or inputs != self.current_inputs:
self.free_all() # free all first, so that we don't mix and match inputs later.
lib.Broadcast_struct_global_all(<you know the drill>)
self.current_inputs = inputs
def init_ps(self, inputs: InputParameters):
if not self._ps_inited and (self.current_inputs is None or self.current_inputs != inputs):
# Always broadcast inputs first
self.broadcast_inputs(inputs)
lib.init_ps()
self._ps_inited = True
def init_heat(self, inputs):
<same thing>
...
def free_all(self):
if self._ps_inited:
lib.free_ps()
self._ps_inited = False
if self._sigma_inited:
lib.freeSigmaMInterpTable()
self._sigma_inited = False
...
def init(self, inputs: InputParameters, *to_init: list[str]):
if not to_init:
# be conservative and initialize everything if nothing explicitly given
to_init = ['ps', 'sigma', 'heat']
for thing in to_init:
match thing:
'ps': self.init_ps()
'sigma': self.init_sigma()
... etc
c_init_manager = GlobalCInitManager()Now, either in that module or maybe in Then, any function that might use the PS or sigma or whatever just needs to have the decorator applied: The downside to all of this (which I think is the same downside for you) is that we have to be careful to find and decorate EVERY function that uses the PS/heat/sigma stuff even implicitly, both now and into the future. It would be great if we could simply mark the C-side functions somehow with a special mark such that we can tell from Python whether they will use these, then we could make automatic judgments. Oh wait, that's called a parameter to the function :-P |
|
Thanks @steven-murray for your feedback and your suggestion! I went over your suggested pseudo-code. It currently lacks some important logics, like a special case for a generator function (that doesn't return an output), and free memory if exception has occurred, but I'm pretty sure they can be included as well. I am struggling to understand though in what sense this design is simpler compared to mine, at the end I see that in your design we have one decorator that could contain multiple arguments above the user-called functions, whereas in my design there could be multiple decorators of a single argument... Ultimately, I don't have a clear preference for your design or mine (again, unless I'm missing something, the main difference boils down to the number of decorators and their arguments). But I will say that this PR was supposed to be a quick patch for solving the bottleneck in We should discuss all of this in our next meeting :) |
|
Thanks for the your comments @jordanflitter. Yes, it might be easiest to discuss this next week directly. You may be right that there is no real discernible difference between our methods, but I certainly felt mine was much simpler as I was writing it haha. |
Fix #591.
This PR deletes all the calls of
init_ps,initialiseSigmaMInterpTableandinit_heatfrom the C backend. The initializations (as well as the broadcast to C) are done in the wrapper only ONCE, regardless if we run a high level function, a single field function, or something fromcfuncs.py.I moved the nested wrapper in
cfuncs.pyto_param_config.py. The update wrapper (whose every line was written with blood, sweat and mostly tears) contains a dynamic_InitManagerinstance that is passed across different levels of functions via theirkwargs, allowing the functions to inspect the state of the initializations and decide whether a given initialization must be performed or not. The decorators above the functions also give an idea for what must be initialized in order for the execution to complete.Besides the more elegant design (according to me!), this architecture reduces the runtime of
run_global_evolutionby 50%! (because the sigma interpolation tables are not initialized every time a single field function is called)Moreover, it cleans some memory leak bugs (
init_pswas very often not accompanied withfree_psat the end, sometimes there were even calls toinit_pswithout actually using the power spectrum).Other ideas that I had but still haven't implemented them:
performance.rst._InitManagerto raise an error (if something was not initialized, to prevent a segfault) or a warning if the computation was complete but something was left unfreed. While this is a nice feature in theory, the architecture of the updated wrapper will make it very difficult to test these scenarios (since the whole idea is that the wrapper uses_InitManagerto decide what to initialize and what not.