caching: centralize caching#688
Conversation
010d514 to
12bbc65
Compare
12bbc65 to
cc89d1b
Compare
cc89d1b to
1969b09
Compare
1969b09 to
45abbb5
Compare
12abfe2 to
bd9b23c
Compare
|
How about starting by adding the caching as an opt-in global config? This would allow us to test the caching in Lakeview. |
Good idea. I'll do some exploratory work on that tomorrow. |
77101ae to
71b6492
Compare
6a1d696 to
6113fa4
Compare
| data = ( | ||
| LazyCache.from_h5py_dset(dset, field="Value") | ||
| if caching.global_cache | ||
| else dset.fields("Value") | ||
| ) | ||
| timestamps = ( | ||
| LazyCache.from_h5py_dset(dset, field="Timestamp") | ||
| if caching.global_cache | ||
| else dset.fields("Timestamp") | ||
| ) |
There was a problem hiding this comment.
I think we should move the cache checking into functions in the caching package.
Those would be concerned about what caching to use while this code is only about loading the channel data.
I added a basic example below, but I think we can add other ones that cover the other usages from loading numpy arrays and such.
# in caching
def from_h5py(dset, field=None):
global global_cache
return (
LazyCache.from_h5py_dset(dset, field=field)
if global_cache
else dset.fields(field)
)
# here
@staticmethod
def from_dataset(dset, y_label="y", calibration=None) -> Slice:
data = caching.from_h5py(dset, field="Value")
timestamps = caching.from_h5py(dset, field="Timestamp")There was a problem hiding this comment.
Yeah, that's a good point. That keeps it all in one place. Easier to change.
I've added the method you suggested and put the retrieval in caching as well. Decided to go with a mixin here, since then we can keep the storage with the code that stores.
c23886f to
4b2f6e9
Compare
d619ceb to
44b5d4f
Compare
0. Why two caches?
There is a size disparity between channel data and processed data, which is why I think they should live in separate caches.
1. Allows us to monitor memory consumption of all the caches.
By centralizing the channel caching, we can monitor the memory consumption by the caches.
2. Less memory use in some cases.
Loading the same slice twice, doesn't cost you twice the memory anymore (since things are stored by location).
3. It fixes subtle sources of bugs
We can also enforce that the slices are read only. This prevents the following source of "bugs":
And instead prompts the user to explicitly make a copy if they want to change the data. We already do this for certain arrays coming from confocal objects.
Note that in-place operations on slices are still fine though (which is nice, because those are often used in de/recalibrating). For example::
4. It makes certain access patterns much more performant
Currently doing something like:
Is horrendously slow clocking in at
42 +- 5seconds for a small30second kymograph with50tracks. This reduces to1.6 +- 0.1seconds with the new cache. The following would have also worked to fix the problem without the cache:This would require additional awareness on the part of the user and doesn't have the benefit of allowing us to monitor how much data is going into these caches going forward.
Open questions
1. Small versus big slices
While doing this, I noticed that for
Continuousslices, we always read the whole slice even if we take only a small segment (despiteh5pysupporting lazily slicing). Note that this was also the case before the caching was introduced (bug?).At first, this seemed wasteful, so I set up the cache to allow slicing first and then grab the data (including start index and stop index in the location), but when you take many small slices, the performance is frequently far worse than reading the big slice as a whole at once. As a result, the case discussed above with the tracks still took 23 seconds.
One possible improvement to the cache system could be to include a start and stop index in the location again for continuous channels. Then keep track of how frequently a slice of a continuous channel with a particular path is accessed and if that exceeds some threshold assume that the user will be taking more chunks, dump the small chunks and load the whole slice instead. Then in the cache grabber, we could always check if the parent location (the one without additional indexing) is already present, and slice from that if it is.
Right now, from my testing so far, this additional complexity does not seem worth it though. Most
h5files contain one or two items and generally, you look at the whole thing before you start interacting with sub-slices. One iffy part would be if a single slice would exceed the memory capacity. But right now, you also have no way of loading those without going through the rawh5structure.2. What is a good default cache size?
I think making it proportional to a size in bytes makes more sense than number of items. Especially considering how much the items can vary in size.
I currently hardcoded the maximum at 1 GB, but I suspect this should depend on the user system somehow? Are there any guidelines for application memory usage?
3. Open file handles
It might be worth considering going one step further and only store locations in the
lk.Fileobject and handle opening/closing the resource in the function that looks up and caches these resources. Then we also don't need to keep theFileopen indefinitely. By handling file interactions in one place, it might also make it a lot easier to swap out file formats in the future.4.
sys.getsizeofThe way the property/method cache is set up is that it should only ever be used for immutable types whose size is reported correctly by
sys.getsizeof. This is true for anumpyarray (if we set writable to false) and basic types like integers and floats for instance, but not for tuples. This requires some diligence on us to make sure that we enforce this.Some final notes
When reading data from
TimeTagchannels, we actually read the data 3 times already even when we don't want to access it at all 4 by the time we actually read it. Note that we can fix this without having to resort to a global cache however.Considering that disallowing data modification is a breaking change, I would propose we iterate on this a bit now, and roll this out when
2.0rolls around.Thoughts?