Stash datafile numpy arrays and concatenate once#57
Open
zackcornelius wants to merge 1 commit intoECP-CANDLE:masterfrom
Open
Stash datafile numpy arrays and concatenate once#57zackcornelius wants to merge 1 commit intoECP-CANDLE:masterfrom
zackcornelius wants to merge 1 commit intoECP-CANDLE:masterfrom
Conversation
Avoid appending to xt_all and yt_all during datagen by stashing xt and yt arrays in a python list. Concatenate all the xt and yt arrays after all datagen frames have been processed, to triggere memcopy only once.
mseryn
approved these changes
Apr 13, 2020
mseryn
left a comment
There was a problem hiding this comment.
Changes look good. Performance speedup is useful.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Avoid appending to xt_all and yt_all during datagen by stashing xt and yt arrays in a python list.
Concatenate all the xt and yt arrays after all datagen frames have been processed, to trigger memcopy only once.
Before this patch, p2b1_baseline_keras2.py on Haswell (Cooley at Argonne - E5-2620v3 x2, 384 GB RAM, K80 GPU) runs in 4590 seconds
After this patch, it runs in 3555 seconds, for a ~23% speedup.
In situations with limited memory bandwidth (such as when using Optane DC Memory, or external memory via the RAN project at Argonne), this would have a significantly higher impact.