Stash datafile numpy arrays and concatenate once by zackcornelius · Pull Request #57 · ECP-CANDLE/Benchmarks

zackcornelius · 2020-01-28T22:00:18Z

Avoid appending to xt_all and yt_all during datagen by stashing xt and yt arrays in a python list.

Concatenate all the xt and yt arrays after all datagen frames have been processed, to trigger memcopy only once.

Before this patch, p2b1_baseline_keras2.py on Haswell (Cooley at Argonne - E5-2620v3 x2, 384 GB RAM, K80 GPU) runs in 4590 seconds

After this patch, it runs in 3555 seconds, for a ~23% speedup.

In situations with limited memory bandwidth (such as when using Optane DC Memory, or external memory via the RAN project at Argonne), this would have a significantly higher impact.

Avoid appending to xt_all and yt_all during datagen by stashing xt and yt arrays in a python list. Concatenate all the xt and yt arrays after all datagen frames have been processed, to triggere memcopy only once.

mseryn

Changes look good. Performance speedup is useful.

Stash datafile numpy arrays and concatenate once

cdd9c47

Avoid appending to xt_all and yt_all during datagen by stashing xt and yt arrays in a python list. Concatenate all the xt and yt arrays after all datagen frames have been processed, to triggere memcopy only once.

mseryn approved these changes Apr 13, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stash datafile numpy arrays and concatenate once#57

Stash datafile numpy arrays and concatenate once#57
zackcornelius wants to merge 1 commit intoECP-CANDLE:masterfrom
zackcornelius:p2b1_datagen_concat

zackcornelius commented Jan 28, 2020

Uh oh!

mseryn left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zackcornelius commented Jan 28, 2020

Uh oh!

mseryn left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants