OpenCL will allow overlap. Need to complete this for - hand coded kernels (easy) - ViennaCL (hard) - CPU (use libnbc to get MPI_ialltoallv support introduced in MPI-v3)
OpenCL will allow overlap. Need to complete this for