Skip to content

JL-Money/CUDA_conccurent_streams_example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

CUDA Examples with Concurrent Streams

A short example of how we can go from basic CPU to GPU with copy-compute overlap to speed up a dummy encryption/decryption program. Speed ups are dependent on the hardware used.

Table of Contents

Instructions

First ensure that you have properly installed CUDA and Nsight Tools.

Baseline

cd baseline_cipher

Type make baseline in a CLI to compile the code. The first time you run the command will be slow as the CPU does all the encoding and caches the encoded result as a file.

Type make clean to remove the results of compilation.

Streaming

cd all_cuda_streams

Type make streams to compile. This code uses the GPU for both encoding and decoding the data, and it never reads from your cached file.

Type make profile to generate a Nsight Systems report file that can be viewed in Nsight. Viewing the report should show that your compute kernel calls are executed in parallel.

Type make clean to remove the results of compilation.

About

Example code demonstrating the concurrent streams feature of CUDA

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors