cs259-miniproj-ref

CPU reference implementations for the CS 259 mini-project.

Build and run

make        # builds ./attention and ./conv
./attention
./conv

Requires GCC with OpenMP (-fopenmp) and C++17.

Configurations

Attention — head dimension D=64:

Case	S or C
Prefill	4096, 65536
Decode	4096, 65536

attention.cpp provides three implementations:

standard_prefill — two-pass softmax, allocates an O(S) scores buffer per query
flash_prefill — online softmax, single pass, no scores buffer
standard_decode — two-pass softmax for a single query vs KV cache

For S=65536, standard_prefill is skipped (O(S²) — too slow). Flash attention is provided as a reference; you may adapt it for your GPU kernel.

Convolution — 3×3 kernel, stride 1:

Layer	Ny × Nx	Ni	Nn	B
Conv1	224×224	64	64	1, 16
Conv2	14×14	512	512	1, 16

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Makefile		Makefile
README.md		README.md
attention.cpp		attention.cpp
conv.cpp		conv.cpp
timing.h		timing.h

File	Description
`attention.cpp`	Standard and flash attention (prefill + decode)
`conv.cpp`	Direct convolution (VGG Conv1 and Conv2)
`timing.h`	Timing helper used by both

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cs259-miniproj-ref

Contents

Build and run

Configurations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cs259-miniproj-ref

Contents

Build and run

Configurations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages