Skip to content

FP8 GEMM 8wave ping-pong kernel#525

Open
amd-cgilli wants to merge 9 commits into
ROCm:mainfrom
amd-cgilli:fp8_8wave_ping_pong
Open

FP8 GEMM 8wave ping-pong kernel#525
amd-cgilli wants to merge 9 commits into
ROCm:mainfrom
amd-cgilli:fp8_8wave_ping_pong

Conversation

@amd-cgilli
Copy link
Copy Markdown
Contributor

Add 8wave ping-pong fp8 GEMM kernel. Outperforms 4wave on CI shapes.

Comment thread tests/kernels/test_fp8_gemm_rowscale.py
Comment thread tests/kernels/test_fp8_gemm_rowscale.py Outdated
Comment thread kernels/fp8_gemm_utils.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants