Skip to content

fix: add sm_89 (Ada Lovelace) gencode and fix Python/dep versions#871

Open
MasahiroOgawa wants to merge 1 commit intostate-spaces:mainfrom
MasahiroOgawa:fix/sm89-pyproject
Open

fix: add sm_89 (Ada Lovelace) gencode and fix Python/dep versions#871
MasahiroOgawa wants to merge 1 commit intostate-spaces:mainfrom
MasahiroOgawa:fix/sm89-pyproject

Conversation

@MasahiroOgawa
Copy link
Copy Markdown

Summary

  • Add -gencode arch=compute_89,code=sm_89 for RTX 40xx (Ada Lovelace) GPUs under CUDA >= 11.8
  • Bump requires-python to >=3.10 (quack-kernels requires it)
  • Sync quack-kernels to 0.3.4 to match setup.py

Context

RTX 40xx GPUs (RTX 4060/4070/4080/4090) use Ada Lovelace architecture with compute capability 8.9 (sm_89). Without this gencode flag, CUDA kernels fall back to sm_87 instead of native sm_89 compilation.

CUDA 11.8+ supports sm_89, so the flag is added in the same block as sm_90.

Test plan

  • Built mamba-ssm from source on RTX 4080 Laptop GPU with CUDA 12.8
  • Verified selective_scan_cuda extension loads correctly with sm_89
  • Ran pretrained model inference (mamba-130m, mamba-370m) successfully

🤖 Generated with Claude Code

- Add -gencode arch=compute_89,code=sm_89 for RTX 40xx series (Ada
  Lovelace) under CUDA >= 11.8, alongside existing sm_75/80/87/90
- Bump requires-python to >=3.10 (quack-kernels requires it)
- Sync quack-kernels to 0.3.4 to match setup.py

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant