Skip selective_scan_cuda build on ROCm/HIP#915
Open
ChrisLundquist wants to merge 1 commit intostate-spaces:mainfrom
Open
Skip selective_scan_cuda build on ROCm/HIP#915ChrisLundquist wants to merge 1 commit intostate-spaces:mainfrom
ChrisLundquist wants to merge 1 commit intostate-spaces:mainfrom
Conversation
The selective_scan_cuda CUDA extension (used by Mamba-1) does not compile on ROCm. Mamba-2 and Mamba-3 use Triton kernels and do not need this extension. Skip building it on HIP so that `pip install mamba-ssm` works on ROCm without requiring MAMBA_SKIP_CUDA_BUILD=TRUE. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
selective_scan_cudaCUDA extension on ROCm/HIP builds since it does not compile on HIPProblem
Installing
mamba-ssmon ROCm currently requires the workaround:Without this flag,
setup.pytries to compilecsrc/selective_scan/*.cuwith hipcc, which fails because the CUDA kernels use APIs without HIP equivalents.Fix
setup.pyalready detects HIP builds viaHIP_BUILD = bool(torch.version.hip). This PR wraps theselective_scan_cudaextension inif not HIP_BUILD:, so ROCm installs get a pure-Python + Triton wheel automatically.The
selective_scan_interface.pymodule already handlesselective_scan_cuda = Nonegracefully viatry/except.Tested on: AMD Radeon RX 9070 XT (gfx1201, RDNA4), ROCm 7.2.1, PyTorch 2.12.0.dev+rocm7.2
Relates to #65 (ROCm support), #914 (Mamba-3 Triton kernel fix)
🤖 Generated with Claude Code