Skip to content

Conversation

@mahdiehghazim
Copy link
Contributor

This change makes MSCCL++ automatically select CUDA architectures based on the build environment. If an NVIDIA GPU is detected, the build targets the native GPU architecture for optimal performance; otherwise, it falls back to building for multiple architectures for portability. When building for the native architecture, FP8 support is automatically enabled for “a-series” GPUs (e.g., sm_100a), allowing the appropriate optimized code paths to be picked up.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements automatic CUDA architecture selection for MSCCL++ builds. When an NVIDIA GPU is detected at build time, the build targets the native GPU architecture for optimal performance; otherwise it falls back to building for multiple architectures (80, 90, 100, 120 based on CUDA toolkit version) for portability. The native architecture build enables automatic FP8 support for "a-series" GPUs.

Changes:

  • Modified architecture selection logic to use "native" when NVIDIA GPU is detected
  • Preserved multi-architecture fallback when no GPU is present

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants