Skip to content

ci: handle ROCm qwen3 30B A3B configs#1162

Open
sreerohi wants to merge 1 commit into
radixark:mainfrom
sreerohi:rocm/qwen3-30b-a3b-configs
Open

ci: handle ROCm qwen3 30B A3B configs#1162
sreerohi wants to merge 1 commit into
radixark:mainfrom
sreerohi:rocm/qwen3-30b-a3b-configs

Conversation

@sreerohi
Copy link
Copy Markdown

  • Split CONFIGS in tests/e2e/megatron/test_qwen3_30B_A3B.py into ROCm vs CUDA branches via an IS_ROCM runtime check (torch.version.hip).
  • On ROCm, drop DeepEP, FP8, and INT4 variants (NVIDIA-only deps / TP-shard alignment / CUDA-only kernels) and keep one non-bridge + one bridge config.
  • Set SGLANG_USE_AITER=0 on ROCm so SGLang falls back to Triton for MoE GEMM (aiter CK kernels lack instances for this model's per-rank expert dims).

Relates to #1105

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces ROCm support for the qwen3-30B-A3B end-to-end Megatron test. It implements environment detection for ROCm and conditionally adjusts test configurations, disabling CUDA-specific features such as DeepEP, FP8, and INT4 rollouts when running on ROCm hardware. Additionally, it sets a fallback to Triton for MoE GEMM kernels on ROCm and temporarily disables a specific DeepEP test variant for NVIDIA. I have no feedback to provide.

@sreerohi sreerohi force-pushed the rocm/qwen3-30b-a3b-configs branch from 39c0ea3 to 9bcad88 Compare May 20, 2026 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant