Skip to content

[Diffusion] Intel XPU: runtime integration for distributed and stream management#2

Open
xiangyuT wants to merge 3 commits into
mainfrom
xpu_dev
Open

[Diffusion] Intel XPU: runtime integration for distributed and stream management#2
xiangyuT wants to merge 3 commits into
mainfrom
xpu_dev

Conversation

@xiangyuT
Copy link
Copy Markdown
Collaborator

@xiangyuT xiangyuT commented Apr 16, 2026

Motivation

This PR builds on the XPU platform foundation merged in sgl-project#17920, adding the runtime-level changes needed to actually run diffusion inference on Intel XPU (Arc Pro B-series, etc.) with tensor parallelism support.

sgl-project#17920 added the platform detection (XpuPlatform), attention backend (xpu_backend.py), platform plugin registration, and basic sgl_kernel integration. This PR addresses the remaining gaps discovered during end-to-end testing on Intel Arc Pro B60 GPUs.

Test Results

Tested on Intel Arc Pro B60 with Z-Image-Turbo (BF16, 9-step turbo schedule, prompt="A golden retriever in the snow"):

TP=1:
image

TP=2:
image

Notes

  • All changes are gated behind current_platform.is_xpu() checks — no impact on CUDA/ROCm/NPU paths.
  • The XCCL workarounds (AVG, all-to-all, batch P2P) are known issues being tracked upstream in PyTorch; these can be removed once fixed.

@xiangyuT xiangyuT force-pushed the xpu_dev branch 2 times, most recently from c4dd99c to 370cf47 Compare May 14, 2026 00:57
xiangyuT and others added 3 commits May 14, 2026 01:04
upstream main already has xpu_platform_plugin function and
"xpu" entry in builtin_platform_plugins dict since PR sgl-project#17920.
The all_to_all_4D method calls self._maybe_wait() on the output of
ft_c.all_to_all_single, but the method was never defined, causing
AttributeError at runtime during multi-GPU inference.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant