Skip to content

Bump sglang to v0.5.12#1164

Merged
yueming-yuan merged 24 commits into
mainfrom
bump_sglang_v0.5.12
May 22, 2026
Merged

Bump sglang to v0.5.12#1164
yueming-yuan merged 24 commits into
mainfrom
bump_sglang_v0.5.12

Conversation

@yueming-yuan
Copy link
Copy Markdown
Collaborator

@yueming-yuan yueming-yuan commented May 21, 2026

ci-sglang-pr: sglang-miles-v0.5.12
ci-image-tag: sglang-miles-v0.5.12

Summary

  • Point the miles Dockerfile at lmsysorg/sglang:v0.5.12-cu129 by default.
  • Point source checkout to sglang-miles-v0.5.12.
  • Use rebuilt CUDA 12.9 wheels from yueming-yuan/miles-wheels:cu129-x86_64-v0.5.12 so flash-attn 2 matches the torch 2.11/cu129 ABI.
  • Update CUDA 13 example image tags to v0.5.12-cu130.

SGLang migration

  • Pushed sgl-project/sglang:sglang-miles-v0.5.12 rebased on upstream v0.5.12.
  • Dropped custom changes already covered by upstream v0.5.12 where applicable.

Wheels

TODO

Local checks

  • pre-commit run --all-files on the rebased sglang branch
  • targeted python3 -m compileall on conflict-touched sglang files
  • git diff --check in miles
  • verified the new miles-wheels release has all 7 expected assets

Remote build check

  • rcli job new --dockerfile docker/Dockerfile --build-context /home/radixark/miles-bump-sglang-v0.5.12-context --server ion-user-7 --name bump-sglang-v0.5.12-test
  • Created running job yueming.yuan-bump-sglang-v0.5.12-test
  • Verified container imports sglang and torch; sglang checkout HEAD is a4202a8

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the SGLANG image tag and branch version from v0.5.10 to v0.5.12 within the Dockerfile. I have no feedback to provide as there are no review comments to evaluate.

@yueming-yuan yueming-yuan marked this pull request as ready for review May 22, 2026 21:07
Copy link
Copy Markdown
Collaborator

@guapisolo guapisolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI modifications and TITO test changes LGTM.

Copy link
Copy Markdown
Contributor

@maocheng23 maocheng23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, only NITs.


# Setup Ring Flash Attention with CP group from mesh (only when cp_size > 1)
if cp_size > 1:
from ring_flash_attn import substitute_hf_flash_attn
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

position_ids = batch["position_ids"]

if get_parallel_state().cp.size > 1:
from ring_flash_attn import update_ring_flash_attn_params
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

@yueming-yuan yueming-yuan merged commit ae083c5 into main May 22, 2026
9 checks passed
@yueming-yuan yueming-yuan deleted the bump_sglang_v0.5.12 branch May 22, 2026 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants