Skip to content

Fix SM70 docker build, CUDA 13.0 compat conflict, and README docker args#43

Draft
titidatiti wants to merge 39 commits into
1CatAI:mainfrom
titidatiti:main
Draft

Fix SM70 docker build, CUDA 13.0 compat conflict, and README docker args#43
titidatiti wants to merge 39 commits into
1CatAI:mainfrom
titidatiti:main

Conversation

@titidatiti

Copy link
Copy Markdown

为什么需要这个 PR (What this PR does / why we need it)

我在使用 Tesla V100 (SM70) 和较新宿主机驱动(如 CUDA 13.0)部署镜像时遇到了编译报错、运行时 Error 803 以及Vllm参数未生效的问题。这些问题在本地部署时不会发生,而Docker部署就会出现问题,本 PR 旨在彻底修复这些问题:

  1. [Bugfix] 修复 SM70 AWQ kernel 编译失败
    docker/Dockerfilecsrc-build 阶段缺失了对 lmdeploy 源码的拷贝,导致 CMake 判定跳过 SM70 核心的编译(issue: https://github.com/1CatAI/1Cat-vLLM/issues/42#issuecomment-4467796333)。本提交补充了 COPY lmdeploy lmdeploy/

  2. [Core] 移除硬编码的 cuda-compat 库配置
    注释掉了 docker/Dockerfile 中强制写入 /etc/ld.so.conf.d/cuda-compat.conf 的两行命令。如果宿主机的显卡驱动较新(如 580 系列),强行加载旧的 compat 库会覆盖 nvidia-container-toolkit 的智能动态判定,导致容器启动报错 Error 803 (system has unsupported display driver / cuda driver combination)。移除后,让底层 toolkit 自行接管新旧驱动的兼容性挂载,更为稳妥。

  3. [Doc] 纠正 README 中的 docker 启动参数传参方式
    修正了 README.md 中有关 Docker 启动配置的错误。原文档将 modeltensor_parallel_size 等核心参数作为环境变量 (-e VLLM_MODEL=...) 传递,但 vllm serve 入口点原生仅识别 CLI 参数。这会导致容器静默忽略用户的设定并错误地加载源码默认的 Qwen3-0.6B。已将其替换为正确的 CLI 传参语法。

测试记录 (Test Record)

  • 已基于 Ubuntu + Docker 环境验证:修改后镜像编译通过,CUDA 13.0 宿主机不再报 803 错误,且能够正确读取 CLI 参数加载指定的大语言模型。

yangzhuxinyzx and others added 30 commits March 21, 2026 12:23
Signed-off-by: Pan-Shuhan-YMZX <263558224+Pan-Shuhan-YMZX@users.noreply.github.com>
(cherry picked from commit f8e4c58adad5561ab4cd006fdab6c9b1903eec1c)
Signed-off-by: Pan-Shuhan-YMZX <263558224+Pan-Shuhan-YMZX@users.noreply.github.com>
(cherry picked from commit 2fc562b8cfae2bb255baf097e0c71b498860c327)
Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>
Updated the WeChat group QR code image in the README.
修复了错误的名字
Add FLASH_ATTN_V100 runtime path, Qwen3.5/Qwen3.6 launch profiles, SM70 AWQ updates, vendored build dependencies, and public regression charts.
Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>
Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>
Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>
Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>
Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>
Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>
Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>
yangzhuxinyzx and others added 9 commits May 13, 2026 19:00
Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>
Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>
Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>
Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>
Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>
@github-actions

Copy link
Copy Markdown

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants