Fix SM70 docker build, CUDA 13.0 compat conflict, and README docker args by titidatiti · Pull Request #43 · 1CatAI/1Cat-vLLM

titidatiti · 2026-05-17T09:07:07Z

为什么需要这个 PR (What this PR does / why we need it)

我在使用 Tesla V100 (SM70) 和较新宿主机驱动（如 CUDA 13.0）部署镜像时遇到了编译报错、运行时 Error 803 以及Vllm参数未生效的问题。这些问题在本地部署时不会发生，而Docker部署就会出现问题，本 PR 旨在彻底修复这些问题：

[Bugfix] 修复 SM70 AWQ kernel 编译失败：
docker/Dockerfile 的 csrc-build 阶段缺失了对 lmdeploy 源码的拷贝，导致 CMake 判定跳过 SM70 核心的编译（issue: https://github.com/1CatAI/1Cat-vLLM/issues/42#issuecomment-4467796333）。本提交补充了 COPY lmdeploy lmdeploy/。
[Core] 移除硬编码的 cuda-compat 库配置：
注释掉了 docker/Dockerfile 中强制写入 /etc/ld.so.conf.d/cuda-compat.conf 的两行命令。如果宿主机的显卡驱动较新（如 580 系列），强行加载旧的 compat 库会覆盖 nvidia-container-toolkit 的智能动态判定，导致容器启动报错 Error 803 (system has unsupported display driver / cuda driver combination)。移除后，让底层 toolkit 自行接管新旧驱动的兼容性挂载，更为稳妥。
[Doc] 纠正 README 中的 docker 启动参数传参方式：
修正了 README.md 中有关 Docker 启动配置的错误。原文档将 model、tensor_parallel_size 等核心参数作为环境变量 (-e VLLM_MODEL=...) 传递，但 vllm serve 入口点原生仅识别 CLI 参数。这会导致容器静默忽略用户的设定并错误地加载源码默认的 Qwen3-0.6B。已将其替换为正确的 CLI 传参语法。

测试记录 (Test Record)

已基于 Ubuntu + Docker 环境验证：修改后镜像编译通过，CUDA 13.0 宿主机不再报 803 错误，且能够正确读取 CLI 参数加载指定的大语言模型。

Signed-off-by: Pan-Shuhan-YMZX <263558224+Pan-Shuhan-YMZX@users.noreply.github.com> (cherry picked from commit f8e4c58adad5561ab4cd006fdab6c9b1903eec1c)

Signed-off-by: Pan-Shuhan-YMZX <263558224+Pan-Shuhan-YMZX@users.noreply.github.com> (cherry picked from commit 2fc562b8cfae2bb255baf097e0c71b498860c327)

Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>

Updated the WeChat group QR code image in the README.

修复了错误的名字

Add FLASH_ATTN_V100 runtime path, Qwen3.5/Qwen3.6 launch profiles, SM70 AWQ updates, vendored build dependencies, and public regression charts.

Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>

…file

… vars

github-actions · 2026-05-17T09:07:18Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

yangzhuxinyzx and others added 30 commits March 21, 2026 12:23

[Core] Import 1Cat-vLLM-0.0.2 runtime and build system

4683901

[CI/Build] Vendor lmdeploy source for standalone builds

92c6efb

[Kernel] Add validation, examples, and benchmark assets

5262499

[Doc] Publish 1Cat-vLLM-0.0.2 release snapshot

b3b1abd

[Doc] Update rebuilt wheel download links

6fd0f8d

Signed-off-by: Pan-Shuhan-YMZX <263558224+Pan-Shuhan-YMZX@users.noreply.github.com> (cherry picked from commit f8e4c58adad5561ab4cd006fdab6c9b1903eec1c)

[Bugfix] Vendor runtime Python packages for source builds

a8783b0

Signed-off-by: Pan-Shuhan-YMZX <263558224+Pan-Shuhan-YMZX@users.noreply.github.com> (cherry picked from commit 2fc562b8cfae2bb255baf097e0c71b498860c327)

[CI/Build][Doc] Add verified SM70 Docker runtime path

1e6c257

Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>

Add files via upload

f29bd45

Change WeChat group QR code image

d6c28dc

Updated the WeChat group QR code image in the README.

Update README.md

18e5223

Add files via upload

3c7a8a3

Update Dockerfile.sm70-wheel

f5d2e15

修复了错误的名字

Add files via upload

feb8402

docs: update wechat group qr code

c1dce83

docs: update WeChat group QR code

82f59c8

Release 1Cat-vLLM 0.0.3

92a785c

Add FLASH_ATTN_V100 runtime path, Qwen3.5/Qwen3.6 launch profiles, SM70 AWQ updates, vendored build dependencies, and public regression charts.

Merge 1CatAI main history for 0.0.3

eea9d81

Update README.md

04bb4b7

Update README.md

7a7549c

Update README.md

6276450

Update README.md

a1bf487

docs: clarify wheel runtime directory

197f1cc

[Kernel] Add V100 FA2 fp8 KV cache audits

58ebaa6

Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>

[Core] Trim V100 startup memory defaults

3b539f9

Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>

QRcode-update

437b358

[Core] Prepare 1.0.0 V100 release

a4daad6

Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>

[Doc] Update 1.0.0 wheel install and MTP launch

761ae33

Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>

[Doc] Simplify public launch commands

0741a30

Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>

[Doc] Restore validated MTP launch profile

36536e5

Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>

[Doc] Add MTP throughput note

29b73ec

Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>

yangzhuxinyzx and others added 9 commits May 13, 2026 19:00

[Bugfix] Restore spec proposer compatibility

0ac0632

Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>

[Doc] Add TP2 MTP launch profile

05ac1a4

Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>

[Core] Archive FP8 MTP investigation state

8b536c1

Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>

docs: update WeChat group QR code

bf37452

[Kernel] Add SM70 FP8 MoE fast path

69749dd

Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>

[Doc] Credit flash-attention-v100

d18b16c

Signed-off-by: yangzhuxinyzx <153831768+yangzhuxinyzx@users.noreply.github.com>

[Bugfix] Fix SM70 kernel missing by copying lmdeploy source in Docker…

5c37552

…file

[Core] Disable cuda-compat to support new host NVIDIA drivers

30f3240

[Doc] Correct docker run examples to use CLI arguments instead of env…

d60703f

… vars

titidatiti force-pushed the main branch from 206b4a6 to d60703f Compare May 19, 2026 13:10

yangzhuxinyzx force-pushed the main branch from 63b05fc to 00323f2 Compare June 15, 2026 02:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix SM70 docker build, CUDA 13.0 compat conflict, and README docker args#43

Fix SM70 docker build, CUDA 13.0 compat conflict, and README docker args#43
titidatiti wants to merge 39 commits into
1CatAI:mainfrom
titidatiti:main

titidatiti commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

titidatiti commented May 17, 2026

为什么需要这个 PR (What this PR does / why we need it)

测试记录 (Test Record)

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants