Hi,
Great work on the project!
I’m requesting support for tensor parallelism.
[multiproc_executor.py:800] File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/models/qwen3_5.py", line 513, in load_weights (Worker pid=194) (Worker_TP1 pid=194) ERROR 03-20 07:48:17 [multiproc_executor.py:800] weight_loader(param, loaded_weight)
(Worker pid=194) (Worker_TP1 pid=194) ERROR 03-20 07:48:17 [multiproc_executor.py:800] File "/app/paroquant/inference/backends/vllm/plugin.py", line 47, in _rotation_weight_loader
(Worker pid=194) (Worker_TP1 pid=194) ERROR 03-20 07:48:17 [multiproc_executor.py:800] target.copy_(loaded_weight)
(Worker pid=194) (Worker_TP1 pid=194) ERROR 03-20 07:48:17 [multiproc_executor.py:800] RuntimeError: The size of tensor a (2048) must match the size of tensor b (4096) at non-singleton dimension 1
(Worker pid=193) (Worker_TP0 pid=193) INFO 03-20 07:48:17 [multiproc_executor.py:749] Parent process exited, terminating worker
Hi,
Great work on the project!
I’m requesting support for tensor parallelism.