### Prerequisites - [x] I am running the latest code. Mention the version if possible as well. - [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md). - [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed). - [x] I reviewed the [Discussions](https://github.com/ggml-org/llama.cpp/discussions), and have a new and useful enhancement to share. ### Feature Description 有几个问题请教一下: 1) 当前支持视觉大模型在npu上的调度吗 2) 似乎没有在文档或者其他地方看见有关性能测试的数据,例如运行prompt=64 or 128 的qwen 1B 2B模型这样的数据 ### Motivation 希望支持VLM的运行,加速 ### Possible Implementation _No response_
Prerequisites
Feature Description
有几个问题请教一下:
1) 当前支持视觉大模型在npu上的调度吗
2) 似乎没有在文档或者其他地方看见有关性能测试的数据,例如运行prompt=64 or 128 的qwen 1B 2B模型这样的数据
Motivation
希望支持VLM的运行,加速
Possible Implementation
No response