Chat template from chat_template.jinja for all possible paths + custom chat template for Qwen3-VL thinking by dkalinowski · Pull Request #4055 · openvinotoolkit/model_server

dkalinowski · 2026-03-12T13:19:54Z

VLM pipelines still prioritize chat template from openvino_tokenizer.xml rather than chat_template.jinja
This PR changes this, ensures the order is different.

This ensures that Qwen3-VL Thinking is supported

Copilot

Pull request overview

This PR updates the LLM/VLM servable initialization flow to allow overriding the tokenizer chat template from a chat_template.jinja file located in the model path, making that override available across multiple pipeline initializers.

Changes:

Add logic to detect and read chat_template.jinja from the model path and call tokenizer.set_chat_template(...).
Add <fstream> include where needed to support reading the template file.
Apply the same override behavior across legacy LM, continuous batching LM, and legacy VLM initializers.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File	Description
`src/llm/visual_language_model/legacy/servable_initializer.cpp`	Reads `chat_template.jinja` (if present) after creating the VLM pipeline/tokenizer and applies it to the tokenizer.
`src/llm/language_model/legacy/servable_initializer.cpp`	Reads `chat_template.jinja` (if present) after creating the legacy LLM pipeline/tokenizer and applies it to the tokenizer.
`src/llm/language_model/continuous_batching/servable_initializer.cpp`	Reads `chat_template.jinja` (if present) after creating the CB pipeline/tokenizer and applies it to the tokenizer.

src/llm/language_model/legacy/servable_initializer.cpp

Copilot · 2026-03-12T13:32:43Z

src/llm/language_model/continuous_batching/servable_initializer.cpp

        properties->tokenizer = properties->pipeline->get_tokenizer();
+
+        // Override chat template from chat_template.jinja file if present in model directory
+        std::filesystem::path chatTemplateJinjaPath = std::filesystem::path(parsedModelsPath) / "chat_template.jinja";


chatTemplateJinjaPath is derived as parsedModelsPath / "chat_template.jinja". Since parseModelsPath() accepts .gguf files as a valid models_path, this won’t locate a template stored next to a GGUF file (it will incorrectly search under <file>.gguf/chat_template.jinja). Consider basing the search directory on parsedModelsPath if it’s a directory, otherwise use parent_path().

Suggested change

std::filesystem::path chatTemplateJinjaPath = std::filesystem::path(parsedModelsPath) / "chat_template.jinja";

std::filesystem::path modelsPathFs(parsedModelsPath);

std::filesystem::path chatTemplateDir = modelsPathFs;

if (!chatTemplateDir.empty() && !std::filesystem::is_directory(chatTemplateDir)) {

chatTemplateDir = chatTemplateDir.parent_path();

}

std::filesystem::path chatTemplateJinjaPath = chatTemplateDir / "chat_template.jinja";

@atobiszei do gguf models have chat_template,jinja file next to model files, or is the chat template built in?

src/llm/language_model/continuous_batching/servable_initializer.cpp

src/llm/visual_language_model/legacy/servable_initializer.cpp

src/llm/language_model/legacy/servable_initializer.cpp

dtrawins · 2026-03-16T11:47:09Z

src/llm/servable_initializer.cpp

+                std::istreambuf_iterator<char>());
+            if (!chatTemplateContent.empty()) {
+                properties->tokenizer.set_chat_template(chatTemplateContent);
+                SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Overriding chat template from: {}", chatTemplateJinjaPath.string());


Suggested change

SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Overriding chat template from: {}", chatTemplateJinjaPath.string());

SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Using the chat template from: {}", chatTemplateJinjaPath.string());

dtrawins · 2026-03-16T11:47:48Z

src/llm/servable_initializer.cpp

+                SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Overriding chat template from: {}", chatTemplateJinjaPath.string());
+            }
+        } else {
+            SPDLOG_LOGGER_WARN(llm_calculator_logger, "Failed to open chat template file: {}", chatTemplateJinjaPath.string());


Suggested change

SPDLOG_LOGGER_WARN(llm_calculator_logger, "Failed to open chat template file: {}", chatTemplateJinjaPath.string());

SPDLOG_LOGGER_ERROR(llm_calculator_logger, "Failed to open chat template file: {}", chatTemplateJinjaPath.string());

dtrawins · 2026-03-16T11:49:14Z

extras/chat_template_examples/chat_template_qwen3_vl_thinking.jinja

+{%- endfor %}
+{%- if add_generation_prompt %}
+    {#- Originally '<|im_start|>assistant\n<think>\n' #}
+    {{- '<|im_start|>assistant\n' }}


Isn't it possible to turn off thinking?

No, this chat template doesnt support that originally. Looks like not a common thing. Is it already part of the process, that we always add support for that whenever we introduce new thinking model? @dtrawins

dkalinowski added 3 commits March 12, 2026 12:44

save

831a375

save

15dae46

save

1a7437e

dkalinowski force-pushed the ovms_chat_template branch from 694d391 to 1a7437e Compare March 12, 2026 13:23

dkalinowski marked this pull request as ready for review March 12, 2026 13:27

Copilot AI review requested due to automatic review settings March 12, 2026 13:27

Copilot started reviewing on behalf of dkalinowski March 12, 2026 13:28 View session

Copilot AI reviewed Mar 12, 2026

View reviewed changes

dkalinowski added 2 commits March 16, 2026 11:30

Merge remote-tracking branch 'origin/main' into ovms_chat_template

d02b5d8

refactor

40a3c3b

dkalinowski force-pushed the ovms_chat_template branch from c0d17b0 to 40a3c3b Compare March 16, 2026 11:33

dkalinowski added 2 commits March 16, 2026 12:36

save

2c94ffd

save

c8b3d54

dkalinowski force-pushed the ovms_chat_template branch from 51f4881 to c8b3d54 Compare March 16, 2026 11:41

dkalinowski requested review from dtrawins and mzegla March 16, 2026 11:42

dtrawins reviewed Mar 16, 2026

View reviewed changes

post-review

6cb79f4

dkalinowski changed the title ~~Chat template from chat_template.jinja for all possible paths~~ Chat template from chat_template.jinja for all possible paths + custom chat template for Qwen3-VL thinking Mar 16, 2026

mzegla approved these changes Mar 16, 2026

View reviewed changes

dtrawins approved these changes Mar 16, 2026

View reviewed changes

dkalinowski added 2 commits March 16, 2026 16:09

style

291987a

Merge remote-tracking branch 'origin/main' into ovms_chat_template

1b4ed02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chat template from chat_template.jinja for all possible paths + custom chat template for Qwen3-VL thinking#4055

Chat template from chat_template.jinja for all possible paths + custom chat template for Qwen3-VL thinking#4055
dkalinowski wants to merge 10 commits intomainfrom
ovms_chat_template

dkalinowski commented Mar 12, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

dkalinowski Mar 16, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dtrawins Mar 16, 2026

Uh oh!

dkalinowski Mar 16, 2026

Uh oh!

dtrawins Mar 16, 2026

Uh oh!

dkalinowski Mar 16, 2026

Uh oh!

dtrawins Mar 16, 2026

Uh oh!

dkalinowski Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

-        std::filesystem::path chatTemplateJinjaPath = std::filesystem::path(parsedModelsPath) / "chat_template.jinja";
+        std::filesystem::path modelsPathFs(parsedModelsPath);
+        std::filesystem::path chatTemplateDir = modelsPathFs;
+        if (!chatTemplateDir.empty() && !std::filesystem::is_directory(chatTemplateDir)) {
+            chatTemplateDir = chatTemplateDir.parent_path();
+        }
+        std::filesystem::path chatTemplateJinjaPath = chatTemplateDir / "chat_template.jinja";

	SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Overriding chat template from: {}", chatTemplateJinjaPath.string());
	SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Using the chat template from: {}", chatTemplateJinjaPath.string());

	SPDLOG_LOGGER_WARN(llm_calculator_logger, "Failed to open chat template file: {}", chatTemplateJinjaPath.string());
	SPDLOG_LOGGER_ERROR(llm_calculator_logger, "Failed to open chat template file: {}", chatTemplateJinjaPath.string());

Conversation

dkalinowski commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

dkalinowski Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dtrawins Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

dkalinowski Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

dtrawins Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

dkalinowski Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

dtrawins Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

dkalinowski Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dkalinowski commented Mar 12, 2026 •

edited

Loading