Fix ChatVLLM crash on models without a chat template by ferreirafabio · Pull Request #11 · OpenEuroLLM/OpenJury

ferreirafabio · 2026-02-18T17:15:34Z

What is the problem?

PR #8 introduced the ChatVLLM wrapper which switches from vllm.LLM.generate() to vllm.LLM.chat() so that chat templates get applied correctly. This works great for instruct models that ship a chat template in their tokenizer config. However, base/pretrained models like swiss-ai/Apertus-8B-2509 don't define one. Since transformers >= v4.44 no longer provides a default chat template, calling vllm.LLM.chat() on these models raises a ValueError. This also means we can't evaluate base models for fluency tasks anymore, which is something we need for the project.

How do we solve it?

We detect at model load time whether a chat template is available and pick the right vLLM method accordingly. Three paths:

User passes --chat_template: we use llm.chat() with that explicit template. Useful when you know the right format for a model whose tokenizer doesn't include one.
Tokenizer has a chat template: we use llm.chat() and let vLLM apply it automatically. This is the default path for instruct models.
No chat template found: we fall back to llm.generate() (plain text completion, no chat formatting) and print a warning. This is the expected path for base models used in fluency evaluation.

This way instruct models keep working as before, base models no longer crash, and users can still force a template via the CLI when needed.

Changes

openjury/utils.py: add warnings import, three-path template detection in ChatVLLM.__init__(), new _to_raw_text() method for the generate() fallback, pass chat_template through batch() and make_model()
openjury/generate.py: forward chat_template parameter in generate_instructions() and generate_base()
openjury/generate_and_evaluate.py: add --chat_template CLI argument, thread it through CliArgs, gen_fun partials, and make_model() calls
README.md: document chat template behavior under "Model Specification"

Testing

Tested on L40S GPU with vllm 0.10.2 using both Apertus 8B models:

swiss-ai/Apertus-8B-2509 (base, no chat template): correctly falls back to llm.generate(), warning emitted, produces valid completions
swiss-ai/Apertus-8B-Instruct-2509 (instruct, has chat template): correctly uses llm.chat() with the tokenizer's template
swiss-ai/Apertus-8B-2509 + explicit ChatML template: correctly uses llm.chat() with the provided override
make_model("VLLM/...") end-to-end: chat_template parameter correctly forwarded through the full pipeline

Models like swiss-ai/Apertus-8B-2509 (base models) don't define a chat template in their tokenizer config. Since transformers >= v4.44 removed the default template, calling vllm.LLM.chat() on these models raises ValueError. Implement three-path resolution: 1. Explicit --chat_template override -> use llm.chat() with that template 2. Tokenizer has a chat template -> use llm.chat() (auto-detected) 3. No template found -> fall back to llm.generate() + warn This ensures instruct models get chat() automatically, base models get generate() automatically, and users can still force a template via CLI.

geoalgo approved these changes Feb 20, 2026

View reviewed changes

geoalgo merged commit f3cb15e into OpenEuroLLM:main Feb 20, 2026
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ChatVLLM crash on models without a chat template#11

Fix ChatVLLM crash on models without a chat template#11
geoalgo merged 1 commit intoOpenEuroLLM:mainfrom
ferreirafabio:fix/chat-template-fallback

ferreirafabio commented Feb 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ferreirafabio commented Feb 18, 2026

What is the problem?

How do we solve it?

Changes

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants