chat : add EOS token to additional_stops for autoparser templates by jpohhhh · Pull Request #20805 · ggml-org/llama.cpp

jpohhhh · 2026-03-20T15:17:13Z

Some models emit the EOS token as text (e.g. ) rather than as the special EOS token ID. The PEG parser fails at end-of-input because the trailing EOS text isn't consumed.

Regression introduced in 566059a (Autoparser #18675, 2026-03-06).

Fix: add the template's EOS token to additional_stops so the server strips it before the output reaches the parser.

Unit test:

cmake -B build -DLLAMA_BUILD_TESTS=ON -DLLAMA_BUILD_TOOLS=OFF
cmake --build build --target test-chat
./build/bin/test-chat

Server repro (bartowski/mistralai_Mistral-Small-3.2-24B-Instruct-2506-GGUF, temp=0):

llama-server -m Mistral-Small-3.2-24B-Instruct-2506-IQ2_M.gguf --jinja

200 before `566059a`, 500 after

curl http://localhost:8080/v1/chat/completions -d '{
"messages": [{"role": "user", "content": "Weather in Tokyo?"}],
"tools": [{"type": "function", "function": {
"name": "get_weather", "description": "Get weather",
"parameters": {"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]}
}}],
"temperature": 0, "max_tokens": 200
}'

Some models emit the EOS token as text (e.g. </s>) rather than as the special EOS token ID. The PEG parser fails at end-of-input because the trailing EOS text isn't consumed. Regression introduced in 566059a (Autoparser ggml-org#18675, 2026-03-06). Fix: add the template's EOS token to additional_stops so the server strips it before the output reaches the parser. Unit test: cmake -B build -DLLAMA_BUILD_TESTS=ON -DLLAMA_BUILD_TOOLS=OFF cmake --build build --target test-chat ./build/bin/test-chat Server repro (bartowski/mistralai_Mistral-Small-3.2-24B-Instruct-2506-GGUF, temp=0): llama-server -m Mistral-Small-3.2-24B-Instruct-2506-IQ2_M.gguf --jinja # 200 before 566059a, 500 after curl http://localhost:8080/v1/chat/completions -d '{ "messages": [{"role": "user", "content": "Weather in Tokyo?"}], "tools": [{"type": "function", "function": { "name": "get_weather", "description": "Get weather", "parameters": {"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]} }}], "temperature": 0, "max_tokens": 200 }'

pwilkin · 2026-03-20T15:59:17Z

Yet another bad change for a non-issue that masks a real issue.

If a model actually emits an EOS token "as a normal token", that's a tokenizer error that needs to be fixed, not masked in the parser.

jpohhhh · 2026-03-20T16:01:19Z

Yet another bad change for a non-issue that masks a real issue.

If a model actually emits an EOS token "as a normal token", that's a tokenizer error that needs to be fixed, not masked in the parser.

You're being abusive.

I am finding issues with the models I support, making sure they repro with server binary and my API, and then making sure they repro with a unit test. These are not non-issues :( Why are you sooooo mean, all the time?

jpohhhh · 2026-03-20T16:02:33Z

cc @bartowski1182 , bartowski/mistralai_Mistral-Small-3.2-24B-Instruct-2506-GGUF can't be handled by new parser. Any advice, here? I don't know how to square the circle here.

pwilkin · 2026-03-20T16:02:54Z

No, you're being abusive by spamming PRs without opening issues with actual real-life reproductions. Please stop it.

jpohhhh · 2026-03-20T16:07:04Z

No, you're being abusive by spamming PRs without opening issues with actual real-life reproductions. Please stop it.

I don't know what you mean, I'm holding myself to a really strict standard - it has to repro with the llama-server binary, and I have to give the exact command and bisect, and have unit tests.

Please tell someone who can ban me from PRs your claim that I'm spamming PRs without repros and have them evaluate your claim. I'm happy to get banned if they agree.

ggerganov · 2026-03-20T16:22:13Z

@jpohhhh I'm sorry, but I decided to block you for 30 days. Would recommend to change your approach if you wish to contribute to the project.

jpohhhh requested review from a team and pwilkin as code owners March 20, 2026 15:17

github-actions Bot added the testing Everything test related label Mar 20, 2026

pwilkin closed this Mar 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chat : add EOS token to additional_stops for autoparser templates #20805

chat : add EOS token to additional_stops for autoparser templates #20805
jpohhhh wants to merge 1 commit into
ggml-org:masterfrom
jpohhhh:fix-autoparser-eos-stop

jpohhhh commented Mar 20, 2026

Uh oh!

pwilkin commented Mar 20, 2026

Uh oh!

jpohhhh commented Mar 20, 2026

Uh oh!

jpohhhh commented Mar 20, 2026 •

edited

Loading

Uh oh!

pwilkin commented Mar 20, 2026

Uh oh!

jpohhhh commented Mar 20, 2026 •

edited

Loading

Uh oh!

ggerganov commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jpohhhh commented Mar 20, 2026

200 before 566059a, 500 after

Uh oh!

pwilkin commented Mar 20, 2026

Uh oh!

jpohhhh commented Mar 20, 2026

Uh oh!

jpohhhh commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pwilkin commented Mar 20, 2026

Uh oh!

jpohhhh commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ggerganov commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

200 before `566059a`, 500 after

jpohhhh commented Mar 20, 2026 •

edited

Loading

jpohhhh commented Mar 20, 2026 •

edited

Loading