server : unify mtmd image processing with post-decode callback by ggerganov · Pull Request #24520 · ggml-org/llama.cpp

ggerganov · 2026-06-12T14:32:33Z

Overview

Add `mtmd_helper_post_decode_callback` to `mtmd_helper_eval_chunk_single` and `mtmd_helper_decode_image_chunk`, invoked after each successful `llama_decode()`.

The server uses this callback to decode the batch on the draft context, keeping it in sync with the target context during multi-modal prompt processing. This eliminates the need for a second `process_chunk` call on `ctx_dft`.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: YES. pi:llama.cpp/Qwen3.6-27B

Add mtmd_helper_post_decode_callback to mtmd_helper_eval_chunk_single and mtmd_helper_decode_image_chunk, invoked after each successful llama_decode(). The server uses this callback to run common_speculative_process on the batch, keeping the draft context in sync with the target context during multi-modal prompt processing. This eliminates the need for a second process_chunk call on ctx_dft and removes the llama-ext.h workaround include. Assisted-by: pi:llama.cpp/Qwen3.6-27B

ngxson · 2026-06-12T14:38:26Z

fyi, I'm solving the same problem via #24384 , so probably this doesn't need to be fixed

Move the post-decode callback construction inside server_tokens::process_chunk() so that server-common.h no longer depends on mtmd-helper.h. The caller now passes ctx_tgt and ctx_dft directly. Assisted-by: pi:llama.cpp/Qwen3.6-27B

ggerganov · 2026-06-12T14:44:19Z

fyi, I'm solving the same problem via #24384 , so probably this doesn't need to be fixed

Ok, leaving this PR for reference if it can be helpful in any way.

ggerganov · 2026-06-15T09:35:22Z

Superseded by #24645

ggerganov added 2 commits June 12, 2026 17:28

mtmd : narrow-down batch scope

30d49c0

github-actions Bot added examples server labels Jun 12, 2026

ggerganov mentioned this pull request Jun 15, 2026

mtmd : add post-decode callback #24645

Merged

ggerganov closed this Jun 15, 2026

ggerganov deleted the gg/server-spec-mtmd branch June 15, 2026 09:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server : unify mtmd image processing with post-decode callback#24520

server : unify mtmd image processing with post-decode callback#24520
ggerganov wants to merge 3 commits into
masterfrom
gg/server-spec-mtmd

ggerganov commented Jun 12, 2026 •

edited

Loading

Uh oh!

ngxson commented Jun 12, 2026

Uh oh!

ggerganov commented Jun 12, 2026

Uh oh!

ggerganov commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ggerganov commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Requirements

Uh oh!

ngxson commented Jun 12, 2026

Uh oh!

ggerganov commented Jun 12, 2026

Uh oh!

ggerganov commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ggerganov commented Jun 12, 2026 •

edited

Loading