Add --llm_image_context for context-aware image descriptions by vertgo · Pull Request #1037 · datalab-to/marker

vertgo · 2026-05-22T02:05:13Z

New opt-in processor (LLMImageContextDescriptionProcessor) that describes every Picture/Figure block with an LLM, supplying the page's extracted markdown and the page image as context. Returns a short caption (rendered as image alt text) and a long description (rendered as a hidden HTML comment), so a text-only LLM reading the markdown alone can understand the visual content without a multimodal pipeline.

New flag --llm_image_context, orthogonal to --use_llm; works whether or not image extraction is enabled.
Picture/Figure blocks gain short_caption + long_description fields.
HTML renderer emits alt text + a hidden ; markdown renderer converts that aside into an HTML comment.
Gemini service activates for this flag independently of --use_llm.

Developed with assistance from Claude (Claude Code).

New opt-in processor (LLMImageContextDescriptionProcessor) that describes every Picture/Figure block with an LLM, supplying the page's extracted markdown and the page image as context. Returns a short caption (rendered as image alt text) and a long description (rendered as a hidden HTML comment), so a text-only LLM reading the markdown alone can understand the visual content without a multimodal pipeline. - New flag --llm_image_context, orthogonal to --use_llm; works whether or not image extraction is enabled. - Picture/Figure blocks gain short_caption + long_description fields. - HTML renderer emits alt text + a hidden <aside>; markdown renderer converts that aside into an HTML comment. - Gemini service activates for this flag independently of --use_llm. Developed with assistance from Claude (Claude Code).

github-actions · 2026-05-22T02:05:28Z

CLA Assistant Lite bot:
Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.

I have read the CLA Document and I hereby sign the CLA

_{You can retrigger this bot by commenting recheck in this Pull Request}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add --llm_image_context for context-aware image descriptions#1037

Add --llm_image_context for context-aware image descriptions#1037
vertgo wants to merge 1 commit into
datalab-to:masterfrom
vertgo:feature/llm-image-context

vertgo commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vertgo commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant