[Task] Added SiteBench multi-image variant and bug fix #996

oscarqjh · 2026-01-15T07:33:31Z

Added option to input video as multi image to align with models like SenseNova-SI qwen series that are trained on multi image input. See VSI-Bench_32frame results are not reproducible EASI#20.

sample run result:

Fixed the issue in official code where post-prompt and pre-prompt are not extracted properly from lmms_eval_specific_kwargs

…h lmmseval convention

oscarqjh · 2026-01-15T07:33:58Z

@PeterWangyi @kcz358

…s format

oscarqjh · 2026-01-16T05:25:52Z

b792f6c added option to use interleave_visual - this is required to align evaluation result with VLMEvalKit

PeterWangyi · 2026-01-16T09:52:44Z

It seems that in the Internvl series models, we can only call the doc_to_visual function, but not the doc_to_message function.
This will cause the text-image interleave to fail.

oscarqjh added 5 commits January 15, 2026 12:36

[Task] Add SiteBench Multi-image variant

945a8e9

fix: Fixed the issue where post and pre prompt are not added properly

820ff1f

fix lmms_eval_specific_kwargs bug

4be4cfa

pre-commit

2d9e159

fix: convert lmmseval_specific_kwargs to input as a dict to align wit…

960c904

…h lmmseval convention

oscarqjh marked this pull request as draft January 16, 2026 02:36

oscarqjh added 2 commits January 16, 2026 13:23

feat: Added option to use interleave_visual to align with vlmevalkit'…

b792f6c

…s format

precommit

787711a

oscarqjh marked this pull request as ready for review January 16, 2026 05:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Task] Added SiteBench multi-image variant and bug fix #996

[Task] Added SiteBench multi-image variant and bug fix #996

oscarqjh commented Jan 15, 2026

Uh oh!

oscarqjh commented Jan 15, 2026

Uh oh!

oscarqjh commented Jan 16, 2026

Uh oh!

PeterWangyi commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Task] Added SiteBench multi-image variant and bug fix #996

Are you sure you want to change the base?

[Task] Added SiteBench multi-image variant and bug fix #996

Conversation

oscarqjh commented Jan 15, 2026

Uh oh!

oscarqjh commented Jan 15, 2026

Uh oh!

oscarqjh commented Jan 16, 2026

Uh oh!

PeterWangyi commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants