Skip to content

fix(testing): Fix Kyutai Speech-To-Text, LLaVA-OneVision, and LongCatFlash test failures on main CI #44695

Open
harshaljanjani wants to merge 1 commit intohuggingface:mainfrom
harshaljanjani:fix/kyutai-llava-longcat-test-failures
Open

fix(testing): Fix Kyutai Speech-To-Text, LLaVA-OneVision, and LongCatFlash test failures on main CI #44695
harshaljanjani wants to merge 1 commit intohuggingface:mainfrom
harshaljanjani:fix/kyutai-llava-longcat-test-failures

Conversation

@harshaljanjani
Copy link
Contributor

@harshaljanjani harshaljanjani commented Mar 14, 2026

What does this PR do?

The following failing tests were identified and fixed in this PR:

Kyutai Speech-To-Text: The PR [processors] Unbloating simple processors, refactored ProcessorMixin.call to use explicit keyword-only params instead of accepting positional arguments; but the KyutaiSTT integration tests were still calling processor(samples) positionally; the audio samples in the current state mapped to the images param.
LLaVA-OneVision: The PR Load a tiny video to make CI faster introduced local video file path mappings. LlavaOnevision's setUpClass was still building paths to Big_Buck_Bunny_720_10s_10MB.mp4 and sample_demo_1.mp4 in the repo root.
LongCatFlash: The PR [V5] Return a BatchEncoding dict from apply_chat_template by default again changed apply_chat_template to return BatchEncoding dict instead of a tensor. The test was passing this dict directly to model.generate and tried to access .shape on the dict; this fixes that :)

Note: The test still fails with an AssertionError, I'm not too sure and it could be flaky, but the crash should be resolved :)

cc: @Rocketknight1 @zucchini-nlp

CI Failures

Before the fix (feel free to cross-check; these errors are reproducible):

image

After the fix (feel free to cross-check):

image

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you fix any necessary existing tests?

@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: kyutai_speech_to_text, llava_onevision, longcat_flash

@harshaljanjani harshaljanjani marked this pull request as ready for review March 14, 2026 09:12
@github-actions github-actions bot requested a review from ydshieh March 14, 2026 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant