Skip to content

Feature Request from Community for mindone.transformers #1449

@wcrzlh

Description

@wcrzlh

Feature Request from Community:

Related PR: Processor adaptation

  1. Phi-4-multimodal-support:
    Processor: Phi4MultimodalFeatureExtractor, Phi4MultiModalImageProcessorFast, Phi4MultimodalProcessor
    Module: Phi4MultimodalAudioConvModule, Phi4MultimodalAudioNemoConvSubsampling, Phi4MultimodalAudioRelativeAttentionBias, adaptive_enc_mask, unfold_tensor
  2. InterVL-S1:
    InternVLProcessor, GotOcr2ImageProcessorFast, InternVLVideoProcessor
  3. Whisper:
    WhisperFeatureExtractor, WhisperProcessor
  4. Ultravox
    WhisperEncoder, UltravoxProcessor
  5. InternVL 3.5
    InternVLProcessor
  6. Qwen2Audio
    Qwen2AudioEncoder
    Qwen2AudioProcessor
    WhisperFeatureExtractor
  7. MiniCPM-V4.5
    MiniCPMVProcessor
  8. LLava-Next
    LLavaNextImageProcessor
    LLavaNextProcessor
    LLavaNextForConditionalGeneration
    LLavaNextImageProcessor
  9. LLava-Next-Video
    LLavaNextVideoProcessor

Metadata

Metadata

Assignees

Labels

rfc需求单ISSUE

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions