Skip to content

feat(models): consider source-based multimodal projector loading API #133

@leehack

Description

@leehack

Goal

Evaluate and design a public API for loading multimodal projectors from ModelSource, matching LlamaEngine.loadModelSource(...).

Motivation

The current public API keeps existing loadMultimodalProjector(pathOrUrl) behavior while the new model-source APIs focus on model GGUF loading. Apps that want package-managed download/cache behavior for mmproj may eventually need an explicit source-based projector API.

Candidate APIs

await engine.loadMultimodalProjectorSource(
  ModelSource.parse('hf://owner/repo/mmproj.gguf'),
  options: ModelLoadOptions(...),
);

or a higher-level bundle API:

await engine.loadModelBundle(
  model: ModelSource.parse('hf://owner/repo/model.gguf'),
  projector: ModelSource.parse('hf://owner/repo/mmproj.gguf'),
  options: ModelLoadOptions(...),
);

Design questions

  • Should projector cache options be independent from model cache options?
  • Should model/projector load be atomic from the caller's perspective?
  • How should web URL-capable backends handle authenticated/checksummed projector sources?
  • How should runtime capability validation be surfaced?

Acceptance criteria

  • Decide whether this belongs in public API or example/app utility code first.
  • Add tests for local/remote model + local/remote projector combinations.
  • Preserve existing loadMultimodalProjector(...) compatibility.
  • Document unsupported platform combinations clearly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestmodel-assetsModel, projector, cache, and download asset workflows

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions