Skip to content

Design shared GGUF model management and platform background downloads #160

@leehack

Description

@leehack

Feature description

Add a production-grade mobile/local GGUF model management layer that can be shared by multiple apps using llamadart on the same device, instead of each app downloading and storing duplicate model files independently.

Motivation

Large GGUF and multimodal projector files are expensive to download and store. Apps built on llamadart should be able to opt into a shared device-level model store so users can download a model once and reuse it across compatible apps, while preserving app privacy, user consent, integrity checks, and OS lifecycle requirements.

This also provides a natural home for platform-native download executors that survive mobile lifecycle events better than foreground Dart HTTP downloads.

Proposed solution

Design an optional model-management/downloader package or subsystem layered on top of the existing core abstractions:

  • Keep llamadart core focused on ModelSource, ModelDownloadManager, ModelDownloadController, cache metadata, resume, progress, cancellation, and validation.
  • Provide a platform model-store implementation that apps can inject as a ModelDownloadManager.
  • Use platform-native download execution where needed:
    • Android: foreground service and/or system DownloadManager, with persistent notification, pause/cancel, range resume, and checksum verification.
    • iOS: background URLSession download tasks with completion handoff into the model store.
    • Desktop: shared user-level model directory with optional sleep-prevention hooks controlled by the app.
    • Web: document browser storage/service-worker limitations separately; do not imply universal sharing.
  • Store model metadata separately from raw files: source identity, filename, bytes, SHA-256, ETag/Last-Modified, quantization hints, mmproj relationship, license/source URL, last-used time, owning/downloading app, and compatibility notes.
  • Expose safe APIs for listing, importing, validating, pruning, deleting, and selecting shared models.

Design questions

  • Should the shared store live in an optional package, e.g. llamadart_model_store, instead of the core package?
  • What should be the cross-app sharing boundary on Android/iOS given sandboxing and user-consent constraints?
  • Should apps share only files, or also higher-level metadata such as model cards, presets, mmproj links, and recommended runtime parameters?
  • How should private/gated model credentials be handled so tokens and signed URLs are never stored in shared metadata?
  • What permissions/UX are required for one app to delete or prune a model used by another app?

Acceptance criteria

  • A design doc describes platform-specific storage/sharing constraints for Android, iOS, desktop, and web.
  • The proposed API composes with the existing ModelDownloadManager / ModelDownloadController abstractions.
  • The design covers model + mmproj/multimodal assets as one logical model entry.
  • Downloads are resumable and integrity-checked before a file becomes visible as complete.
  • Shared metadata redacts or avoids secrets from URLs, headers, and tokens.
  • Apps can opt in without changing existing core llamadart behavior.
  • Example app integration demonstrates using the shared manager when available and falling back to the current app-local manager otherwise.

Non-goals for the first implementation

  • Making core llamadart automatically hold wakelocks or start platform services.
  • Forcing every app to use a shared model store.
  • Solving browser-wide model sharing beyond documenting web storage limits.

Notes

This came from investigating Android downloads failing when the phone sleeps. Short-term core/example improvements can document lifecycle limits, preserve resumable downloads, and improve foreground UX. True sleep/background-safe downloads and cross-app sharing should be designed as an opt-in platform model-management layer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestmodel-assetsModel, projector, cache, and download asset workflows

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions