Skip to content

Add support for Int8 embeddings #14

@pabvald

Description

@pabvald

It would be great to have support for embeddings compressed to Int8 as per HuggingFace: Embedding Quantization.

Potential implementation would be to:

  • Define an embedder (<:AbstractEmbedder for get_embeddings) and the corresponding finder (<:AbstractSimilarityFinder for find_similar). Both would have the vectors with necessary min_values and max_values fields to hold the effective range for each embedding dimension (eg, length(min_values)=length(max_values)=D)
  • Define methods for these types
  • The conversion to Int8 could be done post hoc (after build_index) via a utility function and then the resulting finder with the range to allow converting to Int8 (to be provided to the airag)
  • It should implement the two-stage pass with rescore_multiplier=4 (first on Int8 embeddings, then with Float x Int8)

Original issue from PromptingTools.jl: svilupp/PromptingTools.jl#118

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions