Add support for Int8 embeddings

>  It would be great to have support for embeddings compressed to Int8 as per [HuggingFace: Embedding Quantization](https://huggingface.co/blog/embedding-quantization#binary-quantization-in-vector-databases).
>   
>   Potential implementation would be to:
>   
>   - Define an embedder (<:AbstractEmbedder for get_embeddings) and the corresponding finder (<:AbstractSimilarityFinder for find_similar). Both would have the vectors with necessary min_values and max_values fields to hold the effective range for each embedding dimension (eg, length(min_values)=length(max_values)=D)
>   - Define methods for these types
>   - The conversion to Int8 could be done post hoc (after build_index) via a utility function and then the resulting finder with the range to allow converting to Int8 (to be provided to the airag)
>   - It should implement the two-stage pass with rescore_multiplier=4 (first on Int8 embeddings, then with Float x Int8)

Original issue from PromptingTools.jl: https://github.com/svilupp/PromptingTools.jl/issues/118

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Int8 embeddings #14

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add support for Int8 embeddings #14

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions