[Feature Request] Support Uint8Array in EmbedResponse Schema for Native Binary Vector Quantization

**Is your feature request related to a problem? Please describe.** I am deeply frustrated by the severe memory and bandwidth inefficiencies of storing high-dimensional vector embeddings, specifically when utilizing modern Polar Quantization algorithms. Currently, Genkit's `@genkit-ai/ai `core types strictly restrict the `EmbedResponse` schema to expect a `number[]` array.

This means even if a developer builds a middleware adapter to mathematically crash high-precision `Float32` embeddings down into 1-bit logic (`0` and `1`), the Genkit schema forces us to transport and serialize these bits as standard Javascript 64-bit JSON floats. This entirely destroys the architectural memory benefits of quantization, yielding a 0% compression rate in transit and storage, despite the math being natively viable.

**Describe the solution you'd like** I am requesting that the Genkit core architecture open up to officially support **Native Binary Vector Transports.**

**1. Schema Expansion:** The `EmbedResponse` and `Document` embedding types should accept explicitly packed binary encodings (like Uint8Array or standard Int8 representations) in addition to `number[]`.

**2. Retriever Adapters:** Update or introduce Retriever plugins for databases that natively support Hamming Distance indexing (e.g., Qdrant, Milvus, Pinecone Serverless) to successfully map and ingress these packed byte arrays out of the box.
This would allow Genkit developers to achieve a literal **32x reduction** in vector database storage costs and network bandwidth when executing semantic retrieval algorithms.

**Describe alternatives you've considered** I considered manually Base64-encoding the packed bytes before passing them into Genkit, but the Typescript Zod validators immediately reject anything that isn't a strict numeric array. As an alternative, I'm currently forced to submit raw Javascript arrays of zeroes and ones (`[0, 1, 0, 0, 1]...`), which falsely inflates the payload back up to massive Float-scale sizes inside the database.

**Additional context** I have already architected an experimental **Request for Comment (RFC) Repository** demonstrating how the Native evaluation logic for this behaves.

I successfully built a custom `node-addon-api `C++ Hardware Quantizer that losslessly intercepts Native Genkit Float32 Arrays and dynamically evaluates their sign-bits in the OS hardware tier (preventing devastating Node.js V8 Garbage Collection lag). The math and Genkit semantic retrieval logic operates flawlessly. You can view the live implementation and the Next.js framework bypassing techniques here:

**Proof of Concept Repository:** [https://github.com/CatoPaus/genkit-turboquant-embedder](https://github.com/CatoPaus/genkit-turboquant-embedder)

If the Genkit team exposes the necessary Schema structure, this repository is ready to natively pipe the resulting Bit-Packed binaries straight to the cloud.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Support Uint8Array in EmbedResponse Schema for Native Binary Vector Quantization #5048

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Support Uint8Array in EmbedResponse Schema for Native Binary Vector Quantization #5048

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions