iqdb-quantize compresses f32 vectors into smaller representations while preserving search quality. A million 768-dim vectors drop from ~3 GB to as little as ~96 MB, trading a controlled amount of recall for memory.
It implements the three standard schemes (scalar, product, binary) behind a common `Quantizer` trait, and reuses `iqdb-distance` for distance on quantized codes.
MSRV is 1.85+ (Rust 2024 edition). Scalar, product, and binary quantization. Quality/space dial.
Status: pre-1.0, in active development. The public API is being designed across the 0.x series and frozen at1.0.0. SeeCHANGELOG.md.
- Scalar quantization — SQ8: f32 to int8, ~4x compression
- Product quantization — PQ: subvector codebooks, 8x-16x compression
- Binary quantization — BQ: sign-based, 32x compression with Hamming distance
- Train / quantize / distance — compute distance directly on the compressed form where possible
- Asymmetric distance — quantize the database, keep the query in f32 for better recall
[dependencies]
iqdb-quantize = "0.1"use iqdb_quantize::{Quantizer, ScalarQuantizer};
use iqdb_types::DistanceMetric;
let training: Vec<Vec<f32>> = vec![
vec![0.10, 0.20, 0.30],
vec![0.15, 0.18, 0.32],
vec![0.12, 0.22, 0.28],
];
let refs: Vec<&[f32]> = training.iter().map(Vec::as_slice).collect();
let mut sq = ScalarQuantizer::new();
sq.train(&refs).unwrap();
let candidate = [0.11_f32, 0.21, 0.29];
let code = sq.quantize(&candidate).unwrap();
let query = [0.10_f32, 0.20, 0.30];
let d = sq.distance(&query, &code, DistanceMetric::Cosine).unwrap();
assert!(d.is_finite());Two rules to use quantization correctly: train on representative data, and search quantized but rerank with full f32. Skipping the rerank step is the most common cause of "quantization broke recall" reports.
Every method of the Quantizer trait is fallible and returns iqdb_types::Result. The library never panics on bad input.
ScalarQuantizer(SQ8) — per-dimension affine calibration; codes areu8. Supports everyDistanceMetricvia asymmetric distance throughiqdb-distance.BinaryQuantizer(BQ) — one bit per dimension, packed intoVec<u64>. SupportsDistanceMetric::Hammingonly; other metrics returnIqdbError::InvalidMetric.ProductQuantizer(PQ) —M-byte codes via deterministic k-means codebooks (PqAdcTablesprecomputes per-query lookup tables for batch ADC scoring). SupportsEuclidean,DotProduct,Manhattan;Cosine(no global norm) andHamming(wrong code space) returnIqdbError::InvalidMetric.
This is the v0.1.0 release: SQ8, BQ, and PQ quantization land behind a single Quantizer trait, with the PqAdcTables batch-ADC primitive, deterministic seeded k-means, property tests for round-trip and distance invariants, recall integration tests, and a criterion bench harness. The public API stabilises across the 0.x series and freezes at 1.0.0 — see the ROADMAP and docs/API.md.
iqdb-quantize is a Phase-2 crate, independent of the index layer. It is used by:
iqdb-types— vector and metric typesiqdb-distance— distance on quantized codesiqdb-ivf— IVF-PQ consumes this for in-cluster compression
It is an optimization, not a requirement: iQDB runs without it.
See dev/DIRECTIVES.md for engineering standards and the definition of done. Before a PR: cargo fmt --all, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all-features must be clean.
Licensed under either of
- Apache License, Version 2.0 — LICENSE-APACHE
- MIT License — LICENSE-MIT
at your option.