iqdb-quantize compresses f32 vectors into smaller representations while preserving search quality. A million 768-dim vectors drop from ~3 GB to as little as ~96 MB, trading a controlled amount of recall for memory.
It implements the three standard schemes — scalar (SQ8), product (PQ), and binary (BQ) — behind a single `Quantizer` trait, and reuses `iqdb-distance` for distance on the compressed codes.
MSRV is 1.87+ (Rust 2024 edition). Scalar, product, and binary quantization. One trait. A quality/space dial.
Status: stable (1.0). The public API is committed under SemVer for the 1.x series — no breaking changes until 2.0. See CHANGELOG.md.
- Scalar quantization (SQ8) — f32 →
u8per dimension, ~4× compression; asymmetric distance under every metric - Product quantization (PQ) — subvector k-means codebooks, up to ~192× compression, with batch-ADC scoring for IVF-PQ
- Binary quantization (BQ) — one bit per dimension, 32× compression, Hamming distance on packed
u64words - One trait —
train→quantize→distance, every method fallible and returning a typed error - Asymmetric distance — compress the database, keep the query in f32 for better recall
- Deterministic — same seed + same data ⇒ byte-identical PQ codebooks on every platform
- Never panics on bad input — empty, non-finite, mismatched, untrained, and unsupported-metric inputs return a typed
IqdbError
[dependencies]
iqdb-quantize = "1.0"Train on a representative sample, then quantize and score. Scalar (SQ8) supports every metric:
use iqdb_quantize::{Quantizer, ScalarQuantizer};
use iqdb_types::DistanceMetric;
let training: Vec<Vec<f32>> = vec![
vec![0.10, 0.20, 0.30],
vec![0.15, 0.18, 0.32],
vec![0.12, 0.22, 0.28],
];
let refs: Vec<&[f32]> = training.iter().map(Vec::as_slice).collect();
let mut sq = ScalarQuantizer::new();
sq.train(&refs).unwrap();
let code = sq.quantize(&[0.11_f32, 0.21, 0.29]).unwrap(); // 3 bytes
let d = sq.distance(&[0.10_f32, 0.20, 0.30], &code, DistanceMetric::Cosine).unwrap();
assert!(d.is_finite());Product (PQ) trades a little recall for big compression, and precomputes a query table for batch scoring:
use iqdb_quantize::{ProductQuantizer, Quantizer};
use iqdb_types::DistanceMetric;
let mut pq = ProductQuantizer::with_config(2, 4, 7); // M = 2 subvectors, K = 4, seed = 7
let training: Vec<Vec<f32>> = (0..16)
.map(|i| { let f = i as f32; vec![f, f + 1.0, f + 2.0, f + 3.0] })
.collect();
let refs: Vec<&[f32]> = training.iter().map(Vec::as_slice).collect();
pq.train(&refs).unwrap();
// Build the ADC table once per query, then score many codes against it.
let query = [1.0_f32, 2.0, 3.0, 4.0];
let tables = pq.build_query_tables(&query, DistanceMetric::Euclidean).unwrap();
let code = pq.quantize(&query).unwrap(); // 2 bytes
let d = tables.distance(&code).unwrap();
assert!(d.is_finite());Binary (BQ) is the highest-compression scheme, scored with Hamming distance:
use iqdb_quantize::{BinaryQuantizer, Quantizer};
use iqdb_types::DistanceMetric;
let mut bq = BinaryQuantizer::new();
bq.train(&[&[0.0_f32, 1.0, 2.0][..], &[2.0_f32, 1.0, 0.0][..]]).unwrap();
let code = bq.quantize(&[0.5_f32, 1.5, 2.5]).unwrap(); // packed u64 words
let d = bq.distance(&[0.5_f32, 1.5, 2.5], &code, DistanceMetric::Hamming).unwrap();
assert_eq!(d, 0.0); // self-distance is zeroTwo rules to use quantization well: train on representative data, and search quantized but rerank with full f32. Skipping the rerank is the most common cause of "quantization broke recall" reports.
Every method of the Quantizer trait is fallible and returns iqdb_types::Result. The library never panics on bad input.
ScalarQuantizer(SQ8) — per-dimension affine calibration; codes areu8. Supports everyDistanceMetricvia asymmetric distance throughiqdb-distance.ProductQuantizer(PQ) —M-byte codes via deterministic k-means codebooks.PqAdcTablesprecomputes per-query lookup tables for batch scoring. SupportsEuclidean,DotProduct,Manhattan;Cosine(no global norm — L2-normalize and useDotProduct) andHamming(wrong code space) returnIqdbError::InvalidMetric.BinaryQuantizer(BQ) — one bit per dimension, packed intoVec<u64>. SupportsDistanceMetric::Hammingonly; other metrics returnIqdbError::InvalidMetric.
The full per-item reference, including the metric-support matrix and the error variants, is in docs/API.md.
v1.0.0 — stable. SQ8, PQ, and BQ all ship behind a single Quantizer trait, with the PqAdcTables batch-ADC primitive, deterministic seeded k-means, property tests for round-trip and distance invariants, recall integration tests against full-f32 baselines, a consumer-simulation soak that builds a mini IVF-PQ on the public surface, tracing instrumentation, a criterion bench harness, and five runnable examples. The public API is committed under SemVer for the 1.x series (no breaking changes until 2.0; the frozen surface is recorded in the ROADMAP), has zero unsafe, and is verified on Windows and Linux across stable and the 1.87 MSRV. The full surface is documented in docs/API.md.
iqdb-quantize is a Phase-2 crate, independent of the index layer. It builds on:
iqdb-types— theDistanceMetric,IqdbError, andResultvocabularyiqdb-distance— the f32 distance kernels SQ8 and PQ delegate to
and is consumed by:
iqdb-ivf— IVF-PQ scores in-cluster codes throughPqAdcTables
It is an optimization, not a requirement: iQDB runs without it.
Built to the iQDB Rust standard. See REPS.md (Rust Efficiency & Performance Standards) and dev/DIRECTIVES.md for the engineering law and the definition of done. Before a PR: cargo fmt --all, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all-features must be clean.
Licensed under either of
- Apache License, Version 2.0 — LICENSE-APACHE
- MIT License — LICENSE-MIT
at your option.