Skip to content

jamesgober/iqdb-quantize

Repository files navigation

Rust logo
iqdb-quantize
iQDB VECTOR QUANTIZATION

Crates.io Downloads docs.rs CI MSRV

iqdb-quantize compresses f32 vectors into smaller representations while preserving search quality. A million 768-dim vectors drop from ~3 GB to as little as ~96 MB, trading a controlled amount of recall for memory.

It implements the three standard schemes — scalar (SQ8), product (PQ), and binary (BQ) — behind a single `Quantizer` trait, and reuses `iqdb-distance` for distance on the compressed codes.



MSRV is 1.87+ (Rust 2024 edition). Scalar, product, and binary quantization. One trait. A quality/space dial.

Status: stable (1.0). The public API is committed under SemVer for the 1.x series — no breaking changes until 2.0. See CHANGELOG.md.


What it does

  • Scalar quantization (SQ8) — f32 → u8 per dimension, ~4× compression; asymmetric distance under every metric
  • Product quantization (PQ) — subvector k-means codebooks, up to ~192× compression, with batch-ADC scoring for IVF-PQ
  • Binary quantization (BQ) — one bit per dimension, 32× compression, Hamming distance on packed u64 words
  • One traittrainquantizedistance, every method fallible and returning a typed error
  • Asymmetric distance — compress the database, keep the query in f32 for better recall
  • Deterministic — same seed + same data ⇒ byte-identical PQ codebooks on every platform
  • Never panics on bad input — empty, non-finite, mismatched, untrained, and unsupported-metric inputs return a typed IqdbError

Installation

[dependencies]
iqdb-quantize = "1.0"

Quick start

Train on a representative sample, then quantize and score. Scalar (SQ8) supports every metric:

use iqdb_quantize::{Quantizer, ScalarQuantizer};
use iqdb_types::DistanceMetric;

let training: Vec<Vec<f32>> = vec![
    vec![0.10, 0.20, 0.30],
    vec![0.15, 0.18, 0.32],
    vec![0.12, 0.22, 0.28],
];
let refs: Vec<&[f32]> = training.iter().map(Vec::as_slice).collect();

let mut sq = ScalarQuantizer::new();
sq.train(&refs).unwrap();

let code = sq.quantize(&[0.11_f32, 0.21, 0.29]).unwrap();   // 3 bytes
let d = sq.distance(&[0.10_f32, 0.20, 0.30], &code, DistanceMetric::Cosine).unwrap();
assert!(d.is_finite());

Product (PQ) trades a little recall for big compression, and precomputes a query table for batch scoring:

use iqdb_quantize::{ProductQuantizer, Quantizer};
use iqdb_types::DistanceMetric;

let mut pq = ProductQuantizer::with_config(2, 4, 7); // M = 2 subvectors, K = 4, seed = 7
let training: Vec<Vec<f32>> = (0..16)
    .map(|i| { let f = i as f32; vec![f, f + 1.0, f + 2.0, f + 3.0] })
    .collect();
let refs: Vec<&[f32]> = training.iter().map(Vec::as_slice).collect();
pq.train(&refs).unwrap();

// Build the ADC table once per query, then score many codes against it.
let query = [1.0_f32, 2.0, 3.0, 4.0];
let tables = pq.build_query_tables(&query, DistanceMetric::Euclidean).unwrap();
let code = pq.quantize(&query).unwrap();                    // 2 bytes
let d = tables.distance(&code).unwrap();
assert!(d.is_finite());

Binary (BQ) is the highest-compression scheme, scored with Hamming distance:

use iqdb_quantize::{BinaryQuantizer, Quantizer};
use iqdb_types::DistanceMetric;

let mut bq = BinaryQuantizer::new();
bq.train(&[&[0.0_f32, 1.0, 2.0][..], &[2.0_f32, 1.0, 0.0][..]]).unwrap();

let code = bq.quantize(&[0.5_f32, 1.5, 2.5]).unwrap();       // packed u64 words
let d = bq.distance(&[0.5_f32, 1.5, 2.5], &code, DistanceMetric::Hamming).unwrap();
assert_eq!(d, 0.0); // self-distance is zero

Two rules to use quantization well: train on representative data, and search quantized but rerank with full f32. Skipping the rerank is the most common cause of "quantization broke recall" reports.


How to use it

Every method of the Quantizer trait is fallible and returns iqdb_types::Result. The library never panics on bad input.

  • ScalarQuantizer (SQ8) — per-dimension affine calibration; codes are u8. Supports every DistanceMetric via asymmetric distance through iqdb-distance.
  • ProductQuantizer (PQ)M-byte codes via deterministic k-means codebooks. PqAdcTables precomputes per-query lookup tables for batch scoring. Supports Euclidean, DotProduct, Manhattan; Cosine (no global norm — L2-normalize and use DotProduct) and Hamming (wrong code space) return IqdbError::InvalidMetric.
  • BinaryQuantizer (BQ) — one bit per dimension, packed into Vec<u64>. Supports DistanceMetric::Hamming only; other metrics return IqdbError::InvalidMetric.

The full per-item reference, including the metric-support matrix and the error variants, is in docs/API.md.


Status

v1.0.0stable. SQ8, PQ, and BQ all ship behind a single Quantizer trait, with the PqAdcTables batch-ADC primitive, deterministic seeded k-means, property tests for round-trip and distance invariants, recall integration tests against full-f32 baselines, a consumer-simulation soak that builds a mini IVF-PQ on the public surface, tracing instrumentation, a criterion bench harness, and five runnable examples. The public API is committed under SemVer for the 1.x series (no breaking changes until 2.0; the frozen surface is recorded in the ROADMAP), has zero unsafe, and is verified on Windows and Linux across stable and the 1.87 MSRV. The full surface is documented in docs/API.md.



Where It Fits

iqdb-quantize is a Phase-2 crate, independent of the index layer. It builds on:

  • iqdb-types — the DistanceMetric, IqdbError, and Result vocabulary
  • iqdb-distance — the f32 distance kernels SQ8 and PQ delegate to

and is consumed by:

  • iqdb-ivf — IVF-PQ scores in-cluster codes through PqAdcTables

It is an optimization, not a requirement: iQDB runs without it.


Standards

Built to the iQDB Rust standard. See REPS.md (Rust Efficiency & Performance Standards) and dev/DIRECTIVES.md for the engineering law and the definition of done. Before a PR: cargo fmt --all, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all-features must be clean.


License

Licensed under either of

at your option.

COPYRIGHT © 2026 JAMES GOBER.

About

Vector quantization (scalar, product, binary) for memory-efficient vector search - part of the iQDB family.

Topics

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages