vxdb

The vector database that fits in your pocket.

Rust-powered. Python-native. One pip install away.

pip install vxdb

import vxdb

db = vxdb.Database(path="./my_data")  # persistent — data survives restarts
collection = db.create_collection("docs", dimension=384)

embed = your_embedding_function  # OpenAI, Sentence Transformers, Cohere, etc.

collection.upsert(
    ids=["a", "b"],
    vectors=[embed("how to train a model"), embed("best pasta recipe")],
    documents=["how to train a model", "best pasta recipe"],
)

collection.query(vector=embed("machine learning"), top_k=5)

embed() is any function that turns text into vectors — see examples/ for OpenAI, Sentence Transformers, LangChain, and Cohere.

That's it. No Docker. No config files. No cloud account. No 500 MB of dependencies.

Why developers choose vxdb

Stupid fast

The entire hot path — distance computation, HNSW traversal, BM25 scoring, mmap I/O — is pure Rust with zero GIL contention. Your Python code calls directly into compiled native code via PyO3. No serialization overhead. No REST round-trips. No subprocess.

Stupid light

A single native wheel under 5 MB with zero Python dependencies. Starts in under 10 ms. No numpy. No scipy. No protobuf. No grpcio version conflicts. Just pip install vxdb and you're done.

Runs anywhere

Laptop. CI pipeline. Raspberry Pi. AWS Lambda. Docker container. Air-gapped server. Anywhere Python runs, vxdb runs. No infrastructure required to get started — scale up to a standalone server when you need it.

Hybrid search built-in

Vector similarity + BM25 keyword matching fused via Reciprocal Rank Fusion. One API call. Tunable alpha parameter. No separate search engine needed. No Elasticsearch sidecar.

Other databases like Qdrant, Milvus, and Zvec support hybrid search too — but they require you to run a separate sparse encoder (BM25 or SPLADE) yourself and pass pre-computed sparse vectors. vxdb computes BM25 internally from the documents you already upserted. One call: hybrid_query(vector=..., query="text", alpha=0.5). No extra step.

Dual-mode: embedded + server

Many databases now offer an "embedded" mode — but the implementations vary widely. Qdrant's local mode is a Python reimplementation (not their Rust engine). Weaviate embedded downloads a Go binary and runs it as a subprocess. Milvus Lite works but is limited to Linux/macOS and recommended for <1M vectors.

vxdb's embedded mode is the real Rust engine compiled directly into a Python extension via PyO3. No serialization. No subprocess. No network. And the same engine powers the standalone REST server — start in a notebook, scale to multi-client HTTP when you're ready. No rewrite.

The full picture

Quick Start

3 lines to your first search

import vxdb

# Persistent (data survives restarts)
db = vxdb.Database(path="./my_data")

# Or in-memory (ephemeral, great for prototyping)
# db = vxdb.Database()

collection = db.create_collection("docs", dimension=384, metric="cosine")

Insert vectors

collection.upsert(
    ids=["a", "b", "c"],
    vectors=[[0.1, 0.2, ...], [0.3, 0.4, ...], [0.5, 0.6, ...]],
    metadata=[{"type": "article"}, {"type": "blog"}, {"type": "article"}],
    documents=["intro to ML", "my favorite recipes", "deep learning guide"],
)

Search — four ways

# 1. Vector similarity
results = collection.query(vector=[0.1, 0.2, ...], top_k=5)

# 2. Filtered (metadata constraints)
results = collection.query(
    vector=[0.1, ...], top_k=5,
    filter={"type": {"$eq": "article"}}
)

# 3. Hybrid (vector + keyword — the sweet spot)
results = collection.hybrid_query(
    vector=[0.1, ...],
    query="machine learning",
    top_k=5,
    alpha=0.5,  # 0=keyword only, 1=vector only
)

# 4. Keyword only (BM25)
results = collection.keyword_search(query="machine learning", top_k=5)

Every result returns {"id", "score", "metadata", "document"}.

Installation

pip install vxdb

That's the whole thing. Works on macOS, Linux, Windows. Python 3.11+.

For the HTTP client (talking to a remote vxdb server):

pip install 'vxdb[server]'

Embedding Providers

vxdb stores pre-computed vectors — bring any embedding model you want. We have step-by-step notebooks for each:

Provider	Install	API Key?	Notebook
OpenAI	`pip install openai`	Yes	examples/openai_embeddings.ipynb
Sentence Transformers	`pip install sentence-transformers`	No (local)	examples/sentence_transformers.ipynb
LangChain (any provider)	`pip install langchain-openai`	Depends	examples/langchain_integration.ipynb
Cohere	`pip install cohere`	Yes	examples/cohere_embeddings.ipynb
Ollama (local LLMs)	`pip install ollama`	No (local)	—

Or let vxdb embed for you. Attach an embedding_function to a collection and work in text — pass documents to upsert and query_text to query:

from vxdb import Database, EmbeddingFunction

class MyEmbedder(EmbeddingFunction):
    def embed(self, texts: list[str]) -> list[list[float]]:
        return your_model.encode(texts)

db = Database()
docs = db.create_collection("docs", embedding_function=MyEmbedder())  # dimension inferred

docs.upsert(ids=["a", "b"], documents=["how to train a model", "best pasta recipe"])
docs.query(query_text="machine learning", top_k=5)

The embedding_function can be an EmbeddingFunction subclass or any callable list[str] -> list[list[float]]. Passing vectors/vector explicitly always works and bypasses embedding — vxdb never requires or imports your model library.

Server Mode

Same engine, accessed over HTTP. Deploy it as a standalone service.

The server ships as a separate, optional package — pip install vxdb-server adds the vxdb-server binary without touching the lean core vxdb wheel:

# Install the standalone server (separate package, no extra deps)
pip install vxdb-server

# Start it
vxdb-server --host 0.0.0.0 --port 8080

The Python Client lives in the core package — install it with the server extra (which pulls in httpx):

pip install 'vxdb[server]'

Note: server mode is currently in-memory only — data does not persist across restarts. For persistence, use embedded mode (vxdb.Database(path=...)).

Python client:

from vxdb import Client

client = Client("http://localhost:8080")
coll = client.create_collection("docs", dimension=384)
coll.upsert(ids=["a"], vectors=[[0.1, ...]], documents=["hello world"])
results = coll.hybrid_query(vector=[0.1, ...], query="hello", top_k=5)

cURL:

# Create collection
curl -X POST localhost:8080/collections \
  -H "Content-Type: application/json" \
  -d '{"name": "docs", "dimension": 384}'

# Upsert
curl -X POST localhost:8080/collections/docs/upsert \
  -H "Content-Type: application/json" \
  -d '{"ids": ["a"], "vectors": [[0.1, 0.2]], "documents": ["hello world"]}'

# Query
curl -X POST localhost:8080/collections/docs/query \
  -H "Content-Type: application/json" \
  -d '{"vector": [0.1, 0.2], "top_k": 5}'

Docker:

docker build -t vxdb .
docker run -p 8080:8080 vxdb    # ~145 MB Debian-based image

Hybrid Search

Most vector databases give you vector search OR keyword search. vxdb gives you both, fused intelligently in a single call.

How it works:

You upsert with documents — raw text is tokenized into a built-in BM25 index alongside your vectors
At query time — vector search and BM25 run in parallel, then Reciprocal Rank Fusion merges both ranked lists
You control the blend — alpha=1.0 (pure vector) → alpha=0.5 (balanced) → alpha=0.0 (pure keyword)

When to use it: Specific product names. Error codes. Proper nouns. Anything where exact terms matter alongside semantic meaning. See examples/hybrid_search.ipynb for a deep dive with side-by-side comparisons.

results = collection.hybrid_query(
    vector=embed("lightweight laptop for students"),
    query="MacBook Air M4",
    top_k=5,
    alpha=0.5,
)

How vxdb compares

	vxdb	Zvec (Alibaba)	ChromaDB	Qdrant	Pinecone	Milvus	Weaviate	FAISS
Language	Rust	C++ (Proxima)	Rust (v1.0+)	Rust	Proprietary	Go/C++	Go	C++
Embedded mode	PyO3, true in-process	In-process	In-process	Python-only local mode	No	Milvus Lite	Subprocess (downloads Go binary)	SWIG bindings
Server mode	Yes	No	Yes	Yes	Cloud only	Yes	Yes	No
`pip install` just works	Yes	Yes	Yes	Yes (local mode)	N/A (SaaS)	Yes (Milvus Lite)	Yes (Linux/macOS)	Yes
Python dependencies	None (zero)	DashText SDK	Several	numpy, grpcio, etc.	N/A	grpcio, protobuf, etc.	grpcio, etc.	numpy
Wheel size	~5 MB	~30 MB	~20 MB	~50 MB	N/A	~50 MB+	~100 MB+ (downloads binary)	~20 MB
Startup time	<10 ms	<100 ms	<500 ms	~1-3 s (server)	N/A	~5-10 s (server)	~3-5 s (server)	<10 ms
Hybrid search	Built-in BM25 + RRF	BM25 + RRF + weighted	RRF (dense+sparse)	RRF, DBSF	Sparse+dense	Sparse vectors	BM25 + RRF	No
BM25 without external encoder	Yes (automatic)	Requires DashText SDK	Yes	Requires sparse encoder	No	Requires sparse encoder	Yes	No
Sparse vectors	No	Yes	Yes	Yes	Yes	Yes	No	No
Multi-vector queries	No	Yes	No	Yes	No	No	No	No
Metadata filtering	10 operators	Structured filters	Yes	Yes	Yes	Yes	Yes	No
Persistence	mmap + SQLite + WAL	Custom engine	SQLite	Gridstore	Cloud	RocksDB	LSM	Manual
Crash recovery	WAL	Yes	Yes (v1.0)	Yes	Yes	Yes	Yes	No
Quantization	No (planned)	FP16, INT8, INT4, RaBitQ	No	Scalar/PQ	Yes	Yes	PQ/BQ	PQ/SQ
Docker image	~145 MB	N/A (no server)	~200 MB+	~100 MB	No	~1 GB+	~300 MB+	No
Runs offline	Yes	Yes	Yes	Yes	No	Yes	Yes	Yes
License	Apache 2.0	Apache 2.0	Apache 2.0	Apache 2.0	Proprietary	Apache 2.0	BSD-3	MIT

API Reference

Python (Embedded)

# Database
db = vxdb.Database()                  # in-memory (ephemeral)
db = vxdb.Database(path="./my_data")  # persistent (data survives restarts)
db.create_collection(name, dimension, metric="cosine", index="flat")
db.get_collection(name)
db.list_collections()
db.delete_collection(name)

# Collection
collection.upsert(ids, vectors, metadata=None, documents=None)
collection.query(vector, top_k=10, filter=None)
collection.hybrid_query(vector, query, top_k=10, alpha=0.5)
collection.keyword_search(query, top_k=10)
collection.delete(ids)
collection.count()

vectors accepts a list[list[float]] or a 2-D float32 NumPy array — NumPy arrays are read zero-copy via the buffer protocol, and NumPy is never imported or required.

REST API

Method	Endpoint	Description
`POST`	`/collections`	Create collection
`GET`	`/collections`	List collections
`DELETE`	`/collections/{name}`	Delete collection
`POST`	`/collections/{name}/upsert`	Upsert vectors (+ optional documents)
`POST`	`/collections/{name}/query`	Vector search (+ optional filter)
`POST`	`/collections/{name}/hybrid`	Hybrid vector + keyword search
`POST`	`/collections/{name}/keyword`	BM25 keyword search
`POST`	`/collections/{name}/delete`	Delete vectors by ID
`GET`	`/collections/{name}/count`	Count vectors

Parameters

Parameter	Values	Default
`metric`	`"cosine"`, `"euclidean"`, `"dot"`	`"cosine"`
`index`	`"flat"` (exact), `"hnsw"` (approximate)	`"flat"`
`filter`	`$eq` `$ne` `$gt` `$gte` `$lt` `$lte` `$in` `$nin` `$and` `$or`	—
`alpha`	`0.0` (keyword) to `1.0` (vector)	`0.5`

Examples

Interactive Jupyter notebooks with step-by-step walkthroughs:

Notebook	What you'll build
quickstart.ipynb	Every feature in 5 min (no API keys)
openai_embeddings.ipynb	Semantic search with OpenAI embeddings
sentence_transformers.ipynb	Free, local embeddings (no API key)
langchain_integration.ipynb	LangChain + RAG pipeline
cohere_embeddings.ipynb	Multilingual search with Cohere
hybrid_search.ipynb	Deep dive: vector vs keyword vs hybrid

Development

git clone https://github.com/getmykhan/vxdb.git && cd vxdb

# Rust
cargo build --all
cargo test --all        # 120+ tests

# Python
uv venv .venv && source .venv/bin/activate
uv pip install maturin pytest httpx
maturin develop
PYTHONPATH=python pytest tests/ -v

The codebase is a Cargo workspace:

vxdb/
├── crates/
│   ├── vxdb-core/       # Engine: indexes, distance, storage, hybrid search
│   ├── vxdb-python/     # PyO3 bindings
│   └── vxdb-server/     # Axum REST API server
├── python/vxdb/         # Python package (client SDK, embedding interface)
├── examples/             # Jupyter notebooks
└── tests/                # Python integration tests

Roadmap

~~Persistent collections (mmap + SQLite + WAL)~~ Done
SIMD-accelerated distance computation
Quantization (int8/binary) for reduced memory
GPU acceleration (CUDA/Metal)
HNSW graph serialization (fast restart for large indexes)
Streaming upsert for large datasets
Sparse vector support
gRPC API
Official LangChain VectorStore integration
Kubernetes Helm chart
Benchmarks suite vs Qdrant, ChromaDB, Zvec, FAISS

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github		.github
crates		crates
docs		docs
examples		examples
python/vxdb		python/vxdb
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vxdb

Why developers choose vxdb

Stupid fast

Stupid light

Runs anywhere

Hybrid search built-in

Dual-mode: embedded + server

The full picture

Quick Start

3 lines to your first search

Insert vectors

Search — four ways

Installation

Embedding Providers

Server Mode

Hybrid Search

How vxdb compares

API Reference

Python (Embedded)

REST API

Parameters

Examples

Development

Roadmap

License

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

vxdb

Why developers choose vxdb

Stupid fast

Stupid light

Runs anywhere

Hybrid search built-in

Dual-mode: embedded + server

The full picture

Quick Start

3 lines to your first search

Insert vectors

Search — four ways

Installation

Embedding Providers

Server Mode

Hybrid Search

How vxdb compares

API Reference

Python (Embedded)

REST API

Parameters

Examples

Development

Roadmap

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages