Benchmark the search performance of Approximate Nearest Neighbor (ANN) algorithms implemented in various systems. This repository contains a Python CLI tool to evaluate and compare the efficiency and accuracy of ANN searches across different platforms.
Approximate Nearest Neighbor (ANN) search algorithms are essential for handling high-dimensional data spaces, enabling fast and resource-efficient retrieval of similar items from large datasets. This benchmarking suite aims to provide an empirical basis for comparing the performance of several popular ANN-enabled search systems.
Before running the benchmarks, ensure you have the following installed:
- Docker
- Python 3.10 or higher
- uv (Python package manager)
-
Install uv (if not already installed):
curl -LsSf https://astral.sh/uv/install.sh | sh -
Clone the repository and install dependencies:
git clone https://github.com/codelibs/search-ann-benchmark.git cd search-ann-benchmark uv sync -
Download the dataset:
bash scripts/setup.sh
For GitHub Actions (smaller dataset):
bash scripts/setup.sh gha
# Run Qdrant benchmark with default settings
uv run search-ann-benchmark run qdrant
# Run Elasticsearch with specific configuration
uv run search-ann-benchmark run elasticsearch --target 1m-768-m48-efc200-ef100-ip
# Run with quantization
uv run search-ann-benchmark run elasticsearch --quantization int8 --variant int8
# Skip filtered search benchmark
uv run search-ann-benchmark run chroma --no-filteruv run search-ann-benchmark list-enginesuv run search-ann-benchmark list-targetsuv run search-ann-benchmark show-config qdrant --target 100k-768-m32-efc200-ef100-ipuv run search-ann-benchmark show-results results.json| Name | Index Size | HNSW M | Description |
|---|---|---|---|
| 100k-768-m32-efc200-ef100-ip | 100,000 | 32 | Small dataset for quick testing |
| 1m-768-m48-efc200-ef100-ip | 1,000,000 | 48 | Medium dataset |
| 5m-768-m48-efc200-ef100-ip | 5,000,000 | 48 | Full dataset |
Different engines support different quantization modes:
- Qdrant: none, int8
- Elasticsearch: none, int4, int8, bbq
- OpenSearch: none (supports faiss engine variant)
- Weaviate: none, pq
- pgvector: vector, halfvec
search-ann-benchmark/
├── src/search_ann_benchmark/
│ ├── __init__.py
│ ├── cli.py # CLI entry point
│ ├── config.py # Configuration classes
│ ├── runner.py # Benchmark orchestration
│ ├── core/
│ │ ├── base.py # Abstract engine interface
│ │ ├── docker.py # Docker management
│ │ ├── embedding.py # Embedding loader
│ │ └── metrics.py # Metrics calculation
│ └── engines/
│ ├── qdrant.py
│ ├── elasticsearch.py
│ ├── opensearch.py
│ ├── milvus.py
│ ├── weaviate.py
│ ├── vespa.py
│ ├── pgvector.py
│ ├── chroma.py
│ ├── clickhouse.py
│ ├── lancedb.py
│ ├── redisstack.py
│ └── vald.py
├── tests/
├── scripts/
│ ├── setup.sh # Dataset download
│ └── get_hardware_info.sh
└── .github/workflows/ # CI workflows
Benchmark results are saved to results.json with the following structure:
{
"variant": "",
"target": "100k-768-m32-efc200-ef100-ip",
"version": "1.13.6",
"settings": { ... },
"results": {
"indexing": {
"execution_time": 123.45,
"process_time": 100.23,
"container": { ... }
},
"top_10": {
"num_of_queries": 10000,
"took": { "mean": 5.2, "std": 1.1, ... },
"hits": { ... },
"precision": { "mean": 0.95, ... }
},
"top_100": { ... },
"top_10_filtered": { ... },
"top_100_filtered": { ... }
},
"timestamp": "2024-01-01T00:00:00"
}uv run pytestuv run ruff check --fix src tests
uv run ruff format src testsuv run mypy srcTo update an engine version, modify the ENGINE_VERSION in:
- The engine config class in
src/search_ann_benchmark/engines/<engine>.py - The corresponding workflow in
.github/workflows/run-<engine>-linux.yml
For a comparison of the results, including response times and precision metrics for different ANN algorithms, see Benchmark Results Page.
We welcome contributions! If you have suggestions for additional benchmarks, improvements to existing ones, or fixes for any issues, please feel free to open an issue or submit a pull request.
This project is licensed under the Apache License 2.0.