This document describes the performance benchmarking system created for semindex.
The benchmarking system measures the performance of semindex operations including:
- Indexing performance (how fast code repositories can be indexed)
- Embedding generation performance (how fast text can be converted to embeddings)
- Search query performance (how fast queries can be executed against the index)
- Memory usage during operations
This module contains:
BenchmarkResult: Data class to store benchmark resultsMemoryTracker: Class to track memory usageBenchmarkRunner: Main runner class to execute benchmarks- Functions to create sample code and repositories for testing
- Dedicated benchmarking functions for each operation type
This script provides:
- Command-line interface to run different benchmark suites
- Multiple test scenarios with different code sizes and complexities
- Comparison of different chunking methods
- Memory usage and execution time measurements
- Throughput calculations (operations per second)
python scripts/run_benchmarks.py# Run only indexing benchmarks
python scripts/run_benchmarks.py --benchmark indexing
# Run only embedding benchmarks
python scripts/run_benchmarks.py --benchmark embedding
# Run only search benchmarks
python scripts/run_benchmarks.py --benchmark search
# Run incremental indexing benchmarks
python scripts/run_benchmarks.py --benchmark incremental
# Run complete suite (default)
python scripts/run_benchmarks.py --benchmark completepython scripts/run_benchmarks.py --output-plot results.pngThe benchmark system includes various test scenarios:
-
Different Code Sizes
- Small (50 lines)
- Medium (200 lines)
- Large (1000 lines)
-
Different File Counts
- Few large files
- Many small files
- Mixed file sizes
-
Different Operations
- Indexing with different chunking methods (symbol vs semantic)
- Embedding generation with different batch sizes
- Search with different top-k values
- Execution time for each operation
- Memory usage before and after operations
- Memory delta (how much memory was consumed)
- Peak memory usage during operations
- Throughput (operations per second)
- Number of items processed
The benchmark system helps identify:
- Performance bottlenecks in the indexing pipeline
- Memory efficiency of different operations
- Optimal batch sizes for embedding generation
- Scalability with different repository sizes
- Effectiveness of different chunking strategies
To add new benchmarks:
- Add a new function in
benchmark.pythat follows the same pattern - Update the runner script to call your new benchmark function
- Ensure proper memory and time tracking using the provided utilities