Skip to content

feat: add Valkey vector store handler#12459

Open
daric93 wants to merge 9 commits into
mindsdb:mainfrom
daric93:feat/valkey-vector-store
Open

feat: add Valkey vector store handler#12459
daric93 wants to merge 9 commits into
mindsdb:mainfrom
daric93:feat/valkey-vector-store

Conversation

@daric93
Copy link
Copy Markdown

@daric93 daric93 commented Jun 2, 2026

Description

Closes: #12457

Adds a new Valkey Vector Store Handler to MindsDB, enabling Valkey (with Search module) as a vector database backend. Uses the valkey-glide async client with GLIDE's Batch pipeline API for efficient bulk operations, wrapped in synchronous methods to implement the VectorStoreHandler interface.

Fixes #AEA-484

What's included

  • Full CRUD operations (create_table, insert, select, delete, drop_table)
  • KNN vector similarity search via Valkey Search (FT.CREATE / FT.SEARCH)
  • Filter expressions for metadata-based queries
  • Configurable vector dimensions, distance metric (COSINE/L2/IP), and index algorithm (FLAT/HNSW)
  • TLS support and configurable request timeout
  • GLIDE Batch (pipeline) for efficient batched reads
  • Handler registration with MindsDB plugin system

Files Added (9 files, +2,001 lines)

File Purpose
valkey_handler.py Core handler implementation
connection_args.py Connection parameter definitions
__init__.py Plugin registration
__about__.py Package metadata
requirements.txt Dependencies (valkey-glide, numpy)
icon.svg Handler icon
README.md Usage documentation with examples
tests/test_valkey_handler.py Unit + integration tests
tests/__init__.py Test package marker

Type of change

  • ⚡ New feature (non-breaking change which adds functionality)

Verification Process

  • Test Location: mindsdb/integrations/handlers/valkey_handler/tests/
  • Verification Steps:
  1. Start Valkey with Search module:
    docker run -d --name valkey-test -p 6379:6379 valkey/valkey-bundle:9.1
  2. Run unit tests (no external deps):
    python3 -m pytest mindsdb/integrations/handlers/valkey_handler/tests/ -v -k "Unit"
  3. Run integration tests:
    VALKEY_HOST=localhost VALKEY_PORT=6379 python3 -m pytest mindsdb/integrations/handlers/valkey_handler/tests/ -v -k "Integration"

Test Results

  • Unit Tests: 30 passed ✅
  • Integration Tests: 15 passed ✅ (against valkey/valkey-bundle:9.1, search module v1.2.0)
  • Linting: ruff check + ruff format --check — clean

Configuration

SQL Usage

CREATE DATABASE valkey_store
WITH ENGINE = 'valkey',
PARAMETERS = {
  "host": "localhost",
  "port": 6379,
  "index_algorithm": "HNSW",
  "distance_metric": "COSINE",
  "vector_dimension": 384
};

CREATE TABLE valkey_store.my_embeddings (
  SELECT embeddings, content, metadata FROM model_output
);

Connection Parameters

Parameter Default Description
host localhost Valkey server hostname
port 6379 Valkey server port
password None Authentication password
db 0 Database number
vector_dimension 384 Vector dimension for indexes
distance_metric COSINE COSINE, L2, or IP
index_algorithm HNSW HNSW or FLAT
prefix doc: Key prefix for document hashes
use_tls False Enable TLS/SSL
request_timeout 5000 Request timeout in ms

Checklist

  • My code follows the style guidelines (PEP 8) of MindsDB.
  • I have appropriately commented on my code, especially in complex areas.
  • Necessary documentation updates are either made or tracked in issues.
  • Relevant unit and integration tests are updated or added.

Related

  • No existing files modified — purely additive change

daric93 added 9 commits May 27, 2026 11:04
Add Valkey as a vector store handler using valkey-glide client.
Implements VectorStoreHandler interface with full CRUD + KNN search.
Includes 15 unit tests and 15 integration tests.

Ref: AEA-484
Signed-off-by: Daria Korenieva <daric2612@gmail.com>
- Set author to contributor name (matches chromadb_handler pattern)
- Replace placeholder icon with official Valkey SVG from dashboard-icons
- Add client_name='mindsdb_valkey_handler' to GlideClientConfiguration
  for production observability (CLIENT LIST, monitoring dashboards)

Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…verage - Add numpy to requirements.txt (missing dependency) - Fix metadata filter expression syntax for TextField search - Fix NOT_EQUAL/NOT_IN silent data loss in ID-only lookup (Case B) - Refactor insert to use single async coroutine (reduce N+1 round-trips) - Add error logging on connect failure, debug logging on disconnect error - Document SCAN pagination non-determinism caveat - Update copyright year to 2026 - Fix test_large_vectors cleanup on skip path - Add 16 new unit tests covering all fixed code paths

Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…ce, and style - Fix query injection in metadata filters: add _escape_phrase() for full FT.SEARCH special char escaping in phrase queries - Fix _escape_tag: add | (TAG union operator) to escaped characters - Fix _run(): detect running event loop and offload to thread to avoid RuntimeError in async contexts - Batch inserts with asyncio.gather (batches of 100) instead of sequential awaits - Add _MAX_SCAN_ITERATIONS safety limit to drop_table and _scan_all_docs - _scan_all_docs: stop collecting keys early once offset+limit reached - Pin numpy>=1.21.0,<3 in requirements.txt - Update README docker image from 9.1.0-rc1 to stable 9.1 - Convert f-string logger calls to lazy %s formatting - Use built-in generics (list[], | None) instead of typing.List/Optional - Replace time.sleep() in integration tests with _wait_for_indexing poll helper

Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…LS, timeout - insert now raises Exception on partial failures (prevents silent data loss) - delete handles NOT_EQUAL/NOT_IN on ID via FT.SEARCH negation filter - Add request_timeout config (default 5000ms, Glide default was 250ms) - Add use_tls config for AWS ElastiCache/MemoryDB deployments - drop_table always cleans up orphaned hash keys even if index not found - check_connection calls disconnect() on failure (prevents client leak) - Extract magic number 10000 to _DELETE_SEARCH_LIMIT constant

Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…g, cache executor

Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…gorithm - Replace asyncio.gather with GLIDE Batch (pipeline) for batched HGETALL in select Case B and _scan_all_docs — single network round-trip per chunk - Make vector index algorithm configurable (FLAT/HNSW) via index_algorithm connection parameter, defaulting to HNSW - Add VectorFieldAttributesFlat support in create_table - Update unit test to mock Batch-based exec instead of individual hgetall

Signed-off-by: Daria Korenieva <daric2612@gmail.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@daric93 daric93 changed the title Feat/valkey vector store feat: add Valkey vector store handler Jun 2, 2026
@daric93
Copy link
Copy Markdown
Author

daric93 commented Jun 2, 2026

I have read the CLA Document and I hereby sign the CLA

github-actions Bot added a commit that referenced this pull request Jun 2, 2026
@daric93
Copy link
Copy Markdown
Author

daric93 commented Jun 2, 2026

recheck

@egoriklok
Copy link
Copy Markdown

Public no-secret MCP Buyer-Agent Readiness Snapshot for mindsdb:

  • Public signal: Public MCP candidate mindsdb: Query Engine for AI - The only MCP Server you'll ever need. Matched R1 terms: agent, analytics, ci, database, github, mcp, postgres, server.
  • R1 fit: agent, analytics, ci, database, github, mcp, postgres, server.
  • Readiness status: public evidence review needed before an autonomous buyer-agent should rely on this surface.
  • Blind spot 1: explicit auth scopes and delegated-permission boundary.
  • Blind spot 2: spend/API cost cap plus approval semantics before paid actions.
  • Blind spot 3: receipt, audit-log, revocation, or dispute evidence for safe buyer-agent use.
  • Single next question: For mindsdb, is there already a documented policy for agent spend/auth limits, receipt evidence, and revocation before a buyer-agent can invoke it?

No secrets, invoice, payment link, delivery link, private endpoint, paid call, or wallet signature; this is only a free public snapshot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Integration]: Valkey Vector Store

2 participants