feat: add Valkey vector store handler#12459
Open
daric93 wants to merge 9 commits into
Open
Conversation
Add Valkey as a vector store handler using valkey-glide client. Implements VectorStoreHandler interface with full CRUD + KNN search. Includes 15 unit tests and 15 integration tests. Ref: AEA-484 Signed-off-by: Daria Korenieva <daric2612@gmail.com>
- Set author to contributor name (matches chromadb_handler pattern) - Replace placeholder icon with official Valkey SVG from dashboard-icons - Add client_name='mindsdb_valkey_handler' to GlideClientConfiguration for production observability (CLIENT LIST, monitoring dashboards) Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…verage - Add numpy to requirements.txt (missing dependency) - Fix metadata filter expression syntax for TextField search - Fix NOT_EQUAL/NOT_IN silent data loss in ID-only lookup (Case B) - Refactor insert to use single async coroutine (reduce N+1 round-trips) - Add error logging on connect failure, debug logging on disconnect error - Document SCAN pagination non-determinism caveat - Update copyright year to 2026 - Fix test_large_vectors cleanup on skip path - Add 16 new unit tests covering all fixed code paths Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…ce, and style - Fix query injection in metadata filters: add _escape_phrase() for full FT.SEARCH special char escaping in phrase queries - Fix _escape_tag: add | (TAG union operator) to escaped characters - Fix _run(): detect running event loop and offload to thread to avoid RuntimeError in async contexts - Batch inserts with asyncio.gather (batches of 100) instead of sequential awaits - Add _MAX_SCAN_ITERATIONS safety limit to drop_table and _scan_all_docs - _scan_all_docs: stop collecting keys early once offset+limit reached - Pin numpy>=1.21.0,<3 in requirements.txt - Update README docker image from 9.1.0-rc1 to stable 9.1 - Convert f-string logger calls to lazy %s formatting - Use built-in generics (list[], | None) instead of typing.List/Optional - Replace time.sleep() in integration tests with _wait_for_indexing poll helper Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…LS, timeout - insert now raises Exception on partial failures (prevents silent data loss) - delete handles NOT_EQUAL/NOT_IN on ID via FT.SEARCH negation filter - Add request_timeout config (default 5000ms, Glide default was 250ms) - Add use_tls config for AWS ElastiCache/MemoryDB deployments - drop_table always cleans up orphaned hash keys even if index not found - check_connection calls disconnect() on failure (prevents client leak) - Extract magic number 10000 to _DELETE_SEARCH_LIMIT constant Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…g, cache executor Signed-off-by: Daria Korenieva <daric2612@gmail.com>
…, extract helper, dedupe tilde
…gorithm - Replace asyncio.gather with GLIDE Batch (pipeline) for batched HGETALL in select Case B and _scan_all_docs — single network round-trip per chunk - Make vector index algorithm configurable (FLAT/HNSW) via index_algorithm connection parameter, defaulting to HNSW - Add VectorFieldAttributesFlat support in create_table - Update unit test to mock Batch-based exec instead of individual hgetall Signed-off-by: Daria Korenieva <daric2612@gmail.com>
|
All contributors have signed the CLA ✍️ ✅ |
Author
|
I have read the CLA Document and I hereby sign the CLA |
Author
|
recheck |
|
Public no-secret MCP Buyer-Agent Readiness Snapshot for mindsdb:
No secrets, invoice, payment link, delivery link, private endpoint, paid call, or wallet signature; this is only a free public snapshot. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Closes: #12457
Adds a new Valkey Vector Store Handler to MindsDB, enabling Valkey (with Search module) as a vector database backend. Uses the
valkey-glideasync client with GLIDE'sBatchpipeline API for efficient bulk operations, wrapped in synchronous methods to implement theVectorStoreHandlerinterface.Fixes #AEA-484
What's included
FT.CREATE/FT.SEARCH)Batch(pipeline) for efficient batched readsFiles Added (9 files, +2,001 lines)
valkey_handler.pyconnection_args.py__init__.py__about__.pyrequirements.txticon.svgREADME.mdtests/test_valkey_handler.pytests/__init__.pyType of change
Verification Process
mindsdb/integrations/handlers/valkey_handler/tests/python3 -m pytest mindsdb/integrations/handlers/valkey_handler/tests/ -v -k "Unit"VALKEY_HOST=localhost VALKEY_PORT=6379 python3 -m pytest mindsdb/integrations/handlers/valkey_handler/tests/ -v -k "Integration"Test Results
valkey/valkey-bundle:9.1, search module v1.2.0)ruff check+ruff format --check— cleanConfiguration
SQL Usage
Connection Parameters
hostlocalhostport6379passwordNonedb0vector_dimension384distance_metricCOSINEindex_algorithmHNSWprefixdoc:use_tlsFalserequest_timeout5000Checklist
Related