feat(vectorstores): add Valkey vector store integration#1
feat(vectorstores): add Valkey vector store integration#1rileydes-improving wants to merge 4 commits into
Conversation
rileydes-improving
commented
Jun 2, 2026
- Add ValkeyApi credential for host/port/auth configuration
- Add ValkeyUrlApi credential for connection URL-based setup
- Implement ValkeyVectorStore node with full CRUD operations
- Support vector similarity search with metadata filtering
- Add comprehensive integration tests with Valkey 9.1+ and valkey-search v1.2+ module
- Add unit tests covering core functionality and edge cases
- Include Valkey icon asset for UI representation
- Update package.json with valkey-glide dependency
- Add ValkeyApi credential for host/port/auth configuration - Add ValkeyUrlApi credential for connection URL-based setup - Implement ValkeyVectorStore node with full CRUD operations - Support vector similarity search with metadata filtering - Add comprehensive integration tests with Valkey 9.1+ and valkey-search v1.2+ module - Add unit tests covering core functionality and edge cases - Include Valkey icon asset for UI representation - Update package.json with valkey-glide dependency Signed-off-by: Riley Des <riley.desserre@improving.com>
MatthiasHowellYopp
left a comment
There was a problem hiding this comment.
This is a solid, well-structured integration. The test coverage — both unit and integration — is thorough and deterministic. A few issues worth addressing before merge:
Valkey.ts
[warning] createGlideClient uses any for clientConfig (~line 490)
Defeats strict-mode safety and masks type errors. The Glide client API has a typed config interface — use it.
import type { GlideClientConfiguration } from '@valkey/valkey-glide'
const clientConfig: GlideClientConfiguration = { ... }[warning] Connection leak in init() (~line 590)
The persistent GlideClient created in init() has no teardown path. The comment acknowledges this matches the Postgres/TypeORM pattern, but one TCP connection per chatflow instance accumulates indefinitely in long-running servers. At minimum, document a maximum pool size or track a Flowise lifecycle teardown hook issue.
[warning] JSON.parse(valkeyMetadataFilter) without schema validation (~line 578)
A non-array value (e.g. "tag1") would silently produce a string filter instead of the expected array.
const parsed = JSON.parse(valkeyMetadataFilter)
if (!Array.isArray(parsed) && typeof parsed !== 'string') {
throw new Error('valkeyMetadataFilter must be a JSON array of strings or a string')
}
filter = parsed[nit] //@ts-ignore on vectorStoreMethods
Suppresses a real type mismatch. Use the correct IVectorStoreMethods interface or an explicit inline type instead.
[nit] Unused import: Embeddings alongside EmbeddingsInterface (lines 6–7)
Embeddings is only used as a type annotation where EmbeddingsInterface would be more correct and consistent.
ValkeyApi.credential.ts
[nit] default: '6379' is a string but type is 'number' (line 28)
default: 6379Valkey.int.test.ts
[warning] deleteAll test uses scan without cursor loop (~line 80)
Uses client.scan('0', ...) once without looping on the cursor — may miss keys on larger datasets. The TTL test already does this correctly; the deleteAll test should match.
[nit] if (skip()) return makes skipped tests appear green
Intentional per the file header, but test.skip / xit would mark them as skipped in the report instead of passing, giving clearer CI signal.
[nit] catch (err: any) in checkValkeyAvailability
Prefer narrowing over any:
catch (err) {
const msg = err instanceof Error ? err.message : String(err)
}package.json
[nit] Dependency may be in the wrong section
@valkey/valkey-glide appears to be added under devDependencies. If the Valkey node is shipped as part of the components package, it should be a regular dependency.
Verdict
The core implementation is correct and well-tested. The connection-leak concern in init() and the missing filter validation are the most meaningful issues. Everything else is polish. Approve with changes — address the connection-leak documentation and the any typing in createGlideClient before merge.
- Replace `any` in createGlideClient with GlideClientConfiguration type - Add validation for valkeyMetadataFilter after JSON.parse - Pass sortby/dialect/limit from buildQuery to GlideFt.search - Flatten array metadata values into TAG field (fixes silent drop) - Replace @ts-ignore with @ts-expect-error + explanation - Replace lodash flatten with native .flat() - Use EmbeddingsInterface instead of Embeddings class for type assertions - Import GlideClientConfiguration from @valkey/valkey-glide
- Fix port default from string '6379' to number 6379 in credential - Use cursor loop in deleteAll test scan verification - Narrow catch (err: any) to proper instanceof check
Review feedback addressed@edlng @MatthiasHowellYopp — pushed two commits addressing 10 of 13 findings. All unit tests pass (65/65), lint is clean. What changed
Pushbacks (3 findings)Connection leak in
Suggestions for tightening AI-assisted reviewsA few calibration notes that might help catch false positives before posting:
These are calibration points, not complaints. The TAG field array bug and the missing filter validation were real production-data-loss risks, and catching those is exactly what makes a review valuable. More of that, less of the easily-verifiable nits. |
Jonathan-Improving
left a comment
There was a problem hiding this comment.
Valkey Vector Store Node Review
Verdict: COMMENT — Solid implementation that meets AEA-488 acceptance criteria. GLIDE API usage is mostly correct, test coverage is comprehensive, and the Flowise node pattern is followed well. A few items to address before upstream submission.
Summary
- ✅ All AEA-488 acceptance criteria met
- ✅ Comprehensive unit tests (798 lines) and integration tests (494 lines)
- ✅ Correct GlideFt.create/search/info/dropindex usage
- ✅ Proper TAG value escaping in FT.SEARCH queries
- ✅ Connection lifecycle (try/finally/close) in upsert/delete paths
- ✅ Concurrent index creation handled gracefully
⚠️ 2 must-fix: NaN propagation in topK, potential query injection via field names⚠️ 3 suggestions: missing Decoder for binary data, dead createIndexOptions field, inconsistent retry strategy
Findings
See inline comments for details.
- Guard against NaN propagation when topK is non-numeric string - Validate contentKey/metadataKey/vectorKey against /^[a-zA-Z_][a-zA-Z0-9_]*$/ to prevent FT.SEARCH query injection via crafted field names - Remove unused createIndexOptions (interface, field, constructor assignment) - Document that retryStrategy is ClusterBatchOptions-only per valkey-glide API Signed-off-by: Riley Des <riley.desserre@improving.com>
Jonathan-Improving
left a comment
There was a problem hiding this comment.
Re-Review (Round 2): Valkey Vector Store Integration
Verdict: COMMENT — 1 finding (1M). All 8 round-1 findings correctly resolved. GLIDE TypeScript API usage verified correct.
Round 1 Resolution Summary
All 8 findings addressed:
- ✅ TAG field dropping arrays → flatMap handles arrays
- ✅ SORTBY/LIMIT/DIALECT not passed → now passed explicitly
- ✅ NaN propagation on topK → isNaN guard added
- ✅ Field name query injection → regex validation in constructor
- ✅ Dead createIndexOptions code → removed
- ✅ Decoder option → confirmed not needed (returnFields excludes binary)
- ✅ Retry strategy asymmetry → documented (API limitation)
- ✅ lodash flatten → native .flat()
GLIDE API Verification
All TypeScript GLIDE calls verified correct:
- GlideFt.search/create/dropindex/info ✅
- Batch/ClusterBatch constructor + hset + expire + exec ✅
- GlideClient.createClient config (addresses, credentials, useTLS) ✅
- ClusterScanCursor scan loop ✅
- FtSearchOptions structure (params, returnFields, sortby, limit, dialect) ✅
- Field schema (TAG with separator, VECTOR with attributes) ✅
New Finding (Round 2)
See inline comment below.
| [`${this.metadataKey}_tags`]: Object.values(metadata) | ||
| .flatMap((v) => | ||
| Array.isArray(v) | ||
| ? v.filter((x) => x != null && typeof x !== 'object').map(String) | ||
| : v != null && typeof v !== 'object' | ||
| ? [String(v)] | ||
| : [] | ||
| ) | ||
| .join(',') |
There was a problem hiding this comment.
[Suggestion — Medium] TAG field comma-separator collision with metadata values containing commas
Metadata values are joined with , (the TAG separator configured at line 138), but individual values that contain commas are NOT escaped before joining. A metadata value like "San Francisco, CA" produces two tags (San Francisco and CA) instead of one.
Impact: Metadata filtering with TAG-based queries will return incorrect results when metadata values contain commas — a common occurrence in real data (locations, multi-word categories, CSV-like fields).
Suggested fix: Escape commas in individual tag values before joining:
.flatMap((v) =>
Array.isArray(v)
? v.filter((x) => x != null && typeof x !== 'object').map(String)
: v != null && typeof v !== 'object' ? [String(v)] : []
)
.map((t) => t.replace(/,/g, '\\\\,')) // escape commas within values
.join(',')Or use a different TAG separator that is less likely to appear in natural metadata (e.g., | or \x1f).