engrava's search_hybrid() combines up to five scoring signals
into a single ranked result list.
| # | Signal | Weight key | Default | Source |
|---|---|---|---|---|
| 1 | FTS5 keyword | default_fts_weight |
0.30 |
BM25 full-text score (min-max normalized) |
| 2 | Vector similarity | default_vector_weight |
0.55 |
Cosine similarity from embedding search |
| 3 | Recency | default_recency_weight |
0.10 |
Exponential decay based on current_cycle |
| 4 | Priority | default_priority_weight |
0.05 |
Boost multiplier per priority level (P1–P4) |
| 5 | Graph | default_graph_weight |
0.00 |
1-hop-weighted neighbour boost (opt-in) |
Default weights sum to 1.0. When a signal is unavailable (e.g. no
current_cycle → recency skipped, no embeddings → vector skipped),
its weight is redistributed proportionally across active signals.
- FTS5 unavailable or empty query → FTS skipped.
query_vectorisNoneand no embedding provider → vector skipped.current_cycleisNone→ recency skipped.priority_weightis0.0→ priority skipped.graph_weightis0.0→ graph skipped, zero overhead.- All signals disabled → fallback to
list_thoughts(LIMIT top_k).
The keyword signal — and the search_fts() method and the MCP search_keywords
tool that expose it directly — runs your text against an SQLite FTS5 index. engrava
normalises the query before handing it to FTS5, with two modes that switch
automatically on what you type:
Bare queries are matched with OR. A plain natural-language query like
what was my sister doing is treated as a bag of words joined with OR, so a
document matches when it shares any word. BM25's IDF weighting then ranks the
documents that share the most distinctive words first, so common function words
(what, was, my) carry little weight and need no stopword list or stemmer —
this works in any language. (Before this, the words were joined with FTS5's
implicit AND, so a question only matched documents containing every word and
relevant answers were missed.)
# Bare query -> OR-matched: finds docs sharing any content word, best-ranked first
hits = await store.search_fts("what was my sister doing", top_k=10)Expert syntax is preserved unchanged. If your query uses FTS5 operators, it is passed through as written:
- quoted phrases —
"machine learning"matches the exact phrase; - uppercase booleans —
AND,OR,NOT(must be uppercase) compose terms, e.g.python AND NOT snake; - prefix — a trailing
*does prefix matching, e.g.neur*; - column filters —
essence:andcontent:restrict a term to that column, e.g.content:berlin.
Punctuation never raises. Unsafe characters split a token into separate terms
rather than breaking the query: a contraction like sister's becomes sister OR s, so it still matches a stored sister's dog. Pasting a URL or a timestamp is
safe too — only the real essence: / content: column filters are honoured, so
http://example.com and 12:30 are treated as ordinary search terms (they do
not become spurious http: / 12: column filters). A genuinely malformed
full-text expression is logged and degraded to zero FTS hits, so the rest of a
hybrid search still returns results.
The graph signal uses 1-hop-weighted neighbour boost. If a candidate thought's graph neighbours also match the query, the candidate receives a boost proportional to the neighbour's semantic score and the connecting edge weight.
For each candidate C in the fusion pool:
neighbours = get_edges(C, direction="BOTH", limit=max_neighbors)
ordered by edge.weight DESC (deterministic)
For each (edge, neighbour):
neighbour_base = max(fts_score[neighbour], vector_score[neighbour])
boost[C] += edge.weight × neighbour_base × graph_edge_decay
final_score[C] += graph_weight × boost[C]
Key properties:
- Only semantic scores propagate — priority, recency, and graph
scores are excluded from
neighbour_baseto prevent hub-cascade effects. - No new candidates — graph signal re-ranks existing results, it does not add thoughts to the result set.
- Deterministic — neighbours are sorted by
edge.weight DESCbefore themax_neighborscap is applied.
search:
default_graph_weight: 0.0 # opt-in (0.0 = disabled)
graph_edge_decay: 0.5 # decay factor for 1-hop distance
max_neighbors_per_candidate: 5 # safety capPer-query override:
result = await store.search_hybrid(
"python async",
graph_weight=0.1,
graph_edge_decay=0.3,
)When graph_weight=0.0 (default), no graph queries are executed and
there is zero performance impact. When active, the implementation
uses a single batch SQL query bounded by
O(top_k × max_neighbors_per_candidate) rows.
When the graph signal contributes to at least one candidate,
"graph" appears in HybridSearchResult.backends_used.
All weights can be overridden per call via keyword arguments:
result = await store.search_hybrid(
"quantum computing",
query_vector=embedding,
fts_weight=0.4,
vector_weight=0.4,
recency_weight=0.1,
priority_weight=0.05,
graph_weight=0.05,
current_cycle=42,
)See configuration.md for the full YAML reference
of SearchConfig fields.
After DreamingExtension.run_consolidation() runs its clustering phase,
ThoughtType.REFLECTION meta-thoughts exist in the store. Three knobs
control how hybrid search handles them.
When False, REFLECTION thoughts are excluded from search_hybrid()
results. Useful when you want raw observations / insights without
higher-order aggregates:
result = await store.search_hybrid(
"machine learning",
query_vector=embedding,
include_reflections=False,
)When REFLECTIONs are included, their final score is multiplied by this
factor. The default 1.0 leaves REFLECTIONs competing on equal footing;
raise it above 1.0 for a modest upranking so high-level abstractions
surface for broad queries without dominating narrow ones.
# Stronger boost — reflections rank near the top for broad queries
result = await store.search_hybrid(
"patterns in memory",
query_vector=embedding,
reflection_boost=1.5,
)
# Disable boost — reflections compete on equal footing
result = await store.search_hybrid(
"specific fact",
query_vector=embedding,
reflection_boost=1.0,
)Configure the default in YAML:
search:
reflection_boost: 1.0 # applies when reflection_boost not overridden per-callConvenience helper that returns only REFLECTION thoughts, scored by cosine similarity to the query vector (plus optional recency blend). Designed for queries like "what themes exist in my memory?":
result = await store.search_reflections_only(
"recurring ideas about learning",
query_vector=embedding,
top_k=5,
current_cycle=42, # optional recency blend
)
for thought_id, score in result.results:
ref = await store.get_thought(thought_id)
print(ref.content) # JSON with member_ids + keywordsKey difference from search_hybrid(include_reflections=True):
search_reflections_only() fetches all REFLECTIONs directly from
the store (no pagination gap) and scores them purely by cosine similarity
to the query. It does not compete against regular thoughts for result slots.