perf(fts): speed up sync writes and search query latency#1551
perf(fts): speed up sync writes and search query latency#1551MrDirkelz wants to merge 8 commits into
Conversation
johan-bell
left a comment
There was a problem hiding this comment.
Do we really need to keep DEV flag ? Else LGTM
9ead26a to
2df9456
Compare
johan-bell
left a comment
There was a problem hiding this comment.
We now call primaryKeys() before applying maxDocCount. For high-frequency trigrams, this loads a large ID list that is immediately discarded by if (ids.length > maxDocCount) continue;. Could we do a count() pre-pass and only fetch keys for usable trigrams to avoid extra IDB work and memory pressure?
| const ids = (await db.docs | ||
| .where("fts") | ||
| .between(trigram + ":", trigram + ";", true, false) | ||
| .primaryKeys()) as string[]; |
There was a problem hiding this comment.
We fetch primaryKeys() for every trigram up front. For very common trigrams this can materialize a large key set before we even know it is usable.
| } | ||
| const matchedDocIds = new Set<string>(); | ||
| for (const { token, ids } of trigramResults) { | ||
| if (ids.length > maxDocCount) continue; |
There was a problem hiding this comment.
The maxDocCount guard happens after key materialization, so oversized trigram matches are still fully loaded and then discarded.
| usableTrigrams.push({ token, df: ids.length }); | ||
| for (const id of ids) matchedDocIds.add(id); |
There was a problem hiding this comment.
Only usable trigrams should reach this accumulation path; consider a count() pre-pass (or capped scan) so we avoid loading IDs for trigrams that fail the threshold.
There was a problem hiding this comment.
Investigate if we can fetch all the trigram matches from the DB in one query instead of iterating through the trigrams and doing a DB lookup for each trigram. That will potentially make the search a lot faster
Corpus-stats recomputation was running synchronously on every bulkPut
batch and on every app startup, scanning all Content docs each time.
With N growing during a fresh sync, this produced O(n²) behaviour on
initial load and ~1s stalls on the first few searches after startup.
Switched both entry points to the existing 10 s debounce — stats now
settle once after sync quiets and once after startup rather than per
batch. Persisted stats carry over between sessions, so BM25 ranking
uses last-session values during the debounce window (tolerably stale).
Measured on a ~700-doc fresh sync: initial load ~7.9s → ~2.4s.
Search pipeline:
- Merged the separate trigram .count() pass into the primaryKeys pass
(df derived from ids.length) and parallelised trigram lookups via
Promise.all, removing the sequential await-per-trigram pattern.
- Pushed the language filter into the Dexie query so wrong-language
candidates never reach the scoring loops.
- Added an LRU cache of per-doc normalised word sets keyed by
`${_id}:${updatedTimeUtc}`, eliminating repeated stripHtml/normalize
work at query time. Measured step6_wordMatch 64ms → 1ms on warm
cache across repeat queries.
Overall search: warm queries 44–120ms (was 106–186ms); cold first
query ~300ms (was 1.4–1.7s due to the startup recompute contention).
Lightweight [perf] console.debug instrumentation left in place so
reviewers can verify on a temporary deployment; console.debug is
filtered from the default devtools output.
This commit removes experimental performance logging code that was added to `ftsSearch.ts`. The instrumentation was used to track the time taken by different stages of the search process.
The performance logging added for debugging the FTS search has been removed.
2df9456 to
979bca1
Compare
Implements a new trigram-based inverted index for full-text search. This replaces the previous multi-entry index with a more efficient structure composed of `trigramStats` and `trigramPostings` tables. This change includes: - A new `trigramIndex.ts` module for managing the index. - Dexie hooks to automatically update the trigram index on document changes. - A backfill mechanism to build the index from existing documents. - Updates to `ftsSearch.ts` to utilize the new index for improved performance.
This commit refactors the trigram index to use a single table (`trigramPostings`) instead of the previous two (`trigramStats` and `trigramPostings`). This simplifies the index structure and reduces the number of database reads required for searching. Additionally, this commit introduces an in-memory cache for `ContentDto` objects to further optimize search performance by avoiding repeated database lookups for frequently accessed documents. The cache is managed using an LRU strategy and invalidated upon document updates.
Introduces the `HomePageSearch` component and integrates it into the `HomePage.vue` template. Removes unused caching logic from `ftsSearch.ts` and `trigramIndex.ts`.
| import IgnorePagePadding from "@/components/IgnorePagePadding.vue"; | ||
| import HomePagePinned from "@/components/HomePage/HomePagePinned.vue"; | ||
| import HomePageNewest from "@/components/HomePage/HomePageNewest.vue"; | ||
| import HomePageSearch from "@/components/HomePage/HomePageSearch.vue"; |
| <template> | ||
| <BasePage> | ||
| <IgnorePagePadding ignoreTop> | ||
| <HomePageSearch /> |
Update documentation and minor logic in FTS search and indexing. This includes clarifying comments, removing dead code, and simplifying error handling for scheduled recomputations.
FTS search perf comparison: branch
1549vsmainLocal IndexedDB, instrumented
shared/src/fts/ftsSearch.tswith identical phase timers and ran the same queries against each algorithm. Numbers in ms.Three variants measured:
main. No trigram inverted index, no candidate window, no in-memory caches. Scores every matched doc.1549withtrigramPostings, coarse-rank candidate window, plus in-memorydocCacheandwordSetCache.Per-query totals
⚠ "How we can overcome sin" on no-caches hit a sync-contention spike: 2188 ms to load just 60 docs (36 ms/doc vs the normal 5–10 ms). Treat as variance, not algorithmic regression — rerun for a fair picture.
Phase breakdown — "What is the benefits to self-denial"
15 query trigrams, 10 usable, ~442 matched docs in the corpus.
What's driving the win (preserved in no-caches version)
Σ idf·tffrom postings → topoffset+limit+20docs)trigramPostingstable, singlebulkGet)count()+primaryKeys()). Saves the ~50 mstrigramCountphase outright.[language+_id]index on docsContentDtoby_id)stripHtml+normalizeTextwork for docs seen in adjacent queries.Per-doc scoring cost is identical, the win is in N
The branch isn't faster per doc — it just does the work on far fewer docs.
Cost of removing the in-memory caches
docCachewould have damped some of thedocsLoadvariance. Without it, slow-run spikes are larger.Recall tradeoff (and why it's fine)
Coarse rank doesn't account for word-match field boosts (e.g. title at 3.0×), so in theory a low-coarse-score doc with a strong title match could be excluded by the candidate window.
In practice this isn't an issue because:
slice(0, offset + limit + 20)grows with offset, so as the user pages forward, more candidates become reachable. A doc that "should have been" in the top 20 by full BM25 but missed the coarse cut will surface within a page or two.What's still on the table
docsLoadis the remaining bottleneck (~75% of branch latency, baseline ~80 ms uncontended, much worse during sync). Avoidingdb.docsat search time entirely (precomputed display data in a smaller store) would be the next biggest win.*ftsmulti-entry index fromdb.docs— its only runtime consumer is the fallback path. Removal cuts IDB storage materially and reduces sync write amplification. Backfill iterates docs directly, so it doesn't need the index either.