Skip to content

perf(fts): speed up sync writes and search query latency#1551

Open
MrDirkelz wants to merge 8 commits into
mainfrom
1549-shared-increase-fts-search-performance
Open

perf(fts): speed up sync writes and search query latency#1551
MrDirkelz wants to merge 8 commits into
mainfrom
1549-shared-increase-fts-search-performance

Conversation

@MrDirkelz

@MrDirkelz MrDirkelz commented Apr 21, 2026

Copy link
Copy Markdown
Collaborator

FTS search perf comparison: branch 1549 vs main

Local IndexedDB, instrumented shared/src/fts/ftsSearch.ts with identical phase timers and ran the same queries against each algorithm. Numbers in ms.

Three variants measured:

  1. main — the algorithm on main. No trigram inverted index, no candidate window, no in-memory caches. Scores every matched doc.
  2. branch+caches — branch 1549 with trigramPostings, coarse-rank candidate window, plus in-memory docCache and wordSetCache.
  3. branch no-caches — the final shipped version: same persistent-IDB optimizations as variant 2 but with both in-memory caches removed (per senior directive: no process-memory caching).

Per-query totals

query main branch+caches branch no-caches
"What is the benefits to self-denial" 1188 465 767
"How to overcome habitual sin" 2501 407
"How we can overcome sin" 1054 2314 ⚠
"14 Bible verses…overcome sin" 1299

⚠ "How we can overcome sin" on no-caches hit a sync-contention spike: 2188 ms to load just 60 docs (36 ms/doc vs the normal 5–10 ms). Treat as variance, not algorithmic regression — rerun for a fair picture.

Phase breakdown — "What is the benefits to self-denial"

15 query trigrams, 10 usable, ~442 matched docs in the corpus.

metric main (slow) branch+caches branch no-caches
total 1188 465 767
candidates scored 442 60 60
corpusStats 19 18 23
trigramCount (main only) 52
trigramLookup 15 16 8
coarseRank 2 2
docsLoad 527 343 639
bm25 94 15 17
wordMatch 481 71 78
sort 1.3 0 0.4

What's driving the win (preserved in no-caches version)

change on the branch impact preserved without caches?
Coarse-rank candidate window (Σ idf·tf from postings → top offset+limit+20 docs) Dominant. Cuts 442 → 60 docs that get full BM25 + word-match. Multiplies through every later phase. ✅ Yes
Trigram inverted index (trigramPostings table, single bulkGet) Replaces main's two sequential per-trigram passes (count() + primaryKeys()). Saves the ~50 ms trigramCount phase outright. ✅ Yes
Per-store implicit transactions instead of one big readonly tx Smaller scopes get granted faster under sync write contention. ✅ Yes
Compound [language+_id] index on docs Wrong-language candidates miss at the IDB layer — no row read for docs we'd discard. ✅ Yes
Doc cache (LRU 500, ContentDto by _id) Skipped IDB hydration for overlapping candidate sets during typing/pagination. ❌ Removed
Word-set cache (LRU 500, per-doc normalized field word sets) Skipped repeat stripHtml + normalizeText work for docs seen in adjacent queries. ❌ Removed

Per-doc scoring cost is identical, the win is in N

main branch+caches branch no-caches
wordMatch ms / doc 1.09 1.18 1.30
bm25 ms / doc 0.21 0.25 0.28

The branch isn't faster per doc — it just does the work on far fewer docs.

Cost of removing the in-memory caches

flow regression
Cold first search None. Both caches start empty anyway; numbers are within run-to-run variance of the cached version.
Repeated overlapping queries (interactive typing, pagination) Theoretical ~30–200 ms slower per follow-up keystroke. Not measured in either trace set — would need a separate test with repeated queries to quantify.
Sync-contention bursts docCache would have damped some of the docsLoad variance. Without it, slow-run spikes are larger.

Recall tradeoff (and why it's fine)

Coarse rank doesn't account for word-match field boosts (e.g. title at 3.0×), so in theory a low-coarse-score doc with a strong title match could be excluded by the candidate window.

In practice this isn't an issue because:

  • A title match means the title's trigrams hit, which drives high coarse score anyway — the doc is already in the window.
  • Empirically: title-match queries return the expected doc on the branch.
  • Pagination expands the windowslice(0, offset + limit + 20) grows with offset, so as the user pages forward, more candidates become reachable. A doc that "should have been" in the top 20 by full BM25 but missed the coarse cut will surface within a page or two.

What's still on the table

  • docsLoad is the remaining bottleneck (~75% of branch latency, baseline ~80 ms uncontended, much worse during sync). Avoiding db.docs at search time entirely (precomputed display data in a smaller store) would be the next biggest win.
  • Web Worker wouldn't reduce work but would stop main-thread sync from causing the 6× slow-run amplification.
  • Remove the *fts multi-entry index from db.docs — its only runtime consumer is the fallback path. Removal cuts IDB storage materially and reduces sync write amplification. Backfill iterates docs directly, so it doesn't need the index either.

@MrDirkelz MrDirkelz linked an issue Apr 21, 2026 that may be closed by this pull request
@MrDirkelz MrDirkelz self-assigned this Apr 21, 2026
johan-bell
johan-bell previously approved these changes Apr 23, 2026

@johan-bell johan-bell left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to keep DEV flag ? Else LGTM

@johan-bell johan-bell dismissed their stale review April 23, 2026 08:19

There is a test failing. Please fix it

@MrDirkelz MrDirkelz force-pushed the 1549-shared-increase-fts-search-performance branch from 9ead26a to 2df9456 Compare April 28, 2026 11:28

@johan-bell johan-bell left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now call primaryKeys() before applying maxDocCount. For high-frequency trigrams, this loads a large ID list that is immediately discarded by if (ids.length > maxDocCount) continue;. Could we do a count() pre-pass and only fetch keys for usable trigrams to avoid extra IDB work and memory pressure?

Comment thread shared/src/fts/ftsSearch.ts Outdated
Comment on lines +111 to +114
const ids = (await db.docs
.where("fts")
.between(trigram + ":", trigram + ";", true, false)
.primaryKeys()) as string[];

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We fetch primaryKeys() for every trigram up front. For very common trigrams this can materialize a large key set before we even know it is usable.

Comment thread shared/src/fts/ftsSearch.ts Outdated
}
const matchedDocIds = new Set<string>();
for (const { token, ids } of trigramResults) {
if (ids.length > maxDocCount) continue;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maxDocCount guard happens after key materialization, so oversized trigram matches are still fully loaded and then discarded.

Comment thread shared/src/fts/ftsSearch.ts Outdated
Comment on lines +122 to +123
usableTrigrams.push({ token, df: ids.length });
for (const id of ids) matchedDocIds.add(id);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only usable trigrams should reach this accumulation path; consider a count() pre-pass (or capped scan) so we avoid loading IDs for trigrams that fail the threshold.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Investigate if we can fetch all the trigram matches from the DB in one query instead of iterating through the trigrams and doing a DB lookup for each trigram. That will potentially make the search a lot faster

MrDirkelz added 3 commits May 5, 2026 13:30
Corpus-stats recomputation was running synchronously on every bulkPut
batch and on every app startup, scanning all Content docs each time.
With N growing during a fresh sync, this produced O(n²) behaviour on
initial load and ~1s stalls on the first few searches after startup.
Switched both entry points to the existing 10 s debounce — stats now
settle once after sync quiets and once after startup rather than per
batch. Persisted stats carry over between sessions, so BM25 ranking
uses last-session values during the debounce window (tolerably stale).

Measured on a ~700-doc fresh sync: initial load ~7.9s → ~2.4s.

Search pipeline:
- Merged the separate trigram .count() pass into the primaryKeys pass
  (df derived from ids.length) and parallelised trigram lookups via
  Promise.all, removing the sequential await-per-trigram pattern.
- Pushed the language filter into the Dexie query so wrong-language
  candidates never reach the scoring loops.
- Added an LRU cache of per-doc normalised word sets keyed by
  `${_id}:${updatedTimeUtc}`, eliminating repeated stripHtml/normalize
  work at query time. Measured step6_wordMatch 64ms → 1ms on warm
  cache across repeat queries.

Overall search: warm queries 44–120ms (was 106–186ms); cold first
query ~300ms (was 1.4–1.7s due to the startup recompute contention).

Lightweight [perf] console.debug instrumentation left in place so
reviewers can verify on a temporary deployment; console.debug is
filtered from the default devtools output.
This commit removes experimental performance logging code that was added
to `ftsSearch.ts`. The instrumentation was used to track the time taken
by different stages of the search process.
The performance logging added for debugging the FTS search has been
removed.
@MrDirkelz MrDirkelz force-pushed the 1549-shared-increase-fts-search-performance branch from 2df9456 to 979bca1 Compare May 5, 2026 11:30
MrDirkelz added 4 commits May 5, 2026 14:16
Implements a new trigram-based inverted index for full-text search. This
replaces the previous multi-entry index with a more efficient structure
composed of `trigramStats` and `trigramPostings` tables.

This change includes:
- A new `trigramIndex.ts` module for managing the index.
- Dexie hooks to automatically update the trigram index on document
  changes.
- A backfill mechanism to build the index from existing documents.
- Updates to `ftsSearch.ts` to utilize the new index for improved
  performance.
This commit refactors the trigram index to use a single table
(`trigramPostings`) instead of the previous two (`trigramStats` and
`trigramPostings`). This simplifies the index structure and reduces the
number of database reads required for searching.

Additionally, this commit introduces an in-memory cache for `ContentDto`
objects to further optimize search performance by avoiding repeated
database lookups for frequently accessed documents. The cache is managed
using an LRU strategy and invalidated upon document updates.
Introduces the `HomePageSearch` component and integrates it into the
`HomePage.vue` template.
Removes unused caching logic from `ftsSearch.ts` and `trigramIndex.ts`.
Comment thread app/src/pages/HomePage.vue Outdated
import IgnorePagePadding from "@/components/IgnorePagePadding.vue";
import HomePagePinned from "@/components/HomePage/HomePagePinned.vue";
import HomePageNewest from "@/components/HomePage/HomePageNewest.vue";
import HomePageSearch from "@/components/HomePage/HomePageSearch.vue";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

<template>
<BasePage>
<IgnorePagePadding ignoreTop>
<HomePageSearch />

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Update documentation and minor logic in FTS search and indexing. This
includes clarifying comments, removing dead code, and simplifying error
handling for scheduled recomputations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Shared: Increase FTS Search performance

3 participants