perf(fts): speed up sync writes and search query latency by MrDirkelz · Pull Request #1551 · bccsa/luminary

MrDirkelz · 2026-04-21T14:32:57Z

FTS search perf comparison: branch `1549` vs `main`

Local IndexedDB, instrumented shared/src/fts/ftsSearch.ts with identical phase timers and ran the same queries against each algorithm. Numbers in ms.

Three variants measured:

main — the algorithm on main. No trigram inverted index, no candidate window, no in-memory caches. Scores every matched doc.
branch+caches — branch 1549 with trigramPostings, coarse-rank candidate window, plus in-memory docCache and wordSetCache.
branch no-caches — the final shipped version: same persistent-IDB optimizations as variant 2 but with both in-memory caches removed (per senior directive: no process-memory caching).

Per-query totals

query	main	branch+caches	branch no-caches
"What is the benefits to self-denial"	1188	465	767
"How to overcome habitual sin"	2501	407	—
"How we can overcome sin"	1054	—	2314 ⚠
"14 Bible verses…overcome sin"	—	—	1299

⚠ "How we can overcome sin" on no-caches hit a sync-contention spike: 2188 ms to load just 60 docs (36 ms/doc vs the normal 5–10 ms). Treat as variance, not algorithmic regression — rerun for a fair picture.

Phase breakdown — "What is the benefits to self-denial"

15 query trigrams, 10 usable, ~442 matched docs in the corpus.

metric	main (slow)	branch+caches	branch no-caches
total	1188	465	767
candidates scored	442	60	60
corpusStats	19	18	23
trigramCount (main only)	52	—	—
trigramLookup	15	16	8
coarseRank	—	2	2
docsLoad	527	343	639
bm25	94	15	17
wordMatch	481	71	78
sort	1.3	0	0.4

What's driving the win (preserved in no-caches version)

change on the branch	impact	preserved without caches?
Coarse-rank candidate window (`Σ idf·tf` from postings → top `offset+limit+20` docs)	Dominant. Cuts 442 → 60 docs that get full BM25 + word-match. Multiplies through every later phase.	✅ Yes
Trigram inverted index (`trigramPostings` table, single `bulkGet`)	Replaces main's two sequential per-trigram passes (`count()` + `primaryKeys()`). Saves the ~50 ms `trigramCount` phase outright.	✅ Yes
Per-store implicit transactions instead of one big readonly tx	Smaller scopes get granted faster under sync write contention.	✅ Yes
Compound `[language+_id]` index on docs	Wrong-language candidates miss at the IDB layer — no row read for docs we'd discard.	✅ Yes
Doc cache (LRU 500, `ContentDto` by `_id`)	Skipped IDB hydration for overlapping candidate sets during typing/pagination.	❌ Removed
Word-set cache (LRU 500, per-doc normalized field word sets)	Skipped repeat `stripHtml` + `normalizeText` work for docs seen in adjacent queries.	❌ Removed

Per-doc scoring cost is identical, the win is in N

	main	branch+caches	branch no-caches
wordMatch ms / doc	1.09	1.18	1.30
bm25 ms / doc	0.21	0.25	0.28

The branch isn't faster per doc — it just does the work on far fewer docs.

Cost of removing the in-memory caches

flow	regression
Cold first search	None. Both caches start empty anyway; numbers are within run-to-run variance of the cached version.
Repeated overlapping queries (interactive typing, pagination)	Theoretical ~30–200 ms slower per follow-up keystroke. Not measured in either trace set — would need a separate test with repeated queries to quantify.
Sync-contention bursts	`docCache` would have damped some of the `docsLoad` variance. Without it, slow-run spikes are larger.

Recall tradeoff (and why it's fine)

Coarse rank doesn't account for word-match field boosts (e.g. title at 3.0×), so in theory a low-coarse-score doc with a strong title match could be excluded by the candidate window.

In practice this isn't an issue because:

A title match means the title's trigrams hit, which drives high coarse score anyway — the doc is already in the window.
Empirically: title-match queries return the expected doc on the branch.
Pagination expands the window — slice(0, offset + limit + 20) grows with offset, so as the user pages forward, more candidates become reachable. A doc that "should have been" in the top 20 by full BM25 but missed the coarse cut will surface within a page or two.

What's still on the table

docsLoad is the remaining bottleneck (~75% of branch latency, baseline ~80 ms uncontended, much worse during sync). Avoiding db.docs at search time entirely (precomputed display data in a smaller store) would be the next biggest win.
Web Worker wouldn't reduce work but would stop main-thread sync from causing the 6× slow-run amplification.
Remove the *fts multi-entry index from db.docs — its only runtime consumer is the fallback path. Removal cuts IDB storage materially and reduces sync write amplification. Backfill iterates docs directly, so it doesn't need the index either.

johan-bell

Do we really need to keep DEV flag ? Else LGTM

There is a test failing. Please fix it

johan-bell

We now call primaryKeys() before applying maxDocCount. For high-frequency trigrams, this loads a large ID list that is immediately discarded by if (ids.length > maxDocCount) continue;. Could we do a count() pre-pass and only fetch keys for usable trigrams to avoid extra IDB work and memory pressure?

johan-bell · 2026-04-29T10:54:39Z

+            const ids = (await db.docs
+                .where("fts")
+                .between(trigram + ":", trigram + ";", true, false)
+                .primaryKeys()) as string[];


We fetch primaryKeys() for every trigram up front. For very common trigrams this can materialize a large key set before we even know it is usable.

johan-bell · 2026-04-29T10:54:57Z

-        }
+    const matchedDocIds = new Set<string>();
+    for (const { token, ids } of trigramResults) {
+        if (ids.length > maxDocCount) continue;


The maxDocCount guard happens after key materialization, so oversized trigram matches are still fully loaded and then discarded.

johan-bell · 2026-04-29T10:57:27Z

+        usableTrigrams.push({ token, df: ids.length });
+        for (const id of ids) matchedDocIds.add(id);


Only usable trigrams should reach this accumulation path; consider a count() pre-pass (or capped scan) so we avoid loading IDs for trigrams that fail the threshold.

ivanslabbert · 2026-05-05T07:34:23Z

Investigate if we can fetch all the trigram matches from the DB in one query instead of iterating through the trigrams and doing a DB lookup for each trigram. That will potentially make the search a lot faster

Corpus-stats recomputation was running synchronously on every bulkPut batch and on every app startup, scanning all Content docs each time. With N growing during a fresh sync, this produced O(n²) behaviour on initial load and ~1s stalls on the first few searches after startup. Switched both entry points to the existing 10 s debounce — stats now settle once after sync quiets and once after startup rather than per batch. Persisted stats carry over between sessions, so BM25 ranking uses last-session values during the debounce window (tolerably stale). Measured on a ~700-doc fresh sync: initial load ~7.9s → ~2.4s. Search pipeline: - Merged the separate trigram .count() pass into the primaryKeys pass (df derived from ids.length) and parallelised trigram lookups via Promise.all, removing the sequential await-per-trigram pattern. - Pushed the language filter into the Dexie query so wrong-language candidates never reach the scoring loops. - Added an LRU cache of per-doc normalised word sets keyed by `${_id}:${updatedTimeUtc}`, eliminating repeated stripHtml/normalize work at query time. Measured step6_wordMatch 64ms → 1ms on warm cache across repeat queries. Overall search: warm queries 44–120ms (was 106–186ms); cold first query ~300ms (was 1.4–1.7s due to the startup recompute contention). Lightweight [perf] console.debug instrumentation left in place so reviewers can verify on a temporary deployment; console.debug is filtered from the default devtools output.

This commit removes experimental performance logging code that was added to `ftsSearch.ts`. The instrumentation was used to track the time taken by different stages of the search process.

The performance logging added for debugging the FTS search has been removed.

Implements a new trigram-based inverted index for full-text search. This replaces the previous multi-entry index with a more efficient structure composed of `trigramStats` and `trigramPostings` tables. This change includes: - A new `trigramIndex.ts` module for managing the index. - Dexie hooks to automatically update the trigram index on document changes. - A backfill mechanism to build the index from existing documents. - Updates to `ftsSearch.ts` to utilize the new index for improved performance.

This commit refactors the trigram index to use a single table (`trigramPostings`) instead of the previous two (`trigramStats` and `trigramPostings`). This simplifies the index structure and reduces the number of database reads required for searching. Additionally, this commit introduces an in-memory cache for `ContentDto` objects to further optimize search performance by avoiding repeated database lookups for frequently accessed documents. The cache is managed using an LRU strategy and invalidated upon document updates.

Introduces the `HomePageSearch` component and integrates it into the `HomePage.vue` template. Removes unused caching logic from `ftsSearch.ts` and `trigramIndex.ts`.

ivanslabbert · 2026-05-12T08:29:02Z

 import IgnorePagePadding from "@/components/IgnorePagePadding.vue";
 import HomePagePinned from "@/components/HomePage/HomePagePinned.vue";
 import HomePageNewest from "@/components/HomePage/HomePageNewest.vue";
+import HomePageSearch from "@/components/HomePage/HomePageSearch.vue";


ivanslabbert · 2026-05-12T08:29:06Z

 <template>
    <BasePage>
        <IgnorePagePadding ignoreTop>
+            <HomePageSearch />


Update documentation and minor logic in FTS search and indexing. This includes clarifying comments, removing dead code, and simplifying error handling for scheduled recomputations.

MrDirkelz linked an issue Apr 21, 2026 that may be closed by this pull request

Shared: Increase FTS Search performance #1549

Open

MrDirkelz self-assigned this Apr 21, 2026

johan-bell previously approved these changes Apr 23, 2026

View reviewed changes

MrDirkelz force-pushed the 1549-shared-increase-fts-search-performance branch from 9ead26a to 2df9456 Compare April 28, 2026 11:28

johan-bell reviewed Apr 29, 2026

View reviewed changes

ivanslabbert reviewed May 5, 2026

View reviewed changes

MrDirkelz added 3 commits May 5, 2026 13:30

Remove temporary performance logging

ec7eb7f

This commit removes experimental performance logging code that was added to `ftsSearch.ts`. The instrumentation was used to track the time taken by different stages of the search process.

Remove temporary performance logging

979bca1

The performance logging added for debugging the FTS search has been removed.

MrDirkelz force-pushed the 1549-shared-increase-fts-search-performance branch from 2df9456 to 979bca1 Compare May 5, 2026 11:30

MrDirkelz added 4 commits May 5, 2026 14:16

tmp

c03d766

Add search component to homepage

6f29966

Introduces the `HomePageSearch` component and integrates it into the `HomePage.vue` template. Removes unused caching logic from `ftsSearch.ts` and `trigramIndex.ts`.

ivanslabbert reviewed May 12, 2026

View reviewed changes

Refactor FTS search and indexing logic

25f0a0e

Update documentation and minor logic in FTS search and indexing. This includes clarifying comments, removing dead code, and simplifying error handling for scheduled recomputations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(fts): speed up sync writes and search query latency#1551

perf(fts): speed up sync writes and search query latency#1551
MrDirkelz wants to merge 8 commits into
mainfrom
1549-shared-increase-fts-search-performance

MrDirkelz commented Apr 21, 2026 •

edited

Loading

Uh oh!

johan-bell left a comment

Uh oh!

johan-bell left a comment

Uh oh!

johan-bell Apr 29, 2026

Uh oh!

johan-bell Apr 29, 2026

Uh oh!

johan-bell Apr 29, 2026

Uh oh!

ivanslabbert May 5, 2026

Uh oh!

ivanslabbert May 12, 2026

Uh oh!

ivanslabbert May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		usableTrigrams.push({ token, df: ids.length });
		for (const id of ids) matchedDocIds.add(id);

Uh oh!

Conversation

MrDirkelz commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

FTS search perf comparison: branch 1549 vs main

Per-query totals

Phase breakdown — "What is the benefits to self-denial"

What's driving the win (preserved in no-caches version)

Per-doc scoring cost is identical, the win is in N

Cost of removing the in-memory caches

Recall tradeoff (and why it's fine)

What's still on the table

Uh oh!

johan-bell left a comment

Choose a reason for hiding this comment

Uh oh!

johan-bell left a comment

Choose a reason for hiding this comment

Uh oh!

johan-bell Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

johan-bell Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

johan-bell Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

ivanslabbert May 5, 2026

Choose a reason for hiding this comment

Uh oh!

ivanslabbert May 12, 2026

Choose a reason for hiding this comment

Uh oh!

ivanslabbert May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MrDirkelz commented Apr 21, 2026 •

edited

Loading

FTS search perf comparison: branch `1549` vs `main`