Skip to content

[ENG-1224 & ENG-343] Replaced fuzzy with minisearch in Discourse Summoning Menu#694

Open
trangdoan982 wants to merge 3 commits intomainfrom
eng-1224-implement-minisearch-library-for-node-summoning-menu-and
Open

[ENG-1224 & ENG-343] Replaced fuzzy with minisearch in Discourse Summoning Menu#694
trangdoan982 wants to merge 3 commits intomainfrom
eng-1224-implement-minisearch-library-for-node-summoning-menu-and

Conversation

@trangdoan982
Copy link
Collaborator

@trangdoan982 trangdoan982 commented Jan 13, 2026

https://www.loom.com/share/ed8e82e53a784c08a2befd4bd32ba98e

Summary by CodeRabbit

  • New Features
    • Enhanced search performance with improved indexing approach.
    • Better search query handling with support for type-based filtering.
    • Optimized empty search results with filtered display capabilities.

✏️ Tip: You can customize this high-level summary in your review settings.


Open with Devin

@linear
Copy link

linear bot commented Jan 13, 2026

@trangdoan982
Copy link
Collaborator Author

@coderabbitai full review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 13, 2026

✅ Actions performed

Full review triggered.

@supabase
Copy link

supabase bot commented Jan 13, 2026

This pull request has been ignored for the connected project zytfjzqyijgagqxrzbmz because there are no changes detected in packages/database/supabase directory. You can change this behaviour in Project Integrations Settings ↗︎.


Preview Branches by Supabase.
Learn more about Supabase Branching ↗︎.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 13, 2026

📝 Walkthrough

Walkthrough

Refactors the DiscourseNodeSearchMenu component to replace fuzzy-based local search with MiniSearch-based indexing. Adds minisearch as a dependency and updates the search implementation to index nodes during initial load and perform searches via MiniSearch queries, while preserving existing UI behavior.

Changes

Cohort / File(s) Change Summary
Package Dependencies
apps/roam/package.json
Added minisearch (^7.2.0) dependency; removed trailing newline
Search Implementation
apps/roam/src/components/DiscourseNodeSearchMenu.tsx
Replaced fuzzy-based allNodes caching with MiniSearch-based indexing; introduced searchWithMiniSearch function; added miniSearchRef to persist MiniSearch instance; updated indexing flow to build documentsToIndex array and index with MiniSearch; adjusted effect/state orchestration to use miniSearchRef; maintains existing block update and UI behavior (+193/-75 lines)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check ✅ Passed The title accurately summarizes the main change: replacing fuzzy search with minisearch in the DiscourseNodeSearchMenu component, which is the Discourse Summoning Menu.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In @apps/roam/package.json:
- Line 63: Remove the unused "fuzzy" dependency entry from package.json (the
line containing "fuzzy": "^0.1.3") since the project now uses minisearch; update
the dependencies block by deleting that dependency and run a package manager
clean/install (npm/yarn/pnpm) to update lockfile and node_modules accordingly.

In @apps/roam/src/components/DiscourseNodeSearchMenu.tsx:
- Around line 399-402: The capture call uses item.id which can be undefined
because Result only guarantees text and uid and searchWithMiniSearch maps
results without id; update the mapping in searchWithMiniSearch (and any places
mapping MiniSearch results) to include id: result.uid so items include id, or
change the posthog.capture call in DiscourseNodeSearchMenu to use item.uid (with
a fallback to item.id) — adjust the Result type if needed and update
searchNodesForType mappings to remain consistent with the new shape.
🧹 Nitpick comments (4)
apps/roam/src/components/DiscourseNodeSearchMenu.tsx (4)

168-176: Excessive type casting suggests typing mismatch.

The as unknown as MinisearchResult casts indicate that MiniSearch's generic typing isn't being leveraged correctly. MiniSearch supports generics that should eliminate these casts.

The MiniSearch.search() method should return properly typed results when the instance is typed as MiniSearch<MinisearchResult>. Consider verifying the MiniSearch type definitions to avoid runtime type assumptions.

MiniSearch typescript generics storeFields return type

195-195: Document the score threshold magic number.

The 0.1 score threshold is undocumented. Consider extracting this to a named constant with a comment explaining why this threshold was chosen.

💡 Suggested improvement
+// Minimum relevance score threshold - filters out very weak matches
+const MIN_SEARCH_SCORE = 0.1;
+
 // In searchWithMiniSearch:
-const filteredResults = rawSearchResults.filter((r) => r.score > 0.1);
+const filteredResults = rawSearchResults.filter((r) => r.score > MIN_SEARCH_SCORE);

158-160: Remove or gate performance logging for production.

Multiple console.log statements (lines 158-160, 223-225, 284-286) will output to the console in production. Consider removing them or gating behind a debug flag.

💡 Suggested improvement
+const DEBUG_SEARCH = process.env.NODE_ENV === 'development';
+
 // Then wrap logs:
-console.log(
-  `[MiniSearch] Empty search - ${allDocuments} total documents, ${searchDuration.toFixed(2)}ms`,
-);
+if (DEBUG_SEARCH) {
+  console.log(
+    `[MiniSearch] Empty search - ${allDocuments} total documents, ${searchDuration.toFixed(2)}ms`,
+  );
+}

111-141: Consider moving searchNodesForType outside the component.

This function doesn't depend on component props or state, so it could be defined outside the component to avoid recreation on each render.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4dadfc5 and e121955.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (2)
  • apps/roam/package.json
  • apps/roam/src/components/DiscourseNodeSearchMenu.tsx
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/main.mdc)

**/*.{ts,tsx}: Use Tailwind CSS for styling where possible
When refactoring inline styles, use tailwind classes
Prefer type over interface in TypeScript
Use explicit return types for functions
Avoid any types when possible
Prefer arrow functions over regular function declarations
Use named parameters (object destructuring) when a function has more than 2 parameters
Use PascalCase for components and types
Use camelCase for variables and functions
Use UPPERCASE for constants
Function names should describe their purpose clearly
Prefer early returns over nested conditionals for better readability

Files:

  • apps/roam/src/components/DiscourseNodeSearchMenu.tsx
apps/roam/**/*.{js,ts,tsx,jsx,json}

📄 CodeRabbit inference engine (.cursor/rules/roam.mdc)

Prefer existing dependencies from package.json when working on the Roam Research extension

Files:

  • apps/roam/src/components/DiscourseNodeSearchMenu.tsx
  • apps/roam/package.json
apps/roam/**/*.{ts,tsx,jsx,js,css,scss}

📄 CodeRabbit inference engine (.cursor/rules/roam.mdc)

Use BlueprintJS 3 components and Tailwind CSS for platform-native UI in the Roam Research extension

Files:

  • apps/roam/src/components/DiscourseNodeSearchMenu.tsx
apps/roam/**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/roam.mdc)

apps/roam/**/*.{ts,tsx,js,jsx}: Use the roamAlphaApi docs from https://roamresearch.com/#/app/developer-documentation/page/tIaOPdXCj when implementing Roam functionality
Use Roam Depot/Extension API docs from https://roamresearch.com/#/app/developer-documentation/page/y31lhjIqU when implementing extension functionality

Files:

  • apps/roam/src/components/DiscourseNodeSearchMenu.tsx
apps/roam/**

📄 CodeRabbit inference engine (.cursor/rules/roam.mdc)

Implement the Discourse Graph protocol in the Roam Research extension

Files:

  • apps/roam/src/components/DiscourseNodeSearchMenu.tsx
  • apps/roam/package.json
🧠 Learnings (8)
📚 Learning: 2025-06-22T10:40:52.752Z
Learnt from: sid597
Repo: DiscourseGraphs/discourse-graph PR: 232
File: apps/roam/src/utils/getAllDiscourseNodesSince.ts:18-31
Timestamp: 2025-06-22T10:40:52.752Z
Learning: In apps/roam/src/utils/getAllDiscourseNodesSince.ts, the user confirmed that querying for `?title` with `:node/title` and mapping it to the `text` field in the DiscourseGraphContent type is the correct implementation for retrieving discourse node content from Roam Research, despite it appearing to query page titles rather than block text content.

Applied to files:

  • apps/roam/src/components/DiscourseNodeSearchMenu.tsx
📚 Learning: 2025-08-25T15:53:21.799Z
Learnt from: sid597
Repo: DiscourseGraphs/discourse-graph PR: 372
File: apps/roam/src/components/DiscourseNodeMenu.tsx:116-116
Timestamp: 2025-08-25T15:53:21.799Z
Learning: In apps/roam/src/components/DiscourseNodeMenu.tsx, when handling tag insertion, multiple leading hashtags (like ##foo) should be preserved as they represent user intent, not normalized to a single hashtag. The current regex /^#/ is correct as it only removes one leading # before adding one back, maintaining any additional hashtags the user intended.

Applied to files:

  • apps/roam/src/components/DiscourseNodeSearchMenu.tsx
📚 Learning: 2025-11-25T00:52:41.934Z
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-11-25T00:52:41.934Z
Learning: Applies to apps/roam/**/*.{js,ts,tsx,jsx,json} : Prefer existing dependencies from package.json when working on the Roam Research extension

Applied to files:

  • apps/roam/package.json
📚 Learning: 2025-11-25T00:52:36.174Z
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/obsidian.mdc:0-0
Timestamp: 2025-11-25T00:52:36.174Z
Learning: Applies to apps/obsidian/** : Prefer existing dependencies from apps/obsidian/package.json when adding dependencies to the Obsidian plugin

Applied to files:

  • apps/roam/package.json
📚 Learning: 2025-08-11T19:09:58.252Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 337
File: apps/roam/src/components/DiscourseFloatingMenu.tsx:43-43
Timestamp: 2025-08-11T19:09:58.252Z
Learning: The roam subdirectory (apps/roam) is constrained to React 17 and cannot use React 18+ features like createRoot API. ReactDOM.render should be used instead of createRoot in this subdirectory.

Applied to files:

  • apps/roam/package.json
📚 Learning: 2025-11-25T00:52:41.934Z
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-11-25T00:52:41.934Z
Learning: Applies to apps/roam/**/*.{ts,tsx,js,jsx} : Use the roamAlphaApi docs from https://roamresearch.com/#/app/developer-documentation/page/tIaOPdXCj when implementing Roam functionality

Applied to files:

  • apps/roam/package.json
📚 Learning: 2025-11-25T00:52:41.934Z
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-11-25T00:52:41.934Z
Learning: Applies to apps/roam/**/*.{ts,tsx,jsx,js,css,scss} : Use BlueprintJS 3 components and Tailwind CSS for platform-native UI in the Roam Research extension

Applied to files:

  • apps/roam/package.json
📚 Learning: 2025-11-25T00:52:41.934Z
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-11-25T00:52:41.934Z
Learning: Applies to apps/roam/**/*.{ts,tsx,js,jsx} : Use Roam Depot/Extension API docs from https://roamresearch.com/#/app/developer-documentation/page/y31lhjIqU when implementing extension functionality

Applied to files:

  • apps/roam/package.json
🧬 Code graph analysis (1)
apps/roam/src/components/DiscourseNodeSearchMenu.tsx (4)
apps/roam/src/utils/types.ts (1)
  • Result (42-46)
apps/obsidian/src/components/canvas/TldrawView.tsx (1)
  • onClose (228-256)
apps/obsidian/src/components/BulkIdentifyDiscourseNodesModal.tsx (1)
  • onClose (442-447)
apps/obsidian/src/components/DiscourseContextView.tsx (1)
  • onClose (145-150)
🪛 GitHub Actions: CI
apps/roam/package.json

[error] 1-1: The importer resolution is broken at dependency "react": version "19.1.1" doesn't satisfy range "18.2.0"

🔇 Additional comments (4)
apps/roam/src/components/DiscourseNodeSearchMenu.tsx (4)

38-40: LGTM!

The MinisearchResult type cleanly extends Result with the required type field for MiniSearch integration.


254-279: LGTM!

The MiniSearch initialization is well-structured:

  • Proper configuration with fields, storeFields, and idField
  • Efficient bulk indexing with addAll()
  • Correct storage in ref for persistence across renders

299-314: LGTM!

The search effect correctly:

  • Guards against loading/uninitialized state
  • Filters types based on user selection
  • Includes all necessary dependencies

711-723: LGTM!

The use of ReactDOM.render and ReactDOM.unmountComponentAtNode is correct for this codebase, as the roam subdirectory is constrained to React 17. Based on learnings, this is the expected pattern.

if (!acc[result.type]) {
acc[result.type] = [];
}
if (acc[result.type].length < MAX_ITEMS_PER_TYPE) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trangdoan982 trangdoan982 changed the title [ENG-1224] Replaced fuzzy with minisearch in Discourse Summoning Menu [ENG-1224 & ENG-343] Replaced fuzzy with minisearch in Discourse Summoning Menu Jan 13, 2026
Copy link
Contributor

@mdroidian mdroidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this initial task might have fallen through the cracks:

Okay, so the first round of testing should be to verify the lag time for plugin-testing-akamatsulab and plugin-testing-megacoglab, on computer (old one if possible) and phone, for current and proposed search strategy @doantrang982

Let's run the tests on larger graphs and on slower devices and record the results 👍

@trangdoan982
Copy link
Collaborator Author

@mdroidian
image
it takes consistently ~200-300ms for me to index and query the megacog graph, which has ~5000 nodes. It seems like the order of magnitude scale for the number of nodes and query+index time is the same. So a graph with 10,000 nodes might take ~400-600ms to query (including fetching all the nodes -> index -> query). I think this is a good result for a large graph

@trangdoan982 trangdoan982 force-pushed the eng-1224-implement-minisearch-library-for-node-summoning-menu-and branch from 55e38be to a28d8b3 Compare February 2, 2026 15:37
Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View issue and 5 additional flags in Devin Review.

Open in Devin Review

Comment on lines +206 to +209
acc[result.type].push({
text: result.text,
uid: result.uid,
});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Missing id field in MiniSearch results causes undefined value in analytics tracking

The new MiniSearch implementation fails to include the id field when mapping search results back to Result objects, causing item.id to be undefined when accessed for posthog analytics tracking.

Click to expand

How the bug occurs

In the original implementation using fuzzy search, the searchNodesForType function at line 134-137 created Result objects with id: result.uid:

return results.map(([result]: any) => ({
  id: result.uid,
  text: result.title || result.string,
  uid: result.uid,
}));

However, in the new searchWithMiniSearch function, when mapping results back to Result objects, the id field is omitted:

  • Lines 166-169: { text: r.text, uid: r.uid } (empty search with type filter)
  • Lines 206-209: { text: result.text, uid: result.uid } (search with term)

Later at line 375, the code accesses item.id for analytics:

posthog.capture("Discourse Node: Selected from Search Menu", {
  id: item.id,  // This will be undefined
  text: item.text,
});

Impact

Analytics events will have undefined for the id field instead of the actual document identifier, reducing the usefulness of tracking data.

Recommendation: Add the id field when creating Result objects. For example, change line 206-209 to:

acc[result.type].push({
  id: result.uid,
  text: result.text,
  uid: result.uid,
});

Similarly update lines 166-169.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor

@mdroidian mdroidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there, but looks like we are still missing some of the required data from the initial task :

Okay, so the first round of testing should be to verify the lag time for plugin-testing-akamatsulab and plugin-testing-megacoglab, on computer (old one if possible) and phone, for current and proposed search strategy @doantrang982

Let's run the tests on larger graphs and on slower devices and record the results 👍

RE:

it takes consistently ~200-300ms for me to index and query the megacog graph, which has ~5000 nodes. It seems like the order of magnitude scale for the number of nodes and query+index time is the same. So a graph with 10,000 nodes might take ~400-600ms to query (including fetching all the nodes -> index -> query). I think this is a good result for a large graph

What device were these tests run on? Did you run it on a phone or simulate a slower device as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants