Skip to content

Rust native Indexer #12

@malwarebo

Description

@malwarebo

Summary

Replace the current TypeScript in-memory indexer (src/nyrve/indexer/index-manager.ts) with a Rust native addon (src/nyrve/indexer/native/) exposed via N-API. The current implementation uses Map<string, FileIndexEntry> - it works but does not scale. On projects with 10k+ files, the initial index build blocks the extension host thread, symbol search is O(n) over every entry, and memory usage grows linearly with no persistence across sessions.

Why This Is Needed

The indexer is a dependency for four other Nyrve systems that are already production-complete:

  1. Verification engine -test-runner.ts calls indexManager.searchFiles() to find relevant test files for modified source files. Slow search = slow verification loop.
  2. Structure scanner (Project DNA) -structure-scanner.ts calls indexManager.searchFiles('') to enumerate all files, then getFileSymbols() per file. On a 10k-file project this is thousands of synchronous Map lookups in a tight loop.
  3. Import checker -import-checker.ts resolves every import in modified files against the index. Each resolution tries multiple extensions (['.ts', '.tsx', '.js', ...]) multiplied by index lookups.
  4. @-mention autocomplete -mention-resolver.ts does fuzzy symbol search on every keystroke. Must return results in <50ms or the dropdown feels laggy.

The TypeScript indexer is the performance ceiling for all of these. A Rust addon with an on-disk index removes that ceiling.

Current State

  • src/nyrve/indexer/index-manager.ts -Full TypeScript implementation (400 lines). In-memory Map-based storage, file watcher integration, fuzzy search. This is the contract the Rust addon must match.
  • src/nyrve/indexer/symbol-extractor.ts -Extracts symbols via VS Code's ILanguageFeaturesService. Stays in TypeScript (depends on Monaco APIs). The Rust addon should accept symbols from this service.
  • src/nyrve/indexer/nyrveignore.ts -.nyrveignore file parsing. Stays in TypeScript.
  • src/nyrve/indexer/native/src/lib.rs -Empty placeholder.
  • src/nyrve/indexer/native/Cargo.toml -Skeleton with no dependencies.

Requirements

Must Have

  • Implement the INyrveIndexManager interface (defined in index-manager.ts lines 57-91) from Rust, exposed via N-API
  • buildIndex() -Index 10k files in <10 seconds (current TS impl: ~60s)
  • updateFile() -Incremental single-file re-index in <50ms
  • searchSymbols(query) -Fuzzy symbol search returning results in <10ms for 100k symbols
  • searchFiles(query) -Fuzzy file path search returning results in <5ms
  • getFileSymbols(path) / getFileEntry(path) -O(1) lookup
  • On-disk persistence -Index stored in .nyrve/index.db (SQLite or custom binary format), survives editor restart
  • Respect .nyrveignore patterns passed from the TypeScript layer
  • Non-blocking -All heavy operations run off the main thread. The N-API binding must use AsyncTask or equivalent
  • Cross-platform builds -macOS (arm64, x64), Windows (x64), Linux (x64, arm64)

Should Have

  • Content hashing (xxhash) for incremental rebuild -skip files that haven't changed since last index
  • Import graph storage -track which files import which, enabling fast "find dependents" queries for the verification engine
  • Memory-mapped file reading for large projects

Nice to Have

  • Trigram index for content search (grep-like queries against the index without reading files)
  • Watch integration -Rust-side file watcher as alternative to VS Code's IFileService.watch()

Suggested Rust Dependencies

Crate Purpose
napi / napi-derive N-API bindings for Node.js
rusqlite SQLite for on-disk persistence
ignore .gitignore/.nyrveignore pattern matching
nucleo or fuzzy-matcher Fast fuzzy matching for symbol/file search
xxhash-rust Content hashing for incremental rebuild
rayon Parallel file reading during initial index build
dashmap Concurrent hashmap for the in-memory index

Architecture Guidance

TypeScript (extension host)                     Rust (N-API addon)
------------------------------------           ------------------------------------
NyrveIndexManager                    ------>   nyrve_indexer::IndexEngine
  calls native addon methods                     - SQLite-backed storage
  passes NyrveSymbol[] from                      - In-memory fuzzy search index
  symbol-extractor.ts                            - Rayon-parallelized file scan
                                                 - xxhash content dedup
NyrveSymbolExtractor
  stays in TypeScript
  uses Monaco ILanguageFeaturesService
  feeds symbols INTO the Rust index

The TypeScript NyrveIndexManager class should become a thin wrapper that delegates to the native addon. Symbol extraction stays in TypeScript because it depends on VS Code's language server protocol. The Rust side receives pre-extracted symbols and stores/indexes them.

Acceptance Criteria

  1. npm run compile-check-ts-native passes
  2. INyrveIndexManager interface is fully implemented -all existing callers work without changes
  3. Benchmark on a 10k-file TypeScript project:
    • Initial index build: <10 seconds
    • Symbol search (100k symbols): <10ms p95
    • File search: <5ms p95
    • Incremental update: <50ms
  4. Index persists in .nyrve/index.db -reopening the editor skips full rebuild
  5. Builds on macOS arm64, macOS x64, Linux x64, Windows x64
  6. Existing Nyrve unit tests pass (scripts/test.sh --grep "Nyrve")

Files to Modify

  • src/nyrve/indexer/native/Cargo.toml -Add dependencies
  • src/nyrve/indexer/native/src/lib.rs -Main implementation
  • src/nyrve/indexer/index-manager.ts -Refactor to delegate to native addon
  • package.json -Add native build scripts
  • build/ -Add cross-platform native compilation to CI

References

  • Current interface: src/nyrve/indexer/index-manager.ts lines 57-91
  • Symbol types: src/nyrve/indexer/symbol-extractor.ts (NyrveSymbol, NyrveFileSymbols)
  • Index types: src/nyrve/indexer/index-manager.ts (FileIndexEntry, IndexSearchResult, IndexStats)
  • Ignore patterns: src/nyrve/indexer/nyrveignore.ts
  • Consumers: src/nyrve/agent/verification/test-runner.ts, src/nyrve/agent/verification/import-checker.ts, src/nyrve/memory/dna/structure-scanner.ts, src/nyrve/context/mention-resolver.ts

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions