Skip to content

Proposal: High-Performance Architecture Refactor (Zero-FFI Queries, Pre-Computed Metadata) #130

@AlexCannonball

Description

@AlexCannonball

Problem Statement

The current architecture relies heavily on dynamic, on-the-fly Tree-sitter node traversal during LSP requests (such as Hover, DocumentSymbol, and the WorkspaceSymbol implementation).

pub fn parse(&mut self, uri: Url, contents: impl AsRef<[u8]>) -> Option<ParsedTree> {
self.parser.parse(contents, None).map(|t| ParsedTree {
tree: Arc::new(t),
uri,
})

This introduces several core limitations:

  1. FFI Overhead: Frequent cross-boundary calls (node.parent(), node.child(), node.kind()) accumulate substantial latency under the current_thread Tokio flavor.
  2. Brittle Navigation & Features: Features like rename (highlighted in Feature request: rename messages from referenced site #129) are tightly coupled to strict AST parent-child grammar structures (message_name vs message_or_enum_type), making cross-reference features complex and error-prone to implement via direct AST iteration.
  3. Redundant Parsing: WorkspaceSymbol currently rescans and reparses the entire workspace from scratch on every single user query (PR Add support for workspace symbols #95), which causes unnecessary CPU spikes and blocks the single-threaded async loop.

Proposed Solution

Transition from dynamic tree traversal to a Push-Based Document Meta-Model pattern. Instead of querying the AST on each user action, the server should map the AST into safe, native Rust structures once per file change using declarative Tree-sitter SCM queries.

The Core Data Structure Concept

pub struct ProtoEntity {
    pub id: usize,
    pub parent_id: Option<usize>,
    pub meta: EntityMeta, // Name, FQN, lsp::Range, doc-comments, deprecated status
    pub kind: EntityKind, // Message, Field (with tags), Service, RPC, Enum
    pub children: Vec<usize>,
}

pub struct ProtoDocument {
    pub version: i32,
    pub package: String,
    pub imports: Vec<String>,
    pub pool: Vec<ProtoEntity>,           // Flat safe storage
    pub spatial_index: Vec<(u32, usize)>, // Sorted line-to-entity map for O(log N) Hovers
}

Strategic Implementation Plan

  1. Phase 1: Internal Meta-Model & Single-Pass Extractor

    • Introduce a declarative METAMODEL_QUERY (SCM) to match all semantic entities (message, field, etc.) in a single execution pass.
    • Parse flat matches into a ProtoDocument using a context-stack approach to build parent-child relations natively in Rust without nested FFI calls.
    • Rewrite DocumentSymbol and Hover to pull directly from this Rust model.
  2. Phase 2: Global Memory Index (Instant Workspace Symbols)

    • Move project-wide scanning out of the active workspace/symbol handler.
    • Run workspace indexing once at startup to populate a HashMap<Url, ProtoDocument>.
    • Make workspace/symbol queries a fast, pure-Rust substring match against the cached pool.
  3. Phase 3: Cross-File Resolution & Fallbacks (Fixes Feature request: rename messages from referenced site #129)

    • Implement Name Resolution (FQN matching) to link references (like RPC arguments) to their definitions across different files. This completely decouples features like Rename and Go to Definition from raw AST node types.
    • Embed google/protobuf/descriptor.proto as a virtual built-in document to resolve standard options natively, while allowing overrides from local -I include flags.
  4. Phase 4: Incremental Pipeline Optimization

    • Enable TextDocumentSyncKind::Incremental on the LSP side.
    • Update the model cache granularly using Tree-sitter's changed_ranges() diffing mechanism.

I would appreciate your feedback on this direction. If you find the approach acceptable, I can begin a deeper research and prepare reviewable PRs starting with Phase 1 (Meta-Model definition and Extractor logic).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions