-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Summary
Add tree-sitter-based parsing for Python and Go alongside the existing TypeScript Compiler API parser. TypeScript-only support locks out 80%+ of the market.
Motivation
- codebase-memory-mcp supports 64 languages via tree-sitter — single Go binary
- CKB uses SCIP indexes for multi-language support
- CodePathfinder is Python-only but proves demand for non-TS analysis
- Every competitive analysis perspective flagged single-language as the existential risk
- Python is the most common language in AI agent codebases
Approach
Phase 1: Parser Abstraction
- Extract parser interface from current TS Compiler API implementation
- Define
ParsedFilecontract that both parsers satisfy - Keep TS Compiler API for TypeScript (higher quality: type resolution, call sites)
Phase 2: tree-sitter Integration
- Add
tree-sitter+ language grammars as dependencies - Implement parser for Python (functions, classes, imports, exports)
- Implement parser for Go (functions, structs, imports, packages)
- Map tree-sitter AST nodes to existing
ParsedFileinterface
Phase 3: Graph Unification
- Multi-language files in same graph
- Cross-language edges (e.g., Python calling TS via API)
- Metrics compute identically regardless of source language
Tradeoffs
| Aspect | TS Compiler API | tree-sitter |
|---|---|---|
| Type resolution | Full | None |
| Call site confidence | type-resolved | text-inferred only |
| Language support | TypeScript only | 64+ languages |
| Parse speed | Slower | Faster |
| Dependency | typescript npm | tree-sitter + grammars |
Acceptance Criteria
- Python files parsed: functions, classes, imports, exports
- Go files parsed: functions, structs, imports, packages
- Mixed-language repos produce unified graph
- All existing metrics compute on non-TS files
- TS Compiler API still used for .ts/.tsx files (no quality regression)
- Tests for Python and Go parsing with real fixture files
Priority
Short-term — Critical for market viability. Start with Python.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request