Project Structure

This document outlines the modular architecture of Smart Coding MCP.

Directory Structure

smart-coding-mcp/
├── index.js                    # Main entry point, MCP server setup
├── package.json                # Package configuration
├── config.json                 # User configuration
├── LICENSE                     # MIT License
├── README.md                   # Project documentation
├── EXAMPLES.md                 # Usage examples
├── .gitignore                  # Git ignore rules
│
├── lib/                        # Core libraries
│   ├── config.js              # Configuration loader
│   ├── cache.js               # Embeddings cache management
│   └── utils.js               # Shared utilities (chunking, similarity)
│
├── features/                   # Pluggable features
│   ├── hybrid-search.js       # Semantic search feature
│   ├── index-codebase.js      # Code indexing feature
│   └── clear-cache.js         # Cache management feature
│
└── scripts/                    # Utility scripts
    └── clear-cache.js         # Cache management utility

Module Responsibilities

index.js

MCP server initialization
Feature registry and orchestration
Tool request routing
Global state management (embedder, cache)

lib/config.js

Loads and validates configuration from config.json
Provides default configuration values
Resolves file paths

lib/cache.js

EmbeddingsCache class
Manages persistence of embedding vectors
File hash tracking for change detection
Load/save operations for disk cache

lib/utils.js

cosineSimilarity() - Vector similarity calculation
hashContent() - MD5 hashing for change detection
smartChunk() - Language-aware code chunking

features/hybrid-search.js

HybridSearch class
Combines semantic and exact matching
Weighted scoring algorithm
Result formatting with relevance scores
MCP tool: semantic_search

features/index-codebase.js

CodebaseIndexer class
File discovery via glob patterns
Incremental indexing
File watcher for real-time updates
MCP tool: index_codebase

Adding New Features

To extend with a new feature:

1. Create Feature Module

Create features/my-feature.js:

export class MyFeature {
  constructor(embedder, cache, config) {
    this.embedder = embedder;
    this.cache = cache;
    this.config = config;
  }

  async execute(params) {
    // Implementation
    return {
      /* results */
    };
  }
}

export function getToolDefinition(config) {
  return {
    name: "my_tool",
    description: "What this tool does",
    inputSchema: {
      type: "object",
      properties: {
        param1: { type: "string", description: "..." },
      },
      required: ["param1"],
    },
  };
}

export async function handleToolCall(request, instance) {
  const params = request.params.arguments;
  const result = await instance.execute(params);

  return {
    content: [
      {
        type: "text",
        text: JSON.stringify(result, null, 2),
      },
    ],
  };
}

2. Register in index.js

import * as MyFeature from "./features/my-feature.js";

// In initialize():
const myFeature = new MyFeature.MyFeature(embedder, cache, config);

// Add to features array:
const features = [
  // ... existing features
  {
    module: MyFeature,
    instance: myFeature,
    handler: MyFeature.handleToolCall,
  },
];

3. Done!

The feature will automatically:

Be listed in MCP tool discovery
Handle incoming tool requests
Have access to embeddings and cache

Configuration Flow

User creates/edits config.json
lib/config.js loads configuration on startup
Configuration merged with defaults
Passed to all features via constructor

Data Flow

Indexing Flow

User code files
    ↓
glob pattern matching
    ↓
smartChunk() - split into chunks
    ↓
embedder - generate vectors
    ↓
EmbeddingsCache - store in memory + disk

Search Flow

User query
    ↓
embedder - query to vector
    ↓
cosineSimilarity() - score all chunks
    ↓
exact match boost - adjust scores
    ↓
sort and filter - top N results
    ↓
format output - markdown with syntax highlighting

Performance Considerations

Caching Strategy

First Run: Download model (~90MB), index all files, save cache
Subsequent Runs: Load cache from disk, only index changed files
File Changes: Incremental updates via file watcher

Memory Usage

Approximate memory usage:

Base (Node.js + libraries): ~50MB
Embedding model: ~100MB
Vector store: ~10KB per code chunk
Example: 1000 files × 20 chunks/file = ~200MB

Optimization Tips

Reduce chunkSize for large codebases
Disable watchFiles if not needed
Use excludePatterns aggressively
Limit fileExtensions to relevant types

Future Feature Ideas

Potential features to add following this architecture:

Code Complexity Analysis
- Cyclomatic complexity scoring
- Technical debt detection
Pattern Detection
- Anti-pattern identification
- Best practice recommendations
Documentation Generation
- Auto-generate function docs
- README generation from code
Refactoring Suggestions
- Code smell detection
- Automated fix suggestions
Test Coverage Analysis
- Identify untested code paths
- Generate test templates
Dependency Analysis
- Import/export graph
- Dead code detection

Each feature would follow the same pattern:

Class in features/ directory
Access to embedder, cache, config
MCP tool definition and handler
Registration in feature array

Testing Strategy

Recommended testing approach:

Unit Tests: lib/ modules
- Test utilities in isolation
- Mock dependencies
Integration Tests: features/
- Test with sample codebases
- Verify MCP tool contracts
E2E Tests: Full workflow
- Index → Search → Results
- File watching behavior
- Cache persistence

Error Handling

Each module follows defensive error handling:

Config errors → use defaults
File read errors → log and skip
Embedding errors → retry or skip chunk
Cache errors → log but continue
Unknown tools → return helpful error message

All errors logged to stderr for MCP protocol compatibility.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Project Structure

Directory Structure

Module Responsibilities

index.js

lib/config.js

lib/cache.js

lib/utils.js

features/hybrid-search.js

features/index-codebase.js

Adding New Features

1. Create Feature Module

2. Register in index.js

3. Done!

Configuration Flow

Data Flow

Indexing Flow

Search Flow

Performance Considerations

Caching Strategy

Memory Usage

Optimization Tips

Future Feature Ideas

Testing Strategy

Error Handling

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Project Structure

Directory Structure

Module Responsibilities

index.js

lib/config.js

lib/cache.js

lib/utils.js

features/hybrid-search.js

features/index-codebase.js

Adding New Features

1. Create Feature Module

2. Register in index.js

3. Done!

Configuration Flow

Data Flow

Indexing Flow

Search Flow

Performance Considerations

Caching Strategy

Memory Usage

Optimization Tips

Future Feature Ideas

Testing Strategy

Error Handling