Skip to content

doc : add FIM architecture documentation#126

Draft
ggerganov wants to merge 3 commits into
masterfrom
gg/fim-architecture-doc
Draft

doc : add FIM architecture documentation#126
ggerganov wants to merge 3 commits into
masterfrom
gg/fim-architecture-doc

Conversation

@ggerganov
Copy link
Copy Markdown
Member

Description

Comprehensive HTML documentation illustrating the FIM (Fill-in-the-Middle) completion system architecture.

Content

  • Request flow with cache hit/miss branches (Mermaid diagram)
  • Trigger mechanism with dual autocmds and debounce
  • Local context construction (prefix/middle/suffix) with editor visualization
  • LRU cache system with SHA-256 hashing and multi-hash variants
  • Fuzzy matching over 128 typed characters with interactive demo
  • Ring buffer context with similarity-based eviction (Dice coefficient)
  • Request payload and response fields
  • Response rendering with repetition filtering
  • Speculative pipeline for KV cache warm-up
  • Accept modes (full/line/word) and cycling
  • Timing breakdown with gantt chart
  • Full configuration reference with parameters and keymaps

Technical details

  • Pure HTML/CSS/JS, no external dependencies
  • Dark theme matching GitHub's dark mode
  • Interactive fuzzy matching demo
  • Responsive layout
  • ~42KB, 640 lines

Screenshots

The documentation includes visual diagrams for:

  • Request flow with branching logic
  • FIM context construction (prefix/middle/suffix)
  • Fuzzy matching character-by-character visualization
  • Ring buffer chunk lifecycle
  • Speculative pipeline warm-up

AI Usage Disclosure: YES. llama.cpp + pi

Comprehensive HTML documentation illustrating the FIM (Fill-in-the-Middle)
completion system architecture, covering:

- Request flow with cache hit/miss branches
- Trigger mechanism with dual autocmds
- Local context construction (prefix/middle/suffix)
- LRU cache system with SHA-256 hashing
- Fuzzy matching over 128 typed characters
- Ring buffer context with similarity eviction
- Request payload and response fields
- Response rendering with repetition filtering
- Speculative pipeline for KV cache warm-up
- Accept modes and keybindings
- Timing breakdown
- Full configuration reference

Includes interactive demo for fuzzy matching visualization.

Assisted-by: llama.cpp:local pi
@ggerganov ggerganov force-pushed the gg/fim-architecture-doc branch from 65d32cc to 1f91b61 Compare May 14, 2026 13:32
ggerganov added 2 commits May 14, 2026 16:48
Assisted-by: llama.cpp:local pi
Illustrate the instruct-based editing flow: interaction sequence, prompt
construction, request lifecycle states, buffer visualization, keymap table,
and request object schema.

Assisted-by: llama.cpp:local pi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant