Fast hybrid code search for agents. Pure Go, single static binary, drop-in MCP-compatible with MinishLab/semble — same tool schemas, same output format, install steps swapped to a Go binary.
ken is a Go port of semble: BM25 lexical + Model2Vec semantic embeddings + RRF fusion + a code-aware reranker, with the retrieval algorithm ported verbatim from semble's search.py + ranking/*.py.
- ~97% recall@10 in the default (hybrid) mode — 0.967 NL / 0.995 symbol on semble's 1,251-query benchmark, vs grep's ~99.9% — while costing an agent ~46× fewer tokens than
grep + Read(4,120 vs 189,773 median tokens on NL queries — measured in the same default hybrid mode). For "find the chunk that answers this," that's a 1–2 order-of-magnitude token win at near-parity recall. (Reproduce:docs/BENCH.md.) - Single static binary. Pure Go, no cgo, no Python interpreter on cold start, no GIL on indexing. Cross-compiles to Linux / macOS / Windows (amd64/arm64) for free.
- Drop-in for semble. Same
search/find_relatedMCP tool schemas and the same markdown-string wire format — swap thecommand:path and existing agents work unchanged. - Local, CPU-only. Embedding inference, BM25, and fusion all run on the CPU. No API keys, no GPU, no vector DB, air-gapped friendly.
One knob controls recall. The 82–91% figure in the token-budget tables is the BM25-only fallback ken runs in when no embedding model is installed. ken-mcp fetches the model automatically on first run (~60 MB, pure-Go, no Python — serves bm25 until it lands, then upgrades to the ~97% hybrid path;
KEN_MCP_AUTO_FETCH=0to disable). For the CLI, runken download-modelonce. Exhaustive enumeration (refactors, pre-rename audits) still belongs to grep; ken is for "find the chunk that answers this."
- docs/USERS.md — agent users. Install ken-mcp, point your agent at it, use the nine tools. 5-minute on-ramp.
- docs/DEVELOPERS.md — SDK authors and tuners. The
mcp.Runembedded-corpus library, prebuilt indices,fs.FSindexing, custom chunkers, tuning rerank, performance expectations. - docs/DESIGN.md + docs/internal/DECISIONS.md — algorithm spec + every architectural decision (ADRs).
- docs/BENCH.md — benchmark reproduction (NDCG, token-budget recall, the hybrid-vs-BM25 decomposition).
Install via a package manager:
# macOS / Linux (Homebrew) — installs both `ken` and `ken-mcp`:
brew install --cask townsendmerino/tap/ken# Windows (Scoop):
scoop bucket add townsendmerino https://github.com/townsendmerino/scoop-bucket
scoop install kenOr with Go:
# Install both binaries (Go 1.26+).
go install github.com/townsendmerino/ken/cmd/ken@latest
go install github.com/townsendmerino/ken/cmd/ken-mcp@latest
# Download the default Model2Vec model (~60 MB, one-time). Pure Go, no Python.
# (ken-mcp auto-fetches this on first run; the CLI needs it explicitly.)
# This is the single biggest retrieval-quality lever — it puts you on the ~97% path.
ken download-model
# Search any local repo from the CLI.
ken search /path/to/myrepo "save model to disk" --model ~/.ken/modelOr skip the model and use lexical-only mode (BM25-only costs ~14 pp recall@10 vs the hybrid default — see docs/BENCH.md):
ken search /path/to/myrepo "validateToken" --mode bm25Pre-built binaries for macOS, Linux, and Windows (amd64/arm64) are attached to each release — .tar.gz for macOS/Linux, .zip for Windows.
As of v0.3, ken index <path> defaults to watch mode — it stays alive and re-indexes on change (2 s debounce); --no-watch restores build-once-and-exit. ken-mcp always watches, so an agent editing the repo mid-session sees its own changes without a restart. ken also respects nested .gitignore files (per-directory, matching git).
ken-mcp speaks JSON-RPC over stdio and serves the same two core tools (search, find_related) semble does, with the same arg shapes and markdown output.
# Claude Code
claude mcp add ken -s user -- /absolute/path/to/ken-mcp# ~/.codex/config.toml
[mcp_servers.ken]
command = "/absolute/path/to/ken-mcp"// ~/.opencode/config.json
{ "mcp": { "ken": { "type": "local", "command": ["/absolute/path/to/ken-mcp"] } } }| Variable | Default | Purpose |
|---|---|---|
KEN_MCP_DEFAULT_REPO |
(unset) | Pre-indexed source; lets tools omit the repo arg. |
KEN_MCP_MODE |
hybrid |
bm25 / semantic / hybrid. Serves bm25 while the model is missing — fetched on first run by default (see KEN_MCP_AUTO_FETCH). |
KEN_MCP_MODEL_DIR |
~/.ken/model |
Path to a Model2Vec snapshot containing model.safetensors. Falls back to ~/.ken/model (where ken download-model writes) when unset. |
KEN_MCP_AUTO_FETCH |
1 |
On first run with a model-needing mode and no model present, fetch potion-code-16M (~60 MB) in the background, serving bm25 until it lands then upgrading to hybrid. 0 disables (serve bm25, warn). |
KEN_MCP_CHUNKER |
regex |
regex / treesitter / line / markdown. See Choosing a chunker. |
KEN_MCP_CACHE_SIZE |
16 |
LRU bound on the repo→Index cache. |
KEN_MCP_LOG_LEVEL |
warn |
debug / info / warn / error. All logs go to stderr; stdout is the JSON-RPC channel (details). |
KEN_ALLOW_PRIVATE_CLONE_TARGETS |
0 |
Off by default: for http(s) repo URLs, ken rejects loopback / link-local / RFC1918 addresses (SSRF guard). Set 1 to allow internal git hosts. |
The full env reference — including the KEN_DB_* database variables — is in docs/USERS.md and docs/db-indexing.md. For agents that should route between ken and grep deliberately (rather than ken's default "prefer ken" instruction), see the routing snippet in docs/USERS.md.
Both core tools return a formatted markdown string identical to semble's _format_results output. (ken-mcp also exposes seven structural tools — definition, references, callers, outline, symbols, recently_changed, status — plus reindex_db when a database is configured; see docs/USERS.md.)
| Arg | Type | Required | Default | Description |
|---|---|---|---|---|
query |
string | ✓ | — | Natural language or code query. |
repo |
string | — | https:// / http:// URL or local directory. Required if no KEN_MCP_DEFAULT_REPO. |
|
mode |
hybrid|semantic|bm25 |
hybrid |
Search mode. | |
top_k |
int | 5 |
Number of results. |
| Arg | Type | Required | Default | Description |
|---|---|---|---|---|
file_path |
string | ✓ | — | Path as it appears in a search result. |
line |
int (1-indexed) | ✓ | — | A line inside the chunk to seed the similarity search. |
repo |
string | — | Same as for search. |
|
top_k |
int | 5 |
Number of similar chunks. |
ken's hybrid retrieval is calibrated for source code (Python / Go / TypeScript / Java / Rust have language-aware chunking; others fall back to the line chunker) and documentation (markdown chunked on heading boundaries, code blocks/tables kept atomic, frontmatter handled). Mixed code-and-docs corpora route per file by extension.
It also indexes database schemas alongside code — static .sql files (with migration-history folding) and live introspection of Postgres / SQLite / MySQL / MariaDB — so an agent answering "how do users get authenticated" gets the Go function, the SQL it runs, the users table definition, and the FK relationships in one ranked list. Full reference (Tier-1/Tier-2, row sampling, LISTEN/NOTIFY, the reindex_db tool, PII stance, all KEN_DB_* vars): docs/db-indexing.md.
For plain prose with no code or structured docs, BM25 mode (--mode=bm25) carries the load; the semantic model is code-trained and unvalidated on literary text.
gitignore-respecting walk
→ regex chunker (Python / Go / TS / Java / Rust) with line-chunker fallback
→ BM25 (Lucene variant, k1=1.5, b=0.75) + Model2Vec semantic (cosine over a dense matrix)
→ α-weighted RRF fusion (α auto-detected: 0.3 for symbol queries, 0.5 for NL)
→ file-coherence boost + query-type boosts (definition / embedded-symbol / stem-match)
→ path penalties (test files, compat / legacy, `.d.ts`) + file-saturation decay
→ top-k
The retrieval algorithm is a verbatim port of semble's search.py + ranking/*.py; see docs/DESIGN.md §7 for every constant and pipeline-order subtlety, and §4 for the Model2Vec inference contract (three-tensor safetensors, the mapping[] indirection, the float64 precision that's load-bearing for cosine parity).
| Property | semble | ken |
|---|---|---|
| Language / distribution | Python · uvx / pip |
Go · single static binary |
| Cold start | ~500 ms (interpreter + numpy + model) | ~10–20 ms ken search over a tiny index |
| Retrieval algorithm | reference implementation | verbatim port (constants + pipeline order from search.py + ranking/*.py) |
| NDCG@10 on semble's benchmark | 0.854 | 0.842 hybrid (gap 0.012, full 63 repos × 1,251 queries) |
| Recall@10 on agent queries | (not measured) | ~0.97 hybrid (0.967 NL / 0.995 symbol); BM25-only fallback ~0.84 |
| Tokens to recall@10 | (not measured) | ~46× fewer than grep+Read on NL queries (4,120 vs 189,773 median, hybrid) |
| MCP server | yes | yes — drop-in (same schemas + wire format) |
| Binary size | n/a | release (slim) ken ~22 MB · ken-mcp ~38 MB |
Requires huggingface-cli |
yes | no — ken download-model fetches direct from HF |
Full methodology, the per-ablation breakdown (semantic-raw matches semble within 0.003, validating the embedding + tokenizer + ANN port), the CoIR-CSN-Python external anchor, and every footnote are in docs/BENCH.md.
The default regex chunker handles most cases well. The opt-in treesitter chunker (--chunker=treesitter / KEN_MCP_CHUNKER=treesitter, pure-Go gotreesitter) measurably wins for Kotlin, Zig, TypeScript, Java, PHP and loses on Python, C, Rust, Lua, Scala — net Δ −0.004 NDCG overall (within noise), so it stays opt-in. The full per-language recommendation table is in docs/BENCH.md; the default-stays-regex rationale is ADR-011.
The mcp.Run library lets you bake a //go:embed corpus + the Model2Vec model into one static MCP server binary — no backend, no vector DB, no per-query network egress, version-pinned by build artifact. ~20 lines of main.go, go build, push to a GitHub release; users brew install and add one line to their agent config. The walker and indexer take any fs.FS (embed.FS, fstest.MapFS, tarball-backed), which also gives agent sandboxing by construction.
Full guide — the canonical pattern, prebuilt indices for fast cold start, the binary-size contract, and the opt-in mcp/db package — is in docs/DEVELOPERS.md.
Live demos (downloadable mcp.Run binaries over real codebases, with audit transcripts): demos/v0.1.0 release — Kubernetes v1.31.0 (59,795 chunks) and PostgreSQL 17.0 (64,506 chunks). Writeup: I shipped two downloadable code search binaries. The audit caught two bugs..
The risk register with explicit triggers is in docs/DESIGN.md §10; the living 1.0-readiness tracker is docs/internal/road-to-1.0.md. Retrieval is treated as closed for 1.0 (the relevance curve is flat); remaining work is polish + onboarding (getting fresh installs onto the hybrid path) + distribution.
ken is a port. The retrieval algorithm is verbatim from MinishLab/semble (Python); the Go implementation was written by Claude under fixed constraints: pure Go / no cgo, algorithm constants ported verbatim and never tuned, original source wins whenever Claude's reconstruction diverges from semble's live code. That last rule caught five material errors during the rerank-pipeline port — each a confident-sounding hallucination that was wrong when checked against the Python source. The discipline of always checking, the verbatim-port rule, and the 11k-input tokenizer parity harness (which surfaced three bugs an 18-case spot-check missed) are human-supplied. Every architectural decision is recorded in docs/internal/DECISIONS.md.
ken stands on MinishLab's shoulders — the retrieval algorithm, the model, the whole embedding-table approach are theirs.
- semble — the original Python implementation. © Thomas van Dongen, MIT.
- model2vec — the static-embedding library whose three-tensor format ken implements. © Thomas van Dongen, MIT.
- potion-code-16M — model weights, distilled from
nomic-ai/CodeRankEmbed(MIT), itself fromSnowflake/snowflake-arctic-embed-m-long(Apache-2.0). © Minish Lab. Redistributed perNOTICE.
ken is MIT-licensed. It bundles attribution for the redistributed model weights in NOTICE and a generated dependency-license list in THIRD_PARTY_LICENSES.md; every link in the provenance chain is permissive (MIT, Apache-2.0, MPL-2.0). See docs/DESIGN.md §6.
For contributors: CLAUDE.md has the build/test/formatting conventions and the project's invariants (precision contract, stdout/stderr contract).