ken

Fast hybrid code search for agents. Pure Go, single static binary, drop-in MCP-compatible with MinishLab/semble — same tool schemas, same output format, install steps swapped to a Go binary.

ken is a Go port of semble: BM25 lexical + Model2Vec semantic embeddings + RRF fusion + a code-aware reranker, with the retrieval algorithm ported verbatim from semble's search.py + ranking/*.py.

Why ken

~97% recall@10 in the default (hybrid) mode — 0.967 NL / 0.995 symbol on semble's 1,251-query benchmark, vs grep's ~99.9% — while costing an agent ~46× fewer tokens than grep + Read (4,120 vs 189,773 median tokens on NL queries — measured in the same default hybrid mode). For "find the chunk that answers this," that's a 1–2 order-of-magnitude token win at near-parity recall. (Reproduce: docs/BENCH.md.)
Single static binary. Pure Go, no cgo, no Python interpreter on cold start, no GIL on indexing. Cross-compiles to Linux / macOS / Windows (amd64/arm64) for free.
Drop-in for semble. Same search / find_related MCP tool schemas and the same markdown-string wire format — swap the command: path and existing agents work unchanged.
Local, CPU-only. Embedding inference, BM25, and fusion all run on the CPU. No API keys, no GPU, no vector DB, air-gapped friendly.

One knob controls recall. The 82–91% figure in the token-budget tables is the BM25-only fallback ken runs in when no embedding model is installed. ken-mcp fetches the model automatically on first run (~60 MB, pure-Go, no Python — serves bm25 until it lands, then upgrades to the ~97% hybrid path; KEN_MCP_AUTO_FETCH=0 to disable). For the CLI, run ken download-model once. Exhaustive enumeration (refactors, pre-rename audits) still belongs to grep; ken is for "find the chunk that answers this."

Where to start

docs/USERS.md — agent users. Install ken-mcp, point your agent at it, use the nine tools. 5-minute on-ramp.
docs/DEVELOPERS.md — SDK authors and tuners. The mcp.Run embedded-corpus library, prebuilt indices, fs.FS indexing, custom chunkers, tuning rerank, performance expectations.
docs/DESIGN.md + docs/internal/DECISIONS.md — algorithm spec + every architectural decision (ADRs).
docs/BENCH.md — benchmark reproduction (NDCG, token-budget recall, the hybrid-vs-BM25 decomposition).

Quickstart

Install via a package manager:

# macOS / Linux (Homebrew) — installs both `ken` and `ken-mcp`:
brew install --cask townsendmerino/tap/ken

# Windows (Scoop):
scoop bucket add townsendmerino https://github.com/townsendmerino/scoop-bucket
scoop install ken

Or with Go:

# Install both binaries (Go 1.26+).
go install github.com/townsendmerino/ken/cmd/ken@latest
go install github.com/townsendmerino/ken/cmd/ken-mcp@latest

# Download the default Model2Vec model (~60 MB, one-time). Pure Go, no Python.
# (ken-mcp auto-fetches this on first run; the CLI needs it explicitly.)
# This is the single biggest retrieval-quality lever — it puts you on the ~97% path.
ken download-model

# Search any local repo from the CLI.
ken search /path/to/myrepo "save model to disk" --model ~/.ken/model

Or skip the model and use lexical-only mode (BM25-only costs ~14 pp recall@10 vs the hybrid default — see docs/BENCH.md):

ken search /path/to/myrepo "validateToken" --mode bm25

Pre-built binaries for macOS, Linux, and Windows (amd64/arm64) are attached to each release — .tar.gz for macOS/Linux, .zip for Windows.

As of v0.3, ken index <path> defaults to watch mode — it stays alive and re-indexes on change (2 s debounce); --no-watch restores build-once-and-exit. ken-mcp always watches, so an agent editing the repo mid-session sees its own changes without a restart. ken also respects nested .gitignore files (per-directory, matching git).

Install as an MCP server

ken-mcp speaks JSON-RPC over stdio and serves the same two core tools (search, find_related) semble does, with the same arg shapes and markdown output.

# Claude Code
claude mcp add ken -s user -- /absolute/path/to/ken-mcp

// ~/.cursor/mcp.json  (or .cursor/mcp.json) — also .vscode/mcp.json with "servers"
{ "mcpServers": { "ken": { "command": "/absolute/path/to/ken-mcp" } } }

# ~/.codex/config.toml
[mcp_servers.ken]
command = "/absolute/path/to/ken-mcp"

// ~/.opencode/config.json
{ "mcp": { "ken": { "type": "local", "command": ["/absolute/path/to/ken-mcp"] } } }

Core environment variables

Variable	Default	Purpose
`KEN_MCP_DEFAULT_REPO`	(unset)	Pre-indexed source; lets tools omit the `repo` arg.
`KEN_MCP_MODE`	`hybrid`	`bm25` / `semantic` / `hybrid`. Serves `bm25` while the model is missing — fetched on first run by default (see `KEN_MCP_AUTO_FETCH`).
`KEN_MCP_MODEL_DIR`	`~/.ken/model`	Path to a Model2Vec snapshot containing `model.safetensors`. Falls back to `~/.ken/model` (where `ken download-model` writes) when unset.
`KEN_MCP_AUTO_FETCH`	`1`	On first run with a model-needing mode and no model present, fetch `potion-code-16M` (~60 MB) in the background, serving `bm25` until it lands then upgrading to hybrid. `0` disables (serve bm25, warn).
`KEN_MCP_CHUNKER`	`regex`	`regex` / `treesitter` / `line` / `markdown`. See Choosing a chunker.
`KEN_MCP_CACHE_SIZE`	`16`	LRU bound on the repo→Index cache.
`KEN_MCP_LOG_LEVEL`	`warn`	`debug` / `info` / `warn` / `error`. All logs go to stderr; stdout is the JSON-RPC channel (details).
`KEN_ALLOW_PRIVATE_CLONE_TARGETS`	`0`	Off by default: for `http(s)` `repo` URLs, ken rejects loopback / link-local / RFC1918 addresses (SSRF guard). Set `1` to allow internal git hosts.

The full env reference — including the KEN_DB_* database variables — is in docs/USERS.md and docs/db-indexing.md. For agents that should route between ken and grep deliberately (rather than ken's default "prefer ken" instruction), see the routing snippet in docs/USERS.md.

Tools

Both core tools return a formatted markdown string identical to semble's _format_results output. (ken-mcp also exposes seven structural tools — definition, references, callers, outline, symbols, recently_changed, status — plus reindex_db when a database is configured; see docs/USERS.md.)

`search`

Arg	Type	Required	Default	Description
`query`	string	✓	—	Natural language or code query.
`repo`	string		—	`https://` / `http://` URL or local directory. Required if no `KEN_MCP_DEFAULT_REPO`.
`mode`	`hybrid`\|`semantic`\|`bm25`		`hybrid`	Search mode.
`top_k`	int		`5`	Number of results.

`find_related`

Arg	Type	Required	Default	Description
`file_path`	string	✓	—	Path as it appears in a `search` result.
`line`	int (1-indexed)	✓	—	A line inside the chunk to seed the similarity search.
`repo`	string		—	Same as for `search`.
`top_k`	int		`5`	Number of similar chunks.

What ken indexes

ken's hybrid retrieval is calibrated for source code (Python / Go / TypeScript / Java / Rust have language-aware chunking; others fall back to the line chunker) and documentation (markdown chunked on heading boundaries, code blocks/tables kept atomic, frontmatter handled). Mixed code-and-docs corpora route per file by extension.

It also indexes database schemas alongside code — static .sql files (with migration-history folding) and live introspection of Postgres / SQLite / MySQL / MariaDB — so an agent answering "how do users get authenticated" gets the Go function, the SQL it runs, the users table definition, and the FK relationships in one ranked list. Full reference (Tier-1/Tier-2, row sampling, LISTEN/NOTIFY, the reindex_db tool, PII stance, all KEN_DB_* vars): docs/db-indexing.md.

For plain prose with no code or structured docs, BM25 mode (--mode=bm25) carries the load; the semantic model is code-trained and unvalidated on literary text.

How it works

gitignore-respecting walk
    → regex chunker (Python / Go / TS / Java / Rust) with line-chunker fallback
    → BM25 (Lucene variant, k1=1.5, b=0.75)  +  Model2Vec semantic (cosine over a dense matrix)
    → α-weighted RRF fusion (α auto-detected: 0.3 for symbol queries, 0.5 for NL)
    → file-coherence boost + query-type boosts (definition / embedded-symbol / stem-match)
    → path penalties (test files, compat / legacy, `.d.ts`) + file-saturation decay
    → top-k

The retrieval algorithm is a verbatim port of semble's search.py + ranking/*.py; see docs/DESIGN.md §7 for every constant and pipeline-order subtlety, and §4 for the Model2Vec inference contract (three-tensor safetensors, the mapping[] indirection, the float64 precision that's load-bearing for cosine parity).

Comparison to semble

Property	semble	ken
Language / distribution	Python · `uvx` / `pip`	Go · single static binary
Cold start	~500 ms (interpreter + numpy + model)	~10–20 ms `ken search` over a tiny index
Retrieval algorithm	reference implementation	verbatim port (constants + pipeline order from `search.py` + `ranking/*.py`)
NDCG@10 on semble's benchmark	0.854	0.842 hybrid (gap 0.012, full 63 repos × 1,251 queries)
Recall@10 on agent queries	(not measured)	~0.97 hybrid (0.967 NL / 0.995 symbol); BM25-only fallback ~0.84
Tokens to recall@10	(not measured)	~46× fewer than grep+Read on NL queries (4,120 vs 189,773 median, hybrid)
MCP server	yes	yes — drop-in (same schemas + wire format)
Binary size	n/a	release (slim) `ken` ~22 MB · `ken-mcp` ~38 MB
Requires `huggingface-cli`	yes	no — `ken download-model` fetches direct from HF

Full methodology, the per-ablation breakdown (semantic-raw matches semble within 0.003, validating the embedding + tokenizer + ANN port), the CoIR-CSN-Python external anchor, and every footnote are in docs/BENCH.md.

Choosing a chunker

The default regex chunker handles most cases well. The opt-in treesitter chunker (--chunker=treesitter / KEN_MCP_CHUNKER=treesitter, pure-Go gotreesitter) measurably wins for Kotlin, Zig, TypeScript, Java, PHP and loses on Python, C, Rust, Lua, Scala — net Δ −0.004 NDCG overall (within noise), so it stays opt-in. The full per-language recommendation table is in docs/BENCH.md; the default-stays-regex rationale is ADR-011.

For SDK authors: ship docs as a single binary

The mcp.Run library lets you bake a //go:embed corpus + the Model2Vec model into one static MCP server binary — no backend, no vector DB, no per-query network egress, version-pinned by build artifact. ~20 lines of main.go, go build, push to a GitHub release; users brew install and add one line to their agent config. The walker and indexer take any fs.FS (embed.FS, fstest.MapFS, tarball-backed), which also gives agent sandboxing by construction.

Full guide — the canonical pattern, prebuilt indices for fast cold start, the binary-size contract, and the opt-in mcp/db package — is in docs/DEVELOPERS.md.

Live demos (downloadable mcp.Run binaries over real codebases, with audit transcripts): demos/v0.1.0 release — Kubernetes v1.31.0 (59,795 chunks) and PostgreSQL 17.0 (64,506 chunks). Writeup: I shipped two downloadable code search binaries. The audit caught two bugs..

Roadmap

The risk register with explicit triggers is in docs/DESIGN.md §10; the living 1.0-readiness tracker is docs/internal/road-to-1.0.md. Retrieval is treated as closed for 1.0 (the relevance curve is flat); remaining work is polish + onboarding (getting fresh installs onto the hybrid path) + distribution.

How this was built

ken is a port. The retrieval algorithm is verbatim from MinishLab/semble (Python); the Go implementation was written by Claude under fixed constraints: pure Go / no cgo, algorithm constants ported verbatim and never tuned, original source wins whenever Claude's reconstruction diverges from semble's live code. That last rule caught five material errors during the rerank-pipeline port — each a confident-sounding hallucination that was wrong when checked against the Python source. The discipline of always checking, the verbatim-port rule, and the 11k-input tokenizer parity harness (which surfaced three bugs an 18-case spot-check missed) are human-supplied. Every architectural decision is recorded in docs/internal/DECISIONS.md.

Acknowledgments

ken stands on MinishLab's shoulders — the retrieval algorithm, the model, the whole embedding-table approach are theirs.

potion-code-16M — model weights, distilled from nomic-ai/CodeRankEmbed (MIT), itself from Snowflake/snowflake-arctic-embed-m-long (Apache-2.0). © Minish Lab. Redistributed per NOTICE.

License

ken is MIT-licensed. It bundles attribution for the redistributed model weights in NOTICE and a generated dependency-license list in THIRD_PARTY_LICENSES.md; every link in the provenance chain is permissive (MIT, Apache-2.0, MPL-2.0). See docs/DESIGN.md §6.

For contributors: CLAUDE.md has the build/test/formatting conventions and the project's invariants (precision contract, stdout/stderr contract).

Name		Name	Last commit message	Last commit date
Latest commit History 248 Commits
.claude		.claude
.github		.github
bench		bench
cmd		cmd
demos		demos
docs		docs
internal		internal
mcp		mcp
outputs		outputs
scripts		scripts
testdata		testdata
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yml		.goreleaser.yml
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
THIRD_PARTY_LICENSES.md		THIRD_PARTY_LICENSES.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ken

Why ken

Where to start

Quickstart

Install as an MCP server

Core environment variables

Tools

`search`

`find_related`

What ken indexes

How it works

Comparison to semble

Choosing a chunker

For SDK authors: ship docs as a single binary

Roadmap

How this was built

Acknowledgments

License

About

Uh oh!

Releases 22

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ken

Why ken

Where to start

Quickstart

Install as an MCP server

Core environment variables

Tools

search

find_related

What ken indexes

How it works

Comparison to semble

Choosing a chunker

For SDK authors: ship docs as a single binary

Roadmap

How this was built

Acknowledgments

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 22

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`search`

`find_related`

Packages