GusCayresMindsight · GusCayresMindsight · May 15, 2026 · May 15, 2026 · May 15, 2026 · May 15, 2026
diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
@@ -0,0 +1,26 @@
+name: Tests
+
+on:
+  push:
+    branches: [master]
+  pull_request:
+
+jobs:
+  unit-tests:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+          cache: 'npm'
+
+      - name: Install dependencies
+        run: npm ci --ignore-scripts
+
+      - name: Rebuild native add-on
+        run: npm rebuild better-sqlite3
+
+      - name: Run unit tests
+        run: npx cucumber-js
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1,97 @@
+# OpenCode Workspace — AGENTS.md
+
+## What this repo is
+
+A tmux workspace manager + **MCP tool-retrieval layer** for OpenCode. Tool retrieval operates in three modes:
+
+1. **One-shot** — before each `opencode run` session, the prompt is embedded, the corpus is searched, and deny-rules are injected into a temp config.
+2. **TUI first-message hook** — an OpenCode plugin (`lib/tool-retrieval.plugin.js`, installed to `~/.config/opencode/plugins/ow-tool-retrieval.js`) fires on the user's first TUI message, runs retrieval, and injects the results as system context via `client.session.prompt({ noReply: true })`.
+3. **On-demand MCP tool** — the `tool-retrieval` MCP server (launched via `opencode-workspace mcp-serve`) exposes a `search_tools(query, k?)` tool. The agent calls this proactively whenever it believes it needs additional or different MCP capabilities.
+
+Plain Node.js (CommonJS, no TypeScript, no build step). Requires Node ≥ 18.
+
+---
+
+## Developer commands
+
+```bash
+make install      # npm install -g .
+make test         # opencode-workspace --help (exit-code only; very shallow)
+make smoke        # node bin/cli.js index && node bin/smoke.js  (real validation)
+make update       # bumps package.json "opencode.version" from GitHub API — does NOT run npm install
+```
+
+**One-shot usage:**
+```bash
+opencode-workspace index             # incremental; builds corpus before first one-shot
+opencode-workspace index --force     # re-embeds every tool regardless of cache
+opencode-workspace "find open PRs"   # retrieval → temp config → opencode run
+OPENCODE_WORKSPACE_RETRIEVAL=off opencode-workspace "any prompt"  # bypass retrieval
+opencode-workspace stats --last 10
+opencode-workspace mcp env GITHUB_TOKEN  # store MCP credential
+```
+
+**Standalone retrieval (new):**
+```bash
+opencode-workspace retrieve "list GitHub pull requests"   # human-readable top-K
+opencode-workspace retrieve --json "run browser tests"    # JSON array output
+opencode-workspace retrieve --k 5 "query a database"      # override top-K count
+```
+
+**Fresh-install order (matters):**
+1. `npm install -g .`
+2. `opencode-workspace install` — installs uv, glab, opencode 1.15.0, semgrep
+3. `opencode-workspace mcp env NOTION_TOKEN` / `GITHUB_TOKEN` (if needed)
+4. `opencode-workspace index` — corpus must exist before any one-shot
+5. `make smoke` — asserts GitHub PR query returns a GitHub tool as top-1
+
+---
+
+## Architecture — what to know before editing
+
+**Indexing** (`src/cmd/index.js`): reads `lib/opencode.json.template`, spawns each MCP server (max 4 parallel, 15 s timeout), calls `listTools()`, hashes `description+inputSchema` to skip unchanged tools, embeds `"<server> / <tool_name>: <description>"`, stores in SQLite.
+
+**One-shot** (`src/cmd/oneshot.js`): embeds prompt → cosine-searches corpus → collects unique server names from top-K → reads `~/.config/opencode/opencode.json` for existing user permissions → writes merged temp config to `/tmp/ow-<uuid>.json` with deny-rules for every server NOT in top-K → `OPENCODE_CONFIG=/tmp/ow-<uuid>.json opencode run "..."` → deletes temp file.
+
+**TUI first-message hook** (`lib/tool-retrieval.plugin.js`): an OpenCode plugin installed to `~/.config/opencode/plugins/ow-tool-retrieval.js` by `opencode-workspace install`. Subscribes to the `message.updated` event. On the first user message per session, it calls `opencode-workspace retrieve --json "<text>"` as a subprocess, then calls `client.session.prompt({ noReply: true, … })` to inject the retrieved tool list as system context before the LLM responds. Soft-fails silently on any error so normal operation is never interrupted.
+
+**On-demand retrieval tool** (`src/mcp/tool-retrieval-server.js`): a MCP stdio server launched as `opencode-workspace mcp-serve`. Always present in the template config (never denied by permission rules via `ALWAYS_ALLOWED` in `src/retrieval/permissions.js`). Exposes `search_tools(query, k?)` — the agent calls this proactively when it suspects it needs a tool it does not currently know about.
+
+**Standalone retrieval** (`src/cmd/retrieve.js`): `opencode-workspace retrieve [--json] [--k N] "<query>"`. Used by the plugin subprocess and directly by users or scripts.
+
+**`lib/opencode.json.template`** is the single source of truth for which MCP servers exist. Editing it affects both indexing and retrieval.
+
+---
+
+## Runtime file locations
+
+| Path | Purpose |
+|---|---|
+| `~/.config/opencode-workspace/config.json` | User config; auto-created with defaults if absent |
+| `~/.config/opencode-workspace/tools.db` | SQLite corpus (265 tools when fully indexed) |
+| `~/.config/opencode-workspace/sessions.jsonl` | Per-session telemetry; may not exist until first one-shot |
+| `~/.local/share/opencode/mcp.env` | MCP secrets (`KEY=value`, one per line) |
+| `~/.config/opencode/opencode.json` | Global OpenCode config — read by this tool for permission merging |
+| `~/.config/opencode/plugins/ow-tool-retrieval.js` | TUI first-message hook plugin; installed by `opencode-workspace install` |
+| `/tmp/ow-<uuid>.json` | Temp per-session config; deleted after opencode exits |
+| `~/.cache/huggingface/` | ONNX model cache (~23 MB, auto-downloaded on first use) |
+
+---
+
+## Gotchas
+
+- **No test runner**: `make test` checks help output only. `make smoke` is the real validation; requires a live indexed corpus.
+- **`sqlite-vec` is optional**: absent → transparent fallback to brute-force in-process cosine search. Performance difference only.
+- **`bun:sqlite` first, then `better-sqlite3`**: `db.js` tries `bun:sqlite`; the throw is caught. Do not remove the fallback.
+- **Embedding text format must stay consistent**: `"<server> / <tool_name>: <description>"` — index and search must use the same string and same model. Mixing models silently produces wrong results.
+- **Permissions are deny-only, server-level**: if any tool from a server is in top-K, all tools on that server stay accessible. User rules from `~/.config/opencode/opencode.json` are never overridden.
+- **Permission key format**: `mcp_<server_name>_*` with underscores — server `brave-search-mcp-server` → `mcp_brave-search-mcp-server_*`.
+- **All retrieval messages go to `stderr`**; opencode stdout is untouched.
+- **`postinstall` runs `cmdInstall`**: `npm install` triggers dependency installation; each step fails with a warning rather than aborting.
+- **`workspaces/`** at repo root is `.gitignored` — treat it as external; it is not part of this package.
+- **PATH**: `cli.js` prepends `~/.local/bin` and `~/.opencode/bin` on every run; tools installed there are always found.
+- **`make update`** only edits `package.json`; does not reinstall. Run `npm install -g .` manually after if you want the new binary version.
+- **`docs/*.feature`** are documentation only — no step implementations exist.
+- **`ALWAYS_ALLOWED` in `src/retrieval/permissions.js`**: servers listed here are never denied by the one-shot permission generator. Currently contains `tool-retrieval` so the on-demand search_tools MCP tool is always callable.
+- **Plugin is global**: `ow-tool-retrieval.js` is installed into `~/.config/opencode/plugins/` (the OpenCode global plugin directory), not `~/.config/opencode-workspace/`. It fires for all opencode sessions, but soft-fails if the corpus is absent.
+- **Plugin uses ES module syntax** (`export const`): OpenCode plugins are loaded by Bun (which supports ESM). The rest of this codebase uses CommonJS — do not mix them in the same file.
diff --git a/Makefile b/Makefile
@@ -1,19 +1,24 @@
 # Usage:
 #   make install   — install the package globally from this local repo
-#   make test      — run a quick smoke test of the CLI
+#   make test      — quick CLI sanity checks
+#   make smoke     — end-to-end: index all MCP servers, assert top retrieval result
 #   make update    — update pinned dependency versions to their latest releases
 
-.PHONY: install test update
+.PHONY: install test smoke update
+
 
 install:
 	npm install -g .
 
 test:
-	@echo "--- help ---"
-	opencode-workspace --help
-	@echo "--- unknown command exits non-zero ---"
-	! opencode-workspace bogus >/dev/null 2>&1
-	@echo "All checks passed."
+	npx cucumber-js
+
+smoke:
+	@echo "=== Step 1: index MCP tool corpus ==="
+	node bin/cli.js index
+	@echo ""
+	@echo "=== Step 2: retrieval assertion ==="
+	node bin/smoke.js
 
 update:
 	@node -e " \

diff --git a/README.md b/README.md
@@ -3,6 +3,11 @@
 Launches [OpenCode](https://opencode.ai) AI agents in a tmux split-pane layout, from any directory.
 Auto-creates a tmux session if you're not already in one.
 
+Includes a **tool-retrieval layer**: before each one-shot session the user's
+prompt is embedded and used to cosine-search the full MCP tool corpus.
+Only the most relevant servers are exposed to the LLM, cutting context overhead
+from 10+ servers down to the top-K matches.
+
 ## Install
 
 ```bash
@@ -13,32 +18,157 @@ npm install -g @gus/opencode-workspace
 ## Setup (first time)
 
 ```bash
-# Add your API keys via the mcp env command (stored securely)
+# 1. Store API keys
 opencode-workspace mcp env NOTION_TOKEN
 opencode-workspace mcp env GITHUB_TOKEN
+opencode-workspace mcp env BRAVE_API_KEY   # optional
+
+# 2. Build the tool corpus (connect to every MCP server and embed their tools)
+opencode-workspace index
 ```
 
+`index` is incremental — re-run it whenever you add or update an MCP server.
+Each tool is only re-embedded when its description or input schema changes.
+
 ## Usage
 
 ```bash
-opencode-workspace                  # launch OpenCode agent (default, auto-creates tmux)
-opencode-workspace agent            # same as above
-opencode-workspace term             # split pane to the right, plain terminal
+# TUI mode (no retrieval — opens interactive agent in a tmux split)
+opencode-workspace
+opencode-workspace agent
+
+# One-shot mode (retrieves tools, then runs opencode non-interactively)
+opencode-workspace "find open PRs assigned to me and draft a summary"
+opencode-workspace "run the test suite and report any failures"
+
+# Disable retrieval entirely for a single session (A/B baseline)
+OPENCODE_WORKSPACE_RETRIEVAL=off opencode-workspace "your prompt"
+
+# Inspect what tools were retrieved in past sessions
+opencode-workspace stats
+opencode-workspace stats --last 10
 ```
 
 ## Commands
 
 | Command | Description |
 |---|---|
-| `opencode-workspace` (default) | Launch the OpenCode agent. Auto-creates a tmux session if needed. |
+| `opencode-workspace` | Launch TUI agent. Auto-creates tmux session if needed. |
+| `opencode-workspace "<prompt>"` | One-shot: embed prompt → retrieve top-K tools → run `opencode run`. |
+| `opencode-workspace index` | Index all MCP servers. Incremental; only re-embeds changed tools. |
+| `opencode-workspace index --force` | Force re-embed of all tools regardless of schema cache. |
+| `opencode-workspace stats` | Summarise retrieval history from `~/.config/opencode-workspace/sessions.jsonl`. |
+| `opencode-workspace stats --last N` | Limit to last N sessions. |
 | `opencode-workspace install` | Install dependencies: uv, glab, opencode, semgrep. |
-| `opencode-workspace agent` | Split a pane to the right in the current directory and run opencode. |
-| `opencode-workspace term` | Split a pane to the right as a plain terminal. |
-| `opencode-workspace mcp env VAR_NAME` | Prompt for a secret and store it in `~/.local/share/opencode/mcp.env`. |
+| `opencode-workspace agent` | TUI alias (same as bare invocation, no retrieval). |
+| `opencode-workspace term` | Split a plain terminal pane. |
+| `opencode-workspace mcp env VAR` | Store a secret in `~/.local/share/opencode/mcp.env`. |
 
-## MCP servers included
+## Configuration
+
+`~/.config/opencode-workspace/config.json` (created automatically with defaults):
+
+```json
+{
+  "embedding": {
+    "provider": "local",
+    "model": "Xenova/all-MiniLM-L6-v2"
+  },
+  "retrieval": {
+    "k": 10,
+    "strategy": "topk"
+  }
+}
+```
+
+### Embedding providers
+
+| Provider | `"provider"` value | Notes |
+|---|---|---|
+| Local ONNX (default) | `"local"` | `Xenova/all-MiniLM-L6-v2`, ~23 MB downloaded on first use to `~/.cache/huggingface`. No API key needed. |
+| OpenAI | `"openai"` | Set `OPENAI_API_KEY` or add `"apiKey"` to the config. Default model: `text-embedding-3-small`. |
+| Voyage | `"voyage"` | Not yet implemented. |
+| Cohere | `"cohere"` | Not yet implemented. |
 
-The bundled template configures these MCP servers out of the box:
+### Retrieval strategies
+
+| `"strategy"` | Status |
+|---|---|
+| `"topk"` | Implemented — cosine top-K over the full corpus. |
+| `"agent_first"` | Placeholder (not implemented). |
+| `"graph"` | Placeholder (not implemented). |
+| `"active"` | Placeholder (not implemented). |
+
+### Kill switch
+
+```bash
+OPENCODE_WORKSPACE_RETRIEVAL=off opencode-workspace "prompt"
+```
+
+Bypasses all retrieval and permission filtering. Behaviour is identical to
+running `opencode run "prompt"` directly. Use this as the A/B baseline.
+
+## Inspecting what was retrieved
+
+```bash
+# Plain text summary
+opencode-workspace stats
+
+# Raw JSONL (one record per session)
+cat ~/.config/opencode-workspace/sessions.jsonl | jq .
+```
+
+Each record:
+
+```json
+{
+  "ts": "2026-05-15T12:00:00.000Z",
+  "session_id": "uuid",
+  "prompt": "find open PRs...",
+  "retrieved_tools": [
+    { "server": "github", "tool": "list_pull_requests", "score": 0.923 }
+  ],
+  "corpus_size": 84,
+  "embedding_model": "Xenova/all-MiniLM-L6-v2",
+  "k": 10
+}
+```
+
+## Smoke test
+
+Verifies that `index` + retrieval are working end-to-end:
+
+```bash
+make smoke
+```
+
+This runs `opencode-workspace index`, then asserts that querying
+`"list open pull requests on GitHub"` returns a GitHub tool as the top result.
+
+## How it works
+
+1. **`index`** — connects to every MCP server in `lib/opencode.json.template`
+   (using `@modelcontextprotocol/sdk`), calls `listTools()`, and stores
+   `{server, name, description, inputSchema}` plus a 384-dim embedding of
+   `"{server} / {tool}: {description}"` in a SQLite DB at
+   `~/.config/opencode-workspace/tools.db`.
+   Embeddings are skipped when `sha256(description + JSON.stringify(schema))`
+   is unchanged — making re-runs fast.
+
+2. **One-shot** — the prompt is embedded with the same model, cosine-searched
+   against the corpus (via `sqlite-vec` if installed, otherwise in-process
+   brute-force), and the top-K tools are identified.
+   A temporary config is written to `/tmp/ow-<uuid>.json` that extends the
+   workspace template with `"permission": { "mcp_<server>_*": "deny" }` for
+   every server absent from the top-K results.
+   `opencode run "<prompt>"` is then spawned with `OPENCODE_CONFIG` pointing
+   at that temp file. The file is deleted when opencode exits.
+
+3. **Compose, never overwrite** — only deny rules are generated; user-defined
+   permission entries in `~/.config/opencode/opencode.json` are preserved and
+   merged. A server the user has already denied cannot be re-enabled.
+
+## MCP servers included
 
 | Server | Description |
 |---|---|
@@ -50,10 +180,30 @@ The bundled template configures these MCP servers out of the box:
 | `aws-knowledge` | AWS docs & regional availability (remote) |
 | `sequential-thinking` | Structured reasoning via `@modelcontextprotocol/server-sequential-thinking` |
 | `github` | GitHub API via `@modelcontextprotocol/server-github` (requires `GITHUB_TOKEN`) |
+| `brave-search-mcp-server` | Web search via Brave (requires `BRAVE_API_KEY`) |
 
 ## Prerequisites
 
 - `tmux`
 - `git`
 - `curl`
 - Node.js >= 18
+
+## References
+
+This implementation is based on the following work:
+
+> Lumer, E., Nizar, F., Gulati, A., Honaganahalli Basavaraju, P., & Subbiah, V. K. (2025). *Tool-to-Agent Retrieval: Bridging Tools and Agents for Scalable LLM Multi-Agent Systems.* arXiv:2511.01854. https://arxiv.org/abs/2511.01854
+
+```bibtex
+@misc{lumer2025tooltoagent,
+  title         = {Tool-to-Agent Retrieval: Bridging Tools and Agents for Scalable LLM Multi-Agent Systems},
+  author        = {Lumer, Elias and Nizar, Faheem and Gulati, Anmol and Honaganahalli Basavaraju, Pradeep and Subbiah, Vamse Kumar},
+  year          = {2025},
+  eprint        = {2511.01854},
+  archivePrefix = {arXiv},
+  primaryClass  = {cs.CL},
+  url           = {https://arxiv.org/abs/2511.01854}
+}
+```
+