Skip to content
This repository was archived by the owner on May 18, 2026. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
223 changes: 31 additions & 192 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,209 +1,48 @@
# opencode-workspace

Launches [OpenCode](https://opencode.ai) AI agents in a tmux split-pane layout, from any directory.
Auto-creates a tmux session if you're not already in one.

Includes a **tool-retrieval layer**: before each one-shot session the user's
prompt is embedded and used to cosine-search the full MCP tool corpus.
Only the most relevant servers are exposed to the LLM, cutting context overhead
from 10+ servers down to the top-K matches.

## Install
Includes a **tool-retrieval layer**: before each one-shot session the prompt is embedded and
cosine-searched against the MCP tool corpus, cutting context from 10+ servers down to the top-K matches.

```bash
npm install -g @gus/opencode-workspace
# postinstall automatically sets up: uv, glab, opencode, semgrep
```

## Setup (first time)

```bash
# 1. Store API keys
opencode-workspace mcp env NOTION_TOKEN
opencode-workspace mcp env GITHUB_TOKEN
opencode-workspace mcp env BRAVE_API_KEY # optional

# 2. Build the tool corpus (connect to every MCP server and embed their tools)
opencode-workspace index
```

`index` is incremental — re-run it whenever you add or update an MCP server.
Each tool is only re-embedded when its description or input schema changes.

## Usage

```bash
# TUI mode (no retrieval — opens interactive agent in a tmux split)
opencode-workspace
opencode-workspace agent

# One-shot mode (retrieves tools, then runs opencode non-interactively)
opencode-workspace "find open PRs assigned to me and draft a summary"
opencode-workspace "run the test suite and report any failures"

# Disable retrieval entirely for a single session (A/B baseline)
OPENCODE_WORKSPACE_RETRIEVAL=off opencode-workspace "your prompt"

# Inspect what tools were retrieved in past sessions
opencode-workspace stats
opencode-workspace stats --last 10
opencode-workspace index # build tool corpus (first time)
opencode-workspace "find open PRs" # one-shot: retrieve tools + run opencode
opencode-workspace # TUI mode: interactive agent in tmux split
```

## Commands

| Command | Description |
|---|---|
| `opencode-workspace` | Launch TUI agent. Auto-creates tmux session if needed. |
| `opencode-workspace "<prompt>"` | One-shot: embed prompt → retrieve top-K tools → run `opencode run`. |
| `opencode-workspace index` | Index all MCP servers. Incremental; only re-embeds changed tools. |
| `opencode-workspace index --force` | Force re-embed of all tools regardless of schema cache. |
| `opencode-workspace stats` | Summarise retrieval history from `~/.config/opencode-workspace/sessions.jsonl`. |
| `opencode-workspace stats --last N` | Limit to last N sessions. |
| `opencode-workspace install` | Install dependencies: uv, glab, opencode, semgrep. |
| `opencode-workspace agent` | TUI alias (same as bare invocation, no retrieval). |
| `opencode-workspace term` | Split a plain terminal pane. |
| `opencode-workspace mcp env VAR` | Store a secret in `~/.local/share/opencode/mcp.env`. |
## Documentation

## Configuration

`~/.config/opencode-workspace/config.json` (created automatically with defaults):

```json
{
"embedding": {
"provider": "local",
"model": "Xenova/all-MiniLM-L6-v2"
},
"retrieval": {
"k": 10,
"strategy": "topk"
}
}
```
All behaviour is specified as Gherkin feature files in [`docs/`](docs/):

### Embedding providers

| Provider | `"provider"` value | Notes |
|---|---|---|
| Local ONNX (default) | `"local"` | `Xenova/all-MiniLM-L6-v2`, ~23 MB downloaded on first use to `~/.cache/huggingface`. No API key needed. |
| OpenAI | `"openai"` | Set `OPENAI_API_KEY` or add `"apiKey"` to the config. Default model: `text-embedding-3-small`. |
| Voyage | `"voyage"` | Not yet implemented. |
| Cohere | `"cohere"` | Not yet implemented. |

### Retrieval strategies

| `"strategy"` | Status |
| Feature file | What it covers |
|---|---|
| `"topk"` | Implemented — cosine top-K over the full corpus. |
| `"agent_first"` | Placeholder (not implemented). |
| `"graph"` | Placeholder (not implemented). |
| `"active"` | Placeholder (not implemented). |

### Kill switch

```bash
OPENCODE_WORKSPACE_RETRIEVAL=off opencode-workspace "prompt"
```

Bypasses all retrieval and permission filtering. Behaviour is identical to
running `opencode run "prompt"` directly. Use this as the A/B baseline.

## Inspecting what was retrieved
| [`docs/prerequisites.feature`](docs/prerequisites.feature) | Node ≥ 18, tmux, git, curl |
| [`docs/installation.feature`](docs/installation.feature) | `npm install`, postinstall, `opencode-workspace install` |
| [`docs/mcp-env.feature`](docs/mcp-env.feature) | `mcp env VAR` — storing secrets in `mcp.env` |
| [`docs/mcp-servers.feature`](docs/mcp-servers.feature) | The 10 bundled MCP servers and their configuration |
| [`docs/indexing.feature`](docs/indexing.feature) | `index` — crawling MCP servers and building the corpus |
| [`docs/configuration.feature`](docs/configuration.feature) | `config.json` — embedding providers and retrieval strategy |
| [`docs/retrieval.feature`](docs/retrieval.feature) | One-shot retrieval, kill switch, fallthrough behaviour |
| [`docs/permissions.feature`](docs/permissions.feature) | Deny-rule generation and composition with user config |
| [`docs/telemetry.feature`](docs/telemetry.feature) | Session records, `stats` command |
| [`docs/tui-commands.feature`](docs/tui-commands.feature) | TUI mode: `agent`, `term`, tmux layout |
| [`docs/tool-retrieval-mcp.feature`](docs/tool-retrieval-mcp.feature) | On-demand `search_tools` MCP tool |
| [`docs/tui-retrieval.feature`](docs/tui-retrieval.feature) | TUI first-message hook plugin |
| [`docs/smoke-test.feature`](docs/smoke-test.feature) | `make smoke` — end-to-end validation |

Scenarios tagged `@wip` require a live environment (real binaries, tmux, network) and are skipped
by `npm test`. Run `make smoke` for end-to-end validation.

## Running the tests

```bash
# Plain text summary
opencode-workspace stats

# Raw JSONL (one record per session)
cat ~/.config/opencode-workspace/sessions.jsonl | jq .
```

Each record:

```json
{
"ts": "2026-05-15T12:00:00.000Z",
"session_id": "uuid",
"prompt": "find open PRs...",
"retrieved_tools": [
{ "server": "github", "tool": "list_pull_requests", "score": 0.923 }
],
"corpus_size": 84,
"embedding_model": "Xenova/all-MiniLM-L6-v2",
"k": 10
}
npm test # unit tests — skips @wip scenarios
make smoke # end-to-end: real MCP servers, real index, real retrieval
```

## Smoke test

Verifies that `index` + retrieval are working end-to-end:

```bash
make smoke
```

This runs `opencode-workspace index`, then asserts that querying
`"list open pull requests on GitHub"` returns a GitHub tool as the top result.

## How it works

1. **`index`** — connects to every MCP server in `lib/opencode.json.template`
(using `@modelcontextprotocol/sdk`), calls `listTools()`, and stores
`{server, name, description, inputSchema}` plus a 384-dim embedding of
`"{server} / {tool}: {description}"` in a SQLite DB at
`~/.config/opencode-workspace/tools.db`.
Embeddings are skipped when `sha256(description + JSON.stringify(schema))`
is unchanged — making re-runs fast.

2. **One-shot** — the prompt is embedded with the same model, cosine-searched
against the corpus (via `sqlite-vec` if installed, otherwise in-process
brute-force), and the top-K tools are identified.
A temporary config is written to `/tmp/ow-<uuid>.json` that extends the
workspace template with `"permission": { "mcp_<server>_*": "deny" }` for
every server absent from the top-K results.
`opencode run "<prompt>"` is then spawned with `OPENCODE_CONFIG` pointing
at that temp file. The file is deleted when opencode exits.

3. **Compose, never overwrite** — only deny rules are generated; user-defined
permission entries in `~/.config/opencode/opencode.json` are preserved and
merged. A server the user has already denied cannot be re-enabled.

## MCP servers included

| Server | Description |
|---|---|
| `notion` | Notion API via `@notionhq/notion-mcp-server` |
| `gitlab` | GitLab CLI via `glab mcp serve` |
| `playwright` | Browser automation via `@playwright/mcp` |
| `fetch` | HTTP fetch via `mcp-server-fetch` (uvx) |
| `semgrep` | Code scanning via `semgrep mcp` |
| `aws-knowledge` | AWS docs & regional availability (remote) |
| `sequential-thinking` | Structured reasoning via `@modelcontextprotocol/server-sequential-thinking` |
| `github` | GitHub API via `@modelcontextprotocol/server-github` (requires `GITHUB_TOKEN`) |
| `brave-search-mcp-server` | Web search via Brave (requires `BRAVE_API_KEY`) |

## Prerequisites

- `tmux`
- `git`
- `curl`
- Node.js >= 18

## References

This implementation is based on the following work:

> Lumer, E., Nizar, F., Gulati, A., Honaganahalli Basavaraju, P., & Subbiah, V. K. (2025). *Tool-to-Agent Retrieval: Bridging Tools and Agents for Scalable LLM Multi-Agent Systems.* arXiv:2511.01854. https://arxiv.org/abs/2511.01854

```bibtex
@misc{lumer2025tooltoagent,
title = {Tool-to-Agent Retrieval: Bridging Tools and Agents for Scalable LLM Multi-Agent Systems},
author = {Lumer, Elias and Nizar, Faheem and Gulati, Anmol and Honaganahalli Basavaraju, Pradeep and Subbiah, Vamse Kumar},
year = {2025},
eprint = {2511.01854},
archivePrefix = {arXiv},
primaryClass = {cs.CL},
url = {https://arxiv.org/abs/2511.01854}
}
```

> Lumer, E., Nizar, F., Gulati, A., Honaganahalli Basavaraju, P., & Subbiah, V. K. (2025).
> *Tool-to-Agent Retrieval: Bridging Tools and Agents for Scalable LLM Multi-Agent Systems.*
> arXiv:2511.01854. <https://arxiv.org/abs/2511.01854>
1 change: 1 addition & 0 deletions cucumber.js
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ module.exports = {
'unit-tests/support/hooks.js',
'unit-tests/step-definitions/**/*.steps.js',
],
tags: 'not @wip',
format: ['progress-bar', 'summary'],
timeout: 30000,
},
Expand Down
62 changes: 62 additions & 0 deletions docs/installation.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
Feature: Installation
Running "npm install -g @gus/opencode-workspace" installs the package and
automatically triggers a postinstall hook that sets up all required system
dependencies. The explicit "opencode-workspace install" command can be
re-run at any time to repair or update individual dependencies.

Each install step is wrapped in a try/catch so a single failure (for
example, a network error when downloading glab) warns and continues rather
than aborting the entire setup.

@wip
Scenario: Postinstall runs automatically after npm install
When "npm install -g @gus/opencode-workspace" is run
Then the postinstall hook calls "opencode-workspace install" automatically

@wip
Scenario: Install sets up uv if not already present
Given "uv" is not installed on the system
When the user runs "opencode-workspace install"
Then uv is downloaded and installed via the Astral installer script
And uv is available on PATH under ~/.local/bin

@wip
Scenario: Install sets up glab if not already present
Given "glab" is not installed on the system
When the user runs "opencode-workspace install"
Then the latest glab release is fetched from the GitLab API
And the glab binary is installed to ~/.local/bin/glab

@wip
Scenario: Install sets up opencode if not already present
Given "opencode" is not installed on the system
When the user runs "opencode-workspace install"
Then opencode is installed at the version pinned in package.json["opencode"]["version"]
And the installer script is fetched from https://opencode.ai/install

@wip
Scenario: Install sets up semgrep if not already present
Given "semgrep" is not installed on the system
When the user runs "opencode-workspace install"
Then semgrep is installed via "uv tool install semgrep"

@wip
Scenario: Install copies the TUI retrieval plugin
When the user runs "opencode-workspace install"
Then the file ~/.config/opencode/plugins/ow-tool-retrieval.js is created
And its contents match lib/tool-retrieval.plugin.js

@wip
Scenario: Already-installed dependencies are skipped without error
Given all dependencies (uv, glab, opencode, semgrep) are already installed
When the user runs "opencode-workspace install"
Then each dependency's existing version is logged to stdout
And no download or install step is retried

@wip
Scenario: A failing install step warns and continues
Given the glab download fails with a network error
When the user runs "opencode-workspace install"
Then a warning is printed containing "glab failed"
And a hint "Re-run: opencode-workspace install" is printed
And the remaining steps (opencode, semgrep, plugin) still run
47 changes: 47 additions & 0 deletions docs/mcp-env.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
Feature: MCP Environment Secrets (mcp env)
"opencode-workspace mcp env VAR_NAME" prompts for a secret value and stores
it in ~/.local/share/opencode/mcp.env in KEY=value format, one entry per line.

MCP servers that reference {env:VAR_NAME} in lib/opencode.json.template
automatically receive the stored value at startup via environment injection.
The directory is created if it does not exist. Re-running the command with
the same key updates the value in-place without duplicating the entry.

@wip
Scenario: Secret is stored after interactive prompt
Given the user runs "opencode-workspace mcp env GITHUB_TOKEN"
When the user types a secret value and presses Enter
Then the value is stored in ~/.local/share/opencode/mcp.env as "GITHUB_TOKEN=<value>"
And "Saved GITHUB_TOKEN to <path>" is printed to stdout

Scenario: mcp.env uses KEY=value format with one entry per line
Given ~/.local/share/opencode/mcp.env contains:
"""
GITHUB_TOKEN=ghp_abc123
NOTION_TOKEN=secret_xyz
"""
When the mcp.env file is parsed
Then GITHUB_TOKEN resolves to "ghp_abc123"
And NOTION_TOKEN resolves to "secret_xyz"

Scenario: Storing a second key does not overwrite the first
Given ~/.local/share/opencode/mcp.env already contains "GITHUB_TOKEN=ghp_abc123"
When "NOTION_TOKEN=secret_xyz" is added to mcp.env
Then both GITHUB_TOKEN and NOTION_TOKEN are present in mcp.env

Scenario: Storing an existing key updates its value in-place
Given ~/.local/share/opencode/mcp.env already contains "GITHUB_TOKEN=old_token"
When "GITHUB_TOKEN=new_token" is written to mcp.env
Then GITHUB_TOKEN resolves to "new_token"
And there is only one GITHUB_TOKEN entry in mcp.env

Scenario: The mcp.env directory is created automatically if absent
Given ~/.local/share/opencode/ does not exist
When the mcp.env file is written
Then the directory ~/.local/share/opencode/ is created automatically

@wip
Scenario: Missing VAR_NAME argument prints usage and exits with code 1
When the user runs "opencode-workspace mcp env" without a variable name
Then "Usage: opencode-workspace mcp env VAR_NAME" is printed to stderr
And the process exits with code 1
Loading
Loading