Demesne

An agent-agnostic, local, containerised agent-orchestration MCP server you drive from your agent of choice. It runs untrusted shell, scripts, and AI coding agents in disposable containers, decoupling agent reasoning from execution effects. Host mounts are read-only; outbound network access is governed by egress allowlists.

Warning

Alpha — best-effort. demesne is early software, and is largely built using itself (its own containerised agents do much of the work). Expect rough edges, gaps, and breaking changes between versions. Treat it as alpha and best-effort, and review what it does before relying on it.

What you can do

Ask your agent to run through demesne:

One-off scripts — execute a shell command in a fresh sandbox and collect output. Example
Headless React-widget rendering — render and screenshot a React widget inside a sandbox via the demesne-built browser image (Playwright + Chromium + Node 22, works at egress=none). How-to
Video / audio / image conversion — transcode, convert, and manipulate media inside a sandbox via the demesne-built media image (ffmpeg + ImageMagick + libvips + audio tooling, works at egress=none). How-to
Offline interactive-fiction build/playtest — compile a Twine story and playtest it headlessly via the demesne-built twine image (Tweego + Twine story formats + Chromium, works at egress=none). Reference
Offline HTML5-game build/playtest — build and playtest a web game via the demesne-built webgamedev image (a warm Phaser + Vite + TypeScript template + Chromium, works at egress=none). Reference
Long-running research with open internet — spawn a research agent with unrestricted outbound access. Reference
Delegated coding-agent tasks — hand off a prompt to a sub-agent running inside a sandbox. Example
Persistent sessions — create a sandbox, run multiple commands, upload/download files, then destroy it. Example
Multi-agent orchestration — the orchestrator agent is itself a containerised run that spawns child sandboxes for its workers and verifier, dispatching tasks and judging results across the tree. Example
Ready-made orchestration skills — a library of SKILL.md pipeline definitions (migration sweeps, corpus map-reduce, document ETL, and more) to drop into your agent and adapt. Pre-alpha: in principle ready to use but largely untested — regard them as examples of what could be tried. Example skills

Together these let your agent take on larger tasks more autonomously: you can push security-review-awkward script execution, autonomous research, and entire multi-agent pipelines into containers that run with no permission prompts — much of the autonomy you'd otherwise reach for --dangerously-skip-permissions to get, but with the host kept at arm's length by a container boundary, read-only mounts, and egress allowlists. (That boundary is container-level isolation, not a hard security guarantee — see SECURITY.md.) And you don't pre-declare the pipeline: your agent composes the orchestration prompt itself for the task at hand, and the containerised orchestrator adapts the layout and subagents as it runs.

How it works

Containerised agents can themselves spawn sandboxes, and — with appropriate configuration — get a read-only subset of your host's MCP server tools proxied in through a per-sandbox tunnel. See docs/reference/nested-sandboxes.md.

Get started

a. Install the binary

Download a release binary from the GitHub releases page. Builds are available for linux/amd64, darwin/amd64, darwin/arm64, and windows/amd64.

To build from source instead, see CONTRIBUTING.md.

b. Run a local OpenSandbox

demesne needs a running OpenSandbox instance. See docs/reference/requirements.md for prerequisites; Step 2 of the Quickstart walks through launching one locally.

c. Wire into Claude Code or Codex

See docs/how-to/wire-into-mcp-client.md for the per-client config snippets and full env var reference.

For the full walkthrough, see Quickstart.

Tools

Tool	Description	Reference
`sandbox_script`	Run a shell command in a fresh sandbox and tear it down. Returns exit code, stdout, stderr, and the `/out` host path.	ref
`sandbox_create`	Create a persistent sandbox. Returns a `sandbox_id` handle and the `/out` host path. TTL is 24h, refreshed by each `sandbox_exec`.	ref
`sandbox_exec`	Run a shell command in an existing sandbox. Refreshes TTL. Returns exit code, stdout, and stderr.	ref
`sandbox_upload`	Copy a host file into an existing sandbox.	ref
`sandbox_download`	Copy a file out of an existing sandbox; written under `<output_dir>/downloads/<basename>`. Returns the host path.	ref
`sandbox_destroy`	Kill an existing sandbox. Host output dir is preserved.	ref
`sandbox_agent`	Run an AI coding agent in a fresh sandbox against a caller-supplied prompt. Provider is inferred from `model` (defaults to codex/gpt-5.5 when Codex credentials are configured, otherwise claude-code/sonnet). Outbound HTTPS is restricted to the vendor proxy. Returns exit code, stdout, stderr, the `/out` host path, and the (indicative) cost summary.	ref
`sandbox_research`	Run a long-running research agent with no input mounts and unrestricted outbound internet access. Returns exit code, stdout, stderr, the `/out` host path, and the (indicative) cost summary.	ref

Background / async jobs

All three spawn tools (sandbox_script, sandbox_agent, sandbox_research) accept an optional background: true parameter. When set, the tool returns immediately with {job_id, status: "running"} instead of blocking. Use the complementary job-control tools to manage the run:

Tool	Description
`sandbox_status`	Non-blocking snapshot of status, elapsed time, stdout tail, and cost.
`sandbox_wait`	Block up to `timeout_seconds` (default 30, max 120) for a terminal state; returns the final result or a `"still running"` sentinel if the timeout elapses. Call in a loop to poll.
`sandbox_cancel`	Cancel the job and its entire descendant subtree; idempotent on already-terminal jobs.

Use background: true when a run might exceed the ~240s client tool-call timeout — for example, a multi-hour research agent or a long compilation. The job registry is in-memory; jobs do NOT survive MCP-server restarts (a stale job_id returns an error after restart); completed jobs are retained ~1h via a TTL reaper.

For a step-by-step walkthrough of the persistent-sandbox lifecycle, see the Quickstart and the sandbox_create / sandbox_exec reference pages.

Docs


Quickstart	Five steps to your first `sandbox_script` call
Docs	Tutorials, how-to recipes, reference, explanation
Examples	Runnable example calls
Example skills	Ready-to-use orchestration pipelines you can adapt (pre-alpha)

Contributing

See CONTRIBUTING.md for building from source, linting, and tests.

Status

See CHANGELOG.md for milestone history.

Name		Name	Last commit message	Last commit date
Latest commit History 145 Commits
.github		.github
cmd		cmd
docs		docs
examples		examples
internal		internal
scripts		scripts
.env.dist		.env.dist
.gitignore		.gitignore
.golangci.yml		.golangci.yml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
go.mod		go.mod
go.sum		go.sum
manifest.json		manifest.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Demesne

What you can do

How it works

Get started

a. Install the binary

b. Run a local OpenSandbox

c. Wire into Claude Code or Codex

Tools

Background / async jobs

Docs

Contributing

Status

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Demesne

What you can do

How it works

Get started

a. Install the binary

b. Run a local OpenSandbox

c. Wire into Claude Code or Codex

Tools

Background / async jobs

Docs

Contributing

Status

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages