Local Studio is a local-first workstation for running, managing, and using self-hosted LLM backends. One machine can launch models, watch GPU/runtime state, chat with OpenAI-compatible endpoints, and run agent sessions against local or remote controllers.
It is built from three modules that share one controller API:
controller/— Bun/Hono backend. Owns model lifecycle (launch, evict, recipes, downloads, runtime process coordination), an OpenAI-compatible proxy (chat, models, tokenization, audio), system state (GPU metrics, logs, usage, settings, SSE), and controller integrations.frontend/— Next.js 16 + React 19 UI and the macOS Electron desktop shell. Hosts/agent(Pi coding agent runtime), settings, usage, recipes, logs, and the browser-facing API routes.cli/— Bun CLI for checking and operating a controller from a terminal, with headless commands and an interactive TUI.
A controller is the backend process the UI and CLI talk to — the Bun/Hono
server in controller/. You can run one locally or point the frontend at a
remote controller on a GPU host. The controller owns model lifecycle, the
OpenAI-compatible proxy, system state, and SSE event streams.
flowchart LR
User["User"] --> Desktop["Electron desktop app"]
User --> Web["Next.js web UI"]
User --> CLI["Bun CLI"]
Desktop --> Frontend["Frontend server / API routes"]
Web --> Frontend
CLI --> Controller["Controller API (Bun + Hono)"]
Frontend --> Controller
Controller --> Runtime["Inference runtime process"]
Runtime --> Backends["vLLM / SGLang / llama.cpp / MLX recipes"]
Controller --> Data["Local data directory"]
Controller --> Events["SSE status and runtime events"]
Frontend --> Agent["Pi coding agent runtime"]
flowchart TB
subgraph Frontend["frontend/"]
AgentPage["/agent"]
Settings["/settings and controller config"]
Usage["/usage"]
Recipes["/recipes"]
ProxyRoutes["/api/* proxy and agent routes"]
DesktopMain["desktop/ Electron shell"]
end
subgraph Controller["controller/"]
HttpApp["src/http/app.ts"]
Engines["src/modules/engines"]
Models["src/modules/models"]
Proxy["src/modules/proxy"]
Studio["src/modules/studio"]
System["src/modules/system"]
Audio["src/modules/audio"]
Stores["src/stores"]
end
ProxyRoutes --> HttpApp
HttpApp --> Engines
HttpApp --> Models
HttpApp --> Proxy
HttpApp --> Studio
HttpApp --> System
HttpApp --> Audio
System --> Stores
Prerequisites: Bun 1.x (controller, CLI), Node.js 20+ and npm (frontend),
Python 3.10+ on PATH (uv strongly recommended; engine installs fall back to
pip), Git. vLLM/SGLang serving on Linux needs NVIDIA driver + CUDA; Apple
Silicon uses the MLX backend.
Run the preflight check first (toolchain, ports, directories, network):
npm run doctorStart the controller (listens on 127.0.0.1:8080, data dir + SQLite created
automatically, model weights in LOCAL_STUDIO_MODELS_DIR, default /models):
cd controller && bun install && bun src/main.tsStart the frontend in a second terminal, then open http://localhost:3000/setup:
cd frontend && npm ci && npm run devnpm ci runs a postinstall patch against @earendil-works/pi-ai. If that step
prints a warning, agent streaming may misrender. The setup wizard walks through
choosing a models directory, installing an engine, downloading a model,
launching it, and benchmarking. Engine installs (vLLM/SGLang/MLX) land in
<data dir>/runtime/venvs/<backend>-latest.
Optional CLI:
cd cli && bun install && bun src/main.ts statusThe agent surface lives at /agent in the frontend. It uses
@earendil-works/pi-coding-agent through the frontend runtime rather than
shelling out to a separate agent process for normal turns. Agent skills and
extensions are loaded by the frontend runtime and surfaced in the session UI.
Agent file operations are local-only, stored under data/agentfs.
Recipes launch through the controller runtime layer. Wired backend families:
vllm— vLLM server recipes through configured/discovered/system/Docker/bundled targets.sglang— SGLanglaunch-serverrecipes through configured or discovered Python targets.llamacpp— llama.cppllama-serverrecipes for GGUF models.mlx— MLXmlx_lm.serverrecipes for Apple Silicon.
Runtime target discovery is surfaced in Settings; selections persist in the controller data directory.
Build the frontend, then serve it with the standalone server:
cd frontend && npm run build && npm run startnpm run start launches the standalone server (scripts/start-standalone.mjs).
Never use plain next start — it breaks SSE streaming. The controller runs the
same way in production as in development: bun src/main.ts.
The controller binds 127.0.0.1 by default. Binding a non-loopback host (e.g.
LOCAL_STUDIO_HOST=0.0.0.0) requires LOCAL_STUDIO_API_KEY — startup throws
without it. On a trusted LAN you may instead set
LOCAL_STUDIO_ALLOW_UNAUTHENTICATED=true to opt out of authentication.
Point the frontend at a remote controller with BACKEND_URL or
NEXT_PUBLIC_API_URL (default http://localhost:8080). The CLI uses
LOCAL_STUDIO_URL.
Remote deployment is handled by scripts/deploy-remote.sh. Configure
.env.local first (see .env.example):
REMOTE_HOST=192.168.x.x
REMOTE_USER=username
REMOTE_PATH=/home/user/project
REMOTE_URL=https://your-domain.example./scripts/deploy-remote.sh controller # sync + build + restart controller
./scripts/deploy-remote.sh frontend # sync + build + restart frontend
./scripts/deploy-remote.sh status # inspect remote processesLocal daemon helpers: ./scripts/daemon-start.sh, daemon-status.sh,
daemon-stop.sh.
npm run check # contracts + structure + frontend quality + controller/cli typecheck
npm run test:e2e # controller integration + frontend e2eThe configured pre-push hook (.githooks/pre-push) checks conventional commits
and runs the frontend quality gate (npm --prefix frontend run check:quality)
before pushing.
Releases are automated. Pushing conventional commits to main triggers the
release.yml workflow, which runs semantic-release (release.config.cjs): it
analyzes commits since the last tag, cuts the next tag (feat → minor, others
→ patch, breaking → major), and publishes a GitHub Release with generated notes.
There is no npm publish (private monorepo, protected main). Do not tag by hand.
Contributions should be small, focused, and easy to review. Start from the
latest main, one logical change per branch, no formatting-only rewrites, no
secrets or build artifacts. Run npm run check (and npm run test:e2e for
behavior changes) before opening a PR; include a concise summary, the validation
commands you ran, and screenshots for UI changes. See AGENTS.md for the full
code standards an agent (or contributor) must follow.
See LICENSE.