Portfolio project. A local cross-device workbench that controls multiple AI coding CLIs — Codex, Claude, Codewhale — from one browser UI. Built as an end-to-end demonstration of full-stack engineering: Python runtime layer, vanilla-JS web UI, Electron shell, cross-device control over LAN/Tailscale, and real-time parsing of three different CLI stream protocols. Not actively seeking adoption; the source is public so the design and the code can be read.
Agent Desk runs the product layer (projects, conversations, messages, runtime state, queue/cancel/stop, handoffs) locally and treats mature CLIs as execution adapters behind it. You start the server on a Windows workstation and control any of the CLIs from the same machine, a Mac, or a phone browser — the conversation, queue state, and runtime status all survive switching devices.
- Runtime adapter pattern — three vendor CLIs (Codex, Claude, Codewhale) plugged into a uniform send/stop/restore interface, each with their own session model, stream format, and process lifecycle.
- Real-time stream parsing — line-by-line NDJSON parsers for
claude --output-format stream-json,codex --json, andcodewhale exec --output-format stream-json, including handling of terminal escape sequences embedded mid-stream by upstream CLIs. - Cross-device state — durable local message store, runtime status that survives page reloads, project-scoped runtime keys, pairing-gate non-loopback access.
- Surgical compatibility shims — example: when
deepseek-tuiwas renamed tocodewhaleupstream, a 30-lineresolve_deepseek_executable()PATH probe replaced a 47-file hard rename (see commit 6d84254). - Operational hygiene — 700+ unit tests, smoke scripts, isolated public-preview launcher (
Start-AgentDeskPreview.cmd) that uses a separate data root so portfolio screenshots never leak real local history.
AI coding tools are powerful, but the workflow gets messy fast:
- Codex, Claude, Codewhale, and other CLIs each have their own session model.
- Long-running work is hard to follow from another device.
- A user should not have to understand whether a CLI process is alive, dead, or resumable.
- Sending context from one agent conversation to another usually means manual copy-paste.
- Phone access to a local coding agent should feel like continuing the same conversation, not remoting into a terminal.
Agent Desk treats mature CLIs as execution engines and keeps the product layer local: projects, conversations, messages, runtime state, queue state, stop/resume, and handoffs.
As provider-native mobile clients improve, Agent Desk should not compete as a generic mobile chat app for one provider. Its job is the local project control layer: one place to run and resume multiple CLI agents on the user's own workstation, keep the project/session identity stable, and let agents hand work to each other with user confirmation.
The phase-1 target is:
Phone / Mac / Windows browser
-> Agent Desk web UI
-> Windows host
-> local Codex / Claude / Codewhale process
The current golden path is:
- Add or select a project folder.
- Create or select a conversation.
- Choose provider settings such as adapter, model, thinking level, and permission mode.
- Send a message to the local CLI runtime.
- See running, queued, canceled, stopped, failed, or resumable state.
- Reopen the browser and continue the same Agent Desk conversation.
The product plan lives in docs/phase-1-launch-prd.md. A portfolio-ready demo outline lives in docs/portfolio-showcase.md, with the release checklist in docs/public-preview-checklist.md and the current evidence packet in docs/mature-preview-evidence.md.
- Project/folder based conversation list.
- Local web UI and Electron desktop shell.
- Windows LAN/Tailscale access for phone and Mac control surfaces.
- Pairing gate for non-loopback browser access.
- Durable local message store.
- Runtime adapters for Codex, Claude, and Codewhale flows.
- Visible queued message state.
- Cancel queued messages before they are sent.
- Stop the current running turn.
- Restore/resume a resumable conversation from the UI.
- Import/discover local provider sessions.
- Inter-conversation handoff protocol for agent collaboration.
- Focused unit tests around runtime state, queueing, restore, and UI wiring.
- Not a hosted SaaS.
- Not a generic mobile chat client for one provider.
- Not a replacement for Codex, Claude Code, or Codewhale.
- Not a secure multi-user server.
- Not a production remote-access gateway.
- Not a full IDE.
Use it on trusted local networks or through your own private network tooling.
Requirements:
- Python 3.10+
- Node.js if you want the Electron desktop shell
- At least one local AI CLI, depending on the adapter you want to use
From the repository root:
python -m pip install -e .Start the local browser UI:
agent-desk-uiThen open:
http://127.0.0.1:8765
This is the normal local product surface. It uses your real Agent Desk data root, usually ~\.agent-desk, so it can show real local projects, conversations, provider sessions, runtime failures, and logs.
Start the desktop shell:
.\Start-AgentDesk.cmdStart the browser UI for phone/Mac access over LAN or Tailscale:
.\Start-AgentDeskRemote.cmdThe remote launcher prints:
- the local URL;
- any detected Tailscale URL;
- the current pairing code;
- stdout/stderr log paths.
After pairing once, the browser stores a local token so the same device can keep using Agent Desk without re-entering the code.
Start an isolated public-preview demo without your real ~\.agent-desk history:
.\Start-AgentDeskPreview.cmdThen open:
http://127.0.0.1:8776
This is the isolated public-preview surface. The preview launcher uses a separate data directory under %TEMP%, so old local conversations and failed runtime records do not appear in portfolio screenshots or demos.
It also seeds a small Demo Workspace with Builder/Reviewer roles and sample files, so the first screen is useful without touching your real local history.
Use .\Start-AgentDeskPreview.cmd -Fresh when the demo must start from a clean preview data root.
Install Node dependencies:
npm installRun Electron:
npm run desktopRun the LAN helper:
npm run desktop:lanCheck Electron entrypoints:
npm run desktop:checkUseful focused checks:
python -m compileall src\agent_postbox
python -m unittest discover -s tests -p "test_postbox.py" -q
python -m unittest discover -s tests -p "test_runtime_status.py" -q
python -m unittest discover -s tests -p "test_runtime_state_store.py" -q
python scripts\smoke_agent_desk_web.py
python scripts\smoke_agent_desk_queue_stop.py
python scripts\smoke_agent_desk_handoff.py
python scripts\smoke_agent_desk_public_hygiene.py
python scripts\smoke_agent_desk_preview_launcher.pyPublic-preview smoke without spending provider quota:
python scripts\smoke_agent_desk_preview.pyThe public-preview release checklist is docs/public-preview-checklist.md.
Refresh the portfolio screenshots from the isolated preview:
node scripts\capture_agent_desk_preview_assets.mjs http://127.0.0.1:8776Before a live demo, run the real provider send smoke. It starts a local CLI runtime and may use model quota:
python scripts\smoke_agent_desk_send.py --adapter codexOr run the preview smoke plus a real provider send:
python scripts\smoke_agent_desk_preview.py --real-send --adapter codexWhen Chrome is available through the local DevTools endpoint, run the desktop and mobile viewport smokes against the preview URL:
python scripts\smoke_agent_desk_preview.py --desktop-cdp --mobile-cdp --preview-url http://127.0.0.1:8776The CDP smokes expect a running preview server and a Chrome DevTools endpoint. On Windows, launch Chrome with:
& "$env:ProgramFiles\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9229 --user-data-dir="$env:TEMP\agent-desk-mobile-cdp"The mobile smoke checks both 390x844 and a narrow 320x720 viewport. Use AGENT_DESK_CDP, AGENT_DESK_DESKTOP_WIDTH, AGENT_DESK_DESKTOP_HEIGHT, AGENT_DESK_MOBILE_WIDTH, and AGENT_DESK_MOBILE_HEIGHT to target a different browser endpoint or viewport when running the lower-level CDP scripts directly.
Full test discovery:
python -m unittest discover -s tests -p "test*.py"Agent Desk has two layers:
- Product layer: projects, conversations, messages, work sessions, handoffs, pairing, UI state.
- Runtime layer: local adapters that start, resume, stop, and send turns to CLI processes.
The durable product state belongs to Agent Desk. Provider-native sessions are useful for resume and diagnostics, but they are not the public user model.
The internal Python package is named agent_postbox and ships a lower-level file-backed message CLI (session-send, session-inbox, session-doctor, etc.). It is the storage layer underneath Agent Desk and is useful for headless smoke tests and inter-agent scripting. See docs/legacy-mailbox.md for the historical context, why the package name kept the old prefix, and example commands.
Agent Desk is local-first developer software. It can launch local commands and connect browser clients to local AI runtimes.
- Do not expose it directly to the public internet.
- Use trusted networks or private network tooling.
- Do not put secrets, credentials, cookies, JWTs, or private customer data into public demos.
- Review route files and command automation before enabling them.
- Treat local runtime logs and message stores as private development data.
See SECURITY.md for more notes.
This is a finished portfolio project, not an actively developed product. The first cross-device CLI control loop works end-to-end across the three target CLIs, and the source is public so the design and the code can be read.
It is not trying to win against vendor-native GUIs (Claude Code Desktop, Codex Desktop, etc.) — those have full-time teams and ship faster than a side project ever could. The value here is the demonstration: a single person designing the product layer / runtime adapter pattern / cross-device control surface end to end, with the test discipline (~700 unit tests, isolated preview launcher, public-hygiene smokes) to back it up.
I still use it on my own machine; I may push occasional polish or upstream-rename fixes (as in the recent codewhale rename). Larger feature work is out of scope.
