Skip to content

ttcd77/agent-desk

Repository files navigation

Agent Desk

Portfolio project. A local cross-device workbench that controls multiple AI coding CLIs — Codex, Claude, Codewhale — from one browser UI. Built as an end-to-end demonstration of full-stack engineering: Python runtime layer, vanilla-JS web UI, Electron shell, cross-device control over LAN/Tailscale, and real-time parsing of three different CLI stream protocols. Not actively seeking adoption; the source is public so the design and the code can be read.

Agent Desk runs the product layer (projects, conversations, messages, runtime state, queue/cancel/stop, handoffs) locally and treats mature CLIs as execution adapters behind it. You start the server on a Windows workstation and control any of the CLIs from the same machine, a Mac, or a phone browser — the conversation, queue state, and runtime status all survive switching devices.

Agent Desk desktop preview

What this project demonstrates

  • Runtime adapter pattern — three vendor CLIs (Codex, Claude, Codewhale) plugged into a uniform send/stop/restore interface, each with their own session model, stream format, and process lifecycle.
  • Real-time stream parsing — line-by-line NDJSON parsers for claude --output-format stream-json, codex --json, and codewhale exec --output-format stream-json, including handling of terminal escape sequences embedded mid-stream by upstream CLIs.
  • Cross-device state — durable local message store, runtime status that survives page reloads, project-scoped runtime keys, pairing-gate non-loopback access.
  • Surgical compatibility shims — example: when deepseek-tui was renamed to codewhale upstream, a 30-line resolve_deepseek_executable() PATH probe replaced a 47-file hard rename (see commit 6d84254).
  • Operational hygiene — 700+ unit tests, smoke scripts, isolated public-preview launcher (Start-AgentDeskPreview.cmd) that uses a separate data root so portfolio screenshots never leak real local history.

Why It Exists

AI coding tools are powerful, but the workflow gets messy fast:

  • Codex, Claude, Codewhale, and other CLIs each have their own session model.
  • Long-running work is hard to follow from another device.
  • A user should not have to understand whether a CLI process is alive, dead, or resumable.
  • Sending context from one agent conversation to another usually means manual copy-paste.
  • Phone access to a local coding agent should feel like continuing the same conversation, not remoting into a terminal.

Agent Desk treats mature CLIs as execution engines and keeps the product layer local: projects, conversations, messages, runtime state, queue state, stop/resume, and handoffs.

As provider-native mobile clients improve, Agent Desk should not compete as a generic mobile chat app for one provider. Its job is the local project control layer: one place to run and resume multiple CLI agents on the user's own workstation, keep the project/session identity stable, and let agents hand work to each other with user confirmation.

Current Scope

The phase-1 target is:

Phone / Mac / Windows browser
  -> Agent Desk web UI
      -> Windows host
          -> local Codex / Claude / Codewhale process

The current golden path is:

  1. Add or select a project folder.
  2. Create or select a conversation.
  3. Choose provider settings such as adapter, model, thinking level, and permission mode.
  4. Send a message to the local CLI runtime.
  5. See running, queued, canceled, stopped, failed, or resumable state.
  6. Reopen the browser and continue the same Agent Desk conversation.

The product plan lives in docs/phase-1-launch-prd.md. A portfolio-ready demo outline lives in docs/portfolio-showcase.md, with the release checklist in docs/public-preview-checklist.md and the current evidence packet in docs/mature-preview-evidence.md.

Features In This Preview

  • Project/folder based conversation list.
  • Local web UI and Electron desktop shell.
  • Windows LAN/Tailscale access for phone and Mac control surfaces.
  • Pairing gate for non-loopback browser access.
  • Durable local message store.
  • Runtime adapters for Codex, Claude, and Codewhale flows.
  • Visible queued message state.
  • Cancel queued messages before they are sent.
  • Stop the current running turn.
  • Restore/resume a resumable conversation from the UI.
  • Import/discover local provider sessions.
  • Inter-conversation handoff protocol for agent collaboration.
  • Focused unit tests around runtime state, queueing, restore, and UI wiring.

What It Is Not Yet

  • Not a hosted SaaS.
  • Not a generic mobile chat client for one provider.
  • Not a replacement for Codex, Claude Code, or Codewhale.
  • Not a secure multi-user server.
  • Not a production remote-access gateway.
  • Not a full IDE.

Use it on trusted local networks or through your own private network tooling.

Install

Requirements:

  • Python 3.10+
  • Node.js if you want the Electron desktop shell
  • At least one local AI CLI, depending on the adapter you want to use

From the repository root:

python -m pip install -e .

Start the local browser UI:

agent-desk-ui

Then open:

http://127.0.0.1:8765

This is the normal local product surface. It uses your real Agent Desk data root, usually ~\.agent-desk, so it can show real local projects, conversations, provider sessions, runtime failures, and logs.

Windows Launchers

Start the desktop shell:

.\Start-AgentDesk.cmd

Start the browser UI for phone/Mac access over LAN or Tailscale:

.\Start-AgentDeskRemote.cmd

The remote launcher prints:

  • the local URL;
  • any detected Tailscale URL;
  • the current pairing code;
  • stdout/stderr log paths.

After pairing once, the browser stores a local token so the same device can keep using Agent Desk without re-entering the code.

Start an isolated public-preview demo without your real ~\.agent-desk history:

.\Start-AgentDeskPreview.cmd

Then open:

http://127.0.0.1:8776

This is the isolated public-preview surface. The preview launcher uses a separate data directory under %TEMP%, so old local conversations and failed runtime records do not appear in portfolio screenshots or demos. It also seeds a small Demo Workspace with Builder/Reviewer roles and sample files, so the first screen is useful without touching your real local history. Use .\Start-AgentDeskPreview.cmd -Fresh when the demo must start from a clean preview data root.

Desktop Shell

Install Node dependencies:

npm install

Run Electron:

npm run desktop

Run the LAN helper:

npm run desktop:lan

Check Electron entrypoints:

npm run desktop:check

Tests

Useful focused checks:

python -m compileall src\agent_postbox
python -m unittest discover -s tests -p "test_postbox.py" -q
python -m unittest discover -s tests -p "test_runtime_status.py" -q
python -m unittest discover -s tests -p "test_runtime_state_store.py" -q
python scripts\smoke_agent_desk_web.py
python scripts\smoke_agent_desk_queue_stop.py
python scripts\smoke_agent_desk_handoff.py
python scripts\smoke_agent_desk_public_hygiene.py
python scripts\smoke_agent_desk_preview_launcher.py

Public-preview smoke without spending provider quota:

python scripts\smoke_agent_desk_preview.py

The public-preview release checklist is docs/public-preview-checklist.md.

Refresh the portfolio screenshots from the isolated preview:

node scripts\capture_agent_desk_preview_assets.mjs http://127.0.0.1:8776

Before a live demo, run the real provider send smoke. It starts a local CLI runtime and may use model quota:

python scripts\smoke_agent_desk_send.py --adapter codex

Or run the preview smoke plus a real provider send:

python scripts\smoke_agent_desk_preview.py --real-send --adapter codex

When Chrome is available through the local DevTools endpoint, run the desktop and mobile viewport smokes against the preview URL:

python scripts\smoke_agent_desk_preview.py --desktop-cdp --mobile-cdp --preview-url http://127.0.0.1:8776

The CDP smokes expect a running preview server and a Chrome DevTools endpoint. On Windows, launch Chrome with:

& "$env:ProgramFiles\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9229 --user-data-dir="$env:TEMP\agent-desk-mobile-cdp"

The mobile smoke checks both 390x844 and a narrow 320x720 viewport. Use AGENT_DESK_CDP, AGENT_DESK_DESKTOP_WIDTH, AGENT_DESK_DESKTOP_HEIGHT, AGENT_DESK_MOBILE_WIDTH, and AGENT_DESK_MOBILE_HEIGHT to target a different browser endpoint or viewport when running the lower-level CDP scripts directly.

Full test discovery:

python -m unittest discover -s tests -p "test*.py"

Architecture

Agent Desk has two layers:

  • Product layer: projects, conversations, messages, work sessions, handoffs, pairing, UI state.
  • Runtime layer: local adapters that start, resume, stop, and send turns to CLI processes.

The durable product state belongs to Agent Desk. Provider-native sessions are useful for resume and diagnostics, but they are not the public user model.

Legacy Mailbox

The internal Python package is named agent_postbox and ships a lower-level file-backed message CLI (session-send, session-inbox, session-doctor, etc.). It is the storage layer underneath Agent Desk and is useful for headless smoke tests and inter-agent scripting. See docs/legacy-mailbox.md for the historical context, why the package name kept the old prefix, and example commands.

Security Notes

Agent Desk is local-first developer software. It can launch local commands and connect browser clients to local AI runtimes.

  • Do not expose it directly to the public internet.
  • Use trusted networks or private network tooling.
  • Do not put secrets, credentials, cookies, JWTs, or private customer data into public demos.
  • Review route files and command automation before enabling them.
  • Treat local runtime logs and message stores as private development data.

See SECURITY.md for more notes.

Status

This is a finished portfolio project, not an actively developed product. The first cross-device CLI control loop works end-to-end across the three target CLIs, and the source is public so the design and the code can be read.

It is not trying to win against vendor-native GUIs (Claude Code Desktop, Codex Desktop, etc.) — those have full-time teams and ship faster than a side project ever could. The value here is the demonstration: a single person designing the product layer / runtime adapter pattern / cross-device control surface end to end, with the test discipline (~700 unit tests, isolated preview launcher, public-hygiene smokes) to back it up.

I still use it on my own machine; I may push occasional polish or upstream-rename fixes (as in the recent codewhale rename). Larger feature work is out of scope.

About

Portfolio project: a local cross-device workbench that controls multiple AI coding CLIs (Codex / Claude / Codewhale) from one browser UI. Full-stack demo of runtime adapter pattern, real-time stream parsing, and pairing-gated LAN access.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors