🤖 feat: sub-workspaces as subagents #1219

ThomasK33 · 2025-12-18T11:42:31Z

Implements “sub-workspaces as subagents” by introducing agent Tasks backed by child workspaces spawned via the new task tool.

Built-in presets: research, explore
Config + UI for max parallel tasks / nesting depth
Restart-safe orchestration (queueing, report delivery to parent, auto-resume)
Explicit reporting via agent_report + leaf auto-cleanup
Sidebar nesting for child workspaces

Validation:

make static-check
bun test src/node/services/tools/task.test.ts src/node/services/taskService.test.ts

📋 Implementation Plan

🤖 Sub-workspaces as subagents (Mux)

Decisions (confirmed)

Lifecycle: auto-delete the subagent workspace after it completes (and after its child tasks complete).
Isolation (runtime-aware): create subagent workspaces using the parent workspace’s runtime (runtimeConfig); prefer runtime.forkWorkspace(...) (when implemented) so the child starts from the parent’s branch.
Results: when the subagent finishes, it calls agent_report and we post the report back into the parent workspace.
Limits (configurable): max parallel subagents + max nesting depth (defaults: 3 parallel, depth 3).
Durability: if Mux restarts while tasks are running, tasks resume and the parent awaits existing tasks (no duplicate spawns).
Delegation: expose a task tool so any agent workspace can spawn sub-agent tasks (depth-limited).
Built-in presets (v1): Research + Explore.

Recommended approach: Workspace Tasks (net +~1700 LoC product code)

Represent each subagent as a Task (as described in subagents.md), implemented as a child workspace plus orchestration.

This keeps the v1 scope small while keeping the API surface task-shaped so we can later reuse it for non-agent tasks (e.g., background bashes).

High-level architecture

flowchart TD
  Parent["Parent workspace"]
  TaskTool["tool: task"]
  Spawn["Task.create(parentId, kind=agent, agentType, prompt)"]
  Child["Child workspace (agent)"]

  ReportTool["tool-call-end: agent_report"]
  Report["Append report message to parent history + emit chat event"]
  Cleanup["Remove child workspace + delete runtime resources"]

  StreamEndNoReport["stream-end (no report)"]
  Reminder["Send reminder: toolPolicy requires agent_report"]

  Parent --> TaskTool --> Spawn --> Child
  Child --> ReportTool --> Report --> Cleanup
  Child --> StreamEndNoReport --> Reminder --> Child

Data model

Alignment with subagents.md (what we’re matching)

Agent identity: Claude’s agentId maps cleanly to Mux’s workspaceId for the child workspace.
Spawning: Claude’s Task(subagent_type=…, prompt=…) becomes Mux tool task, backed by Task.create({ parentWorkspaceId, kind: "agent", agentType, prompt }).
Tool filtering: Claude’s tools/disallowedTools maps to Mux’s existing toolPolicy (applied in order).
Result propagation: Agent tasks use an explicit agent_report tool call (child → parent) plus a backend retry if the tool wasn’t called. (Future: bash tasks can map to existing background bash output, or be unified behind a Task.output API.)
Background vs foreground: task({ run_in_background: true, ... }) returns immediately; otherwise the tool blocks until the child calls agent_report (with a timeout).

Extend workspace metadata with optional fields:

parentWorkspaceId?: string — enables nesting in the UI
agentType?: "research" | "explore" | string — selects an agent preset

(These are optional so existing configs stay valid.)

Agent presets (built-in)

Create a small registry of agent presets that define:

toolPolicy (enforced)
systemPrompt (preset-defined; can replace or append; v1 uses replace so each subagent can fully override the parent’s user instructions)

Implementation detail: for agent task workspaces, treat the preset’s systemPrompt as the effective prompt (internal mode), instead of always appending to the parent workspace’s system message.

A required reporting mechanism: the agent must call agent_report exactly once when it has a final answer

Initial presets:

Research: allow web_search + web_fetch (and optionally file_read), disallow edits.
Explore: allow read-only repo exploration (likely file_read + bash for rg/git), disallow file edits.

Both presets should enable:

task (so agents can spawn subagents when useful)
agent_report (so leaf tasks have a single, explicit channel for reporting back)

Enforce max nesting depth from settings (default 3) in the backend to prevent runaway recursion.

Note: Mux doesn’t currently have a “grep/glob” tool; Explore will either need bash or we add a future safe-search tool.

Implementation steps

1) Schemas + types (IPC boundary)

Net +~50 LoC

Extend:
- WorkspaceMetadataSchema / FrontendWorkspaceMetadataSchema (src/common/orpc/schemas/workspace.ts)
- WorkspaceConfigSchema (src/common/orpc/schemas/project.ts)
Thread the new fields through WorkspaceMetadata / FrontendWorkspaceMetadata types.

2) Persist config (workspace tree + task settings)

Net +~320 LoC

Workspace tree fields
- Ensure config write paths preserve parentWorkspaceId and agentType.
- Update Config.getAllWorkspaceMetadata() to include the new fields when constructing metadata.
Task settings (global; shown in Settings UI)
- Persist taskSettings in ~/.mux/config.json, e.g.:
  - maxParallelAgentTasks (default 3)
  - maxTaskNestingDepth (default 3)
- Settings UI
  - Add a small Settings section (e.g. “Tasks”) with two number inputs.
  - Read via api.config.getConfig(); persist via api.config.saveConfig().
  - Clamp to safe ranges (e.g., parallel 1–10, depth 1–5) and show the defaults.
Task durability fields (per agent task workspace)
- Persist a minimal task state for child workspaces (e.g., taskStatus: queued|running|awaiting_report) so we can rehydrate and resume after restart.

3) Backend Task API: Task.create

Net +~450 LoC
Add a new task operation (ORPC + service) that is intentionally generic:

Task.create({ parentWorkspaceId, kind, ... })
- Return a task-shaped result: { taskId, kind, status }.

V1: implement kind: "agent" (sub-workspace agent task):

Validate parent workspace exists.
Enforce limits from taskSettings (configurable):
- Max nesting depth (maxTaskNestingDepth, default 3) by walking the parentWorkspaceId chain.
- Max parallel agent tasks (maxParallelAgentTasks, default 3) by counting running agent tasks globally (across the app).
- If parallel limit is reached: persist as status: "queued" and start later (FIFO).
Create a new child workspace ID + generated name (e.g., agent_research_<id>; must match [a-z0-9_-]{1,64}).
Runtime-aware: create the child workspace using the parent workspace’s runtimeConfig (Local/Worktree/SSH).
- Prefer runtime.forkWorkspace(...) (when implemented) so the child starts from the parent’s branch.
- Otherwise fall back to runtime.createWorkspace(...) with the same runtime config (no branch isolation).
Write workspace config entry including { parentWorkspaceId, agentType, taskStatus }.
When the task is started, send the initial prompt message into the child workspace.

Durability / restart:

On app startup, rehydrate queued/running tasks from config and resume them:
- queued tasks are scheduled respecting maxParallelAgentTasks
- running tasks get a synthetic “Mux restarted; continue + call agent_report” message.
Parent await semantics (restart-safe):
- While a parent workspace has any descendant agent tasks in queued|running|awaiting_report, treat it as “awaiting” and avoid starting new subagent tasks from it.
- When the final descendant task reports, automatically resume any parent partial stream that was waiting on the task tool call.

Design note: keep the return type “task-shaped” (e.g., { taskId, kind, status }) so we can later add kind: "bash" tasks that wrap existing background bashes.

4) Tool: `task` (agents can spawn sub-agents)

Net +~250 LoC
Expose a Claude-like Task tool to the LLM (but backed by Mux workspaces):

Tool: task
- Input (v1): { subagent_type: string, prompt: string, description?: string, run_in_background?: boolean }
- Behavior:
  - Spawn (or enqueue) a child agent task via Task.create({ parentWorkspaceId: <current workspaceId>, kind: "agent", agentType: subagent_type, prompt, ... }).
  - If run_in_background is true: return immediately { status: "queued" | "running", taskId }.
  - Otherwise: block (potentially across queue + execution) until the child calls agent_report (or timeout) and return { status: "completed", reportMarkdown }.
  - Durability: if this foreground wait is interrupted (app restart), the child task continues; when it reports, we persist the tool output into the parent message and auto-resume the parent stream.
- Wire-up: add to TOOL_DEFINITIONS + register in getToolsForModel(); inject taskService into ToolConfiguration so the tool can call Task.create.
Guardrails
- Enforce maxTaskNestingDepth and maxParallelAgentTasks from settings (defaults: depth 3, parallel 3).
  - If parallel limit is reached, new tasks are queued and the parent blocks/awaits until a slot is available.
- Disallow spawning new tasks after the workspace has called agent_report.

5) Enforce preset tool policy + system prompt

Net +~130 LoC
In the backend send/stream path:

Compute an effective tool policy:
- effectivePolicy = [...(options.toolPolicy ?? []), ...presetPolicy]
- Apply presetPolicy last so callers cannot re-enable restricted tools.
System prompt strategy for agent task workspaces (per preset):
- Replace (default): ignore the parent workspace’s user instructions and use the preset’s systemPrompt as the effective instructions (internal-only agent mode).
- Implementation: add an internal system-message variant (e.g., "agent") that starts from an empty base prompt (no user custom instructions), then apply preset.systemPrompt.
- Append (optional): keep the normal workspace system message and append preset instructions.
Ensure the preset prompt covers:
- When/how to delegate via the task tool (available subagent_types).
- When/how to call agent_report (final answer only; after any spawned tasks complete).

6) Auto-report back + auto-delete (orchestrator)

Net +~450 LoC
Add a small reporting tool + orchestrator that ensures the child reports back explicitly, and make it durable across restarts.

Tool: agent_report
- Input: { reportMarkdown: string, title?: string } (or similar)
- Execution: no side effects; return { success: true } (the backend uses the tool-call args as the report payload)
- Wire-up: add to TOOL_DEFINITIONS + register in getToolsForModel() as a non-runtime tool
Orchestrator behavior
- Primary path: handle tool-call-end for agent_report
  1. Validate workspaceId is an agent task workspace and has parentWorkspaceId.
  2. Persist completion (durable):
    - Update child workspace config: taskStatus: "reported" (+ reportedAt).
  3. Deliver report to the parent (durable):
    - Append an assistant message to the parent workspace history (so the user can read the report).
    - If the parent has a partial assistant message containing a pending task tool call, update that tool part from input-available → output-available with { reportMarkdown, title } (like the ask_user_question restart-safe fallback).
    - Emit tool-call-end + workspace.onChat events so the UI updates immediately.
  4. Auto-resume the parent (durable tool call semantics):
    - If the parent has a partial message and no active stream, call workspace.resumeStream(parent) after writing the tool output.
    - Only auto-resume once the parent has no remaining running descendant tasks (so it doesn’t spawn duplicates).
  5. Cleanup:
    - If the task has no remaining child tasks, delete the workspace + runtime resources (branch/worktree if applicable).
    - Otherwise, mark it pending cleanup and delete it once its subtree is gone.
- Enforcement path: if a stream ends without an agent_report call
  1. Send a synthetic "please report now" message into the child workspace with a toolPolicy that requires only agent_report.
  2. If still missing after one retry, fall back to posting the child's final text parts (last resort) and clean up to avoid hanging sub-workspaces.

7) UI: nested sidebar rows

Net +~100 LoC

Update sorting/rendering so child workspaces appear directly below the parent with indentation.
Add a small depth prop to WorkspaceListItem and adjust left padding.

8) No user-facing launcher (agent-orchestrated only)

Net +~0 LoC

Do not add slash commands / command palette actions for spawning tasks.
Tasks are launched exclusively via the model calling the task tool from the parent workspace.

9) Tests

~200 LoC tests (not counted in product LoC estimate)

Unit test: workspace tree flattening preserves parent→child adjacency.
Unit/integration test: task tool spawns/enqueues a child agent task and enforces maxTaskNestingDepth.
Unit/integration test: queueing respects maxParallelAgentTasks (extra tasks stay queued until a slot frees).
Unit/integration test: agent_report posts report to parent, updates waiting task tool output (restart-safe), and triggers cleanup (and reminder path when missing).
Unit test: toolPolicy merge guarantees presets can’t be overridden.

Follow-ups (explicitly out of scope for v1)

More presets (Review, Writer). “Writer” likely needs non-auto-delete so the branch/diff persists.
Task.create(kind: "bash") tasks that wrap existing background bashes (and optionally render under the parent like agent tasks).
Safe “code search” tools (Glob/Grep) to avoid granting bash to Explore.
Deeper nesting UX (collapse/expand, depth cap visuals).

Generated with codex cli • Model: gpt-5.2 • Thinking: xhigh

ThomasK33 · 2025-12-18T14:12:49Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/node/services/taskService.ts

ThomasK33 · 2025-12-18T14:46:43Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/node/services/taskService.ts

ThomasK33 · 2025-12-18T15:20:50Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/node/services/taskService.ts

ThomasK33 · 2025-12-18T16:43:47Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/browser/components/Settings/sections/TasksSection.tsx

ThomasK33 · 2025-12-18T17:24:29Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/node/services/aiService.ts

ThomasK33 · 2025-12-18T17:44:38Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/node/services/taskService.ts

ThomasK33 · 2025-12-18T18:08:12Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/node/services/taskService.ts

ThomasK33 · 2025-12-18T19:53:03Z

@codex review

chatgpt-codex-connector · 2025-12-18T20:07:38Z

Codex Review: Didn't find any major issues. You're on a roll.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Change-Id: I98401f98f52a9ba82adc854ef796fa7da0494553 Signed-off-by: Thomas Kosiewski <tk@coder.com>

Change-Id: I08c56fc42bd03e6c403b0ab320d429b5c6950eed Signed-off-by: Thomas Kosiewski <tk@coder.com>

ThomasK33 force-pushed the codex-cli-subagents branch 3 times, most recently from ce6d41b to 1c1ae0d Compare December 18, 2025 13:11

chatgpt-codex-connector bot reviewed Dec 18, 2025

View reviewed changes

src/node/services/taskService.ts Outdated Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Dec 18, 2025

View reviewed changes

src/node/services/taskService.ts Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Dec 18, 2025

View reviewed changes

src/node/services/taskService.ts Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Dec 18, 2025

View reviewed changes

src/browser/components/Settings/sections/TasksSection.tsx Show resolved Hide resolved

ThomasK33 force-pushed the codex-cli-subagents branch from 106c139 to 943b4e6 Compare December 18, 2025 17:23

chatgpt-codex-connector bot reviewed Dec 18, 2025

View reviewed changes

src/node/services/aiService.ts Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Dec 18, 2025

View reviewed changes

src/node/services/taskService.ts Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Dec 18, 2025

View reviewed changes

src/node/services/taskService.ts Show resolved Hide resolved

ThomasK33 force-pushed the codex-cli-subagents branch 6 times, most recently from a7dcf94 to 61a814b Compare December 19, 2025 15:52

🤖 feat: subagent tasks and reliable report delivery

846aa28

Change-Id: I98401f98f52a9ba82adc854ef796fa7da0494553 Signed-off-by: Thomas Kosiewski <tk@coder.com>

ThomasK33 force-pushed the codex-cli-subagents branch from 61a814b to 846aa28 Compare December 19, 2025 16:03

fix: include taskId in test assertion for blocking task result

8e70272

Change-Id: I08c56fc42bd03e6c403b0ab320d429b5c6950eed Signed-off-by: Thomas Kosiewski <tk@coder.com>

🤖 feat: sub-workspaces as subagents #1219

Are you sure you want to change the base?

🤖 feat: sub-workspaces as subagents #1219

Conversation

ThomasK33 commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🤖 Sub-workspaces as subagents (Mux)

Decisions (confirmed)

Recommended approach: Workspace Tasks (net +~1700 LoC product code)

High-level architecture

Data model

Agent presets (built-in)

Implementation steps

1) Schemas + types (IPC boundary)

2) Persist config (workspace tree + task settings)

3) Backend Task API: Task.create

4) Tool: task (agents can spawn sub-agents)

5) Enforce preset tool policy + system prompt

6) Auto-report back + auto-delete (orchestrator)

7) UI: nested sidebar rows

8) No user-facing launcher (agent-orchestrated only)

9) Tests

Uh oh!

ThomasK33 commented Dec 18, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ThomasK33 commented Dec 18, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ThomasK33 commented Dec 18, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ThomasK33 commented Dec 18, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ThomasK33 commented Dec 18, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ThomasK33 commented Dec 18, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ThomasK33 commented Dec 18, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ThomasK33 commented Dec 18, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

ThomasK33 commented Dec 18, 2025 •

edited

Loading

4) Tool: `task` (agents can spawn sub-agents)