Skip to content

Conversation

@ThomasK33
Copy link
Member

@ThomasK33 ThomasK33 commented Dec 18, 2025

Implements “sub-workspaces as subagents” by introducing agent Tasks backed by child workspaces spawned via the new task tool.

  • Built-in presets: research, explore
  • Config + UI for max parallel tasks / nesting depth
  • Restart-safe orchestration (queueing, report delivery to parent, auto-resume)
  • Explicit reporting via agent_report + leaf auto-cleanup
  • Sidebar nesting for child workspaces

Validation:

  • make static-check
  • bun test src/node/services/tools/task.test.ts src/node/services/taskService.test.ts

📋 Implementation Plan

🤖 Sub-workspaces as subagents (Mux)

Decisions (confirmed)

  • Lifecycle: auto-delete the subagent workspace after it completes (and after its child tasks complete).
  • Isolation (runtime-aware): create subagent workspaces using the parent workspace’s runtime (runtimeConfig); prefer runtime.forkWorkspace(...) (when implemented) so the child starts from the parent’s branch.
  • Results: when the subagent finishes, it calls agent_report and we post the report back into the parent workspace.
  • Limits (configurable): max parallel subagents + max nesting depth (defaults: 3 parallel, depth 3).
  • Durability: if Mux restarts while tasks are running, tasks resume and the parent awaits existing tasks (no duplicate spawns).
  • Delegation: expose a task tool so any agent workspace can spawn sub-agent tasks (depth-limited).
  • Built-in presets (v1): Research + Explore.

Recommended approach: Workspace Tasks (net +~1700 LoC product code)

Represent each subagent as a Task (as described in subagents.md), implemented as a child workspace plus orchestration.

This keeps the v1 scope small while keeping the API surface task-shaped so we can later reuse it for non-agent tasks (e.g., background bashes).

High-level architecture

flowchart TD
  Parent["Parent workspace"]
  TaskTool["tool: task"]
  Spawn["Task.create(parentId, kind=agent, agentType, prompt)"]
  Child["Child workspace (agent)"]

  ReportTool["tool-call-end: agent_report"]
  Report["Append report message to parent history + emit chat event"]
  Cleanup["Remove child workspace + delete runtime resources"]

  StreamEndNoReport["stream-end (no report)"]
  Reminder["Send reminder: toolPolicy requires agent_report"]

  Parent --> TaskTool --> Spawn --> Child
  Child --> ReportTool --> Report --> Cleanup
  Child --> StreamEndNoReport --> Reminder --> Child
Loading

Data model

Alignment with subagents.md (what we’re matching)
  • Agent identity: Claude’s agentId maps cleanly to Mux’s workspaceId for the child workspace.
  • Spawning: Claude’s Task(subagent_type=…, prompt=…) becomes Mux tool task, backed by Task.create({ parentWorkspaceId, kind: "agent", agentType, prompt }).
  • Tool filtering: Claude’s tools/disallowedTools maps to Mux’s existing toolPolicy (applied in order).
  • Result propagation: Agent tasks use an explicit agent_report tool call (child → parent) plus a backend retry if the tool wasn’t called. (Future: bash tasks can map to existing background bash output, or be unified behind a Task.output API.)
  • Background vs foreground: task({ run_in_background: true, ... }) returns immediately; otherwise the tool blocks until the child calls agent_report (with a timeout).

Extend workspace metadata with optional fields:

  • parentWorkspaceId?: string — enables nesting in the UI
  • agentType?: "research" | "explore" | string — selects an agent preset

(These are optional so existing configs stay valid.)

Agent presets (built-in)

Create a small registry of agent presets that define:

  • toolPolicy (enforced)
  • systemPrompt (preset-defined; can replace or append; v1 uses replace so each subagent can fully override the parent’s user instructions)

Implementation detail: for agent task workspaces, treat the preset’s systemPrompt as the effective prompt (internal mode), instead of always appending to the parent workspace’s system message.

  • A required reporting mechanism: the agent must call agent_report exactly once when it has a final answer

Initial presets:

  • Research: allow web_search + web_fetch (and optionally file_read), disallow edits.
  • Explore: allow read-only repo exploration (likely file_read + bash for rg/git), disallow file edits.

Both presets should enable:

  • task (so agents can spawn subagents when useful)
  • agent_report (so leaf tasks have a single, explicit channel for reporting back)

Enforce max nesting depth from settings (default 3) in the backend to prevent runaway recursion.

Note: Mux doesn’t currently have a “grep/glob” tool; Explore will either need bash or we add a future safe-search tool.


Implementation steps

1) Schemas + types (IPC boundary)

Net +~50 LoC

  • Extend:
    • WorkspaceMetadataSchema / FrontendWorkspaceMetadataSchema (src/common/orpc/schemas/workspace.ts)
    • WorkspaceConfigSchema (src/common/orpc/schemas/project.ts)
  • Thread the new fields through WorkspaceMetadata / FrontendWorkspaceMetadata types.

2) Persist config (workspace tree + task settings)

Net +~320 LoC

  • Workspace tree fields

    • Ensure config write paths preserve parentWorkspaceId and agentType.
    • Update Config.getAllWorkspaceMetadata() to include the new fields when constructing metadata.
  • Task settings (global; shown in Settings UI)

    • Persist taskSettings in ~/.mux/config.json, e.g.:
      • maxParallelAgentTasks (default 3)
      • maxTaskNestingDepth (default 3)
    • Settings UI
      • Add a small Settings section (e.g. “Tasks”) with two number inputs.
      • Read via api.config.getConfig(); persist via api.config.saveConfig().
      • Clamp to safe ranges (e.g., parallel 1–10, depth 1–5) and show the defaults.
  • Task durability fields (per agent task workspace)

    • Persist a minimal task state for child workspaces (e.g., taskStatus: queued|running|awaiting_report) so we can rehydrate and resume after restart.

3) Backend Task API: Task.create

Net +~450 LoC
Add a new task operation (ORPC + service) that is intentionally generic:

  • Task.create({ parentWorkspaceId, kind, ... })
    • Return a task-shaped result: { taskId, kind, status }.

V1: implement kind: "agent" (sub-workspace agent task):

  1. Validate parent workspace exists.
  2. Enforce limits from taskSettings (configurable):
    • Max nesting depth (maxTaskNestingDepth, default 3) by walking the parentWorkspaceId chain.
    • Max parallel agent tasks (maxParallelAgentTasks, default 3) by counting running agent tasks globally (across the app).
    • If parallel limit is reached: persist as status: "queued" and start later (FIFO).
  3. Create a new child workspace ID + generated name (e.g., agent_research_<id>; must match [a-z0-9_-]{1,64}).
  4. Runtime-aware: create the child workspace using the parent workspace’s runtimeConfig (Local/Worktree/SSH).
    • Prefer runtime.forkWorkspace(...) (when implemented) so the child starts from the parent’s branch.
    • Otherwise fall back to runtime.createWorkspace(...) with the same runtime config (no branch isolation).
  5. Write workspace config entry including { parentWorkspaceId, agentType, taskStatus }.
  6. When the task is started, send the initial prompt message into the child workspace.

Durability / restart:

  • On app startup, rehydrate queued/running tasks from config and resume them:

    • queued tasks are scheduled respecting maxParallelAgentTasks
    • running tasks get a synthetic “Mux restarted; continue + call agent_report” message.
  • Parent await semantics (restart-safe):

    • While a parent workspace has any descendant agent tasks in queued|running|awaiting_report, treat it as “awaiting” and avoid starting new subagent tasks from it.
    • When the final descendant task reports, automatically resume any parent partial stream that was waiting on the task tool call.

Design note: keep the return type “task-shaped” (e.g., { taskId, kind, status }) so we can later add kind: "bash" tasks that wrap existing background bashes.

4) Tool: task (agents can spawn sub-agents)

Net +~250 LoC
Expose a Claude-like Task tool to the LLM (but backed by Mux workspaces):

  • Tool: task

    • Input (v1): { subagent_type: string, prompt: string, description?: string, run_in_background?: boolean }

    • Behavior:

      • Spawn (or enqueue) a child agent task via Task.create({ parentWorkspaceId: <current workspaceId>, kind: "agent", agentType: subagent_type, prompt, ... }).

      • If run_in_background is true: return immediately { status: "queued" | "running", taskId }.

      • Otherwise: block (potentially across queue + execution) until the child calls agent_report (or timeout) and return { status: "completed", reportMarkdown }.

      • Durability: if this foreground wait is interrupted (app restart), the child task continues; when it reports, we persist the tool output into the parent message and auto-resume the parent stream.

    • Wire-up: add to TOOL_DEFINITIONS + register in getToolsForModel(); inject taskService into ToolConfiguration so the tool can call Task.create.

  • Guardrails

    • Enforce maxTaskNestingDepth and maxParallelAgentTasks from settings (defaults: depth 3, parallel 3).
      • If parallel limit is reached, new tasks are queued and the parent blocks/awaits until a slot is available.
    • Disallow spawning new tasks after the workspace has called agent_report.

5) Enforce preset tool policy + system prompt

Net +~130 LoC
In the backend send/stream path:

  • Compute an effective tool policy:
    • effectivePolicy = [...(options.toolPolicy ?? []), ...presetPolicy]
    • Apply presetPolicy last so callers cannot re-enable restricted tools.
  • System prompt strategy for agent task workspaces (per preset):
    • Replace (default): ignore the parent workspace’s user instructions and use the preset’s systemPrompt as the effective instructions (internal-only agent mode).
    • Implementation: add an internal system-message variant (e.g., "agent") that starts from an empty base prompt (no user custom instructions), then apply preset.systemPrompt.
    • Append (optional): keep the normal workspace system message and append preset instructions.
  • Ensure the preset prompt covers:
    • When/how to delegate via the task tool (available subagent_types).
    • When/how to call agent_report (final answer only; after any spawned tasks complete).

6) Auto-report back + auto-delete (orchestrator)

Net +~450 LoC
Add a small reporting tool + orchestrator that ensures the child reports back explicitly, and make it durable across restarts.

  • Tool: agent_report

    • Input: { reportMarkdown: string, title?: string } (or similar)
    • Execution: no side effects; return { success: true } (the backend uses the tool-call args as the report payload)
    • Wire-up: add to TOOL_DEFINITIONS + register in getToolsForModel() as a non-runtime tool
  • Orchestrator behavior

    • Primary path: handle tool-call-end for agent_report

      1. Validate workspaceId is an agent task workspace and has parentWorkspaceId.
      2. Persist completion (durable):
        • Update child workspace config: taskStatus: "reported" (+ reportedAt).
      3. Deliver report to the parent (durable):
        • Append an assistant message to the parent workspace history (so the user can read the report).
        • If the parent has a partial assistant message containing a pending task tool call, update that tool part from input-availableoutput-available with { reportMarkdown, title } (like the ask_user_question restart-safe fallback).
        • Emit tool-call-end + workspace.onChat events so the UI updates immediately.
      4. Auto-resume the parent (durable tool call semantics):
        • If the parent has a partial message and no active stream, call workspace.resumeStream(parent) after writing the tool output.
        • Only auto-resume once the parent has no remaining running descendant tasks (so it doesn’t spawn duplicates).
      5. Cleanup:
        • If the task has no remaining child tasks, delete the workspace + runtime resources (branch/worktree if applicable).
        • Otherwise, mark it pending cleanup and delete it once its subtree is gone.
    • Enforcement path: if a stream ends without an agent_report call

      1. Send a synthetic "please report now" message into the child workspace with a toolPolicy that requires only agent_report.
      2. If still missing after one retry, fall back to posting the child's final text parts (last resort) and clean up to avoid hanging sub-workspaces.

7) UI: nested sidebar rows

Net +~100 LoC

  • Update sorting/rendering so child workspaces appear directly below the parent with indentation.
  • Add a small depth prop to WorkspaceListItem and adjust left padding.

8) No user-facing launcher (agent-orchestrated only)

Net +~0 LoC

  • Do not add slash commands / command palette actions for spawning tasks.
  • Tasks are launched exclusively via the model calling the task tool from the parent workspace.

9) Tests

~200 LoC tests (not counted in product LoC estimate)

  • Unit test: workspace tree flattening preserves parent→child adjacency.
  • Unit/integration test: task tool spawns/enqueues a child agent task and enforces maxTaskNestingDepth.
  • Unit/integration test: queueing respects maxParallelAgentTasks (extra tasks stay queued until a slot frees).
  • Unit/integration test: agent_report posts report to parent, updates waiting task tool output (restart-safe), and triggers cleanup (and reminder path when missing).
  • Unit test: toolPolicy merge guarantees presets can’t be overridden.
Follow-ups (explicitly out of scope for v1)
  • More presets (Review, Writer). “Writer” likely needs non-auto-delete so the branch/diff persists.
  • Task.create(kind: "bash") tasks that wrap existing background bashes (and optionally render under the parent like agent tasks).
  • Safe “code search” tools (Glob/Grep) to avoid granting bash to Explore.
  • Deeper nesting UX (collapse/expand, depth cap visuals).

Generated with codex cli • Model: gpt-5.2 • Thinking: xhigh

@ThomasK33 ThomasK33 force-pushed the codex-cli-subagents branch 3 times, most recently from ce6d41b to 1c1ae0d Compare December 18, 2025 13:11
@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. You're on a roll.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33 ThomasK33 force-pushed the codex-cli-subagents branch 6 times, most recently from a7dcf94 to 61a814b Compare December 19, 2025 15:52
Change-Id: I98401f98f52a9ba82adc854ef796fa7da0494553
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I08c56fc42bd03e6c403b0ab320d429b5c6950eed
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant