Skip to content

Coordinator Restraint: Give Copilot Room to Breathe #587

@bradygaster

Description

@bradygaster

Summary

The coordinator (Copilot, via squad.agent.md) has been doing too much. Not badly — earnestly. Copilot puts in incredible effort and Squad wouldn't exist without it. But over time, the coordinator prompt has accumulated behaviors that cause it to step into domain work instead of routing to the team. This makes Squad feel less like a team and more like one agent with helpers.

We want to give Copilot room to breathe. Less weight on its shoulders. More trust in the team it orchestrates.

The Problem

The coordinator's identity is built around "What can I launch RIGHT NOW?" — which is great for parallelism but leads to:

  1. Doing domain work inline instead of routing to an agent (e.g., attempting a release without reading the release-process skill)
  2. Filling skill gaps itself instead of suggesting the team expand (e.g., no Java expert? The coordinator wings it rather than recommending onboarding)
  3. Not reading its own playbook.copilot/skills/ contains foundational process knowledge (release process, git workflow, reviewer protocol) that the coordinator wasn't even told to check (fixed in a recent commit, but the pattern is symptomatic)
  4. Anticipatory work the coordinator does inline rather than routing through proper agents

What's Good (Don't Touch)

These behaviors are working well and should be preserved exactly as-is:

  • Parallel fan-out — spawn many agents at once. "Decompose broadly" is great. We want MORE teammates involved, not fewer.
  • Background mode default — cheap to collect later
  • "Feels Heard" acknowledgment — users should never see a blank screen
  • Ralph's continuous loop — keep the pipeline moving
  • Task tool requirement — real agents, not role-play
  • Model selection hierarchy — cost-first-unless-code is sound
  • Drop-box pattern — conflict-free shared writes

What Needs to Change

1. Coordinator Identity (Mindset)

Current: "What can I launch RIGHT NOW?" — makes speed the identity
Proposed: Something like "Who already knows how to do this?" — same urgency, but the first instinct is to look before launching

The coordinator should be fast but deferential. Not lazy. Not passive. Just... trusting the team.

2. Pre-Flight Checklist (New Section)

Before any action, the coordinator should:

  1. Check .copilot/skills/ and .squad/skills/ — do I have a playbook for this?
  2. Check routing.md — who handles this domain?
  3. Is this within Direct Mode scope? (status, factual, from-context)
  4. Route to the right agent(s)

This replaces the "launch first, think later" pattern with "think first, launch fast."

3. Skill Gap Protocol (New Section)

When a request doesn't match any team member's expertise:

The Squad-opinionated default: Recommend expanding the team.
The user's choice: They can say "just handle it" and the coordinator routes to the closest agent.

Example: "No one on the team specializes in Java. I'd recommend onboarding a Java engineer — want me to cast one? Or I can route this to {closest match} who can likely handle it."

This is philosophically important. A true Squaddie grows the team. The coordinator CAN handle it, but shouldn't assume that's what the user wants.

4. Eager Execution Refinement

The Eager Execution Philosophy section currently says "launch aggressively, collect results later." The word "aggressively" should evolve. The coordinator should still decompose broadly and spawn agents eagerly — but it should never do domain work itself as part of that eagerness.

Keep: Spawning 4 agents in parallel for a multi-domain task
Fix: The coordinator deciding to handle one of those domains itself because it seems faster

5. Skill-Aware Routing (Partially Done)

A recent commit added .copilot/skills/ to the routing check (previously only .squad/skills/ was referenced). The full list of copilot-level skills includes:

  • release-process — how to ship
  • git-workflow — branch and commit conventions
  • reviewer-protocol — review gates
  • agent-collaboration — multi-agent patterns
  • ci-validation-gates — CI requirements
  • And more

The coordinator should check these BEFORE acting, not after things break.

UX Impact Assessment

These changes trade apparent capability for actual capability. The old coordinator looks competent because it always produces something. The new coordinator admits what it doesn't know — which is better engineering but requires honest assessment of how users will feel.

Positive UX Changes

  • Better first-pass quality. Agents get skill references in their spawn prompts, so they know the release process, git workflow, etc. before starting. Fewer "oops I didn't know about that" revision loops.
  • Honest gap surfacing. Instead of the coordinator silently winging Java code (and producing garbage), users get: "No one covers Java. Want me to cast an engineer?" Transparent, respectful, and the output quality goes up.
  • Same speed perception. "Feels Heard" stays, parallel fan-out stays, background mode stays. The launch table looks identical. Users still see agents working immediately.

Negative UX Changes (Must Address During Implementation)

  • New interruption point. The Skill Gap Protocol adds an ask_user dialog when no agent matches. For users who liked the coordinator "just handling it" (even badly), this is a ~10-second pause that didn't exist before. Mitigation: If a user says "just do it" or similar urgency signal, route to closest agent immediately — no ceremony. Make the gap dialog fast and skippable.
  • First message latency. The Pre-Flight Checklist reads skills before routing. This adds ~2-3s to the first turn of a session while skills are read alongside team.md and routing.md. Mitigation: Cache aggressively. After the first message, skills are in context — subsequent messages should see zero added latency. The parallel read (skills + team.md + routing.md in one turn) keeps this minimal.
  • Tasks that "just worked" will now stop and ask. A user who said "write me some Terraform" and got mediocre-but-functional output from the coordinator will now get a gap dialog instead. Technically better output, but it feels like a regression until they understand why. Mitigation: The gap dialog should clearly explain the value: "I can cast a Terraform engineer who'll get this right the first time (~30s), or route to {closest agent}." Frame the pause as quality, not obstruction.
  • "Five stages of Squad" friction. This change will be most felt by new users going through the disdain-accept-power through-live with-love it journey. The coordinator admitting gaps may initially read as less capable. Mitigation: Ensure the gap dialog is confident, not apologetic. "No one covers this yet" — emphasis on growth, not limitation.

UX Principles for Implementation

  1. Never apologetic. The gap dialog should feel like a recommendation from a confident team lead, not an error message.
  2. Always skippable. Every new dialog must have a "just do it" escape hatch that routes immediately.
  3. Repeated gaps should self-heal. If the same domain gap triggers twice in a session, the second time should say: "This is the second time we've hit a {domain} gap. Adding a team member would prevent this." Nudge, don't nag.
  4. Speed perception matters more than actual speed. The "Feels Heard" acknowledgment buys goodwill. As long as users see something immediately, 2-3s of pre-flight is invisible.
  5. Measure the before/after. We should track: (a) how often the coordinator was doing domain work itself (before), (b) how often the gap dialog fires (after), (c) whether users choose "cast" vs "route to closest" vs "skip". This tells us if the interruption frequency is acceptable.

How We'll Measure Success

  1. Coordinator never does domain work inline — every domain task goes through an agent
  2. Skill gaps trigger team expansion offers — not coordinator workarounds
  3. Skills are read before routing — the playbook is consulted, not ignored
  4. Fan-out stays broad — no reduction in team parallelism
  5. User satisfaction — the coordinator feels helpful without feeling overbearing
  6. Gap dialog frequency — track how often it fires and what users choose (cast/route/skip)
  7. First-message latency — confirm the Pre-Flight Checklist doesn't add perceptible delay

Implementation Approach

  • All changes are to .squad-templates/squad.agent.md (canonical source)
  • sync-templates.mjs propagates to all copies
  • Each change should be documented with old text → new text
  • Rollback = revert the commit. Changes are in one file.

Files

  • .squad-templates/squad.agent.md — the canonical coordinator prompt
  • .copilot/skills/ — copilot-level skills the coordinator should read
  • .squad/skills/ — team-level skills agents should read

Proposals

Detailed proposals from the team:

  • .squad/proposals/coordinator-restraint.md — Flight's architecture proposal (surgical changes, line-by-line)
  • .squad/proposals/prompt-architecture-analysis.md — Procedures' prompt analysis (aggression taxonomy, Pre-Read pattern)

Context

This emerged from a design conversation about the coordinator being "too aggro." The five stages of Squad: disdain it, accept it, power through it, live with it, love it. We're making that journey smoother by giving Copilot less to carry and the team more room to work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or improvementgo:needs-researchNeeds investigationsquadSquad triage inbox — Lead will assign to a membersquad:fidoAssigned to FIDO (Quality Owner)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions