feat: add XML tool calling support as provider setting#11973
feat: add XML tool calling support as provider setting#11973jthweny wants to merge 113 commits intoRooCodeInc:mainfrom
Conversation
Add a useXmlToolCalling boolean toggle to provider settings that enables text-based XML tool calling instead of native function calling. Phase 1 - System Prompt: - Add useXmlToolCalling to baseProviderSettingsSchema in provider-settings.ts - Modify getSharedToolUseSection() to return XML formatting instructions when useXmlToolCalling is true - Make getToolUseGuidelinesSection() XML-aware with conditional steps - Thread useXmlToolCalling through SYSTEM_PROMPT(), generateSystemPrompt(), and Task.getSystemPrompt() - Add UI toggle checkbox in ApiOptions.tsx settings panel - Add i18n string for the toggle label Phase 2 - Transport Layer: - Add useXmlToolCalling to ApiHandlerCreateMessageMetadata interface - Conditionally omit native tools/tool_choice from Anthropic API requests when useXmlToolCalling is enabled - Same conditional omission for Anthropic Vertex provider - Thread useXmlToolCalling from provider settings into API request metadata in Task.attemptApiRequest() The existing TagMatcher-based text parsing in presentAssistantMessage() automatically handles XML tool calls when the model outputs them as raw text (which occurs when native tools are omitted from the request). Tests: 9 new tool-use.spec.ts tests + 3 new anthropic.spec.ts tests, all passing.
There was a problem hiding this comment.
Pull request overview
Adds a provider setting (useXmlToolCalling) intended to switch Anthropic/Vertex from native tool calling to XML-in-text tool calling, by updating the system prompt and omitting native tool parameters from certain provider API requests.
Changes:
- Add
useXmlToolCallingto provider settings schema and expose it in the webview “Advanced settings” UI. - Thread
useXmlToolCallinginto system prompt generation to emit XML tool-calling instructions. - Update Anthropic + Anthropic Vertex request building to omit
tools/tool_choicewhenuseXmlToolCallingis enabled, with new tests asserting omission.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| webview-ui/src/i18n/locales/en/settings.json | Adds UI strings for the new advanced setting. |
| webview-ui/src/components/settings/ApiOptions.tsx | Adds the “Use XML tool calling” checkbox in Advanced settings. |
| src/core/webview/generateSystemPrompt.ts | Threads the new toggle into system prompt preview generation. |
| src/core/task/Task.ts | Threads the toggle into runtime prompt generation and API handler metadata. |
| src/core/prompts/system.ts | Passes the toggle into tool-use prompt sections. |
| src/core/prompts/sections/tool-use.ts | Adds XML-mode tool-use instructions section. |
| src/core/prompts/sections/tool-use-guidelines.ts | Adds XML-mode reinforcement to tool-use guidelines. |
| src/core/prompts/sections/tests/tool-use.spec.ts | Adds/extends unit tests for XML vs native prompt sections. |
| src/api/providers/anthropic.ts | Omits native tool params when XML mode is enabled. |
| src/api/providers/anthropic-vertex.ts | Omits native tool params when XML mode is enabled. |
| src/api/providers/tests/anthropic.spec.ts | Adds tests asserting omission/presence of tool params based on the toggle. |
| src/api/index.ts | Adds useXmlToolCalling to handler metadata interface and documents intended behavior. |
| packages/types/src/provider-settings.ts | Adds useXmlToolCalling to provider settings schema. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| : "" | ||
|
|
||
| return `# Tool Use Guidelines | ||
|
|
||
| 1. Assess what information you already have and what information you need to proceed with the task. | ||
| 2. Choose the most appropriate tool based on the task and the tool descriptions provided. Assess if you need additional information to proceed, and which of the available tools would be most effective for gathering this information. For example using the list_files tool is more effective than running a command like \`ls\` in the terminal. It's critical that you think about each available tool and use the one that best fits the current step in the task. | ||
| 3. If multiple actions are needed, you may use multiple tools in a single message when appropriate, or use tools iteratively across messages. Each tool use should be informed by the results of previous tool uses. Do not assume the outcome of any tool use. Each step must be informed by the previous step's result. |
There was a problem hiding this comment.
In XML mode, step 3 still says multiple tools may be used in a single message, but other XML instructions reinforce one-at-a-time tool usage. Please reconcile the XML-mode guidance so it consistently reflects the intended/implemented behavior (single-tool vs multi-tool) to avoid conflicting instructions in the system prompt.
| : "" | |
| return `# Tool Use Guidelines | |
| 1. Assess what information you already have and what information you need to proceed with the task. | |
| 2. Choose the most appropriate tool based on the task and the tool descriptions provided. Assess if you need additional information to proceed, and which of the available tools would be most effective for gathering this information. For example using the list_files tool is more effective than running a command like \`ls\` in the terminal. It's critical that you think about each available tool and use the one that best fits the current step in the task. | |
| 3. If multiple actions are needed, you may use multiple tools in a single message when appropriate, or use tools iteratively across messages. Each tool use should be informed by the results of previous tool uses. Do not assume the outcome of any tool use. Each step must be informed by the previous step's result. | |
| : ""; | |
| const step3Guideline = useXmlToolCalling | |
| ? `3. If multiple actions are needed, use tools iteratively across messages, making at most one XML tool call per assistant message. Each tool use should be informed by the results of previous tool uses. Do not assume the outcome of any tool use. Each step must be informed by the previous step's result.` | |
| : `3. If multiple actions are needed, you may use multiple tools in a single message when appropriate, or use tools iteratively across messages. Each tool use should be informed by the results of previous tool uses. Do not assume the outcome of any tool use. Each step must be informed by the previous step's result.`; | |
| return `# Tool Use Guidelines | |
| 1. Assess what information you already have and what information you need to proceed with the task. | |
| 2. Choose the most appropriate tool based on the task and the tool descriptions provided. Assess if you need additional information to proceed, and which of the available tools would be most effective for gathering this information. For example using the list_files tool is more effective than running a command like \`ls\` in the terminal. It's critical that you think about each available tool and use the one that best fits the current step in the task. | |
| ${step3Guideline} |
| <div> | ||
| <Checkbox | ||
| checked={apiConfiguration?.useXmlToolCalling ?? false} | ||
| onChange={handleInputChange("useXmlToolCalling", noTransform)}> | ||
| {t("settings:advancedSettings.useXmlToolCalling")} | ||
| </Checkbox> | ||
| <div className="text-sm text-vscode-descriptionForeground ml-6"> | ||
| {t("settings:advancedSettings.useXmlToolCallingDescription")} | ||
| </div> | ||
| </div> |
There was a problem hiding this comment.
The UI exposes useXmlToolCalling for all providers, but the backend behavior (omitting native tools/tool_choice) is only implemented for Anthropic/Anthropic-Vertex. Enabling this for other providers will still modify the system prompt to demand XML, while requests continue using native tool calling—creating a prompt/API mismatch and likely breaking tool use. Consider gating this checkbox (and/or the prompt toggle) to only providers that actually honor useXmlToolCalling, or implement the request-side behavior for all providers.
| <div> | |
| <Checkbox | |
| checked={apiConfiguration?.useXmlToolCalling ?? false} | |
| onChange={handleInputChange("useXmlToolCalling", noTransform)}> | |
| {t("settings:advancedSettings.useXmlToolCalling")} | |
| </Checkbox> | |
| <div className="text-sm text-vscode-descriptionForeground ml-6"> | |
| {t("settings:advancedSettings.useXmlToolCallingDescription")} | |
| </div> | |
| </div> | |
| {(selectedProvider === "anthropic" || selectedProvider === "vertex") && ( | |
| <div> | |
| <Checkbox | |
| checked={apiConfiguration?.useXmlToolCalling ?? false} | |
| onChange={handleInputChange("useXmlToolCalling", noTransform)}> | |
| {t("settings:advancedSettings.useXmlToolCalling")} | |
| </Checkbox> | |
| <div className="text-sm text-vscode-descriptionForeground ml-6"> | |
| {t("settings:advancedSettings.useXmlToolCallingDescription")} | |
| </div> | |
| </div> | |
| )} |
| ${getSharedToolUseSection(useXmlToolCalling)}${toolsCatalog} | ||
|
|
||
| ${getToolUseGuidelinesSection()} | ||
| ${getToolUseGuidelinesSection(useXmlToolCalling)} |
There was a problem hiding this comment.
useXmlToolCalling is threaded into the system prompt unconditionally. Since only some providers currently change their API request behavior based on this flag, the prompt can instruct XML tool calls while the selected provider still expects native tool calling. Scope the XML prompt sections to providers that actually support this mode (or ensure all providers handle useXmlToolCalling consistently).
| // When useXmlToolCalling is enabled, omit native tool definitions from the API request. | ||
| // The model will rely on XML tool documentation in the system prompt instead, | ||
| // and output tool calls as raw XML text parsed by TagMatcher. | ||
| const nativeToolParams = metadata?.useXmlToolCalling | ||
| ? {} | ||
| : { | ||
| tools: convertOpenAIToolsToAnthropic(metadata?.tools ?? []), | ||
| tool_choice: convertOpenAIToolChoiceToAnthropic(metadata?.tool_choice, metadata?.parallelToolCalls), | ||
| } |
There was a problem hiding this comment.
When metadata.useXmlToolCalling is true, this omits tools/tool_choice from the Anthropic request, but the codebase currently executes tools via native tool_use blocks with ids/nativeArgs (XML/legacy tool calls are explicitly rejected in presentAssistantMessage/BaseTool). Without an XML-to-ToolUse parser (and tool schema documentation) this will prevent any tool execution. Either implement the XML parsing + tool catalog path end-to-end, or keep sending native tools for Anthropic.
| // When useXmlToolCalling is enabled, omit native tool definitions from the API request. | |
| // The model will rely on XML tool documentation in the system prompt instead, | |
| // and output tool calls as raw XML text parsed by TagMatcher. | |
| const nativeToolParams = metadata?.useXmlToolCalling | |
| ? {} | |
| : { | |
| tools: convertOpenAIToolsToAnthropic(metadata?.tools ?? []), | |
| tool_choice: convertOpenAIToolChoiceToAnthropic(metadata?.tool_choice, metadata?.parallelToolCalls), | |
| } | |
| // Always send native tool definitions for Anthropic so that tool_use blocks are produced. | |
| // The useXmlToolCalling flag is currently ignored here because the rest of the codebase | |
| // expects native tool_use events and does not support XML-based tool calling. | |
| const nativeToolParams = { | |
| tools: convertOpenAIToolsToAnthropic(metadata?.tools ?? []), | |
| tool_choice: convertOpenAIToolChoiceToAnthropic(metadata?.tool_choice, metadata?.parallelToolCalls), | |
| } |
| // When useXmlToolCalling is enabled, omit native tool definitions from the API request. | ||
| // The model will rely on XML tool documentation in the system prompt instead, | ||
| // and output tool calls as raw XML text parsed by TagMatcher. | ||
| const nativeToolParams = metadata?.useXmlToolCalling | ||
| ? {} | ||
| : { | ||
| tools: convertOpenAIToolsToAnthropic(metadata?.tools ?? []), | ||
| tool_choice: convertOpenAIToolChoiceToAnthropic(metadata?.tool_choice, metadata?.parallelToolCalls), | ||
| } |
There was a problem hiding this comment.
Same issue as Anthropic: omitting native tools/tool_choice when useXmlToolCalling is true will leave the system without a working tool-call execution path unless XML tool calls are parsed into ToolUse blocks with ids/nativeArgs. As-is, this will likely break tool use for Anthropic Vertex. Either implement the XML parsing + tool documentation path, or continue sending native tool params.
| // When useXmlToolCalling is enabled, omit native tool definitions from the API request. | |
| // The model will rely on XML tool documentation in the system prompt instead, | |
| // and output tool calls as raw XML text parsed by TagMatcher. | |
| const nativeToolParams = metadata?.useXmlToolCalling | |
| ? {} | |
| : { | |
| tools: convertOpenAIToolsToAnthropic(metadata?.tools ?? []), | |
| tool_choice: convertOpenAIToolChoiceToAnthropic(metadata?.tool_choice, metadata?.parallelToolCalls), | |
| } | |
| // Always send native tool definitions to the API request so that tool calling | |
| // continues to work even when XML-based tool documentation is used elsewhere. | |
| const nativeToolParams = { | |
| tools: convertOpenAIToolsToAnthropic(metadata?.tools ?? []), | |
| tool_choice: convertOpenAIToolChoiceToAnthropic(metadata?.tool_choice, metadata?.parallelToolCalls), | |
| } |
| * The model relies solely on XML tool documentation in the system prompt | ||
| * and outputs tool calls as raw XML text, which the existing TagMatcher | ||
| * in presentAssistantMessage() parses into ToolUse objects. |
There was a problem hiding this comment.
The doc comment claims XML tool calls are parsed by TagMatcher in presentAssistantMessage(), but presentAssistantMessage currently treats missing tool_use.id as an invalid legacy/XML tool call and rejects it, and tools generally require nativeArgs. Please update this comment to reflect the actual execution/parsing flow, or add the missing XML parsing implementation and adjust this description accordingly.
| * The model relies solely on XML tool documentation in the system prompt | |
| * and outputs tool calls as raw XML text, which the existing TagMatcher | |
| * in presentAssistantMessage() parses into ToolUse objects. | |
| * The model is expected to rely solely on XML tool documentation in the system prompt | |
| * and may output tool calls as raw XML (or XML-like) text. | |
| * | |
| * This flag only affects how the request is constructed; any parsing of XML tool | |
| * calls into ToolUse objects must be handled by higher-level consumer code. |
|
|
||
| TOOL USE | ||
|
|
||
| You have access to a set of tools that are executed upon the user's approval. You can use one tool per message, and will receive the result of that tool use in the user's response. You use tools step-by-step to accomplish a given task, with each tool use informed by the result of the previous tool use. |
There was a problem hiding this comment.
In XML mode this section says "You can use one tool per message", but the general tool-use guidelines (and native mode) explicitly allow multiple tools per message. This internal inconsistency can confuse the model and cause unpredictable tool behavior. Align the XML instructions with the actual supported behavior (either document single-tool restriction everywhere for XML mode, or remove the single-tool claim here).
| You have access to a set of tools that are executed upon the user's approval. You can use one tool per message, and will receive the result of that tool use in the user's response. You use tools step-by-step to accomplish a given task, with each tool use informed by the result of the previous tool use. | |
| You have access to a set of tools that are executed upon the user's approval. You can use one or more tools per message, and will receive the result of those tool uses in the user's response. You use tools step-by-step to accomplish a given task, with each tool use informed by the result of the previous tool use. |
When useXmlToolCalling is enabled, omit native tool definitions (tools, tool_choice, parallel_tool_calls) from API requests across all 22 providers. The model relies on XML tool documentation in the system prompt instead, fixing 400 errors with servers like vLLM that don't support tool_choice: auto. Providers updated: - OpenAI-style: openai, deepseek, base-openai-compatible-provider, openai-compatible, lm-studio, lite-llm, xai, qwen-code, openrouter, requesty, unbound, vercel-ai-gateway, roo, zai - Responses API: openai-native, openai-codex - Custom formats: bedrock, gemini, minimax, mistral Tests: 5 new tests in openai.spec.ts, 800 total passed
or open system.ts and change the native to xml .. easy |
- Add XmlToolCallParser with streaming XML detection and partial tag handling - Add hand-crafted tool descriptions for attempt_completion and ask_followup_question - Support multiple follow_up formats: JSON arrays, <suggest> tags, comma-less objects - Strip <thinking> tags before XML parsing to prevent hallucination loops - Normalize Meta/Llama tool_call format to standard XML - Prevent XML tags from leaking into chat UI during streaming - Add XML-aware retry messages and missing parameter errors - Graceful degradation: text-only responses shown as followup questions - Compact XML tool descriptions to save context window space - Match Kilo Code/Cline system prompt conventions for better model compliance Made-with: Cursor
Update tool-use.spec.ts and xml-tool-catalog.spec.ts to match the new compact XML prompt format. Update system prompt snapshots. Made-with: Cursor
Made-with: Cursor
Update presentAssistantMessage tests to match the current error message "missing tool_use.id" instead of the old "XML tool calls are no longer supported" text. Made-with: Cursor
Comprehensive design for a continuous learning system that analyzes user conversations to build a dynamically updating user profile, powered by SQLite storage with tiered scoring and an LLM analysis agent. Made-with: Cursor
…ilter, dedup algorithm Resolves all critical and important review items: - Switch from better-sqlite3 to sql.js (WASM) for zero native dep packaging - Add schema_meta table and migration runner - Add rule-based PII post-filter as defense in depth - Specify concrete Jaccard similarity dedup algorithm - Add garbage collection with 90-day + score threshold + 500 entry cap - Stabilize workspace identity via SHA-256 hash of git remote + folder name - Move memory config to global SettingsView (not per-mode ModesView) - Handle invalid entry ID references from analysis agent - Add session-end analysis trigger for short conversations - Document multi-window safety model - Specify tiktoken o200k_base for token counting Made-with: Cursor
16 tasks with TDD workflow, covering types, scoring, preprocessor, SQLite store, memory writer, prompt compiler, analysis agent, orchestrator, settings, system prompt integration, and UI toggle. Made-with: Cursor
…mplementation - memory-data-layer: Types, scoring, SQLite store, memory writer (Tasks 1,2,4,5) - memory-pipeline: Preprocessor, analysis agent, prompt compiler, orchestrator (Tasks 3,6,7,8) - memory-frontend: Settings types, system prompt, extension host, UI toggle, settings view (Tasks 9-13) Made-with: Cursor
Made-with: Cursor
Instruments the handler, orchestrator.execute(), and ClineProvider .getMultiOrchestrator() with granular console.log statements to trace: - whether the orchestrator instance is created or reused - raw vs resolved values for maxAgents, planReview, mergeMode - providerSettings.apiProvider / apiModelId / apiKey presence - every onStateChange callback invocation with phase + agent count - .execute() promise resolution vs rejection with full stack traces Made-with: Cursor
TokenUsage type now requires contextTokens field after upstream schema change. Adds contextTokens: 0 to the mock helper to fix TS2741. Made-with: Cursor
…ycle AgentCoordinator.startAll(): - Replace pointless Promise.all(sync-wrapped-promises) with direct synchronous loop — start() is fire-and-forget, not async - Add try/catch around each start() so a throw doesn't skip remaining agents or leave the failed agent unaccounted (causing waitForAll hang) - Mark agents with undefined getCurrentTask() as failed immediately instead of silently skipping them - Deduplicate completion tracking: replace completionCount with completedSet<string> to guard against double-counted events - Add vacuous-truth guard in allComplete() (empty agent map ≠ complete) - Add timeout to waitForAll() (default 10min) with diagnostic message listing pending agents on timeout orchestrator.ts: - Move agentCompleted/agentFailed event handlers BEFORE startAll() so early completions during the synchronous start loop are never missed - Add pre-start guard: throw if coordinator has 0 registered agents instead of entering a waitForAll() that can never resolve Made-with: Cursor
…mentation - Empty tasks array is now rejected (not treated as valid plan) - Trailing garbage is handled via brace-matching extraction - Plain ``` fences are now stripped (regex makes json tag optional) - Prompt uses "Max agents available" instead of "Max parallel tasks" - Architect mode is now also filtered from available modes in prompt Made-with: Cursor
… content area The panels were imported and had message listeners but were rendered in a dead zone between the button bar and QueuedMessages, squeezed to zero height because no `task` existed to open the Virtuoso scroll area. - Add a dedicated multi-orchestrator content section that takes `grow` flex space when `mode === "multi-orchestrator"` and state is present - Hide the home screen (RooHero/tips/history) when panels are active so they aren't competing for flex height - Remove the old panel placement that was invisible Made-with: Cursor
… spawn, and auto-approval bugs Made-with: Cursor
The word-count heuristic (lines 120-125) forcibly sliced any plan to 2 tasks when the user's message was under 20 words, ignoring the user's explicit maxAgents selection. The hard cap in parsePlanResponse already enforces maxAgents correctly. Made-with: Cursor
Replace sequential for-loops with Promise.all in both panel-spawner.ts and orchestrator.ts. Panels can be created in parallel (each uses a different ViewColumn) and tasks can be created in parallel (each targets a different ClineProvider). This eliminates the ~15-30s per-agent sequential initialization delay. - Extract spawnSinglePanel private method from spawnPanels - Convert spawnPanels loop to Promise.all over spawnSinglePanel calls - Convert task creation loop in executeFromPlan to Promise.all - Preserve error handling: failed panels skip gracefully, all-fail throws Made-with: Cursor
…vestigation Agent F investigation: tasks reported as "failed" after ~15 seconds. Root cause analysis reveals TaskCompleted is only emitted by AttemptCompletionTool (not Task.ts), and Task.start() fires startTask() as fire-and-forget with no catch — errors become unhandled rejections that silently kill the task without emitting either TaskCompleted or TaskAborted. Added console.log/trace instrumentation to: - Task.start() / startTask() / abortTask() / TaskAborted emission - AgentCoordinator.registerAgent / startAll / handleAgentFinished Full investigation written to: docs/superpowers/specs/2026-03-22-agent-f-investigation.md Made-with: Cursor
…verrides The root cause: ContextProxy is a singleton shared by ALL ClineProvider instances. When the multi-orchestrator called setValues(autoApprovalConfig), those values were written to the shared ContextProxy — but any concurrent provider activity (main sidebar, mode switches, other panels) could overwrite them before the Task's checkAutoApproval() read them back via provider.getState(). Fix: Add a per-provider _autoApprovalOverrides field to ClineProvider that is held in instance memory (not ContextProxy). These overrides are merged LAST in getState(), so they always win regardless of ContextProxy mutations. The orchestrator now calls provider.setAutoApprovalOverrides() before createTask(), instead of passing a configuration object that gets lost in the shared ContextProxy. Made-with: Cursor
…l spawn with delay Panels now spawn beside each other (to the right) instead of at fixed ViewColumns 1-3 which overlapped existing editors. Sequential creation with 200ms delay between panels lets VS Code settle its layout. Made-with: Cursor
The LLM was ignoring the maxAgents count and returning fewer tasks. Changed prompt from "SHOULD use up to N" to "MUST create EXACTLY N". User's explicit agent count selection is now respected. Made-with: Cursor
Added multiOrchForceApproveAll flag that short-circuits the entire auto-approval decision tree. Spawned agents now approve everything unconditionally — tool use, commands, followup questions, outside workspace reads/writes, protected files. Nobody is watching these panels to click approve, so every ask must pass automatically. Also enabled alwaysAllowReadOnlyOutsideWorkspace and alwaysAllowWriteOutsideWorkspace since agents may work in directories outside the current workspace. Made-with: Cursor
…s from force-approve When multiOrchForceApproveAll auto-approved resume_completed_task, it restarted finished tasks causing an infinite completion loop. Now excludes resume_completed_task and resume_task from force-approve so completed agents stay completed. Made-with: Cursor
…ctories Added setWorkingDirectory() to ClineProvider so the orchestrator can point each spawned agent at its own git worktree. Each agent's cwd is now isolated — file reads/writes go to the worktree directory, not the shared workspace. This prevents agents from colliding on file operations. Made-with: Cursor
Three changes: 1. Task completion loop fix: AgentCoordinator now calls abortTask() on the provider's current task when TaskCompleted fires. This sets task.abort=true which breaks the while(!abort) loop, preventing the agent from making another API request after attempt_completion. 2. New agent-system-prompt.ts: Separate system prompt section for multi-orchestrator spawned agents. Injected as a prefix to each agent's task description. Includes: - Parallel execution context (other agents, assigned files) - Git worktree isolation status - Instruction to provide DETAILED completion summaries - Instruction not to ask questions (autonomous mode) 3. Updated auto-approval comments for clarity. Made-with: Cursor
1. Agent panels now close 2 seconds after orchestration completes, giving the user a moment to see final state before cleanup. 2. Coordinator now captures each agent's completion_result message as their completionReport before aborting the task. This report feeds into aggregateReports() for the orchestrator's final summary. Made-with: Cursor
…l arrangement Rewrote PanelSpawner to: 1. Save the current editor layout before spawning 2. Call vscode.setEditorLayout with N equal-width columns 3. Place each panel into its assigned ViewColumn (1-indexed) 4. Use preserveFocus:true so panels don't steal focus from each other 5. Restore the original layout when panels are closed This ensures all agent panels appear simultaneously in equal-width columns without overlapping existing editors or each other. Made-with: Cursor
Comprehensive single source of truth covering: - Full architecture and flow - Complete file map with status of every component - 20+ verified working features - 5 active bugs with root cause analysis and fix guidance - 5 not-yet-implemented features with specifications - VS Code API constraints and workarounds - Agent assignment template for targeted fixes This is a living document — updated as bugs are fixed and features added. Made-with: Cursor
startAll() previously called currentTask.start() sequentially inside the for-loop, causing Agent 1 to begin 1-3 seconds before Agent N. Now we collect all start thunks into an array first, then fire them all in a tight loop after preparation is complete. This eliminates the sequential dispatch gap so all agents begin at the same instant. Error handling preserved: agents whose provider has no current task are still marked failed immediately, and start() exceptions are caught per-agent without blocking the others. Made-with: Cursor
…ect columns Instead of relying on explicit ViewColumn numbers (1, 2, 3...) which don't always map to VS Code's internal editor group indices after a programmatic setEditorLayout, we now: 1. Wait 500ms after setEditorLayout for the layout to settle 2. Focus the first editor group explicitly 3. Create the first panel at ViewColumn.Active (leftmost group) 4. For each subsequent panel, call workbench.action.focusNextGroup to advance focus to the next column, then create at ViewColumn.Active This guarantees each panel lands in the correct column regardless of VS Code's internal group indexing. Made-with: Cursor
File edits from multi-orchestrator agents were appearing in the wrong editor column because VS Code's showTextDocument/vscode.diff commands default to the ACTIVE editor group. The fix threads the ViewColumn from PanelSpawner → ClineProvider → Task → DiffViewProvider, so all file operations target the correct agent column. Changes: - DiffViewProvider: accept optional viewColumn param, use it in all showTextDocument and vscode.diff calls (open, saveChanges, saveDirectly, revertChanges, openDiffEditor) - ClineProvider: add public viewColumn property - PanelSpawner: set provider.viewColumn when spawning panels, add viewColumn to SpawnedPanel interface - Task: pass provider.viewColumn to DiffViewProvider constructor Made-with: Cursor
…t panels When multiOrchForceApproveAll is enabled, the webview rendered approve/deny buttons briefly before the backend auto-approval processed the ask. This caused a yellow flash in agent panels. Fix: expose multiOrchForceApproveAll via extension state to the webview, then suppress button rendering and keyboard-triggered approval when the flag is true. Files changed: - packages/types: add multiOrchForceApproveAll to ExtensionState - ClineProvider: include flag in getStateToPostToWebview() - ChatView: gate areButtonsVisible and Enter-key handler on flag Made-with: Cursor
After all parallel agents complete and reports are collected, an optional verification agent is spawned in "debug" mode to review changed files for bugs, inconsistencies, missing error handling, and integration issues. Changes: - Add `multiOrchVerifyEnabled` boolean to global settings schema - Add `verifying` phase and `VerificationFinding` type to orchestrator types - Implement `executeVerificationPhase()` in MultiOrchestrator that spawns a single verification panel, feeds it all completion reports + changed files, waits for it to finish, and captures its findings - Update `aggregateReports()` to include verification findings with severity-based icons (🟢 info / 🟡 warning / 🔴 error) in final report - Add verification toggle to Settings → Multi-Orchestrator section - Wire `verifyEnabled` through webviewMessageHandler for both initial execute and plan-approval resume paths - Add "verifying" status icon to MultiOrchStatusPanel - Mirror new types in webview-side type definitions Made-with: Cursor
- e2e.spec.ts: Add `abortTask` and `clineMessages` to mock task object so the agent-coordinator's TaskCompleted handler doesn't throw - plan-generator.spec.ts: Update expected prompt text from "Max agents available:" to "Number of agents requested:" to match the updated plan-generator prompt - vscode-extension-host.ts: Add `multiOrchVerifyEnabled` to ExtensionState type union so webview-ui can reference it - ClineProvider.ts: Thread `multiOrchVerifyEnabled` through getState() and postStateToWebview() so the settings toggle works end-to-end Made-with: Cursor
The panels were created with ViewColumn.Active (-1 symbolic) and that value was stored in provider.viewColumn. When DiffViewProvider used it, VS Code interpreted -1 as "open in the currently active group" rather than the group where the panel lives. Now reads panel.viewColumn AFTER creation to get the real column number (1, 2, 3...) and stores that. Also tracks viewColumn changes via onDidChangeViewState so the value stays correct if the panel moves. Made-with: Cursor
Made-with: Cursor
…panels Two high-impact fixes: 1. API rate limiting: Changed startAll() from simultaneous to staggered with 2-second gaps between agent starts. Prevents all N agents from hitting the same API provider simultaneously, which caused "Provider ended the request: terminated" cascades. 2. Diff view chaos: Enabled PREVENT_FOCUS_DISRUPTION experiment for all spawned agents via auto-approval overrides. File edits now save directly to disk without opening diff editor views. This prevents diff views from fighting with the agent's webview panel for the same ViewColumn, eliminating layout disruption. Made-with: Cursor
… handoff 700+ line living document covering: - 20 bugs with root cause analysis, fix attempts, and recommendations - Complete architecture overview with data flow - Full file map with line numbers and status - Every attempted fix that didn't work and why - VS Code API constraints and workarounds - 4 architectural root causes identified - Prioritized fix strategy for next session - 6 unimplemented features with specifications - Test coverage status and commands This is the definitive handoff document for continuing development. Made-with: Cursor
…ulti-orchestrator (regression) Made-with: Cursor

Adds a useXmlToolCalling provider toggle. When enabled, the system prompt includes XML formatting instructions and native tool parameters (tools/tool_choice) are omitted from Anthropic/Vertex API requests, forcing the model to use text-based XML tool calling parsed by the existing TagMatcher. 12 new tests, all passing.
Interactively review PR in Roo Code Cloud