agent_harness turns Agent Zero into a coding-first, DeerFlow-inspired implementation harness.
It adds explicit run modes, checkpointed risky actions, thread-scoped workspace data, memory curation, task-graph orchestration, and a DeerFlow-like core bridge over Agent Zero's existing runtime.
Agent Harness is the closest DeerFlow-style coding harness that fits naturally inside Agent Zero's plugin architecture.
It is designed for:
- deep implementation tasks that benefit from an explicit run lifecycle
- safer autonomous coding with checkpoints before risky actions
- project rule curation instead of silent memory writes
- thread-local uploads, outputs, and scratch workspace data
- structured orchestration for larger coding tasks
- exposing DeerFlow-like backend state without replacing Agent Zero itself
This plugin is not a literal port of the full DeerFlow backend.
Today it does not ship:
- a standalone LangGraph server
- a separate DeerFlow gateway service
- DeerFlow's channel app layer
- DeerFlow's full sandbox/provider stack
Instead, it brings DeerFlow's coding-harness ideas into Agent Zero by using Agent Zero's own plugin hooks, API handlers, tools, and UI.
- DeerFlow-aligned coding modes:
flash,standard,pro,ultra - explicit run state with objective, phase, risk level, failures, verification, and task tracking
- guarded execution with approval checkpoints for risky actions
- manual and queued memory curation for reusable repo rules
- DeerFlow-style thread data roots for
workspace,uploads, andoutputs - optional multi-step task-graph orchestration with parallel subagent dispatch in
ultra - a DeerFlow-like core facade for thread state, configured models, available skills, and memory status
- dashboard UI for current run, task graph, pending checkpoints, memory queue, and recent accepted rules
-
Enable the plugin in Agent Zero.
This plugin is not
always_enabled, so it must be turned on through Agent Zero's plugin controls. It supports global, per-project, and per-agent configuration. -
Open the Agent Harness UI.
The plugin provides:
- a dashboard for active run state and review queues
- a status strip above the chat input when
show_status_uiis enabled - a sidebar entry for quick access
-
Start a run.
For most coding tasks, start in
pro.Use
ultrawhen the work is large enough to benefit from decomposition and parallel subagents. -
Work in chat as usual.
The harness injects workflow instructions into the runtime and tracks the run as the agent moves through inspect, plan, implementation, verification, and completion.
-
Review checkpoints and memory proposals.
Risky actions will pause behind a checkpoint. Reusable rules are captured into a review queue instead of being silently persisted.
-
Verify and complete.
The harness expects tests or other verification to be recorded before a run is completed.
| Mode | Best For | Planning | Subagents | Default Policy |
|---|---|---|---|---|
flash |
small, low-risk coding tasks | usually skipped | disabled | subagent_limit=0, repair_limit=0 |
standard |
moderate single-agent work | optional | disabled | subagent_limit=0, repair_limit=1 |
pro |
most serious coding work | recommended for multi-step tasks | disabled | subagent_limit=0, repair_limit=1 |
ultra |
larger, multi-file, decomposable work | required before multi-file execution | enabled | subagent_limit=3, repair_limit=3 |
Migration note:
- legacy stored values such as
assist,build, andsurgeare normalized to the new DeerFlow-style modes so older saved settings do not break existing chats
Each harness run tracks a concrete coding workflow:
inspectplanimplementverifyrepairsummarizecomplete
The practical behavior depends on mode:
flashfavors the shortest safe pathstandardkeeps the work thoughtful but single-agentproencourages deliberate inspect -> plan -> implement -> verify workultraexpects task decomposition and supports dispatch/collect loops for parallel work
When a task graph exists, the harness tracks:
- pending sub-tasks
- dispatched sub-tasks
- completed sub-tasks
- failed sub-tasks
The dashboard renders both a Mermaid graph and a text fallback view of the graph.
Agent Harness adds review checkpoints before risky behavior.
By default, checkpoints are required for:
- dependency installation commands
- destructive shell commands
- edits to protected files or folders
Default protected paths:
agent.pyinitialize.pyusr/plugins/
The guardrail matcher treats folder-style entries such as usr/plugins/ as directory prefixes, so nested files under that path are protected too.
The plugin also exposes a manual checkpoint tool so the agent can request approval proactively when the action is risky even if it is not caught by a pattern.
Instead of silently persisting every reusable observation, the harness can queue memory proposals for review.
Each proposal includes:
- the candidate rule text
- a reason
- source metadata
- a target scope
Supported scopes:
projectagentglobal
Accepted rules are surfaced back into future harness prompts as accepted project rules.
When workspace support is enabled, the plugin prepares DeerFlow-style thread data directories for each chat context.
If the chat is attached to a project, the data is rooted under that project.
If the chat is not attached to a project, the data is rooted under the chat storage folder.
Layout:
.harness/
threads/
<context_id>/
user-data/
workspace/
uploads/
outputs/
offloads/
runs/
Meaning:
workspace/: thread-local scratch space for implementation workuploads/: files uploaded into the threadoutputs/: generated artifacts for the threadoffloads/: auxiliary markdown offloadsruns/: saved run logs
Project-backed workspaces also ensure the project's .gitignore includes:
.harness/workspace/.harness/offloads/.harness/threads/
The plugin includes several user-facing surfaces:
The dashboard shows:
- current run objective, mode, phase, status, and risk level
- tracked tasks
- task graph progress
- pending checkpoints
- pending memory proposals
- recent accepted rules
- the latest verification result
When enabled, the plugin can display a compact harness status strip above the chat input so the current run remains visible without opening the full dashboard.
The plugin also registers sidebar/status entrypoints so the harness can be reached from the existing Agent Zero UI shell.
The plugin exposes three primary tools inside Agent Zero.
This is the main orchestration tool.
Supported actions:
start: begin a run with a mode, objective, and optional constraintsstatus: summarize the current runphase: move the run to a new phaseplan: submit a task graph for decomposed workdispatch: spawn ready sub-tasks in parallelcollect: collect completed sub-task results and update the graphtask: upsert tracked task itemsverification: record a verification resultfailure: record a bounded failure or repair loopclean: clear workspace/offloads while preserving outputs and run logscomplete: finish the run after verification and task completion
This tool creates a manual review checkpoint with:
- a reason
- a proposed action
- optional tool metadata
- risk level
Use it when the action is risky and should pause for approval even if the guardrail layer has not blocked it automatically.
This tool submits candidate reusable rules into the memory review queue.
Use it when the agent learns a durable repo convention that should be reviewed before being persisted.
If memory_curation_enabled is turned off, proposed rules are accepted immediately instead of being queued.
The plugin uses Agent Zero lifecycle extensions to keep the harness active throughout a run.
Important runtime hooks:
extensions/python/tool_execute_before/_20_harness_guardrails.pyBlocks risky execution and opens checkpoints.extensions/python/tool_execute_after/_20_harness_tool_events.pyTracks tool activity and sub-task outcomes.extensions/python/message_loop_prompts_after/_20_harness_runtime.pyInjects harness instructions into the active prompt.extensions/python/chat_model_call_after/_20_harness_cost.pyTracks token usage and budget information.extensions/python/monologue_start/_20_harness_workspace.pyPrepares thread-scoped workspace roots before work begins.
The plugin exposes DeerFlow-style host state through a local facade.
Core bridge helpers:
helpers/deerflow_core.pyhelpers/deerflow_client.pyhelpers/workspace.py
The bridge currently surfaces:
- thread path summary
- configured models
- available skills
- memory status
- thread uploads
- thread artifacts
This is intended to make Agent Zero feel more like DeerFlow at the harness boundary, even though the underlying runtime is still Agent Zero.
All plugin API routes live under /plugins/agent_harness/.
| Endpoint | Purpose |
|---|---|
run |
start, inspect current run state, stop a run, and submit checkpoint decisions |
state |
dashboard state, pending checkpoints, memory queue, recent rules, latest verification |
memory_queue |
accept or reject queued memory proposals |
deerflow_core |
DeerFlow-like thread, model, skill, and memory summary |
thread_data |
inspect thread data roots or clean them up |
thread_uploads |
list, upload, or delete thread-local uploads |
thread_artifacts |
list generated artifacts or download a specific artifact |
Behavior notes:
runis what the dashboard uses forstart,checkpoint_decide, andstopstateis the dashboard polling endpointthread_uploadsacceptsGETandPOSTand supportsaction=uploadandaction=deletethread_artifactssupports listing artifacts and downloading a requested file viaGETthread_datasupportsaction=statusandaction=cleanup
The plugin ships a default config in default_config.yaml and supports both per-project and per-agent overrides.
| Setting | Default | Meaning |
|---|---|---|
ambient_assist_enabled |
true |
keep lightweight coding guidance active even without an explicit run |
default_deep_mode |
pro |
default mode when starting a deep harness workflow |
memory_curation_enabled |
true |
queue reusable rules for review instead of silently persisting them |
show_status_ui |
true |
display the compact harness status strip |
max_auto_edit_files |
8 |
cap on automatic broad edit fan-out before the task should slow down |
dependency_install_requires_checkpoint |
true |
require approval before install commands |
destructive_actions_require_checkpoint |
true |
require approval before destructive commands |
protected_paths |
see config | files and directory prefixes that always require approval before editing |
mode_policies |
varies by mode | subagent and repair-loop policy per mode |
context_pressure_threshold |
0.7 |
threshold used to classify context pressure |
context_model_window |
128000 |
assumed context window for pressure calculations |
workspace_enabled |
true |
create DeerFlow-style thread workspace roots |
token_budget |
0 |
optional hard budget for usage tracking; 0 means disabled |
cost_tracking_enabled |
true |
track prompt/completion token usage during runs |
Current default mode policies:
flash:subagent_limit=0,repair_limit=0standard:subagent_limit=0,repair_limit=1pro:subagent_limit=0,repair_limit=1ultra:subagent_limit=3,repair_limit=3
The plugin includes local DeerFlow-style assets so it is usable without copying files out of the upstream DeerFlow repository.
Included assets:
Install.mdscripts/check_deerflow_harness.pyscripts/import_deerflow_public_skills.pyskills/public/bootstrapskills/public/find-skills
Use these when you want to:
- verify that the plugin has all expected DeerFlow-style assets
- compare local plugin state against a DeerFlow checkout
- import DeerFlow public skills into Agent Zero
Basic parity and plugin checks:
./.venv/bin/python usr/plugins/agent_harness/scripts/check_deerflow_harness.py
./.venv/bin/pytest -q tests/test_agent_harness_plugin.py tests/test_harness_*.pyVerify against a local DeerFlow checkout:
./.venv/bin/python usr/plugins/agent_harness/scripts/check_deerflow_harness.py --source /path/to/deer-flowOptional DeerFlow public skill import:
./.venv/bin/python usr/plugins/agent_harness/scripts/import_deerflow_public_skills.py --source /path/to/deer-flowFor most users, this flow works well:
- Enable the plugin for the current project or agent.
- Start a
prorun for normal coding work. - Switch to
ultraonly when the task is large enough to justify decomposition. - Let checkpoints slow you down before risky actions instead of bypassing them.
- Review memory proposals so durable project rules stay clean and intentional.
- Treat
outputs/as the canonical place for thread-local generated artifacts. - Run verification before completing the harness run.
Make sure the plugin is enabled for the current scope and that the current chat context exists.
No run exists until one is started. The dashboard can still load, but run-specific controls will remain idle until start is called.
Check the pending checkpoint queue. The block may be caused by:
- a dependency installation command
- a destructive command
- a protected path edit
Remember that uploads and outputs are thread-scoped, not global. A different chat context will have a different thread data root.
This plugin gets close at the harness layer, but full parity would require Agent Zero core work beyond the plugin boundary, especially around LangGraph-style execution and a separate gateway/runtime service.
If you are extending the plugin itself, these are the most important files:
plugin.yamldefault_config.yamlapi/run.pyapi/state.pyapi/memory_queue.pyapi/deerflow_core.pyapi/thread_data.pyapi/thread_uploads.pyapi/thread_artifacts.pytools/harness_run.pytools/harness_checkpoint.pytools/harness_memory_propose.pyhelpers/deerflow_core.pyhelpers/deerflow_client.pyhelpers/workspace.pywebui/dashboard.htmlwebui/config.htmlwebui/harness-store.js
Install.mddocs/plans/2026-04-01-deerflow-core-parity-design.md
If you want the design rationale behind the DeerFlow parity bridge, read the design doc. If you want the operational bootstrap and verification path, read Install.md.