Benchmark Projects creates standard Agent Zero projects from a bundled benchmark template, launches hidden benchmark runs from the plugin dashboard, and stores run artifacts inside each created project.
- creates benchmark-ready projects under
usr/projects/ - keeps benchmark orchestration in the plugin, not in agent-visible tools
- launches one task chat per task with an isolated working directory
- stores results under
benchmark_runs/<run_id>/ - compares runs within the same project
V1 ships with one bundled benchmark preset: project_b.