OpenCaw.Summary.mp4
OpenCaw is an open source framework library for AI-assisted development that standardizes instructions, skills, commands, and architecture guidance for tools such as Cursor, Codex, and Claude.
It provides a structured system that allows teams to:
- Standardize AI agent behavior across repositories
- Reuse architecture frameworks and coding standards
- Define reusable commands and skills
- Offload memory and learning fragments into project-local storage
- Maintain consistent enterprise-ready development workflows
OpenCaw is designed so it can be installed into an existing repository as a submodule or cloned tool directory (such as .codex, .cursor, or .claude) while keeping project-specific artifacts separate from the shared instruction system.
- Install
- Examples
- Contributing
- Architecture Frameworks
- Roles
- Role-Skill Bindings
- Sub-Agent Orchestration
- Goals
- Skills
- Commands
- Skills & Commands Guide
- Validation
- Task Management
- AI Memory System
- License
Before installing OpenCaw in your project, you must first fork the repository.
This ensures:
- you control updates
- you can modify roles, skills, or commands
- upstream updates can be merged safely
- any enterprise security policies are satisfied
Visit:
https://github.com/TimothyMeadows/OpenCaw
Click Fork and create a fork under your GitHub account or organization.
Example fork location:
https://github.com/<your-org>/OpenCaw
After forking, use your fork URL in all installation commands instead of the upstream repository.
OpenCaw can be installed in an existing repository in two primary ways.
Submodules allow the instruction system to be updated centrally while individual projects control the version they use.
Example installation as .codex or .cursor or .claude:
git submodule add https://github.com/<your-org>/OpenCaw .codex
git submodule update --init --recursiveor
git submodule add https://github.com/<your-org>/OpenCaw .cursoror
git submodule add https://github.com/<your-org>/OpenCaw .claudeUpdate at anytime:
git submodule update --remoteIf a repository needs a customized version of OpenCaw, it can be cloned instead.
git clone https://github.com/<your-org>/OpenCaw .codexor
git clone https://github.com/<your-org>/OpenCaw .cursoror
git clone https://github.com/<your-org>/OpenCaw .claudeThis allows you to modify instructions without affecting the upstream repository.
OpenCaw should auto load in most sessions once the host bootstrap is present in root AGENTS.md. If needed, you can still force activation by saying:
Read AGENTS.md instructions
These are some examples in how to use OpenCaw after it's been installed, and activated.
use role security-engineer + sre and review this repository for vulnerabilities and recommend fixes
use role qa-engineer and generate full test coverage for the current feature including edge cases
use role devops-automator and create a CI/CD pipeline as a github action for gcp with build, test, and deploy stages for this repository
use role fullstack-engineer and build a calculator app with a simple UI, basic arithmetic operations, tests, task tracking, architecture generation if missing, and final verification
use 4 agents with project-manager + fullstack-engineer + qa-engineer to split the checkout refactor into safe parallel lanes, then integrate and verify the result
goal: modernize the reporting module across these five tasks. Automatically raise each task PR after validation, run post-PR QA, then continue to the next task. Do not merge PRs automatically.
work on #123 and implement the fix with tests and verification evidence
OpenCaw deterministically resolves your prompt into:
- Roles activated -> sets perspective and priorities
- Skills selected -> plans and reasons about the work
- Tasks + issues created/updated ->
.ai/tasks/TODO.md+.ai/tasks/<task>/TASK.md+.ai/tasks/OPEN_ISSUES.md - Sub-agent lanes planned when useful ->
.ai/tasks/<task>/SUBAGENTS.mdcaptures parallel lanes, roles, ownership, and verification - Goal flow selected when explicit ->
.ai/goals/<goal>/GOAL.mdgoverns automatic task-to-PR-to-QA progression - Architecture ensured -> generates
ARCHITECTURE.mdif missing - Commands executed -> builds, tests, scans, or deploys
- Verification performed -> tests/logs prove correctness
- Memory updated ->
.ai/captures reusable lessons
To be more specific it will:
- activate the
fullstack-engineerrole - check whether
ARCHITECTURE.mdexists - if missing, ask which architecture templates apply
- generate
ARCHITECTURE.md - update
.ai/tasks/TODO.mdwith an ordered checklist - create or import a task file such as:
.ai/tasks/create-calculator-app/TASK.md
- create/link a matching GitHub issue, or import an existing one from a prompt like
Work on #123, and add the URL to.ai/tasks/OPEN_ISSUES.md - if the prompt requests multiple agents/developers/workers, or the task has safe natural parallelism, create a lane plan such as:
.ai/tasks/create-calculator-app/SUBAGENTS.md
- validate that each lane has a resolved role, safe ownership boundaries, dependencies, expected output, and verification path
- apply appropriate skills such as:
create-task-filemanage-task-issuesorchestrate-subagentsgenerate-architecturesolution-buildtest-dotnet
- use appropriate commands based on the stack, such as:
./commands/dotnet-restore.sh./commands/dotnet-build.sh./commands/dotnet-test.sh./commands/create-subagent-plan.shand./commands/validate-subagent-plan.shfor parallel lane planning./commands/comment-issue-test-results.shfor task issue QA evidence
- implement the application, using sub-agents only for lanes that can run safely in parallel
- record sub-agent lane results when a task-backed
SUBAGENTS.mdexists - run validation and verification before completion
- create a PR readiness report and ask the user whether they are ready for the branch to be pushed and a PR opened
- if explicit goal flow is active, use
./commands/pr-readiness-check.sh --goaland automatically push/open the task PR without asking for PR readiness confirmation - otherwise, only after user approval, push/open the PR with GitHub tools in priority order:
gh, then an availablegithubCLI/wrapper, then GitHub MCP/app connector tools only when both CLI options are unavailable or unsuitable - associate the PR with the task issue (
Closes #<issue-number>) - immediately run post-PR QA once the PR is confirmed available
- post QA/Playwright evidence as a GitHub PR comment, including inline screenshot URLs when screenshots are part of the proof
- in goal flow, move to the next task only after post-PR QA completes successfully
- when a later goal task depends on a previous unmerged task or risks conflicts, branch from the previous task branch or PR head and record the dependency
- at goal completion, generate a report with PR links in approval order, branch dependencies, QA evidence, and conflict-risk notes for human approval
- notify the user that the PR is ready for review and the agent can move to the next task if any remain
- update memory files if durable lessons are discovered
This is the intended OpenCaw experience:
- The user gives one high-level prompt
- OpenCaw resolves the role
- OpenCaw selects the right skills
- OpenCaw uses the right commands
- OpenCaw creates task structure and verification flow
- OpenCaw completes the work in a governed, repeatable way
The detailed example below shows the same work broken down step by step for users who want to see the full workflow explicitly.
Contributions to OpenCaw are welcome.
Typical workflow:
- Fork the repository
- Create a feature branch
- Implement your improvement
- Submit a pull request
Example:
git clone https://github.com/<your-org>/OpenCaw
cd OpenCaw
git checkout -b feature/add-architecture-frameworkAfter making changes:
git add .
git commit -m "Add new architecture framework"Before pushing or opening a pull request, stop and confirm the branch is ready for human review. After confirmation, prefer gh for GitHub PR work, fall back to an available github CLI/wrapper, and use GitHub MCP/app connector tools only when both CLI options are unavailable or unsuitable. Once the PR is available, run QA and post the result as a GitHub PR comment.
When contributing:
- Keep architecture frameworks enterprise-ready
- Maintain clear documentation
- Follow existing file structure and conventions
OpenCaw supports multi-architecture repositories.
Frameworks are located in:
.architecture/
These frameworks allow AI agents to generate a unified ARCHITECTURE.md file for a repository by combining multiple architecture standards.
By default, generation is read-reference based so ARCHITECTURE.md stays concise and contains directives such as Read \.//.architecture/DOTNET.md` instructions`. Use inline generation only when full embedded content is explicitly required.
Example supported frameworks include:
- DOTNET
- DOTNET_ASPIRE
- NODE
- MAUI
- EMBEDDED_FIRMWARE
- PYTHON
- PLAYWRIGHT
- NEXTJS
- SPA
- REACT
- ANGULAR
- VUE
- AZURE
- SIGNALR_WEBSOCKETS
- MSSQL
- MYSQL
- POSTGRESDB
- SQLITE
- COSMOSDB
- AZURE_STORAGE_TABLES
- DATABRICKS
- TERRAFORM
- KUBERNETES
- HELM
- MICROSERVICES
- EVENT_DRIVEN
- SOLIDITY
- GITHUB_ACTIONS
- AZURE_DEVOPS
Language/tool alignment guidance is documented in:
.architecture/LANGUAGE_SUPPORT.md
Agents will ask which architectures apply if ARCHITECTURE.md does not exist and generate it automatically.
OpenCaw includes a library of engineering roles in:
.roles/
Each role is stored as:
.roles/<domain>/<role-name>/ROLE.md
Current engineering catalog:
.roles/computer-science/<role-name>/ROLE.md
To browse available roles, categories, and aliases, see:
.roles/INDEX.md
To activate a role, the user can request a matching role name or a common alias.
Role references may be:
- unqualified role name, for example
backend-architect - alias from
.roles/INDEX.md, for examplesecurity - domain-qualified role id, for example
computer-science/backend-architect
Examples:
use role backend-architectuse role securityact as sre
Resolution behavior:
- If an unqualified role name or alias maps to exactly one role across all domains, activate it directly.
- If an unqualified role name or alias maps to multiple roles across domains, prompt the user to choose a domain-qualified role before continuing.
- If both an exact role-name match and alias match exist, exact role-name match wins.
- If no matching role exists, continue with baseline behavior.
Deterministic helper command:
./commands/resolve-role.sh "<role-name|alias|domain/role-name>"
OpenCaw also supports combining roles in one session.
Examples:
use role backend-architect + security-engineeruse roles frontend-developer + qa-engineer
When multiple roles are requested:
- the first role acts as the primary perspective by default
- later roles add specialist constraints, review lenses, or guidance
- stricter or safer guidance should win when roles conflict, unless the user says otherwise
OpenCaw includes default bindings between common engineering roles, reusable skills, and preferred commands.
See:
.roles/ROLE_SKILL_MAP.md
.roles/ROLE_SKILL_MAP.json
These mappings allow role casting to do more than change tone or perspective.
When a role is activated, OpenCaw should:
- prioritize the skills associated with that role
- prefer commands associated with that role
- apply shared skills such as planning, debugging, review, refactoring, and verification
- bias reasoning toward the role's domain expertise
- resolve bindings by checking
<domain>/<role>first, then fallback to<role>for backward compatibility
Examples:
backend-architect→ architecture review, service boundaries, dependency auditsfrontend-developer→ components, feature modules, rendering, accessibilityfullstack-engineer→ end-to-end feature delivery, API/UI integration, full-flow verificationsecurity-engineer→ threat modeling, security audits, dependency vulnerability reviewsre→ incident analysis, resilience design, performance review
Multi-role sessions should merge bindings in the same order as the requested roles.
OpenCaw supports a portable sub-agent flow for complex tasks that have safe parallel work. The flow is centered on the computer-science/project-manager role and the orchestrate-subagents skill.
Use sub-agents when:
- the user explicitly requests a number of agents, developers, workers, or parallel lanes
- the task has independent research, implementation, QA, documentation, or review work
- each lane can have a clear owner, role, scope, expected output, and verification path
- implementation lanes can declare non-overlapping write sets
Avoid sub-agents when:
- the next step is a critical-path blocker the main agent needs immediately
- multiple lanes would edit the same files without an integration strategy
- the role, scope, output, or verification path is unclear
- the overhead of coordination is larger than the work itself
For substantial task-backed work, OpenCaw stores the lane plan in:
.ai/tasks/<task-name>/SUBAGENTS.md
SUBAGENTS.md captures:
- requested and effective capacity
- lane IDs such as
lane-1,lane-2, andlane-3 - resolved OpenCaw role for each lane
- agent type:
explorer,worker, ordefault - scope and write set
- dependencies between lanes
- expected output and verification evidence
- integration order, conflict risks, and final verification
- lane results after agents finish
For complex prompts, the project-manager planning flow is:
- Identify the requested agent/developer count, if one was provided.
- Determine the task's natural parallelism.
- Create at most the requested number of active lanes.
- Assign each lane a resolved OpenCaw role.
- Use
explorerlanes for read-only investigation andworkerlanes for implementation. - Require worker lanes to declare disjoint write sets.
- Keep the main agent responsible for orchestration, blockers, integration, final verification, and user communication.
- Record lane outputs and verification evidence before final handoff.
Helper commands:
./commands/create-subagent-plan.sh "<task_name>" ["agent_count"] [--dry-run]
./commands/validate-subagent-plan.sh "<task_name|path>"
./commands/record-subagent-result.sh "<task_name>" "<lane_id>" "<status>" "<summary_file>" [--dry-run]Good prompts make capacity and ownership expectations explicit:
use 3 agents with project-manager + backend-architect + qa-engineer to plan and implement #123. Split only safe parallel lanes, keep write sets separate, then integrate and verify.
use 4 workers for this migration. Have project-manager create the SUBAGENTS.md lane plan first, use specialist roles for each lane, and reserve one lane for QA/review if implementation cannot safely use all four.
act as project-manager + senior-developer. Break this feature into sub-agent lanes, use explorer agents for investigation, worker agents for non-overlapping patches, and record lane evidence before final verification.
- Ask for a specific agent count only when there is enough work to split cleanly.
- Prefer fewer high-quality lanes over filling every requested seat.
- Use specialist roles for specialist lanes, such as
qa-engineerfor verification orsecurity-engineerfor threat review. - Keep each worker lane's write set narrow and explicit.
- Put cross-cutting or risky changes behind the main agent or a single owner.
- Validate the lane plan before spawning or assigning work.
- Integrate lane outputs deliberately, then run the final verification from the main agent.
A goal is an explicitly requested automated multi-task delivery flow.
Normal task flow is conservative:
- tasks run one by one unless the project-manager role identifies safe parallel lanes
- PR creation waits for the human readiness confirmation gate
- post-PR QA runs after the PR is available
Goal flow is different:
- tasks still require planning, implementation, validation, PR creation, and post-PR QA
- after each task completes local validation, OpenCaw may automatically raise the task PR
- post-PR QA still runs immediately after the PR is available
- the next goal task does not start until post-PR QA completes
- if a later task depends on earlier unmerged work or is likely to conflict later, OpenCaw should branch from the earlier task branch or PR head and record that chain
Goal flow is the only exception to the normal human PR readiness confirmation prompt. It never means auto-merge, merge approval, or auto-merge enablement, and it does not skip QA.
Goal flow activates only when the user explicitly requests it, for example:
goal: finish the onboarding cleanup across the planned tasks, raising each PR automatically after validation and QA before moving on
use goal flow for this migration plan
Task planning may also mark the mode explicitly:
Flow: goal
Goal Flow: enabled
The ordinary ## Goal section inside a TASK.md file does not activate automated goal flow by itself.
Goal files live in:
.ai/goals/<goal-name>/GOAL.md
Create one with:
./commands/create-goal-file.sh "<goal_name>" ["Goal Title"] [--dry-run]Each goal file tracks:
- goal outcome and success criteria
- ordered task queue
- current task
- branch chain for dependent or conflict-prone PRs
- automation rules
- PRs raised per task
- post-PR QA evidence
- stop conditions and review notes
- final completion report path and approval order
For each completed goal task:
- Run local validation.
- Generate readiness evidence with
./commands/pr-readiness-check.sh --goal. - Automatically push/open the PR.
- Confirm the PR is available.
- Run post-PR QA.
- Post QA evidence to the PR.
- If the next task depends on this unmerged work or risks conflict, base that next task on this task branch or PR head.
- Continue to the next goal task only after post-PR QA completes.
Stop goal automation if validation fails, PR creation fails, post-PR QA fails, a merge conflict blocks progress, role resolution is ambiguous, or a required product/security decision was not already covered by the goal plan.
When all goal tasks have completed post-PR QA, generate a completion report:
./commands/create-goal-completion-report.sh "<goal_name|goal_dir|goal_file>" [--dry-run]The report is the human approval packet. It should include:
- PR links in dependency/approval order
- branch base/head notes for each PR
- stacked branch dependencies
- post-PR QA evidence links
- merge-conflict risk notes
Humans can then approve and merge in order, reducing conflict risk across dependent PRs. Goal flow does not merge the PRs itself.
Skills provide reusable instructions for AI agents to perform structured tasks.
Example skill locations:
skills/
skills/generate-architecture/
skills/create-task-file/
skills/test-dotnet/
Skills should:
- Define clear intent
- Provide deterministic instructions
- Avoid hidden behavior
Commands provide reusable CLI workflows for automation tasks.
Examples include:
commands/generate-architecture.sh
commands/create-task-file.sh
commands/dotnet-restore.sh
commands/dotnet-build.sh
commands/dotnet-test.sh
commands/security-scan.sh
commands/clean-context.sh
Commands should remain:
- deterministic
- platform-safe
- clearly documented
OpenCaw may include reusable assets under:
assets/
Current testing assets include:
assets/playwright/- Playwright config, package-script, and Azure DevOps starter templates.assets/playwright/reports/- Markdown report templates for non-interactive test evidence.assets/playwright-cli/references/- Playwright CLI reference notes for discovery workflows.
Host repositories still own real tests, credentials, app-specific artifacts, and generated reports.
OpenCaw includes built-in validation commands for its role, skill, and command schemas.
Available commands:
commands/validate-roles.sh
commands/validate-skills.sh
commands/validate-commands.sh
commands/validate-opencaw.sh
Recommended usage:
./commands/validate-opencaw.shOr run individual checks:
./commands/validate-roles.sh
./commands/validate-skills.sh
./commands/validate-commands.shThese validators check:
.roles/SCHEMA.mdcomplianceskills/SCHEMA.mdcompliancecommands/SCHEMA.mdcompliance- naming conventions
- required metadata and sections
- executable shell command requirements
OpenCaw supports structured task tracking using the .ai/tasks directory.
.ai/tasks/
.ai/tasks/TODO.md
.ai/tasks/<task-name>/TASK.md
.ai/tasks/OPEN_ISSUES.md
Rules:
TODO.mdcontains the ordered list of tasks- Each task folder contains a detailed
TASK.md - Each substantial task is backed by one GitHub issue
- Existing GitHub issues can be imported directly with
./commands/import-task-from-issue.sh "<issue-ref>"where<issue-ref>can be#123,123, or a full issue URL - Track only open issue URLs (one per line) in
OPEN_ISSUES.md - Sync and remove closed issue URLs from
.ai/taskstracking - Agents update progress as tasks are completed
- Agents ask for human PR readiness approval before pushing or opening a PR
- PRs for task-backed work should include issue linkage (for example
Closes #123) - GitHub PR operations should prefer
gh, then an availablegithubCLI/wrapper, then GitHub MCP/app connector tools only when both CLI options are unavailable or unsuitable - After a PR is confirmed available, QA should start immediately and post result comments to the PR, including inline screenshot URLs when screenshots are part of the evidence
- QA/Playwright runs may also post or link result comments to the linked issue for task history
AI learning artifacts are stored outside the tool directory to prevent pollution of the shared instruction system.
Example:
.ai/
.ai/MEMORY.md
.ai/RULES.md
.ai/DEBUG.md
These files allow agents to:
- record lessons learned
- prevent repeated mistakes
- store debugging knowledge
OpenCaw is released under the MIT License.
See the LICENSE file for full details.
OpenCaw separates thinking from execution using:
- Skills → reusable reasoning patterns (WHAT to do)
- Commands → deterministic scripts (HOW to do it)
Skills live in:
./skills/<skill-name>/SKILL.md
They are automatically used by the agent when relevant or when a role is active.
You can explicitly invoke a skill:
use skill create-task-file
use skill manage-task-issues
use skill clean-context
Or combine them:
use skill create-task-file + manage-task-issues + test-dotnet
| Skill | Purpose |
|---|---|
create-task-file |
Create a task file and link a matching issue |
goal-flow |
Manage explicit automated goals across task PRs, post-PR QA, branch chaining, and final approval reporting |
manage-task-issues |
Sync and prune open issue tracking |
orchestrate-subagents |
Plan and coordinate parallel sub-agent lanes with OpenCaw roles |
clean-context |
Compact context after substantial work |
pr-readiness-gate |
Require human confirmation before push or PR creation |
post-pr-qa |
Run QA after PR availability and post PR evidence comments |
solution-build |
Build the .NET solution |
test-dotnet |
Run .NET tests for verification |
playwright-e2e-tests |
Design or run Playwright browser verification |
playwright-browser-discovery |
Discover selectors and dynamic browser behavior before test authoring |
playwright-test-refinement |
Diagnose, rerun, and stabilize Playwright tests |
playwright-reporting |
Generate non-interactive Playwright evidence reports |
install-database-cli-tools |
Install or preview database CLI tooling setup |
database-cli-query |
Run database connect/query workflows by engine |
When using roles:
use role backend-architect
Skills are automatically prioritized:
- task tracking
- verification workflows
- role-specific command selection
Commands live in:
./commands/*.sh
They are executable scripts used for repeatable workflows.
Run directly:
./commands/validate-opencaw.shOr invoke via agent:
run command validate-opencaw
run command dotnet-build
| Command | Purpose |
|---|---|
validate-opencaw.sh |
Validate entire OpenCaw setup |
dotnet-restore.sh |
Restore .NET dependencies |
dotnet-build.sh |
Build .NET project |
dotnet-test.sh |
Run tests |
playwright-install.sh |
Install Playwright browsers in a host repository |
playwright-test.sh |
Run Playwright tests with project/grep/headed options |
playwright-show-report.sh |
Generate non-interactive report summaries from Playwright outputs |
playwright-report-summary.sh |
Convert Playwright JSON results into a Markdown run report |
playwright-artifact-index.sh |
Index screenshots, traces, videos, logs, and report artifacts |
playwright-discovery-report.sh |
Summarize .playwright-cli discovery snapshots and artifacts |
playwright-evidence-report.sh |
Generate a bundle report linking all Playwright evidence reports |
create-goal-completion-report.sh |
Create the final human approval report for a completed goal |
create-goal-file.sh |
Create a task-backed automated goal file under .ai/goals |
create-task-file.sh |
Create a task file and optionally link/create an issue |
create-task-issue.sh |
Create/link a GitHub issue for a task and track its URL |
create-subagent-plan.sh |
Create a task-backed SUBAGENTS.md lane plan |
validate-subagent-plan.sh |
Validate sub-agent lane roles, fields, dependencies, and write sets |
record-subagent-result.sh |
Append lane result evidence to SUBAGENTS.md |
import-task-from-issue.sh |
Import a task from an existing GitHub issue number/URL and link tracking files |
sync-task-issues.sh |
Remove closed issue URLs from active .ai/tasks tracking |
pr-readiness-check.sh |
Create a non-destructive readiness report and required PR approval prompt, or record goal-flow automation with --goal |
link-pr-to-task-issue.sh |
Add issue-closing linkage to PR body |
comment-pr-qa-results.sh |
Post QA evidence to a PR comment with inline screenshot URL support |
comment-issue-test-results.sh |
Post QA/Playwright results and screenshot references to issue |
clean-context.sh |
Compress context and refresh high-signal summaries |
security-scan.sh |
Run security checks |
install-database-cli-tools.sh |
Print or execute database CLI install commands |
database-cli-query.sh |
Execute engine-specific database query/connect commands |
Example:
use role code-migrator
use skill dependency-audit-dotnet
Then:
./commands/dotnet-build.sh
./commands/dotnet-test.sh- Use skills first to plan and reason
- Use commands second to execute
- Combine roles + skills for precision
- Always verify with commands before completion
| Layer | Responsibility |
|---|---|
| Role | Perspective |
| Skill | Thinking |
| Command | Execution |

