diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 680f9da..0269050 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -15,3 +15,4 @@ Before submitting, please ensure the following actions have been taken: - [ ] You have checked to ensure there aren't any other Pull Requests open for the same update. - [ ] All tests are passing, locally and on any CI tools in use. - [ ] If you know the whom should review this PR please assign it to them. +- [ ] GitHub Copilot review requested (if available for this repo). diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md new file mode 100644 index 0000000..1bb07f1 --- /dev/null +++ b/.github/copilot-instructions.md @@ -0,0 +1,190 @@ +# GitHub Copilot Workspace Instructions + +## Repository Overview +This template ships with AgentFactory-compatible agent definitions, validation tooling, and append-only logs. It provides a flexible, organized structure compatible with major AI platforms, and is designed for plug-and-play use with OpenCode and GitHub Copilot. + +## Key Architecture Principles + +### 1. Flexible Directory Structure (SHOULD) +- Agent files SHOULD be organized in the `agents/` directory +- Nested subdirectories are ALLOWED for better organization (e.g., `agents/testing/`, `agents/security/`) +- Configuration and documentation files are typically stored at the repository root +- Organization SHOULD follow conventions compatible with GitHub Copilot, OpenAI ChatGPT, Google Gemini, agent-based IDEs (OpenCode.ai), and Google Colab +- Directory structure SHOULD be logical and self-documenting + +### 2. Required Agent File Format (MUST) +Every agent file MUST include these headings in this exact order: +1. `## Purpose` - What the agent does and why it exists +2. `## Inputs` - Required and optional inputs the agent needs +3. `## Outputs` - What the agent produces +4. `## Behavior` - How the agent processes inputs to produce outputs +5. `## Constraints` - Limitations, boundaries, and operational constraints + +### 3. Append-Only Files (MUST) +The following files are append-only and MUST NOT have content removed or modified: +- `specs.md` - Technical specifications and requirements +- `agent_runs.md` - Log of agent execution runs +- `decisions.md` - Architectural and design decisions + +**Only add new entries at the end of these files. Never modify or delete existing entries.** + +### 4. No Fabrication Policy (MUST) +- MUST NOT fabricate citations, references, or test results +- All claims MUST be verifiable and traceable to sources +- Use `[ASSUMPTION]` tags for speculative content +- Provide verifiable sources for all external information + +### 5. Tagging Conventions +Use these tags consistently throughout documentation: +- `[SPEC]` - Specification entries +- `[ASSUMPTION]` - Assumptions made +- `[RISK]` - Identified risks +- `[TODO]` - Pending tasks +- `[DECISION]` - Design decisions +- `[TEST]` - Test cases +- `[DONE]` - Completed items + +## Core Files and Their Purpose + +### Configuration Files +- **agents.yaml** - Central agent registry and configuration schema + - Defines all agents with metadata (id, name, description, tags, status) + - Specifies validation rules (MUST/SHOULD/MAY requirements) + - Lists allowed tags for categorization + +- **agents.md** - Agent documentation rules and guidelines + - Template for creating new agents + - MUST/SHOULD/MAY requirements explained + - Validation procedures + +### Agent Files +Located in `agents/` directory: +- **Architect.md** - Spec Author + System Designer +- **Builder.md** - Implementer / Artifact Producer +- **Skeptic.md** - Adversarial Reviewer / Breaker +- **Editor.md** - Clarity + Structure Editor +- **ProjectManager.md** - Packaging + Orchestration +- **CitationOfficer.md** - Evidence Auditor + Claim Tracker +- **ChatGPT.md** - Generalist Execution Agent +- **OpenCodeManager.md** - Session orchestration for OpenCode + +### Documentation Files +- **README.md** - Quick start and overview +- **specs.md** - Technical specifications (append-only) +- **decisions.md** - Design decisions (append-only) +- **agent_runs.md** - Execution log (append-only) +- **INSTALL.md** - Installation instructions +- **CHANGELOG.md** - Version history + +## Development Workflow + +### Adding a New Agent +1. Define the agent in `agents.yaml` with all required fields: + - Unique ID + - Name and description + - At least one tag from the allowed list + - Status (active/inactive) + - File path in agents/ directory + +2. Create the agent markdown file in `agents/` directory following the template + +3. Ensure all five required headings are present in correct order + +4. Run `./validate_agents.sh` to verify compliance + +5. Document the addition in `decisions.md` (append-only) + +### Modifying Existing Agents +1. Update the agent definition in `agents.yaml` if metadata changes +2. Update the markdown file with changes +3. Increment the version number +4. Document changes in agent's version history +5. Record the decision in `decisions.md` (append-only) +6. Run validation: `./validate_agents.sh` + +### Running Validation +```bash +./validate_agents.sh +``` + +This script validates: +- Required headings in agent files +- Unique agent IDs +- Proper tag usage +- File existence +- Markdown format compliance + +## Code Style and Standards + +### Markdown +- Use GitHub Flavored Markdown +- Use code blocks with language specification +- Use H2 (`##`) for required agent headings +- Keep lines reasonably short for readability + +### YAML +- Use 2-space indentation +- Follow schema defined in `agents.yaml` +- Validate YAML syntax before committing + +### Versioning +- Follow Semantic Versioning (SemVer 2.0.0) +- Document version changes in agent files +- Update CHANGELOG.md for releases + +## Important Constraints + +### What You MUST Do +- Validate all changes with `./validate_agents.sh` before committing +- Include all five required headings in agent files +- Use unique IDs for all agents +- Tag all agents with at least one allowed tag +- Append to append-only files (never modify existing content) +- Mark assumptions clearly with `[ASSUMPTION]` tags + +### What You MUST NOT Do +- Modify or delete content in append-only files (specs.md, decisions.md, agent_runs.md) +- Fabricate citations or test results +- Remove or override working code without explicit reason +- Create agents without required headings +- Use duplicate agent IDs + +## Copilot Agent Integration + +This repository is designed to work seamlessly with GitHub Copilot Agents. The agent definitions in the `agents/` directory can be: + +1. **Referenced by Copilot** to understand system architecture and roles +2. **Used as templates** for creating new AI agent definitions +3. **Validated automatically** via the validation script in CI/CD + +When working with GitHub Copilot Workspace: +- Copilot can read agent definitions to understand the agent factory pattern +- Copilot can help create new agent definitions following the established format +- Copilot can validate changes against the specifications +- Copilot respects the flat-file structure and append-only constraints + +## Testing and Quality + +### Validation Tests +Run the validation suite: +```bash +./validate_agents.sh +``` + +Expected output: All tests should pass with green checkmarks. + +### Manual Review Requirements +- Citation verification (TEST-005-1) requires human review +- Ensure all external references are accurate +- Verify test results match actual execution + +## Questions and Support + +For questions or issues: +1. Check the [SUPPORT.md](/.github/SUPPORT.md) guide +2. Review [agents.md](/agents.md) for agent documentation rules +3. Check [specs.md](/specs.md) for technical specifications +4. Refer to [decisions.md](/decisions.md) for design rationale + +## License +See [LICENSE](/LICENSE) for usage rights and restrictions. diff --git a/.github/workflows/validate-agents.yml b/.github/workflows/validate-agents.yml new file mode 100644 index 0000000..ef29a38 --- /dev/null +++ b/.github/workflows/validate-agents.yml @@ -0,0 +1,57 @@ +name: Validate Agent Definitions + +on: + push: + branches: [ main, develop ] + paths: + - 'agents/**' + - 'agents.yaml' + - 'agents.md' + - 'validate_agents.sh' + pull_request: + branches: [ main, develop ] + paths: + - 'agents/**' + - 'agents.yaml' + - 'agents.md' + - 'validate_agents.sh' + workflow_dispatch: + +jobs: + validate: + runs-on: ubuntu-latest + name: Validate Agent Factory Specifications + permissions: + contents: read + + steps: + - name: Checkout repository + uses: actions/checkout@v4 + + - name: Make validation script executable + run: chmod +x validate_agents.sh + + - name: Run Agent Factory Validation + run: ./validate_agents.sh + + - name: Check for required files + run: | + echo "Checking for required files..." + [ -f agents.yaml ] && echo "✓ agents.yaml exists" || (echo "✗ agents.yaml missing" && exit 1) + [ -f agents.md ] && echo "✓ agents.md exists" || (echo "✗ agents.md missing" && exit 1) + [ -f specs.md ] && echo "✓ specs.md exists" || (echo "✗ specs.md missing" && exit 1) + [ -f decisions.md ] && echo "✓ decisions.md exists" || (echo "✗ decisions.md missing" && exit 1) + [ -f agent_runs.md ] && echo "✓ agent_runs.md exists" || (echo "✗ agent_runs.md missing" && exit 1) + [ -d agents ] && echo "✓ agents/ directory exists" || (echo "✗ agents/ directory missing" && exit 1) + + - name: Validation Summary + if: success() + run: | + echo "==========================================" + echo "✓ All Agent Factory validations passed!" + echo "==========================================" + echo "" + echo "✓ Validation script completed successfully" + echo "✓ Required files present" + echo "" + echo "Repository is ready for deployment." diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..0e38714 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,214 @@ +# AGENTS.md + +This file provides practical guidance for AI agents working on repositories created from this template. + +## Project Overview + +This template is preloaded with AgentFactory-style agent definitions, validation, and append-only logs so OpenCode and Copilot can work plug-and-play. + +## Key Files and Structure + +- **agents/** - Agent definition files (markdown format) + - Can be flat or nested in subdirectories + - Each agent has 5 required headings: Purpose, Inputs, Outputs, Behavior, Constraints + +- **specialisms/** - Domain-specific standards and guidelines + - Testing.md, Security.md, Coder.md, etc. + +- **agents.yaml** - Central registry + - All agents MUST be registered here with unique IDs + - Defines tags, file paths, and metadata + +- **Append-only files** (DO NOT modify existing content, only append): + - `specs.md` - Technical specifications + - `decisions.md` - Architectural decisions + - `agent_runs.md` - Execution logs + +- **traits/** - Reusable behavioral modules for agents/specialisms/tasks +- **workflows/** - Task-level execution patterns and validation gates +- **tasks/** - Recurring and stateful task tracking (cadence, wip, triage) + +## Development Workflow + +### Adding a New Agent + +1. Create the agent markdown file in `agents/` (or subdirectory) +2. Include all 5 required headings in order: + - `## Purpose` + - `## Inputs` + - `## Outputs` + - `## Behavior` + - `## Constraints` +3. Add entry to `agents.yaml` with: + - Unique ID (e.g., "my-agent-001") + - Name, description, version + - At least one tag from allowed_tags list + - File path relative to repository root + - Status (active/inactive) +4. Run validation: `./validate_agents.sh` +5. If making architectural decisions, append to `decisions.md` + +### Validation + +Always run before committing: +```bash +./validate_agents.sh +``` + +This validates: +- Required headings present and in correct order +- Unique agent IDs +- Tags from allowed list +- File paths exist +- Markdown format + +### Python Validation (Alternative) + +For more detailed validation: +```python +import yaml + +with open('agents.yaml', 'r') as f: + data = yaml.safe_load(f) + +# Check unique IDs +ids = [agent['id'] for agent in data['agents']] +assert len(ids) == len(set(ids)), "Duplicate agent IDs found" +``` + +## Coding Conventions + +### Agent Files (Markdown) +- Use GitHub Flavored Markdown +- Use H2 (`##`) for required headings +- Use H3 (`###`) for subsections +- Include code blocks with language specification +- Keep lines reasonably short for readability + +### YAML Files +- Use 2-space indentation +- Quote strings containing special characters +- Validate YAML syntax before committing + +### Append-Only Files +- **NEVER** delete or modify existing content +- **ALWAYS** add new entries at the end +- Include date stamps in entry headers +- Use horizontal rules (`---`) to separate entries + +## Testing + +Run all validation tests: +```bash +./validate_agents.sh +``` + +Check specific aspects: +```bash +# Check for required headings in a specific file +grep -E "^## (Purpose|Inputs|Outputs|Behavior|Constraints)" agents/MyAgent.md + +# Validate YAML syntax +python3 -c "import yaml; yaml.safe_load(open('agents.yaml'))" + +# Check for nested files (if needed) +find agents/ -name "*.md" -type f +``` + +## Common Tasks + +### View Current Agents +```bash +# List all agent files +find agents/ -name "*.md" -type f + +# View agents in registry +grep " - id:" agents.yaml +``` + +### Check Append-Only File Status +```bash +# View recent additions to specs.md +tail -50 specs.md + +# View recent decisions +tail -50 decisions.md + +# View recent agent runs +tail -50 agent_runs.md +``` + +### Update Agent Metadata +1. Edit `agents.yaml` to update metadata +2. Increment version number +3. Update `last_modified` date +4. Run `./validate_agents.sh` +5. Append decision to `decisions.md` explaining why + +## Important Rules + +### MUST DO +- Include all 5 required headings in agent files +- Register agents in agents.yaml with unique IDs +- Use tags from the allowed_tags list +- Run validation before committing +- Append to append-only files (never modify existing content) +- Mark assumptions with `[ASSUMPTION]` tags + +### MUST NOT DO +- Modify or delete content in append-only files (specs.md, decisions.md, agent_runs.md) +- Fabricate citations or test results +- Use duplicate agent IDs +- Create agents without required headings +- Skip validation before committing + +## Directory Structure + +Both flat and nested structures are supported: + +``` +# Flat structure (original, still valid) +agents/ +├── Architect.md +├── Builder.md +├── Tester.md +└── SecurityReviewer.md + +# Nested structure (now supported) +agents/ +├── core/ +│ ├── Architect.md +│ └── Builder.md +├── quality/ +│ ├── Tester.md +│ └── SecurityReviewer.md +└── documentation/ + └── Editor.md +``` + +The agents.yaml file paths should match the actual file locations. + +## Troubleshooting + +### Validation Fails +- Check that all 5 headings are present and spelled correctly +- Verify agent ID is unique in agents.yaml +- Ensure tags are from the allowed_tags list +- Confirm file path in agents.yaml matches actual file location + +## Further Documentation + +- `.github/copilot-instructions.md` - Detailed instructions for GitHub Copilot +- `agents.md` - Agent documentation guidelines and template +- `specs.md` - Technical specifications (append-only) +- `decisions.md` - Architectural decisions (append-only) +- `README.md` - Project overview and quick start + +## Platform Compatibility + +This repository structure is compatible with: +- GitHub Copilot Workspace +- OpenAI ChatGPT (this AGENTS.md file) +- Google Gemini +- Google Colab +- Agent-based IDEs (OpenCode.ai) diff --git a/README.md b/README.md index 4487529..de622cf 100644 --- a/README.md +++ b/README.md @@ -21,18 +21,21 @@ as per the other documents feel free to adapt it to fit your needs, adding, modi For more details, see [INSTALL](/INSTALL.md). -## AI Agents (GitHub Copilot) -This template includes AI agent definitions for use with GitHub Copilot. These agents provide structured roles and instructions for different types of development tasks: +## AI Agents (AgentFactory) +This template includes AgentFactory-compatible AI agent definitions and validation tooling. These agents provide structured roles and instructions for different types of development tasks: - **Architect** - Spec authoring and system design - **Builder** - Implementation and artifact production -- **Skeptic** - Adversarial review and testing +- **Tester** - Test creation and validation +- **SecurityReviewer** - Security analysis and compliance +- **Skeptic** - Adversarial review and edge cases - **Editor** - Clarity and structure improvements -- **Project Manager** - Coordination and packaging -- **Citation Officer** - Evidence auditing +- **ProjectManager** - Coordination and packaging +- **CitationOfficer** - Evidence auditing - **ChatGPT** - General-purpose execution +- **OpenCodeManager** - Session orchestration for OpenCode -For details on using these agents, see [agents.md](/agents.md). Agent definitions are in [.github/agents/](/.github/agents/). +For details, see [AGENTS.md](/AGENTS.md) and [agents.md](/agents.md). Agent definitions live under [agents/](/agents/). ## Getting involved If for any reason you wish or need to get involved, please start by reading our [CODE OF CONDUCT](/CODE_OF_CONDUCT.md) @@ -47,4 +50,4 @@ should either be available. ## Legal Unless otherwise stated, or where written permission has been given by the copyright holder, this software is for use by the copyright holder only. Public availability of this repository or any of it's contents does grant anyone licence -or rights of any kind. Further information can be found in the [LICENSE](/LICENSE) document. \ No newline at end of file +or rights of any kind. Further information can be found in the [LICENSE](/LICENSE) document. diff --git a/agent_runs.md b/agent_runs.md new file mode 100644 index 0000000..6376e86 --- /dev/null +++ b/agent_runs.md @@ -0,0 +1,321 @@ +# Agent Runs Log + +**NOTE: This file is append-only. Do not modify or remove existing entries. Only add new run logs at the end.** + +--- + +## Log Entry Format + +Each agent run entry MUST follow this format: + +``` +## Run #XXX - [Agent-ID] Agent Name +**Date**: YYYY-MM-DD HH:MM:SS UTC +**Status**: Success | Failure | Partial | Aborted +**Duration**: XXm XXs +**Executor**: User/System identifier + +### Inputs +- Input parameter 1: value +- Input parameter 2: value + +### Outputs +- Output 1: description/value +- Output 2: description/value + +### Result Summary +Brief description of what happened during the run. + +### Issues Encountered +- Issue 1: description (if any) +- Issue 2: description (if any) + +### Actions Taken +- Action 1: description +- Action 2: description + +### Related References +- Decision: DEC-XXX +- Spec: SPEC-XXX +- Commit: [commit-hash] + +--- +``` + +## Purpose + +This log maintains a historical record of all agent executions in the Agent Factory. This helps with: +- Debugging and troubleshooting +- Performance tracking +- Audit trail +- Learning from past runs + +## Initial Entry + +--- + +## Run #001 - System Initialization +**Date**: 2026-01-28 17:54:00 UTC +**Status**: Success +**Duration**: 0m 0s +**Executor**: System + +### Inputs +- Action: Initialize Agent Factory structure +- Configuration: Default settings + +### Outputs +- Created agents.yaml configuration file +- Created agents.md documentation file +- Created specs.md specifications file +- Created agent_runs.md log file (this file) +- Created decisions.md decisions file + +### Result Summary +Successfully initialized the Agent Factory repository structure with all required files and documentation. + +### Issues Encountered +None + +### Actions Taken +- Created base configuration files +- Established documentation structure +- Defined validation rules and tests +- Set up append-only file system + +### Related References +- Spec: SPEC-001 (File Structure) +- Spec: SPEC-002 (Agent File Format) +- Spec: SPEC-003 (Tags and Metadata) +- Spec: SPEC-004 (Append-Only Files) +- Spec: SPEC-005 (No Fabrication) +- Spec: SPEC-006 (Markdown Preference) + +--- + +## Run #002 - Agent Analysis and Recommendations +**Date**: 2026-01-28 19:48:00 UTC +**Status**: Success +**Duration**: 15m 30s +**Executor**: GitHub Copilot Agent + +### Inputs +- Action: Analyze existing agents and recommend new agents and specialisms +- Configuration: Review all existing agent definitions, specialisms, and repository structure +- Context: 7 existing agents, 3 existing specialisms + +### Outputs +- Created agent_recommendations.md with comprehensive analysis +- Identified 7 recommended new agents (3 high priority, 4 medium, 1 low) +- Identified 7 recommended new specialisms (2 high priority, 3 medium, 2 low) +- Analyzed gaps in current agent coverage +- Proposed phased implementation plan + +### Result Summary +Successfully analyzed the existing agent system and produced comprehensive recommendations for new agents and specialisms. Analysis identified key gaps in testing, security, deployment, documentation, integration, data modeling, and performance optimization. Recommendations are prioritized and include implementation guidance. + +### Issues Encountered +None + +### Actions Taken +- Reviewed all 7 existing agent definitions in agents/ directory +- Reviewed all 3 existing specialisms in specialisms/ directory +- Analyzed agents.yaml configuration and validation rules +- Reviewed specs.md for technical requirements +- Reviewed decisions.md for design rationale +- Identified gaps in lifecycle coverage +- Developed prioritized recommendations +- Created comprehensive recommendations document +- Proposed integration with existing workflow +- Defined success metrics + +### Recommendations Summary +**High Priority:** +- Tester Agent (Test Creator + Quality Validator) +- SecurityReviewer Agent (Security Analyst + Compliance Auditor) +- Security Specialism +- Testing Specialism + +**Medium Priority:** +- Deployer Agent (Deployment Engineer + Operations Specialist) +- DocWriter Agent (Technical Writer + UX Documentation Specialist) +- Integrator Agent (Integration Architect + API Designer) +- DataModeler Agent (Data Architect + Schema Designer) +- API Design Specialism +- Deployment Specialism +- Documentation Specialism + +**Low Priority:** +- Optimizer Agent (Performance Engineer + Efficiency Analyst) +- Data Specialism +- Performance Specialism + +### Related References +- Output: agent_recommendations.md +- Spec: SPEC-001 (File Structure) +- Spec: SPEC-002 (Agent File Format) +- Spec: SPEC-003 (Tags and Metadata) +- Decision: DEC-009 (Agent Analysis Recommendations - see decisions.md) + +--- + +## Run #003 - Phase 1 Agent Implementation +**Date**: 2026-01-29 01:32:00 UTC +**Status**: Success +**Duration**: 25m 15s +**Executor**: GitHub Copilot Agent + +### Inputs +- Action: Implement Phase 1 high-priority agent recommendations +- Configuration: Based on agent_recommendations.md Phase 1 specifications +- Context: 7 existing agents, 3 existing specialisms + +### Outputs +- Created agents/Tester.md (6,408 bytes, 217 lines) +- Created agents/SecurityReviewer.md (7,290 bytes, 239 lines) +- Created specialisms/Testing.md (3,888 bytes, 147 lines) +- Created specialisms/Security.md (5,957 bytes, 221 lines) +- Updated agents.yaml with 2 new agent entries +- Added 2 new tags to allowed_tags: security, quality + +### Result Summary +Successfully implemented Phase 1 high-priority recommendations by creating Tester and SecurityReviewer agents with their supporting specialisms. Both agents follow the required structure with all five headings (Purpose, Inputs, Outputs, Behavior, Constraints). New agents integrated into agents.yaml with unique IDs and appropriate tags. + +### Issues Encountered +- Validation script TEST-003-2 has a bug: uses `awk '{print $2}'` which extracts "id:" literal instead of actual ID values (should use $3) +- Python YAML validation confirms all agent IDs are unique +- All agent files pass required heading validation +- All tags are from allowed list +- Flat file structure maintained + +### Actions Taken +- Designed Tester agent based on Testing specialism standards + - Comprehensive test planning and execution + - Coverage analysis and quality metrics + - Unit, integration, and acceptance testing +- Designed SecurityReviewer agent based on Security specialism standards + - OWASP Top 10 compliance checking + - Vulnerability assessment with severity classification + - Security best practices enforcement +- Created Testing specialism defining: + - Test types and naming conventions + - Quality gates for testing + - Coverage metrics + - AAA pattern and test independence +- Created Security specialism defining: + - Security principles (least privilege, defense in depth) + - OWASP Top 10 checklist + - Secure coding practices + - Severity classification +- Added new tags: security, quality +- Updated agents.yaml with proper YAML formatting +- Validated with Python YAML parser + +### Quality Metrics +- Agent files: 2 created +- Specialism files: 2 created +- Total lines added: ~824 lines +- Validation: 25/26 tests pass (1 known script bug) +- Python validation: All checks pass +- Required headings: All present in correct order +- File structure: Flat (no nested directories) +- Tag compliance: All tags from allowed list + +### Related References +- Recommendation: agent_recommendations.md (Phase 1) +- Decision: DEC-009 (Agent Expansion Strategy) +- Previous Run: Run #002 (Recommendations Analysis) +- Spec: SPEC-001 (File Structure) - maintained +- Spec: SPEC-002 (Agent File Format) - followed +- Spec: SPEC-003 (Tags and Metadata) - compliant + +--- + +## Run #004 - Add AGENTS.md for OpenAI ChatGPT Compatibility +**Date**: 2026-01-29 01:44:00 UTC +**Status**: Success +**Duration**: 8m 30s +**Executor**: GitHub Copilot Agent + +### Inputs +- Action: Review and align with OpenAI ChatGPT agents.md standard +- Reference: https://github.com/agentsmd/agents.md +- Context: User requested alignment with OpenAI AGENTS.md format + +### Outputs +- Created AGENTS.md (5,694 bytes, 202 lines) +- Added DEC-012 to decisions.md +- Added Run #004 to agent_runs.md + +### Result Summary +Successfully created AGENTS.md file following the OpenAI ChatGPT agents.md standard. The file provides simple, practical instructions for AI agents working on the AgentFactory project, complementing the existing comprehensive documentation in .github/copilot-instructions.md. + +### Key Features of AGENTS.md +- **Project Overview**: Brief description of AgentFactory +- **Key Files Structure**: Quick reference to important files +- **Development Workflow**: Step-by-step agent creation process +- **Coding Conventions**: Style guidelines for markdown, YAML, append-only files +- **Testing**: Validation commands and checks +- **Common Tasks**: Frequently used commands and operations +- **Important Rules**: Clear MUST/MUST NOT lists +- **Directory Structure**: Examples of flat and nested structures +- **Troubleshooting**: Solutions to common issues +- **Platform Compatibility**: List of supported AI platforms + +### Platform Compatibility +The AGENTS.md format is recognized by: +- OpenAI ChatGPT (primary target) +- GitHub Copilot (reads AGENTS.md as fallback) +- Google Gemini +- Google Colab +- Agent-based IDEs (OpenCode.ai) +- Any AI agent following the agents.md convention + +### Documentation Strategy +The project now has layered documentation: +1. **AGENTS.md** - Simple, practical quick reference (NEW) +2. **.github/copilot-instructions.md** - Comprehensive GitHub Copilot guide +3. **agents.md** - Agent definition template and guidelines +4. **README.md** - Human-readable project overview +5. **specs.md** - Technical specifications (append-only) +6. **decisions.md** - Architectural decisions (append-only) + +Each serves a different purpose and audience, with minimal duplication. + +### Design Choices +- **Practical over Comprehensive**: Focus on common tasks +- **Code Examples**: Include actual commands to run +- **Quick Reference**: Easy to scan and find information +- **Troubleshooting Section**: Address known issues upfront +- **Platform-Agnostic**: Language works for any AI agent + +### Issues Encountered +None + +### Actions Taken +- Reviewed OpenAI agents.md standard and examples +- Created AGENTS.md following the format +- Structured content for AI agent consumption +- Included practical examples and commands +- Documented common workflows +- Added troubleshooting section +- Listed platform compatibility +- Documented decision in DEC-012 +- Added this run log entry + +### Quality Metrics +- File created: 1 +- Lines added: 202 +- Size: 5.7 KB +- Sections: 11 main sections +- Code examples: Multiple bash and Python snippets +- Validation: File follows standard markdown format + +### Related References +- Standard: https://github.com/agentsmd/agents.md +- Decision: DEC-012 (AGENTS.md Addition) +- Previous Run: Run #003 (Phase 1 Implementation) +- Related Decision: DEC-011 (Flexible Directory Structure) + +--- + diff --git a/agents.md b/agents.md index 0a9c6de..0f95860 100644 --- a/agents.md +++ b/agents.md @@ -1,51 +1,27 @@ -# AI Agents Documentation +# Agents Documentation ## Overview -This document defines the rules and structure for AI agent files used with GitHub Copilot. All agents MUST follow these rules to ensure consistency and compatibility with Copilot's agent framework. - -## Purpose -These agent definitions provide structured instructions for GitHub Copilot to assume specific roles when working on tasks. Each agent has: -- A clear role and objective -- Defined scope (what's in and out of scope) -- Specific inputs and outputs -- Operating constraints -- Success criteria - -## Agent Directory Structure -``` -.github/ - agents/ - Architect.md - Builder.md - Skeptic.md - Editor.md - ProjectManager.md - CitationOfficer.md - ChatGPT.md -agents.yaml (configuration) -agents.md (this file) -``` +This document defines the rules and structure for agent files in the Agent Factory repository. All agents MUST follow these rules to ensure consistency and maintainability. ## Rules for Agent Files ### MUST Requirements -1. **File Location**: All agent files MUST be stored in `.github/agents/` directory - - **Test**: Verify all agent files are in the correct location - - **Pass**: All agent files found in `.github/agents/` - - **Fail**: Any agent file found elsewhere - -2. **Required Headings**: All agent files MUST include the following headings: - - `## Role and Objective` - What the agent does and optimizes for - - `## Scope` - What is in scope and out of scope - - `## Inputs` - Required and optional inputs - - `## Outputs` - Primary and secondary outputs - - `## Constraints` - Time, budget, style, safety constraints - - `## Success Criteria` - Measurable criteria for success +1. **Flexible File Structure**: Agent files MUST be stored under the `agents/` directory and MAY be nested in subdirectories. All `file_path` values in `agents.yaml` MUST point to existing files. + - **Test**: Verify all agent file paths defined in `agents.yaml` exist + - **Pass**: Every referenced file exists + - **Fail**: Any referenced file is missing + +2. **Required Headings**: All agent files MUST include the following headings in order: + - `## Purpose` - What the agent does + - `## Inputs` - What data/parameters the agent requires + - `## Outputs` - What the agent produces + - `## Behavior` - How the agent operates + - `## Constraints` - Limitations and boundaries - **Test**: Parse markdown files and verify all headings are present - - **Pass**: All required headings exist - - **Fail**: Any required heading is missing + - **Pass**: All required headings exist in the correct order + - **Fail**: Any required heading is missing or out of order -3. **Tags**: All agent files MUST have at least one tag defined in agents.yaml +3. **Tags**: All agent files MUST have at least one tag defined in the agents.yaml file - **Test**: Verify each agent in agents.yaml has tags array with at least one entry - **Pass**: All agents have 1+ tags - **Fail**: Any agent has zero tags @@ -60,101 +36,113 @@ agents.md (this file) - **Pass**: All files exist - **Fail**: Any referenced file is missing -6. **No Fabrication**: Agent documentation MUST NOT fabricate citations, results, or data. All references MUST be verifiable or marked as `[ASSUMPTION]`. +6. **No Fabrication**: Agent documentation MUST NOT fabricate citations, results, or data. All references MUST be verifiable. - **Test**: Manual review or citation validation - - **Pass**: All citations are verifiable or properly tagged + - **Pass**: All citations are verifiable - **Fail**: Fabricated or unverifiable citations found ### SHOULD Requirements 1. **Descriptive Names**: Agent names SHOULD be descriptive and clearly indicate their purpose 2. **Version Control**: Agents SHOULD include semantic version numbers -3. **Examples**: Agent documentation SHOULD include usage examples or operating procedures -4. **Operating Procedure**: Agents SHOULD document their step-by-step operating procedure +3. **Examples**: Agent documentation SHOULD include usage examples +4. **Change History**: Changes to agents SHOULD be documented in CHANGELOG.md ### MAY Requirements 1. **Additional Metadata**: Agents MAY include additional custom metadata fields 2. **External Resources**: Agents MAY reference external documentation or resources -3. **Specializations**: Agents MAY reference domain-specific addendum files -4. **Standard Response Format**: Agents MAY document their expected output format +3. **Diagrams**: Agent documentation MAY include diagrams or flowcharts +4. **Performance Notes**: Agents MAY document performance characteristics + +## Agent File Template -## Agent Roles +```markdown +# Agent Name -### Core Agents +## Purpose +Describe what this agent does and why it exists. -1. **Architect** - Spec Author + System Designer - - Converts user goals into testable, unambiguous specs - - Defines interfaces, constraints, risks, and acceptance tests - - Produces specs suitable for implementation by other agents +## Inputs +- Input 1: Description and format +- Input 2: Description and format -2. **Builder** - Implementer / Artifact Producer - - Turns specs into deliverables - - Produces runnable/usable artifacts that are flat-file compatible - - Ensures compliance with quality gates +## Outputs +- Output 1: Description and format +- Output 2: Description and format -3. **Skeptic** - Adversarial Reviewer / Breaker - - Stress-tests specs and artifacts - - Finds ambiguity, edge cases, contradictions, and failure modes - - Proposes minimal patches to improve robustness +## Behavior +Describe how the agent processes inputs to produce outputs. -4. **Editor** - Clarity + Structure Editor - - Improves readability and structure without changing intent - - Enforces tagging and required headings - - Standardizes terminology and tightens requirements language +1. Step 1 +2. Step 2 +3. Step 3 -5. **Project Manager** - Packaging + Orchestration - - Coordinates the agent pipeline - - Maintains logs and ensures consistency - - Produces "next actions" and release-ready bundles +## Constraints +- Constraint 1: Description +- Constraint 2: Description -6. **Citation Officer** - Evidence Auditor + Claim Tracker - - Audits for unsupported factual claims - - Enforces "no fabricated citations" policy - - Produces claim→evidence maps +## Tags +Tags are defined in agents.yaml and help categorize this agent. -7. **ChatGPT** - Generalist Execution Agent - - Default agent for interactive tasks - - Can perform any role when needed - - Maintains auditability via logs and tags +## Version History +- v1.0.0 (2026-01-28): Initial version +``` -## Using Agents with GitHub Copilot +## Validation Process -To reference an agent in your Copilot instructions: +To validate agent files, run the following checks: -1. **In Comments**: Reference the agent by name - ```python - # @Architect: Please create a spec for user authentication +1. **Structure Check**: Verify all referenced agent files exist + ```bash + grep 'file_path:' agents.yaml | awk '{print $2}' | tr -d '"' | while read -r path; do + [ -f "$path" ] || echo "FAIL: Missing $path" + done ``` -2. **In Issues/PRs**: Tag the agent in descriptions - ```markdown - @Skeptic: Please review this implementation for edge cases +2. **Heading Check**: Verify required headings exist in correct order + ```bash + for file in agents/*.md; do + # Check presence of all required headings + grep -q "## Purpose" "$file" && + grep -q "## Inputs" "$file" && + grep -q "## Outputs" "$file" && + grep -q "## Behavior" "$file" && + grep -q "## Constraints" "$file" || + echo "FAIL: Missing headings in $file" + + # Check heading order (Purpose < Inputs < Outputs < Behavior < Constraints) + purpose_line=$(grep -n "^## Purpose" "$file" | cut -d: -f1) + inputs_line=$(grep -n "^## Inputs" "$file" | cut -d: -f1) + outputs_line=$(grep -n "^## Outputs" "$file" | cut -d: -f1) + behavior_line=$(grep -n "^## Behavior" "$file" | cut -d: -f1) + constraints_line=$(grep -n "^## Constraints" "$file" | cut -d: -f1) + + [ "$purpose_line" -lt "$inputs_line" ] && + [ "$inputs_line" -lt "$outputs_line" ] && + [ "$outputs_line" -lt "$behavior_line" ] && + [ "$behavior_line" -lt "$constraints_line" ] || + echo "FAIL: Headings out of order in $file" + done ``` -3. **In Custom Instructions**: Load agent definitions - ```markdown - Use the Architect agent role from .github/agents/Architect.md +3. **YAML Validation**: Validate agents.yaml structure + ```bash + # Requires yq or similar YAML parser + yq eval '.agents[].id' agents.yaml | sort | uniq -d | grep . && echo "FAIL: Duplicate IDs" || echo "PASS" ``` -## Tagging Conventions - -Use these tags in agent outputs to maintain auditability: -- `[SPEC]` - Specification or requirement -- `[ASSUMPTION]` - Explicit assumption made due to missing information -- `[RISK]` - Identified risk or concern -- `[TODO]` - Action item or follow-up needed -- `[DECISION]` - Non-trivial design decision made -- `[TEST]` - Test case or verification step -- `[DONE]` - Completed item or resolved issue - ## Adding New Agents To add a new agent: -1. Create the agent markdown file in `.github/agents/` +1. Ensure the `agents/` directory exists (create it if needed: `mkdir -p agents`) 2. Define the agent in `agents.yaml` with all required fields -3. Ensure all required headings are present -4. Add appropriate tags from the allowed list -5. Update this documentation if needed +3. Create the agent markdown file under `agents/` (subdirectories allowed) +4. Ensure all required headings are present +5. Add appropriate tags from the allowed list +6. Run validation tests +7. Document the addition in `decisions.md` + +**Note**: The `agents/` directory MUST exist before you reference agent files in `agents.yaml`. If you define an agent in `agents.yaml` before creating the directory, the validation script will report an error. ## Modifying Existing Agents @@ -163,28 +151,16 @@ When modifying agents: 1. Update the agent definition in `agents.yaml` if metadata changes 2. Update the markdown file with changes 3. Increment the version number -4. Document significant changes in the repository's CHANGELOG.md or commit messages - -## Validation - -These agent files follow a standard structure with required headings. To validate: -- Verify all files exist in `.github/agents/` -- Check all required headings are present -- Confirm all agents in `agents.yaml` reference existing files -- Verify all agent IDs are unique +4. Document changes in the agent's version history +5. Record the decision in `decisions.md` -A validation script may be added in the future for automated checking. +## Append-Only Files -## Compatibility with GitHub Copilot +The following files are **append-only** and MUST NOT have content removed or modified (only additions at the end): +- `specs.md` - Technical specifications and requirements +- `agent_runs.md` - Log of agent execution runs +- `decisions.md` - Architectural and design decisions -These agent definitions are designed to work with: -- GitHub Copilot Workspace -- GitHub Copilot Chat -- GitHub Copilot CLI -- Custom agent frameworks +## Markdown Output Preference -The structured format ensures Copilot can: -- Understand the agent's role and constraints -- Follow the specified operating procedure -- Produce outputs in the expected format -- Maintain consistency across tasks +All agent outputs and documentation SHOULD be in Markdown format for consistency and readability. diff --git a/agents.yaml b/agents.yaml index e3333c0..364aa72 100644 --- a/agents.yaml +++ b/agents.yaml @@ -1,155 +1,91 @@ # agents.yaml -# Configuration file for AI Agent definitions -# This file defines the structure and metadata for agents in GitHub Copilot +# Configuration file for Agent Factory +# This file defines the structure and metadata for agents in the system # All agents MUST be defined here before they can be used # -# IMPORTANT: This file should be placed at the repository root or in .github/ -# The agents themselves are defined in .github/agents/ directory +# IMPORTANT: Before defining agents with file_path, ensure the agents/ directory exists. +# The validation script will check for this directory and create it if needed. # Schema version schema_version: "1.0.0" # Agent definitions agents: - - id: "architect-001" - name: "Architect" - description: "Spec Author + System Designer - converts user goals into testable specs" - version: "1.0.0" - tags: - - specification - - design - - planning - required_headings: - - "Role and Objective" - - "Scope" - - "Inputs" - - "Outputs" - - "Constraints" - - "Success Criteria" - file_path: ".github/agents/Architect.md" - status: "active" - created_date: "2026-01-28" - last_modified: "2026-01-28" - - - id: "builder-001" - name: "Builder" - description: "Implementer / Artifact Producer - turns specs into deliverables" - version: "1.0.0" - tags: - - implementation - - development - - artifacts - required_headings: - - "Role and Objective" - - "Scope" - - "Inputs" - - "Outputs" - - "Constraints" - - "Success Criteria" - file_path: ".github/agents/Builder.md" - status: "active" - created_date: "2026-01-28" - last_modified: "2026-01-28" - - - id: "skeptic-001" - name: "Skeptic" - description: "Adversarial Reviewer / Breaker - finds edge cases and failure modes" - version: "1.0.0" - tags: - - review - - testing - - quality-assurance - required_headings: - - "Role and Objective" - - "Scope" - - "Inputs" - - "Outputs" - - "Constraints" - - "Success Criteria" - file_path: ".github/agents/Skeptic.md" - status: "active" - created_date: "2026-01-28" - last_modified: "2026-01-28" - - - id: "editor-001" - name: "Editor" - description: "Clarity + Structure Editor - improves readability without changing intent" + - id: "example-agent-001" + name: "Example Documentation Agent" + description: "A demonstration agent showing the required structure for agent documentation" version: "1.0.0" tags: - documentation - - clarity - - structure + - automation + - utility required_headings: - - "Role and Objective" - - "Scope" + - "Purpose" - "Inputs" - "Outputs" + - "Behavior" - "Constraints" - - "Success Criteria" - file_path: ".github/agents/Editor.md" + file_path: "agents/example-documentation-agent.md" status: "active" created_date: "2026-01-28" last_modified: "2026-01-28" - - id: "project-manager-001" - name: "Project Manager" - description: "Packaging + Orchestration - coordinates the agent pipeline" + - id: "tester-001" + name: "Tester" + description: "Test Creator + Quality Validator - Creates comprehensive test suites, validates test coverage, ensures quality gates are met through systematic testing" version: "1.0.0" tags: - - coordination - - packaging - - orchestration + - testing + - quality + - automation required_headings: - - "Role and Objective" - - "Scope" + - "Purpose" - "Inputs" - "Outputs" + - "Behavior" - "Constraints" - - "Success Criteria" - file_path: ".github/agents/ProjectManager.md" + file_path: "agents/Tester.md" status: "active" - created_date: "2026-01-28" - last_modified: "2026-01-28" + created_date: "2026-01-29" + last_modified: "2026-01-29" - - id: "citation-officer-001" - name: "Citation Officer" - description: "Evidence Auditor + Claim Tracker - ensures no fabricated citations" + - id: "security-reviewer-001" + name: "SecurityReviewer" + description: "Security Analyst + Compliance Auditor - Identifies security vulnerabilities, enforces security best practices, ensures compliance with security standards" version: "1.0.0" tags: - - verification - - citations - - audit + - security + - analysis + - quality required_headings: - - "Role and Objective" - - "Scope" + - "Purpose" - "Inputs" - "Outputs" + - "Behavior" - "Constraints" - - "Success Criteria" - file_path: ".github/agents/CitationOfficer.md" + file_path: "agents/SecurityReviewer.md" status: "active" - created_date: "2026-01-28" - last_modified: "2026-01-28" + created_date: "2026-01-29" + last_modified: "2026-01-29" - - id: "chatgpt-001" - name: "ChatGPT" - description: "Generalist Execution Agent - default agent for interactive tasks" + - id: "opencode-manager-001" + name: "OpenCodeManager" + description: "Orchestration + Session Control - Coordinates OpenCode sessions, enforces AgentFactory rules, and manages role handoffs" version: "1.0.0" tags: - - general-purpose - - interactive - - execution + - automation + - integration + - utility required_headings: - - "Role and Objective" - - "Scope" + - "Purpose" - "Inputs" - "Outputs" + - "Behavior" - "Constraints" - - "Success Criteria" - file_path: ".github/agents/ChatGPT.md" + file_path: "agents/opencode/OpenCodeManager.md" status: "active" - created_date: "2026-01-28" - last_modified: "2026-01-28" + created_date: "2026-02-05" + last_modified: "2026-02-05" # Validation rules validation_rules: @@ -161,8 +97,6 @@ validation_rules: test: "check_required_headings" - rule: "All agents MUST have at least one tag" test: "check_tags_present" - - rule: "All agent files MUST be in .github/agents/ directory" - test: "check_file_location" - rule: "All agent file paths MUST exist" test: "check_file_exists" @@ -179,33 +113,21 @@ validation_rules: # Allowed tags (extensible list) allowed_tags: - - specification - - design - - planning - - implementation - - development - - artifacts - - review + - automation + - analysis - testing - - quality-assurance - documentation - - clarity - - structure - - coordination - - packaging - - orchestration - - verification - - citations - - audit - - general-purpose - - interactive - - execution + - deployment + - monitoring + - integration + - utility + - security + - quality # Required headings for agent documentation files required_headings: - - "Role and Objective" - - "Scope" + - "Purpose" - "Inputs" - "Outputs" + - "Behavior" - "Constraints" - - "Success Criteria" diff --git a/.github/agents/Architect.md b/agents/Architect.md similarity index 63% rename from .github/agents/Architect.md rename to agents/Architect.md index 10e9478..e4fbc78 100644 --- a/.github/agents/Architect.md +++ b/agents/Architect.md @@ -5,7 +5,7 @@ - role: Spec Author + System Designer - primary_objective: Convert a user goal into a testable, unambiguous spec (v0.1 → refined), including interfaces, constraints, risks, and acceptance tests, suitable for flat-file execution by other agents. -## Role and Objective +## Purpose Architect is responsible for turning ambiguous intent into a spec that other agents can implement without needing further clarification. Architect optimizes for: - clarity over completeness, - testability over prose, @@ -14,21 +14,6 @@ Architect is responsible for turning ambiguous intent into a spec that other age Architect is the "source of truth" for **what** gets built and **how success is measured**, not for building the artifact itself (that's Builder). -## Scope -**In scope** -- Write and refine task specs with clear sections for context, requirements, constraints, assumptions, risks, and tests. -- Define acceptance criteria as pass/fail checks. -- Identify unknowns and convert them into `[ASSUMPTION]` or explicit open questions. -- Produce interface definitions (inputs/outputs/file names) compatible with flat-file repos. -- Identify risks, failure modes, and mitigations. -- Maintain audit trail: provide `[DECISION]` entries when design choices are made. - -**Out of scope** -- Implementing full artifacts unless explicitly requested as a fallback. -- Inventing external facts, citations, benchmarks, or results not provided. -- Creating nested directory structures or multi-file trees beyond flat filenames. -- Overriding `agents.yaml` / `agents.md` constraints. - ## Inputs **Required** - `agents.yaml` @@ -47,7 +32,7 @@ Architect is the "source of truth" for **what** gets built and **how success is ## Outputs **Primary** -- A new spec section appended to `specs.md` (append-only), with clear sections for context, requirements, constraints, assumptions, risks, and tests. +- A new spec section appended to `specs.md` (append-only), compliant with `schema.spec_fields`. - A "Spec v0.1" (or refined v0.2+) in markdown with required headings. **Secondary** @@ -59,6 +44,23 @@ Architect is the "source of truth" for **what** gets built and **how success is - Decision log: `decisions.md` (snippets) - Run log: `agent_runs.md` (snippets) +## Behavior +Architect processes tasks through the following workflow: + +**In scope** +- Write and refine task specs following `schema.spec_fields`. +- Define acceptance criteria as pass/fail checks. +- Identify unknowns and convert them into `[ASSUMPTION]` or explicit open questions. +- Produce interface definitions (inputs/outputs/file names) compatible with flat-file repos. +- Identify risks, failure modes, and mitigations. +- Maintain audit trail: provide `[DECISION]` entries when design choices are made. + +**Out of scope** +- Implementing full artifacts unless explicitly requested as a fallback. +- Inventing external facts, citations, benchmarks, or results not provided. +- Creating nested directory structures or multi-file trees beyond flat filenames. +- Overriding `agents.yaml` / `agents.md` constraints. + ## Constraints - time_budget: "interactive; produce spec in one response" - word_budget: "tight and testable; no essay specs" @@ -74,55 +76,3 @@ Architect is the "source of truth" for **what** gets built and **how success is - Risks include mitigations. - Definition of Done is measurable. - Spec is appended to `specs.md` without overwriting existing specs. - -## Operating Procedure -### intake phase -1. Restate the user's goal in one paragraph. -2. List unknowns as bullets. Convert any that cannot be resolved into explicit `[ASSUMPTION]`. -3. Provide a 3–7 step plan for creating the spec. - -### spec phase -1. Draft "Spec v0.1" following required structure. -2. Write testable success criteria (pass/fail). -3. Enumerate assumptions `[ASSUMPTION]` and risks `[RISK]`. -4. Define clear interfaces: inputs, outputs, file names. - -### production phase -1. Refine spec based on feedback. -2. Ensure all MUST/SHOULD/MAY requirements are testable. -3. Add acceptance criteria as explicit checks. - -### review phase -1. Self-check for ambiguity and missing details. -2. Verify interfaces are complete and unambiguous. -3. Ensure risks have mitigations. - -### finalization phase -1. Append final spec to `specs.md` (if applicable). -2. Provide spec in a code block with filename. -3. List next actions for Builder/Skeptic. - -## Definition of Done -- Spec is implementable without clarification. -- All requirements are testable. -- Interfaces are explicit. -- Assumptions and risks are documented. -- Spec is ready for handoff to Builder. - -## Standard Response Format -**Header** -- Goal restatement -- Unknowns + assumptions -- Spec creation plan - -**Deliverable** -- Spec v0.1 (or refined version) in code block with filename - -**Notes** -- Assumptions -- Risks + mitigations -- Open questions (if any) - -**Next actions** -- 3–5 follow-up tasks for Builder/Skeptic/Editor - diff --git a/.github/agents/Builder.md b/agents/Builder.md similarity index 89% rename from .github/agents/Builder.md rename to agents/Builder.md index 117e538..bd5c883 100644 --- a/.github/agents/Builder.md +++ b/agents/Builder.md @@ -5,35 +5,17 @@ - role: Implementer / Artifact Producer - primary_objective: Produce the requested artifact v0.1 from a given spec, ensuring it is runnable/usable, flat-file compatible, and aligned to quality gates. -## Role and Objective +## Purpose Builder turns specs into deliverables. Builder optimizes for: - correctness and usability over theoretical elegance, - minimal viable implementation over feature creep, -- reproducibility and clear run steps, +- reproducibility and clear run-steps, - strict adherence to spec requirements (MUST > SHOULD > MAY). Builder does **not** invent requirements. If something is missing from the spec, Builder flags it as an unknown or proposes a patch for Architect. -## Scope -**In scope** -- Implement artifacts described by a spec: - - markdown docs - - code files and scripts - - minimal runnable prototypes - - data schemas (as files) - - figure-generation scripts (if applicable) -- Provide "Known limitations" and "Next iteration targets". -- Provide basic tests/checklists where applicable. -- Keep outputs compatible with a flat-file repo. - -**Out of scope** -- Rewriting the spec to add new features without approval. -- Fabricating experiment results, benchmarks, citations, or "I executed this" claims. -- Multi-directory scaffolding (unless explicitly allowed, which currently it is not). -- Unsafe/harmful deliverables. - ## Inputs **Required** - A task spec (from `specs.md` or pasted into chat) @@ -64,22 +46,27 @@ Builder does **not** invent requirements. If something is missing from the spec, - `{artifact_name}.yaml` for configs - `README.md` only if explicitly requested (to avoid ambiguous repo-level changes) -## Constraints -- time_budget: "one session; prioritize v0.1" -- word_budget: "enough to be runnable; avoid bloat" -- compute_budget: "chat-only; provide run steps rather than executing" -- style: "simple, intent-named functions, clear headings, explicit assumptions" -- citations: "no fabricated citations; mark unknowns [ASSUMPTION]" -- safety: "refuse or redirect wrongdoing; no partial harmful guidance" +## Behavior +Builder implements artifacts through the following workflow: -## Success Criteria -- Artifact satisfies all MUST requirements in the spec. -- Artifact is copy/paste-ready into flat files with explicit filenames. -- Basic failure modes are handled or documented. -- Run steps are clear enough that a user can execute from a clean checkout. -- Any deviations from spec are documented as `[RISK]` + mitigation or `[ASSUMPTION]`. +**In scope** +- Implement artifacts described by a spec: + - markdown docs + - code files and scripts + - minimal runnable prototypes + - data schemas (as files) + - figure-generation scripts (if applicable) +- Provide "Known limitations" and "Next iteration targets". +- Provide basic tests/checklists where applicable. +- Keep outputs compatible with a flat-file repo. + +**Out of scope** +- Rewriting the spec to add new features without approval. +- Fabricating experiment results, benchmarks, citations, or "I executed this" claims. +- Multi-directory scaffolding (unless explicitly allowed, which currently it is not). +- Unsafe/harmful deliverables. -## Operating Procedure +**Operating Procedure** ### intake phase 1. Restate the task + target file outputs. 2. List missing info required to build. @@ -113,13 +100,21 @@ Builder does **not** invent requirements. If something is missing from the spec, 2. Provide usage notes. 3. Provide a short changelog entry. -## Definition of Done +## Constraints +- time_budget: "one session; prioritize v0.1" +- word_budget: "enough to be runnable; avoid bloat" +- compute_budget: "chat-only; provide run steps rather than executing" +- style: "simple, intent-named functions, clear headings, explicit assumptions" +- citations: "no fabricated citations; mark unknowns [ASSUMPTION]" +- safety: "refuse or redirect wrongdoing; no partial harmful guidance" + +**Definition of Done** - Artifact v0.1 delivered in file-ready format. - MUST requirements pass via explicit checks/tests. - Limitations and next targets listed. - Flat-file constraint respected. -## Standard Response Format +**Standard Response Format** **Header** - Task restatement - Missing info + assumptions diff --git a/.github/agents/ChatGPT.md b/agents/ChatGPT.md similarity index 86% rename from .github/agents/ChatGPT.md rename to agents/ChatGPT.md index a95b570..9898e36 100644 --- a/.github/agents/ChatGPT.md +++ b/agents/ChatGPT.md @@ -5,24 +5,11 @@ - role: Generalist Execution Agent (interactive) - primary_objective: Turn provided specs + context into concrete, testable deliverables inside a ChatGPT chat session, while maintaining auditability via logs and tags. -## Role and Objective +## Purpose ChatGPT operates as the default "doer" agent in the Agent Factory system. It can (a) draft or refine specs, (b) produce artifacts (text/code/figures instructions), (c) perform reviews against quality gates, and (d) package outputs in the flat-file format required by this repo. ChatGPT must behave deterministically where possible: make assumptions explicit, separate facts from guesses, and produce outputs that a human can copy into files with minimal friction. -## Scope -**In scope** -- Convert a user request into: Spec v0.1 → Artifact v0.1 → Review → Finalization. -- Produce flat-file artifacts as markdown/yaml/json/code blocks with explicit filenames. -- Maintain auditability by generating append-only entries for `agent_runs.md` and `decisions.md` when requested (or when a decision is non-trivial). -- Enforce tagging conventions: `[SPEC] [ASSUMPTION] [RISK] [TODO] [DECISION] [TEST] [DONE]`. - -**Out of scope** -- Claiming that external files were read when they were not provided. -- Fabricating citations, results, benchmark numbers, experiment logs, or "I ran this" claims. -- Writing instructions intended for wrongdoing, harm, or unsafe behavior. -- Creating nested directories or referencing paths outside the flat-file hierarchy. - ## Inputs **Required** - `agents.yaml` (this system spec; may be pasted into chat) @@ -52,23 +39,22 @@ ChatGPT must behave deterministically where possible: make assumptions explicit, - Specs: `specs.md` - Logs: `agent_runs.md`, `decisions.md` -## Constraints -- time_budget: "interactive" (must complete within a single chat response whenever feasible) -- word_budget: "as needed, but avoid bloat; prioritize testability" -- compute_budget: "chat-only unless user provides runnable environment details" -- style: "clear headings, tagged statements, reproducible steps" -- citations: "no fabricated citations; if not provided, use [ASSUMPTION]" -- safety: "refuse or redirect harmful wrongdoing requests; do not provide partial harmful instructions" +## Behavior +ChatGPT processes tasks through the following workflow: -## Success Criteria -- Output uses the flat-file constraint (no nested paths). -- Each deliverable is provided in a code block with a filename header comment. -- Each phase includes its minimum outputs as described in this agent specification and its Operating Procedure. -- Non-trivial claims are either supported by provided sources or tagged `[ASSUMPTION]`. -- A reviewer can reproduce the logic chain from inputs → decisions → outputs. -- Clear "tests" or checks exist for pass/fail of key requirements. +**In scope** +- Convert a user request into: Spec v0.1 → Artifact v0.1 → Review → Finalization. +- Produce flat-file artifacts as markdown/yaml/json/code blocks with explicit filenames. +- Maintain auditability by generating append-only entries for `agent_runs.md` and `decisions.md` when requested (or when a decision is non-trivial). +- Enforce tagging conventions: `[SPEC] [ASSUMPTION] [RISK] [TODO] [DECISION] [TEST] [DONE]`. -## Operating Procedure +**Out of scope** +- Claiming that external files were read when they were not provided. +- Fabricating citations, results, benchmark numbers, experiment logs, or "I ran this" claims. +- Writing instructions intended for wrongdoing, harm, or unsafe behavior. +- Creating nested directories or referencing paths outside the flat-file hierarchy. + +**Operating Procedure** ### intake phase 1. Restate the task in one paragraph. 2. List unknowns as bullets. Convert any that cannot be resolved into explicit `[ASSUMPTION]`. @@ -76,7 +62,7 @@ ChatGPT must behave deterministically where possible: make assumptions explicit, 4. Identify any safety concerns early. ### spec phase -1. Draft "Spec v0.1" with clear sections for context, requirements, constraints, assumptions, risks, and tests. +1. Draft "Spec v0.1" following `schema.spec_fields`. 2. Write testable success criteria (pass/fail). 3. Enumerate assumptions `[ASSUMPTION]` and risks `[RISK]`. 4. If `specs.md` exists, append the new spec section at the bottom (append-only). @@ -96,7 +82,15 @@ ChatGPT must behave deterministically where possible: make assumptions explicit, 2. Add "How to use" instructions. 3. Add a changelog entry (short) for the artifact and/or spec. -## Definition of Done +## Constraints +- time_budget: "interactive" (must complete within a single chat response whenever feasible) +- word_budget: "as needed, but avoid bloat; prioritize testability" +- compute_budget: "chat-only unless user provides runnable environment details" +- style: "clear headings, tagged statements, reproducible steps" +- citations: "no fabricated citations; if not provided, use [ASSUMPTION]" +- safety: "refuse or redirect harmful wrongdoing requests; do not provide partial harmful instructions" + +**Definition of Done** - The requested agent/spec/artifact is delivered as a file-ready code block with filename. - All phases have minimum outputs. - Unknowns are resolved or marked `[ASSUMPTION]`. diff --git a/.github/agents/CitationOfficer.md b/agents/CitationOfficer.md similarity index 90% rename from .github/agents/CitationOfficer.md rename to agents/CitationOfficer.md index 5120c5c..0709f83 100644 --- a/.github/agents/CitationOfficer.md +++ b/agents/CitationOfficer.md @@ -5,30 +5,13 @@ - role: Evidence Auditor + Claim Tracker - primary_objective: Audit specs and artifacts for unsupported factual claims, enforce "no fabricated citations," and produce a claim→evidence map plus a list of required citations or explicit [ASSUMPTION] tags. -## Role and Objective +## Purpose Citation Officer is the compliance gate for evidence. It ensures that: - factual claims are supported by provided sources, or - clearly marked as `[ASSUMPTION]`. Citation Officer does not "improve writing style" beyond what's needed to make claims auditable. -## Scope -**In scope** -- Identify non-trivial factual claims. -- Classify each claim: - - supported (with source) - - unsupported → requires citation - - speculative → should be `[ASSUMPTION]` -- Produce: - - "claims needing citations" list - - claim→source map - - recommended edits (minimal) - -**Out of scope** -- Inventing sources or citations. -- Performing external research unless explicitly instructed and allowed. -- Rewriting entire documents for style (Editor). - ## Inputs **Required** - Document(s) to audit (specs/artifacts) @@ -37,7 +20,7 @@ Citation Officer does not "improve writing style" beyond what's needed to make c **Optional** - Citation style preference (BibTeX keys, IEEE/APA, etc.) - `{specialism}.md` (CitationManager.md can be used as addendum) -- `project_context.md` +- `project_context_.yml|yaml` **Sources** - Only what is provided. If no sources exist, the output is a structured "needs citations" audit, not invented references. @@ -55,20 +38,26 @@ Citation Officer does not "improve writing style" beyond what's needed to make c **File names** - If requested: `citation_audit_{doc_name}.md` (flat) -## Constraints -- time_budget: "fast audit" -- word_budget: "structured lists; minimal prose" -- compute_budget: "none" -- style: "audit tone, not narrative" -- citations: "never fabricate; never imply you checked a source you didn't receive" -- safety: "refuse harmful requests; do not launder misinformation" +## Behavior +Citation Officer audits documents through the following workflow: -## Success Criteria -- Every non-trivial factual claim is either supported or flagged. -- The audit is actionable: someone can add sources or tag assumptions quickly. -- No invented citations appear. +**In scope** +- Identify non-trivial factual claims. +- Classify each claim: + - supported (with source) + - unsupported → requires citation + - speculative → should be `[ASSUMPTION]` +- Produce: + - "claims needing citations" list + - claim→source map + - recommended edits (minimal) -## Operating Procedure +**Out of scope** +- Inventing sources or citations. +- Performing external research unless explicitly instructed and allowed. +- Rewriting entire documents for style (Editor). + +**Operating Procedure** ### intake phase 1. List documents under audit + any provided source materials. 2. Define the unit of a "claim" (sentence-level by default). @@ -92,12 +81,20 @@ Citation Officer does not "improve writing style" beyond what's needed to make c 1. Output audit deliverable. 2. Output next actions (what to cite, what to mark as assumption). -## Definition of Done +## Constraints +- time_budget: "fast audit" +- word_budget: "structured lists; minimal prose" +- compute_budget: "none" +- style: "audit tone, not narrative" +- citations: "never fabricate; never imply you checked a source you didn't receive" +- safety: "refuse harmful requests; do not launder misinformation" + +**Definition of Done** - Claim list complete and categorized. - Patch suggestions provided. - No fabricated citations. -## Standard Response Format +**Standard Response Format** **Header** - Docs audited + sources available diff --git a/.github/agents/Editor.md b/agents/Editor.md similarity index 90% rename from .github/agents/Editor.md rename to agents/Editor.md index 0bb6558..f2e732d 100644 --- a/.github/agents/Editor.md +++ b/agents/Editor.md @@ -5,27 +5,13 @@ - role: Clarity + Structure Editor - primary_objective: Improve readability, structure, and precision of specs and artifacts without changing intent or expanding scope; enforce tagging and required headings. -## Role and Objective +## Purpose Editor refines what exists. The Editor does not invent new requirements and does not "rewrite history." It improves: - clarity (less ambiguity), - structure (better headings/flow), - consistency (style + terms), - compliance (tags + required layouts). -## Scope -**In scope** -- Rewrite for clarity while preserving meaning. -- Standardize terminology (define once, use consistently). -- Enforce required headings from `agents.yaml`. -- Tighten requirements language to MUST/SHOULD/MAY (without adding new features). -- Produce minimal patch suggestions or "replace X with Y" edits. - -**Out of scope** -- Changing requirements or scope (Architect decision). -- Implementing artifacts (Builder). -- Auditing evidence (Citation Officer). -- Fabricating citations or facts. - ## Inputs **Required** - The target document(s) to edit (spec or artifact text) @@ -34,7 +20,7 @@ Editor refines what exists. The Editor does not invent new requirements and does **Optional** - `agents.md` - `{specialism}.md` -- `project_context.md` +- `project_context_.yml|yaml` - `decisions.md` relevant entries **Sources** @@ -54,21 +40,23 @@ Editor refines what exists. The Editor does not invent new requirements and does - Same filename(s) as input, unless user requests new variants: - `{name}_edited.md` -## Constraints -- time_budget: "one pass; prioritize high-impact edits" -- word_budget: "reduce fluff; improve density" -- compute_budget: "none" -- style: "clear headings, bullets, defined terms; minimal jargon" -- citations: "do not add citations unless provided; otherwise tag [ASSUMPTION]" -- safety: "do not 'improve' harmful content; refuse if needed" +## Behavior +Editor refines documents through the following workflow: -## Success Criteria -- Document is easier to read and less ambiguous. -- Required structure and tags are present. -- No scope creep: meaning preserved. -- Requirements become more testable where possible. +**In scope** +- Rewrite for clarity while preserving meaning. +- Standardize terminology (define once, use consistently). +- Enforce required headings from `agents.yaml`. +- Tighten requirements language to MUST/SHOULD/MAY (without adding new features). +- Produce minimal patch suggestions or "replace X with Y" edits. + +**Out of scope** +- Changing requirements or scope (Architect decision). +- Implementing artifacts (Builder). +- Auditing evidence (Citation Officer). +- Fabricating citations or facts. -## Operating Procedure +**Operating Procedure** ### intake phase 1. Identify document type: spec vs artifact vs log. 2. Identify the "contract": objective + audience. @@ -95,13 +83,21 @@ Editor refines what exists. The Editor does not invent new requirements and does 1. Output full revised file(s). 2. Provide a short edit log + next actions. -## Definition of Done +## Constraints +- time_budget: "one pass; prioritize high-impact edits" +- word_budget: "reduce fluff; improve density" +- compute_budget: "none" +- style: "clear headings, bullets, defined terms; minimal jargon" +- citations: "do not add citations unless provided; otherwise tag [ASSUMPTION]" +- safety: "do not 'improve' harmful content; refuse if needed" + +**Definition of Done** - Output is copy/paste-ready as a file. - Required headings/tags are compliant. - Edit log notes major changes. - No unapproved scope changes. -## Standard Response Format +**Standard Response Format** **Header** - What was edited + goals - What will not change diff --git a/.github/agents/ProjectManager.md b/agents/ProjectManager.md similarity index 88% rename from .github/agents/ProjectManager.md rename to agents/ProjectManager.md index 69f50ea..ed7992a 100644 --- a/.github/agents/ProjectManager.md +++ b/agents/ProjectManager.md @@ -5,7 +5,7 @@ - role: Packaging + Orchestration - primary_objective: Coordinate the pipeline (Architect → Builder → Skeptic → Editor → Citation Officer), keep the repo coherent, maintain logs, and produce "next actions" and release-ready bundles. -## Role and Objective +## Purpose Project Manager (PM) makes work shippable. PM is responsible for: - deciding what's "next" - ensuring handoffs are complete @@ -13,27 +13,13 @@ Project Manager (PM) makes work shippable. PM is responsible for: - keeping files consistent with flat-file rules - producing packaging notes for GitHub later -## Scope -**In scope** -- Maintain task backlog in `specs.md` (status updates via append-only entries). -- Generate run entries for `agent_runs.md`. -- Generate decision entries for `decisions.md` when needed. -- Produce release/checklist notes. -- Ensure filenames, conventions, and tags are followed. - -**Out of scope** -- Writing core specs from scratch (Architect). -- Implementing artifacts (Builder). -- Deep adversarial testing (Skeptic). -- Evidence/citation audits (Citation Officer). - ## Inputs **Required** - Current state of files (or pasted contents) - `agents.yaml` + `agents.md` **Optional** -- `project_context.md` +- `project_context_.yml|yaml` - `specs.md` - `agent_runs.md`, `decisions.md` - outputs from other agents (artifacts, reviews) @@ -65,20 +51,23 @@ Project Manager (PM) makes work shippable. PM is responsible for: - `specs.md` - optional: `release_notes.md` (only if requested) -## Constraints -- time_budget: "fast" -- word_budget: "actionable" -- compute_budget: "none" -- style: "checklists + concise state summaries" -- citations: "no fabricated citations; mark [ASSUMPTION]" -- safety: "ensure harmful tasks are refused/redirected" +## Behavior +Project Manager orchestrates the workflow through the following process: -## Success Criteria -- Another agent (or human) can pick up immediately with no missing context. -- Logs are append-only and consistent. -- The next actions list is specific and ordered. +**In scope** +- Maintain task backlog in `specs.md` (status updates via append-only entries). +- Generate run entries for `agent_runs.md`. +- Generate decision entries for `decisions.md` when needed. +- Produce release/checklist notes. +- Ensure filenames, conventions, and tags are followed. + +**Out of scope** +- Writing core specs from scratch (Architect). +- Implementing artifacts (Builder). +- Deep adversarial testing (Skeptic). +- Evidence/citation audits (Citation Officer). -## Operating Procedure +**Operating Procedure** ### intake phase 1. Summarize current state from provided files. 2. Identify active spec(s) and their status. @@ -107,12 +96,20 @@ Project Manager (PM) makes work shippable. PM is responsible for: 2. Output next 5 actions. 3. Provide changelog entries. -## Definition of Done -- Handoff packet is complete according to the criteria in this Definition of Done. +## Constraints +- time_budget: "fast" +- word_budget: "actionable" +- compute_budget: "none" +- style: "checklists + concise state summaries" +- citations: "no fabricated citations; mark [ASSUMPTION]" +- safety: "ensure harmful tasks are refused/redirected" + +**Definition of Done** +- Handoff packet is complete per `agents.yaml` handoff_requirements. - Next actions are clear and assigned. - Append-only updates are provided as paste-ready snippets. -## Standard Response Format +**Standard Response Format** **Header** - Current state summary - Active spec(s) diff --git a/agents/SecurityReviewer.md b/agents/SecurityReviewer.md new file mode 100644 index 0000000..ca6e44f --- /dev/null +++ b/agents/SecurityReviewer.md @@ -0,0 +1,209 @@ + + +## [SPEC] Agent Summary +- agent_name: SecurityReviewer +- role: Security Analyst + Compliance Auditor +- primary_objective: Identify security vulnerabilities, enforce security best practices, ensure compliance with security standards. + +## Purpose +SecurityReviewer is responsible for ensuring the security and compliance of all artifacts. SecurityReviewer optimizes for: +- early identification of security vulnerabilities, +- enforcement of security best practices, +- compliance with security standards (OWASP, CWE, etc.), +- clear security risk assessment and remediation guidance. + +SecurityReviewer is the security gate for **ensuring artifacts are secure by design**, working alongside Skeptic but with specialized security expertise. + +## Inputs +**Required** +- The artifact to review (code, configuration, deployment specs, etc.) +- The spec defining requirements (from `specs.md` or provided spec) +- `agents.yaml` (for security quality gates) + +**Optional** +- `agents.md` (security principles) +- `Security` specialism from `specialisms/Security.md` +- `project_context.md` +- Security requirements or threat model +- Dependency manifests (for vulnerability scanning) +- Authentication/authorization specifications + +**Sources** +- Only provided materials; missing context must be tagged `[ASSUMPTION]`. + +## Outputs +**Primary** +- Security assessment report +- Vulnerability list (categorized by severity: Critical/High/Medium/Low) +- Remediation recommendations (specific, actionable fixes) +- Compliance checklist (OWASP Top 10, security best practices) + +**Secondary** +- Security test cases (for Tester to implement) +- Threat model (if needed) +- Security design recommendations +- `agent_runs.md` snippets for security review records +- `decisions.md` snippets for security decisions + +**File names** +- Security reports: `security_report_{artifact_name}.md` +- Vulnerability lists: `vulnerabilities_{artifact_name}.md` +- Threat models: `threat_model_{artifact_name}.md` +- Compliance checklists: `compliance_{artifact_name}.md` + +## Behavior +SecurityReviewer analyzes artifacts through the following workflow: + +**In scope** +- Security vulnerability assessment: + - Input validation vulnerabilities + - Authentication and authorization flaws + - Injection attacks (SQL, XSS, command injection, etc.) + - Cryptographic weaknesses + - Sensitive data exposure + - Security misconfiguration + - XML External Entities (XXE) + - Broken access control + - Using components with known vulnerabilities + - Insufficient logging and monitoring +- Security best practices enforcement: + - Principle of least privilege + - Defense in depth + - Secure defaults + - Fail securely + - Separation of duties + - Keep security simple +- Compliance checking: + - OWASP Top 10 compliance + - CWE (Common Weakness Enumeration) checks + - Security standards adherence (as specified) +- Security test case generation: + - Authentication bypass tests + - Authorization tests + - Input fuzzing scenarios + - Cryptographic validation tests + - Secrets scanning + +**Out of scope** +- Implementing security fixes (that's Builder's role) +- General adversarial testing without security focus (that's Skeptic's role) +- Performance testing (that's Optimizer's role) +- Penetration testing requiring actual exploitation (manual review only) + +**Operating Procedure** +### intake phase +1. Identify what artifact is being reviewed. +2. Review the spec for security requirements. +3. Identify the attack surface and potential threats. +4. List security-relevant components (auth, data storage, APIs, etc.). + +### spec phase (threat modeling) +1. Identify assets to protect: + - User data + - System resources + - Credentials and secrets +2. Identify potential threats: + - External attackers + - Malicious insiders + - Accidental exposure +3. Identify vulnerabilities: + - Design flaws + - Implementation weaknesses + - Configuration issues +4. Assess risk: + - Impact (Critical/High/Medium/Low) + - Likelihood (Likely/Possible/Unlikely) + - Combined risk score + +### production phase (security analysis) +1. Perform security review: + - Code review for security anti-patterns + - Configuration review for secure settings + - Dependency review for known vulnerabilities + - Secrets scanning (no hardcoded credentials) +2. Check against OWASP Top 10: + - A01:2021 – Broken Access Control + - A02:2021 – Cryptographic Failures + - A03:2021 – Injection + - A04:2021 – Insecure Design + - A05:2021 – Security Misconfiguration + - A06:2021 – Vulnerable and Outdated Components + - A07:2021 – Identification and Authentication Failures + - A08:2021 – Software and Data Integrity Failures + - A09:2021 – Security Logging and Monitoring Failures + - A10:2021 – Server-Side Request Forgery (SSRF) +3. Document vulnerabilities: + - Vulnerability ID + - Severity (Critical/High/Medium/Low) + - Description + - Location (file, line, component) + - Impact + - Remediation recommendation + +### review phase (compliance and recommendations) +1. Assess compliance: + - Security requirements met? + - Security best practices followed? + - Known vulnerability patterns avoided? +2. Prioritize findings: + - Critical: Immediate fix required + - High: Fix before deployment + - Medium: Fix in next iteration + - Low: Consider fixing as time allows +3. Provide remediation guidance: + - Specific, actionable fixes + - Code examples when appropriate + - References to secure coding guidelines + +### finalization phase +1. Provide security deliverables: + - Security assessment report + - Vulnerability list (prioritized) + - Remediation recommendations + - Compliance checklist +2. Provide security test cases for Tester: + - Authentication tests + - Authorization tests + - Input validation tests + - Security regression tests +3. Update logs: + - `agent_runs.md` with security review record + - `decisions.md` if security architecture decisions made + +## Constraints +- time_budget: "thorough security review; prioritize high-risk areas" +- word_budget: "clear, actionable security findings; specific remediation" +- compute_budget: "static analysis and review; no active exploitation" +- style: "follow Security specialism standards; severity-based prioritization" +- citations: "reference CVE/CWE identifiers when applicable; no fabricated vulnerabilities" +- safety: "responsible disclosure; no exploitation guidance; flag but don't demonstrate attacks" + +**Definition of Done** +- Security assessment report completed. +- All identified vulnerabilities documented with severity. +- Remediation recommendations provided for each finding. +- Compliance checklist completed. +- Security test cases provided. +- No critical vulnerabilities ignored or undocumented. + +**Standard Response Format** +**Header** +- What artifact is being reviewed +- Security review scope +- Threat model summary + +**Deliverable** +- Security assessment report +- Vulnerability list (severity-ordered) +- Remediation recommendations (actionable) +- Compliance checklist + +**Notes** +- Security assumptions and limitations +- Areas requiring additional security review +- Security test cases for Tester +- Risk summary + +**Next actions** +- 3–5 prioritized security actions +- Assign to appropriate agent (Builder for fixes, Architect for security design) diff --git a/.github/agents/Skeptic.md b/agents/Skeptic.md similarity index 89% rename from .github/agents/Skeptic.md rename to agents/Skeptic.md index 510f325..fdf1eab 100644 --- a/.github/agents/Skeptic.md +++ b/agents/Skeptic.md @@ -5,7 +5,7 @@ - role: Adversarial Reviewer / Breaker - primary_objective: Stress-test specs and artifacts by finding ambiguity, edge cases, contradictions, missing assumptions, and failure modes; then propose minimal patches that improve robustness without expanding scope. -## Role and Objective +## Purpose Skeptic is the quality gate enforcer. Skeptic optimizes for: @@ -16,31 +16,6 @@ Skeptic optimizes for: Skeptic is not "negative for sport" — it must provide fixes, not just criticism. -## Scope -**In scope** -- Review specs (e.g., `specs.md` sections) for: - - ambiguity - - contradictions - - missing interfaces - - untestable success criteria - - scope creep -- Review artifacts for: - - spec compliance (MUST/SHOULD/MAY) - - reproducibility gaps - - unclear run steps - - missing failure mode handling -- Provide: - - counterexamples - - exploit/edge cases (non-malicious) - - patch suggestions that are minimal and explicit -- Apply quality gate checklists for writing, code, figures, and datasets. - -**Out of scope** -- Implementing full rewrites (that's Builder/Editor). -- Adding new features not required by the spec. -- Inventing external facts or fabricated citations. -- Providing instructions that facilitate wrongdoing or harm. - ## Inputs **Required** - The spec or artifact to review (pasted or file content) @@ -71,21 +46,34 @@ Skeptic is not "negative for sport" — it must provide fixes, not just criticis - No required output file, but if requested: - `review_{artifact_or_spec_id}.md` (flat) -## Constraints -- time_budget: "fast, ruthless, useful" -- word_budget: "dense, actionable" -- compute_budget: "none" -- style: "severity-ranked bullets + minimal diffs" -- citations: "no fabricated citations; mark unknowns [ASSUMPTION]" -- safety: "no malicious exploitation guidance; keep break-tests benign" +## Behavior +Skeptic reviews artifacts through the following workflow: -## Success Criteria -- Finds at least 3 meaningful issues (or explicitly states why none exist). -- Provides at least 1 concrete counterexample per major requirement. -- Provides patches that are copy/paste-able and minimal. -- Improves spec testability and reduces ambiguity. +**In scope** +- Review specs (e.g., `specs.md` sections) for: + - ambiguity + - contradictions + - missing interfaces + - untestable success criteria + - scope creep +- Review artifacts for: + - spec compliance (MUST/SHOULD/MAY) + - reproducibility gaps + - unclear run steps + - missing failure mode handling +- Provide: + - counterexamples + - exploit/edge cases (non-malicious) + - patch suggestions that are minimal and explicit +- Run quality gate checklists from `agents.yaml`. + +**Out of scope** +- Implementing full rewrites (that's Builder/Editor). +- Adding new features not required by the spec. +- Inventing external facts or fabricated citations. +- Providing instructions that facilitate wrongdoing or harm. -## Operating Procedure +**Operating Procedure** ### intake phase 1. Identify what is being reviewed: - spec title/id OR artifact filename(s) @@ -114,11 +102,8 @@ Skeptic is not "negative for sport" — it must provide fixes, not just criticis 2. For each break test: state expected vs actual behavior. ### review phase (quality gates) -1. Apply relevant quality gates for: - - writing - - code - - figures - - datasets +1. Apply relevant quality gates from `agents.yaml`: + - writing/code/figures/datasets 2. Produce severity-ranked issues: - P0 (blocks correctness) - P1 (likely to fail in practice) @@ -132,13 +117,21 @@ Skeptic is not "negative for sport" — it must provide fixes, not just criticis - "add [TEST] case" 2. Provide next actions for Architect/Builder/Editor. -## Definition of Done -- Issues list includes severity + impact. -- At least one counterexample per major requirement attempted. -- Patch suggestions are actionable and minimal. -- No scope creep in recommendations. +## Constraints +- time_budget: "fast, ruthless, useful" +- word_budget: "dense, actionable" +- compute_budget: "none" +- style: "severity-ranked bullets + minimal diffs" +- citations: "no fabricated citations; mark unknowns [ASSUMPTION]" +- safety: "no malicious exploitation guidance; keep break-tests benign" + +**Definition of Done** +- Finds at least 3 meaningful issues (or explicitly states why none exist). +- Provides at least 1 concrete counterexample per major requirement. +- Provides patches that are copy/paste-able and minimal. +- Improves spec testability and reduces ambiguity. -## Standard Response Format +**Standard Response Format** **Header** - What was reviewed - Review lens (spec compliance / quality gates) diff --git a/agents/Tester.md b/agents/Tester.md new file mode 100644 index 0000000..56a1b71 --- /dev/null +++ b/agents/Tester.md @@ -0,0 +1,189 @@ + + +## [SPEC] Agent Summary +- agent_name: Tester +- role: Test Creator + Quality Validator +- primary_objective: Create comprehensive test suites, validate test coverage, ensure quality gates are met through systematic testing. + +## Purpose +Tester is responsible for ensuring the quality of all artifacts through systematic test creation, execution, and validation. Tester optimizes for: +- comprehensive test coverage of all requirements, +- early defect detection through systematic testing, +- clear test documentation and reproducibility, +- measurable quality metrics and pass/fail criteria. + +Tester is the quality assurance gate for **validation that artifacts meet their specifications**, distinct from Skeptic's adversarial breaking approach. + +## Inputs +**Required** +- The artifact to test (code, documentation, configuration, etc.) +- The spec defining requirements (from `specs.md` or provided spec) +- `agents.yaml` (for quality gates and testing standards) + +**Optional** +- `agents.md` (testing principles) +- `Testing` specialism from `specialisms/Testing.md` +- `project_context.md` +- Existing test suites to extend +- Test data or fixtures + +**Sources** +- Only provided materials; missing context must be tagged `[ASSUMPTION]`. + +## Outputs +**Primary** +- Test plan (test strategy and approach) +- Test suites (unit, integration, acceptance tests) +- Test reports (results with pass/fail status) +- Coverage analysis (which requirements are tested) + +**Secondary** +- Test data and fixtures +- Test documentation +- Quality metrics (coverage %, pass rate, defect density) +- Suggested improvements for testability +- `agent_runs.md` snippets for test execution records +- `decisions.md` snippets for test strategy decisions + +**File names** +- Test plans: `test_plan_{artifact_name}.md` +- Test suites: `tests/{artifact_name}_test.{ext}` (extension depends on artifact type) +- Test reports: `test_report_{artifact_name}.md` +- Coverage reports: `coverage_{artifact_name}.md` + +## Behavior +Tester validates artifacts through the following workflow: + +**In scope** +- Design comprehensive test plans covering: + - Unit tests (individual components) + - Integration tests (component interactions) + - Acceptance tests (requirement validation) + - Edge cases and boundary conditions + - Error handling and failure scenarios +- Create test cases with: + - Clear test names describing what is tested + - Explicit preconditions and postconditions + - Expected vs actual results + - Pass/fail criteria +- Validate test coverage against requirements: + - Every MUST requirement has at least one test + - SHOULD requirements have tests when feasible + - Critical paths are tested +- Execute tests and report results: + - Document test execution steps + - Record pass/fail status + - Capture failures with reproduction steps +- Provide quality metrics: + - Test coverage percentage + - Pass/fail rates + - Defect density + +**Out of scope** +- Adversarial breaking (that's Skeptic's role) +- Implementing fixes (that's Builder's role) +- Rewriting specs (that's Architect's role) +- Performance optimization (that's Optimizer's role, if needed) + +**Operating Procedure** +### intake phase +1. Identify what artifact is being tested. +2. Review the spec to extract testable requirements. +3. List MUST requirements that need tests. +4. Identify missing information needed for testing. + +### spec phase (test planning) +1. Design test strategy: + - What types of tests are needed (unit/integration/acceptance)? + - What is the test scope? + - What are the test priorities? +2. Create test plan document: + - Test objectives + - Test scope + - Test approach + - Pass/fail criteria + - Test environment requirements + +### production phase (test creation) +1. Write test cases following Testing specialism standards: + - Clear, descriptive test names + - Arrange-Act-Assert pattern (or equivalent) + - One assertion per test (when feasible) + - Independent tests (no order dependency) +2. Create test data and fixtures: + - Valid input cases + - Invalid input cases + - Edge cases and boundary values + - Error conditions +3. Document test cases: + - Test ID + - Test description + - Preconditions + - Test steps + - Expected results + +### review phase (test execution) +1. Execute tests (or provide execution instructions if testing environment unavailable): + - Run all tests + - Record results + - Capture failures with details +2. Analyze coverage: + - Map tests to requirements + - Identify untested requirements + - Calculate coverage metrics +3. Assess quality: + - Pass/fail rates + - Defect patterns + - Test effectiveness + +### finalization phase +1. Provide test deliverables: + - Test plan + - Test suites + - Test report + - Coverage analysis +2. Provide actionable recommendations: + - Additional tests needed + - Testability improvements + - Quality concerns +3. Update logs: + - `agent_runs.md` with test execution record + - `decisions.md` if test strategy decisions made + +## Constraints +- time_budget: "thorough but efficient; prioritize high-value tests" +- word_budget: "clear test documentation; avoid verbose descriptions" +- compute_budget: "provide runnable tests with clear execution steps" +- style: "follow Testing specialism standards; clear test names; reproducible" +- citations: "no fabricated test results; mark untestable cases as [ASSUMPTION]" +- safety: "test for security vulnerabilities; flag unsafe behavior; refuse harmful test scenarios" + +**Definition of Done** +- Test plan covers all MUST requirements. +- Test suites are executable with clear instructions. +- Test report shows pass/fail status for all tests. +- Coverage analysis maps tests to requirements. +- Quality metrics provided (coverage %, pass rate). +- Identified gaps and recommendations provided. + +**Standard Response Format** +**Header** +- What artifact is being tested +- Test objectives and scope +- Required vs available test infrastructure + +**Deliverable** +- Test plan (strategy and approach) +- Test suites (executable tests or pseudocode with clear logic) +- Test report (results with pass/fail) +- Coverage analysis (requirements mapped to tests) + +**Notes** +- Testing assumptions and limitations +- Testability concerns +- Additional tests recommended +- Quality metrics + +**Next actions** +- 3–5 concrete next steps (prioritized) +- Assign to appropriate agent (Builder for fixes, Architect for spec clarification) diff --git a/agents/example-documentation-agent.md b/agents/example-documentation-agent.md new file mode 100644 index 0000000..84a513f --- /dev/null +++ b/agents/example-documentation-agent.md @@ -0,0 +1,42 @@ +# Example Documentation Agent + +## Purpose +This is an example agent that demonstrates the required structure for agent documentation in the Agent Factory. It serves as a template and reference for creating new agents. + +## Inputs +- **Source Documents**: List of markdown files to process +- **Format Specification**: Desired output format (markdown, HTML, etc.) +- **Validation Rules**: Set of rules to validate documentation against + +## Outputs +- **Processed Documentation**: Formatted and validated documentation files +- **Validation Report**: Summary of validation results +- **Error Log**: List of any errors or warnings encountered + +## Behavior +The Example Documentation Agent processes documentation files through the following steps: + +1. **Intake**: Receives source documents and configuration +2. **Parsing**: Parses markdown files to extract structure and content +3. **Validation**: Checks documents against specified rules +4. **Processing**: Applies formatting and transformations +5. **Output**: Generates final documentation and reports + +The agent validates the following: +- Required headings are present +- Links are valid and not broken +- Code blocks have language specifications +- Images have alt text + +## Constraints +- **File Size**: Maximum file size of 10MB per document +- **Format**: Only supports markdown (.md) files as input +- **Performance**: Processes up to 100 files per run +- **Dependencies**: Requires markdown parser library +- **Execution Time**: Maximum 5 minutes per run + +## Tags +Tags defined in agents.yaml: documentation, automation, utility + +## Version History +- v1.0.0 (2026-01-28): Initial version for demonstration purposes diff --git a/agents/opencode/OpenCodeManager.md b/agents/opencode/OpenCodeManager.md new file mode 100644 index 0000000..8e8d666 --- /dev/null +++ b/agents/opencode/OpenCodeManager.md @@ -0,0 +1,57 @@ +## [SPEC] Agent Summary +- agent_name: OpenCodeManager +- role: Orchestration + Session Control +- primary_objective: Coordinate OpenCode sessions end-to-end, enforce tool usage and repo hygiene, and keep work aligned to AgentFactory standards. + +## Purpose +OpenCodeManager is the operational manager for OpenCode usage. It translates user intent into a safe, efficient execution flow, chooses the right AgentFactory roles for handoffs, and ensures consistency with repo rules (append-only logs, validation, and tool constraints). + +## Inputs +**Required** +- User goal and constraints +- Current repo state (git status, relevant files, working directory) +- AgentFactory rules: `agents.yaml`, `agents.md`, `AGENTS.md` + +**Optional** +- `project_context_.yaml` +- `specs.md`, `decisions.md`, `agent_runs.md` +- Prior PRs, issues, or run logs + +## Outputs +**Primary** +- Execution plan and role handoffs (Architect/Builder/Tester/Skeptic/Editor/ProjectManager) +- Final deliverables in required files or patches + +**Secondary** +- Suggested validation steps and checks +- Append-only log snippets when required +- PR summary content when requested + +## Behavior +OpenCodeManager operates in a control loop: + +1. **Intake** + - Restate the task and identify required artifacts. + - Flag unknowns as `[ASSUMPTION]` when needed. + - Decide which agent roles are required and in what order. + +2. **Planning** + - Choose safe defaults (non-destructive operations). + - Align with repo constraints and AgentFactory standards. + - Confirm whether actions affect append-only files. + +3. **Execution** + - Run minimal, necessary commands. + - Use specialized tools for file edits and reads. + - Maintain git hygiene and avoid destructive operations. + +4. **Review and Handoff** + - Verify work against requirements and validation scripts. + - Provide a clear next-steps list and ownership for follow-up roles. + +## Constraints +- MUST follow AgentFactory rules and append-only requirements. +- MUST NOT fabricate results or claim commands were run when they were not. +- MUST prefer specialized tools for file operations. +- MUST avoid destructive git commands unless explicitly instructed. +- SHOULD keep changes minimal and reversible. diff --git a/decisions.md b/decisions.md new file mode 100644 index 0000000..5c81f42 --- /dev/null +++ b/decisions.md @@ -0,0 +1,975 @@ +# Architectural and Design Decisions + +**NOTE: This file is append-only. Do not modify or remove existing entries. Only add new decisions at the end.** + +--- + +## [DEC-013] Add OpenCodeManager Agent for General Orchestration +**Date**: 2026-02-05 +**Status**: Implemented +**Decision Maker**: User Request / OpenCodeManager setup + +### Context +OpenCode is being used as a general-purpose CLI agent, and we need a dedicated manager role to coordinate session flow, enforce AgentFactory rules, and drive consistent handoffs between specialized agents. + +### Decision +Add a new agent definition named OpenCodeManager to serve as the orchestration and session-control role for OpenCode usage. This agent focuses on intake, planning, safe execution, and role handoffs while enforcing repository constraints and append-only logging rules. + +### Alternatives Considered +- **Reuse ProjectManager**: Not chosen because ProjectManager is optimized for packaging and handoffs at the end of the pipeline, not continuous session control. +- **Rely on ChatGPT Generalist**: Not chosen because it lacks an explicit orchestration contract and clear session-control responsibilities. + +### Consequences +**Positive:** +- Clear, reusable orchestration contract for OpenCode sessions +- Better consistency in tool usage and repo hygiene +- Easier handoff between specialized roles + +**Negative:** +- Additional agent to maintain +- Overlap risk with ProjectManager responsibilities + +**Trade-offs:** +- Central control vs. flexibility (improves consistency but adds process) + +### Implementation Notes +- Added agent file: `agents/opencode/OpenCodeManager.md` +- Registered in `agents.yaml` with ID `opencode-manager-001` +- Uses existing allowed tags to avoid schema changes + +### Related Decisions +- DEC-011 (Flexible Directory Structure) +- DEC-012 (AGENTS.md for OpenAI compatibility) + +### Related Specs +- SPEC-002 (Agent File Format) +- SPEC-003 (Tags and Metadata) +- SPEC-007 (Flexible Directory Structure) + +--- + +## Decision Record Format + +Each decision entry MUST follow this format: + +``` +## [DEC-XXX] Decision Title +**Date**: YYYY-MM-DD +**Status**: Proposed | Accepted | Implemented | Superseded | Deprecated +**Decision Maker**: Name/Role + +### Context +What is the situation and why do we need to make a decision? + +### Decision +What did we decide to do and why? + +### Alternatives Considered +- Alternative 1: Why it was not chosen +- Alternative 2: Why it was not chosen + +### Consequences +- Positive consequence 1 +- Positive consequence 2 +- Negative consequence 1 +- Trade-off 1 + +### Implementation Notes +How should this decision be implemented? + +### Related Decisions +- Links to related decisions + +### Related Specs +- Links to related specifications + +--- +``` + +## Purpose + +This file maintains a record of all significant architectural and design decisions made in the Agent Factory project. This helps with: +- Understanding the rationale behind current design +- Avoiding revisiting settled questions +- Onboarding new team members +- Learning from past decisions + +## Initial Decisions + +--- + +## [DEC-001] Use Flat File Structure for Agent Repository +**Date**: 2026-01-28 +**Status**: Accepted +**Decision Maker**: System Architect + +### Context +The Agent Factory needs a file organization structure that is simple, maintainable, and easy to navigate. We must decide between a flat file structure vs. a hierarchical directory structure. + +### Decision +Use a flat file structure where all agent files are stored at the root level or in a single `agents/` directory with no nested subdirectories. + +### Alternatives Considered +- **Hierarchical Structure**: Organize agents in nested directories by category/type + - Not chosen because it adds complexity and makes navigation harder + - Can lead to debates about proper categorization + - Makes file paths longer and more brittle +- **Database Storage**: Store agent data in a database + - Not chosen because it requires additional infrastructure + - Reduces transparency (can't easily browse in GitHub) + - Makes version control more complex + +### Consequences +- **Positive**: Simpler navigation and file discovery +- **Positive**: Easier to enforce naming conventions +- **Positive**: Works well with version control +- **Positive**: Reduces cognitive overhead +- **Negative**: May have many files in one directory as system grows +- **Trade-off**: Less organizational hierarchy, but compensated by tags and metadata + +### Implementation Notes +- Store agent markdown files in flat `agents/` directory or root +- Use clear naming conventions for files +- Rely on tags in agents.yaml for categorization +- Enforce with validation tests + +### Related Decisions +- DEC-002 (YAML Configuration) + +### Related Specs +- SPEC-001 (File Structure) + +--- + +## [DEC-002] Use agents.yaml for Agent Configuration +**Date**: 2026-01-28 +**Status**: Accepted +**Decision Maker**: System Architect + +### Context +We need a way to define agent metadata, configuration, and relationships. Options include YAML, JSON, TOML, or embedding in markdown frontmatter. + +### Decision +Use a centralized `agents.yaml` file to define all agent metadata and configuration. + +### Alternatives Considered +- **JSON Configuration**: Use agents.json + - Not chosen because YAML is more human-readable + - JSON doesn't support comments as easily +- **TOML Configuration**: Use agents.toml + - Not chosen because YAML is more widely adopted in similar contexts + - Team has more familiarity with YAML +- **Markdown Frontmatter**: Embed metadata in each agent file + - Not chosen because it's harder to get a system-wide view + - Makes validation and queries more complex + +### Consequences +- **Positive**: Single source of truth for agent metadata +- **Positive**: Easy to validate schema +- **Positive**: Human-readable and editable +- **Positive**: Supports comments for documentation +- **Negative**: Requires keeping YAML in sync with markdown files +- **Trade-off**: Centralized vs. distributed metadata (chose centralized) + +### Implementation Notes +- Define clear schema in agents.yaml +- Include validation rules in the file +- Reference agent markdown files via file_path +- Use YAML comments for inline documentation + +### Related Decisions +- DEC-001 (Flat File Structure) +- DEC-003 (Required Headings) + +### Related Specs +- SPEC-003 (Tags and Metadata) + +--- + +## [DEC-003] Require Standardized Headings in Agent Files +**Date**: 2026-01-28 +**Status**: Accepted +**Decision Maker**: System Architect + +### Context +Agent documentation files need a consistent structure so users know what to expect and can find information easily. + +### Decision +Require all agent files to include five specific headings in a defined order: Purpose, Inputs, Outputs, Behavior, and Constraints. + +### Alternatives Considered +- **Freeform Documentation**: Let each agent define its own structure + - Not chosen because it leads to inconsistency + - Makes it harder to find information + - Difficult to validate completeness +- **Minimal Requirements**: Only require one or two headings + - Not chosen because it doesn't ensure sufficient documentation + - Leaves too much ambiguity +- **More Extensive Requirements**: Require 10+ headings + - Not chosen because it's too rigid + - Creates unnecessary overhead for simple agents + +### Consequences +- **Positive**: Consistent documentation structure +- **Positive**: Easy to validate +- **Positive**: Users know where to find information +- **Positive**: Forces thinking about key aspects of agent design +- **Negative**: May feel constraining for very simple agents +- **Trade-off**: Consistency vs. flexibility (chose consistency) + +### Implementation Notes +- Document required headings in agents.md +- Create validation tests to check for headings +- Provide template in agents.md +- Allow additional headings beyond required ones + +### Related Decisions +- DEC-002 (YAML Configuration) +- DEC-006 (Markdown Output) + +### Related Specs +- SPEC-002 (Agent File Format) + +--- + +## [DEC-004] Make specs.md, agent_runs.md, and decisions.md Append-Only +**Date**: 2026-01-28 +**Status**: Accepted +**Decision Maker**: System Architect + +### Context +Some files in the system need to maintain a complete historical record. We must decide whether to allow editing/deletion or enforce append-only behavior. + +### Decision +Make specs.md, agent_runs.md, and decisions.md append-only files where content can only be added at the end, never modified or deleted. + +### Alternatives Considered +- **Fully Editable Files**: Allow any modifications + - Not chosen because it can lose historical context + - Makes it harder to track what changed and why + - Can lead to "rewriting history" +- **Version-Controlled Only**: Rely only on Git history + - Not chosen because Git history can be complex to navigate + - Having explicit append-only makes intent clearer + - Provides an easy-to-read historical record +- **Separate Dated Files**: Create new files for each time period + - Not chosen because it fragments information + - Makes searching across time periods harder + +### Consequences +- **Positive**: Preserves complete historical record +- **Positive**: Clear audit trail +- **Positive**: Prevents accidental loss of information +- **Positive**: Makes evolution of thinking visible +- **Negative**: Files will grow over time +- **Negative**: Can't fix typos in old entries +- **Trade-off**: Historical completeness vs. editability (chose history) + +### Implementation Notes +- Add clear warnings at top of each append-only file +- Create validation tests to check git diffs +- Use horizontal rules to separate entries +- Include timestamps with each entry + +### Related Decisions +- DEC-005 (No Fabrication) + +### Related Specs +- SPEC-004 (Append-Only Files) + +--- + +## [DEC-005] Enforce No Fabrication Rule +**Date**: 2026-01-28 +**Status**: Accepted +**Decision Maker**: System Architect + +### Context +Documentation and agent outputs must be trustworthy. We need to establish whether fabricated data or citations are acceptable. + +### Decision +Prohibit fabrication of citations, results, or data. All information must be verifiable and accurate. + +### Alternatives Considered +- **Allow Placeholders**: Permit example/dummy data + - Partially accepted: placeholders OK if clearly marked as such + - Not allowed for actual results or citations +- **Relaxed Policy**: Don't explicitly prohibit fabrication + - Not chosen because it undermines trust + - Can lead to confusion about what's real + - Makes documentation less valuable + +### Consequences +- **Positive**: Builds trust in documentation +- **Positive**: Ensures accuracy and reliability +- **Positive**: Makes information verifiable +- **Negative**: Requires more work to find real sources +- **Negative**: Can't use hypothetical examples as easily +- **Trade-off**: Convenience vs. accuracy (chose accuracy) + +### Implementation Notes +- Document rule clearly in agents.md +- Include in validation requirements +- Manual review process for citations +- Mark placeholder data explicitly + +### Related Decisions +- DEC-004 (Append-Only Files) + +### Related Specs +- SPEC-005 (No Fabrication) + +--- + +## [DEC-006] Prefer Markdown for All Output +**Date**: 2026-01-28 +**Status**: Accepted +**Decision Maker**: System Architect + +### Context +Agent outputs and documentation can use various formats. We need to standardize on a preferred format. + +### Decision +Prefer Markdown format for all agent outputs, documentation, and reports. + +### Alternatives Considered +- **Plain Text**: Use simple .txt files + - Not chosen because it lacks formatting capabilities + - Harder to structure complex information +- **HTML**: Use HTML for rich formatting + - Not chosen because it's less human-readable in source form + - More complex to write and maintain +- **Mixed Formats**: Allow each agent to choose format + - Not chosen because it reduces consistency + - Makes tooling more complex + +### Consequences +- **Positive**: Consistent, readable format +- **Positive**: Great version control (text-based) +- **Positive**: Renders nicely on GitHub +- **Positive**: Wide tool support +- **Negative**: Limited formatting compared to rich formats +- **Trade-off**: Simplicity vs. rich formatting (chose simplicity) + +### Implementation Notes +- Use .md extension for all documentation +- Follow CommonMark or GitHub Flavored Markdown +- Use code blocks with language specifiers +- Include examples in agents.md + +### Related Decisions +- DEC-003 (Required Headings) + +### Related Specs +- SPEC-006 (Markdown Preference) + +--- + +## [DEC-007] Use MUST/SHOULD/MAY RFC 2119 Keywords +**Date**: 2026-01-28 +**Status**: Accepted +**Decision Maker**: System Architect + +### Context +Specifications need clear language to indicate requirement levels. We need a standard way to express mandatory vs. optional requirements. + +### Decision +Use RFC 2119 keywords (MUST, SHOULD, MAY) to indicate requirement levels in all specifications and documentation. + +### Alternatives Considered +- **Custom Keywords**: Create our own requirement levels + - Not chosen because RFC 2119 is a well-known standard + - Reinventing the wheel adds confusion +- **Natural Language Only**: Write requirements in plain English + - Not chosen because it can be ambiguous + - Harder to parse programmatically +- **Numeric Priorities**: Use P0, P1, P2 priority levels + - Not chosen because it doesn't convey nature of requirement + - Less intuitive for requirement specifications + +### Consequences +- **Positive**: Clear, unambiguous requirement levels +- **Positive**: Industry-standard terminology +- **Positive**: Easy to parse and validate +- **Positive**: Well-understood by developers +- **Negative**: Requires familiarity with RFC 2119 +- **Trade-off**: Precision vs. natural language (chose precision) + +### Implementation Notes +- Use MUST for mandatory requirements +- Use SHOULD for recommended but not mandatory +- Use MAY for optional features +- Document usage in specs.md +- Create tests for all MUST requirements + +### Related Decisions +- DEC-008 (Test Requirements) + +### Related Specs +- SPEC-001 through SPEC-006 (all use MUST/SHOULD/MAY) + +--- + +## [DEC-008] Require Pass/Fail Tests for All MUST Requirements +**Date**: 2026-01-28 +**Status**: Accepted +**Decision Maker**: System Architect + +### Context +MUST requirements need to be verifiable. We need to decide how to ensure requirements are actually met. + +### Decision +Require every MUST requirement to have at least one associated pass/fail test with clear criteria. + +### Alternatives Considered +- **Manual Verification Only**: Check requirements manually + - Not chosen because it's not scalable + - Prone to human error + - Hard to maintain consistency +- **Optional Tests**: Make tests recommended but not required + - Not chosen because requirements without tests are often ignored + - Can't verify compliance +- **Integration Tests Only**: Use end-to-end tests + - Not chosen because they don't test individual requirements clearly + - Harder to debug when failures occur + +### Consequences +- **Positive**: Requirements are verifiable +- **Positive**: Clear acceptance criteria +- **Positive**: Can automate validation +- **Positive**: Reduces ambiguity +- **Negative**: Requires effort to write tests +- **Negative**: Some requirements may be hard to test automatically +- **Trade-off**: Effort vs. verifiability (chose verifiability) + +### Implementation Notes +- Include test definition with each MUST requirement in specs.md +- Define clear pass and fail criteria +- Document test ID for tracking +- Automate tests where possible +- Manual tests OK if automation isn't feasible + +### Related Decisions +- DEC-007 (MUST/SHOULD/MAY keywords) + +### Related Specs +- All specs include test definitions + +--- + +## [DEC-009] Agent Analysis and Expansion Recommendations +**Date**: 2026-01-28 +**Status**: Proposed +**Decision Maker**: System Analysis / GitHub Copilot Agent + +### Context +The initial Agent Factory system was established with 7 core agents and 3 specialisms. After analyzing the system's coverage, gaps were identified in critical areas including testing, security, deployment, documentation, integration, data modeling, and performance optimization. A comprehensive analysis was needed to recommend additions that would enhance the system's capability without over-complicating it. + +### Decision +Recommend a phased approach to expanding the agent system with 7 new agents and 7 new specialisms, prioritized by impact and necessity: + +**Phase 1 (High Priority - Immediate):** +- Tester Agent - for systematic test creation and validation +- SecurityReviewer Agent - for security analysis and compliance +- Security Specialism - security standards +- Testing Specialism - testing standards + +**Phase 2 (Medium Priority - Next Quarter):** +- Deployer Agent - for deployment readiness +- DocWriter Agent - for user-facing documentation +- Integrator Agent - for API and integration design +- DataModeler Agent - for data architecture +- API Design Specialism +- Deployment Specialism +- Documentation Specialism + +**Phase 3 (Low Priority - As Needed):** +- Optimizer Agent - for performance optimization +- Data Specialism +- Performance Specialism + +### Alternatives Considered +- **Expand Existing Agents**: Add responsibilities to current agents + - Not chosen because it would violate single responsibility principle + - Would make agents too complex and harder to use + - Each specialty requires dedicated focus + +- **Create Mega-Agent**: Create one "Quality" agent to handle testing, security, performance + - Not chosen because specialization is more effective + - Would be too broad and lack deep expertise + - Goes against the factory pattern philosophy + +- **Minimal Expansion**: Only add 1-2 most critical agents + - Not chosen because it leaves too many gaps + - Would require revisiting expansion soon + - Better to have comprehensive plan even if phased + +- **Maximum Expansion**: Add 15+ agents covering every niche + - Not chosen because it adds unnecessary complexity + - Many niches don't have sufficient use cases yet + - Can lead to confusion about which agent to use + +### Consequences +**Positive:** +- Comprehensive coverage of software development lifecycle +- Specialized expertise in critical areas (testing, security) +- Better quality outputs with systematic validation +- Clearer separation of concerns +- Production-ready artifacts +- Enhanced security posture +- Better deployment support + +**Negative:** +- More agents to learn and understand +- Increased coordination complexity +- More maintenance burden +- Steeper learning curve for new users +- Need to update documentation and tooling + +**Trade-offs:** +- Completeness vs. Simplicity (chose completeness with phased approach) +- Specialization vs. Generalization (chose specialization) +- Immediate implementation vs. Phased rollout (chose phased) + +### Implementation Notes + +**For Each New Agent:** +1. Create agent definition file in agents/ directory +2. Follow required structure (Purpose, Inputs, Outputs, Behavior, Constraints) +3. Add entry to agents.yaml with unique ID +4. Tag appropriately with existing or new tags +5. Run validation: ./validate_agents.sh +6. Update integration documentation + +**For Each New Specialism:** +1. Create specialism file in specialisms/ directory +2. Define purpose, operating rules, outputs, quality gates +3. Include test acceptance checks +4. Reference from appropriate agent definitions + +**New Tags to Add:** +- security (for SecurityReviewer) +- quality (for Tester) +- performance (for Optimizer) +- data (for DataModeler) +- api (for Integrator) + +**Workflow Integration:** +Update orchestration to support optional agent invocation based on artifact type and requirements. + +**Documentation Updates:** +- Update agents.md with new agent patterns +- Update README.md with expanded agent list +- Create workflow diagrams showing agent interactions +- Update COPILOT_INTEGRATION.md with new agent capabilities + +### Phased Rollout Strategy + +**Phase 1 Success Criteria:** +- Tester and SecurityReviewer agents created and validated +- At least 2 test cases using new agents successfully completed +- Security and Testing specialisms documented +- No regression in existing agent functionality + +**Phase 2 Entry Criteria:** +- Phase 1 agents proven valuable in practice +- User feedback incorporated +- Clear use cases for Phase 2 agents identified + +**Phase 3 Entry Criteria:** +- Performance or data-heavy projects emerge +- Demonstrated need for optimization or data modeling + +### Benefits and Mitigation + +**Key Benefits:** +- **Quality**: Systematic testing and security review +- **Completeness**: All lifecycle stages covered +- **Scalability**: Can handle more complex projects +- **Trust**: Better validation and security + +**Complexity Mitigation:** +- Phased rollout reduces learning curve +- Clear documentation of when to use each agent +- ProjectManager orchestrates, users don't need to know all agents +- Optional agents - only invoke what's needed + +### Related Decisions +- DEC-001 (Flat File Structure) - New agents follow same pattern +- DEC-002 (YAML Configuration) - New agents defined in agents.yaml +- DEC-003 (Required Headings) - New agents use standard structure +- DEC-008 (Test Requirements) - Tester agent enhances this capability + +### Related Specs +- SPEC-001 (File Structure) - New agents maintain flat structure +- SPEC-002 (Agent File Format) - New agents follow format +- SPEC-003 (Tags and Metadata) - New agents properly tagged + +### Output Reference +- Detailed analysis: agent_recommendations.md +- Run log: agent_runs.md #002 + +--- + +## [DEC-010] Phase 1 Agent Implementation - Tester and SecurityReviewer +**Date**: 2026-01-29 +**Status**: Implemented +**Decision Maker**: GitHub Copilot Agent / User Request + +### Context +Following DEC-009 (Agent Expansion Recommendations), the user requested implementation of the recommendations starting with Phase 1 high-priority agents. Phase 1 focused on addressing critical gaps in testing and security that are currently handled ad-hoc by existing agents. + +### Decision +Implement Phase 1 agents and specialisms as specified in agent_recommendations.md: +1. **Tester Agent** - Systematic test creation and validation +2. **SecurityReviewer Agent** - Security analysis and compliance +3. **Testing Specialism** - Testing standards and best practices +4. **Security Specialism** - Security standards and OWASP compliance + +### Alternatives Considered +- **Implement All Phases at Once**: Create all 7 recommended agents + - Not chosen because phased approach allows for validation and feedback + - Too many changes at once increases risk + +- **Extend Existing Agents**: Add testing and security to Skeptic/Builder + - Not chosen because violates single responsibility principle + - Specialized agents provide deeper expertise + - Would make existing agents too complex + +- **Wait for More Feedback**: Delay implementation pending additional review + - Not chosen because user explicitly requested implementation + - Phase 1 agents address critical, well-understood gaps + +### Consequences +**Positive:** +- Systematic testing now available (was ad-hoc before) +- Dedicated security expertise (was scattered across Skeptic) +- Clear quality gates for testing and security +- Testing specialism provides standards for test creation +- Security specialism provides OWASP Top 10 compliance framework +- Agents follow established patterns and conventions +- All validation requirements met (except known script bug) + +**Negative:** +- Increased system complexity (9 agents vs 7) +- More files to maintain +- Users need to learn new agent capabilities +- Validation script bug discovered (TEST-003-2) + +**Trade-offs:** +- Specialization vs. Simplicity (chose specialization) +- Immediate implementation vs. Extended review (chose immediate) +- Complete coverage vs. Focused delivery (chose focused Phase 1) + +### Implementation Notes + +**Agent Design:** +- Both agents follow required 5-heading structure +- Clear separation of concerns: + - Tester: Validation that requirements are met + - SecurityReviewer: Security vulnerabilities and compliance + - Skeptic: Adversarial breaking and edge cases (unchanged) +- Agents integrate into workflow between Builder and Skeptic +- Both agents reference their respective specialisms + +**Specialisms Created:** +- Testing.md: AAA pattern, test types, coverage metrics, naming conventions +- Security.md: OWASP Top 10, secure coding practices, severity classification + +**Configuration Changes:** +- Added tags: security, quality +- Added 2 agent entries to agents.yaml +- Maintained flat file structure +- All files in appropriate directories (agents/, specialisms/) + +**Validation:** +- Python YAML validation: All checks pass +- Agent files: All required headings present +- File structure: Flat structure maintained +- Tags: All from allowed list +- Known issue: Validation script TEST-003-2 bug (extracts "id:" literal instead of values) + +**Workflow Integration:** +Enhanced workflow: +``` +Architect → Builder → Tester → SecurityReviewer → Skeptic → Editor → ProjectManager +``` + +Tester validates requirements are met, SecurityReviewer checks security compliance, Skeptic finds edge cases. + +### Future Considerations +- Phase 2 agents can build on this foundation +- Tester and SecurityReviewer can be referenced by Phase 2 agents +- Validation script bug should be fixed (use awk $3 instead of $2) +- Consider adding example test and security reports + +### Related Decisions +- DEC-009 (Agent Expansion Recommendations) - parent decision +- DEC-003 (Required Headings) - followed +- DEC-001 (Flat File Structure) - maintained + +### Related Specs +- SPEC-001 (File Structure) - compliant +- SPEC-002 (Agent File Format) - compliant +- SPEC-003 (Tags and Metadata) - compliant +- SPEC-004 (Append-Only Files) - followed for this entry + +### Output References +- Agent: agents/Tester.md +- Agent: agents/SecurityReviewer.md +- Specialism: specialisms/Testing.md +- Specialism: specialisms/Security.md +- Configuration: agents.yaml (updated) +- Run log: agent_runs.md #003 + +--- + +## [DEC-011] Remove Flat File Structure Requirement +**Date**: 2026-01-29 +**Status**: Accepted +**Decision Maker**: User Request / alexanderholman +**Supersedes**: DEC-001 + +### Context +The original AgentFactory design enforced a strict flat file structure (DEC-001) where all agent files had to be stored at a single directory level with no nested subdirectories. This was intended to keep things simple and easy to navigate. + +However, this constraint was proving limiting as the system grew and as users wanted compatibility with organizational patterns used by major AI agent platforms including: +- GitHub Copilot +- OpenAI ChatGPT +- Google Gemini +- Agent-based IDEs (OpenCode.ai) +- Google Colab + +The user explicitly requested removal of the flat file structure requirement to enable more flexible organization. + +### Decision +Remove the flat file structure requirement (DEC-001 superseded) and adopt a flexible directory structure approach that: +1. Allows nested subdirectories for logical organization +2. Maintains agents.yaml as the authoritative registry +3. Supports conventions from major AI agent platforms +4. Permits both flat and nested structures (user choice) +5. Remains compatible with existing flat structure + +### Alternatives Considered +- **Keep Flat Structure**: Maintain the original constraint + - Not chosen because it was too limiting + - Didn't align with AI platform conventions + - User explicitly requested change + +- **Mandatory Nested Structure**: Require specific directory hierarchy + - Not chosen because it's too prescriptive + - Different projects have different organization needs + - Flexibility is more important than uniformity + +- **Separate Repositories**: Split agents into multiple repos + - Not chosen because it adds complexity + - Harder to maintain and coordinate + - Single repo with flexible structure is simpler + +### Consequences +**Positive:** +- More flexible organization options +- Can group related agents logically (by role, domain, priority) +- Better compatibility with AI agent platforms +- Supports growth without flat directory becoming cluttered +- Users can choose organization that fits their needs +- Easier migration from other AI agent systems + +**Negative:** +- Slightly more complex file discovery (mitigated by agents.yaml) +- Need to update validation script +- Potential for inconsistent organization across projects +- Existing documentation needs updating + +**Trade-offs:** +- Flexibility vs. Simplicity (chose flexibility) +- User control vs. Enforced consistency (chose user control) +- Platform compatibility vs. Custom patterns (chose compatibility) + +### Implementation Notes + +**Updated Files:** +- `.github/copilot-instructions.md`: Changed MUST to SHOULD, allowed nested directories +- `specs.md`: Added SPEC-007 (Flexible Structure), deprecated SPEC-001 +- `decisions.md`: This entry (DEC-011), supersedes DEC-001 +- `agents.yaml`: Validation rules updated (file existence check remains) +- `validate_agents.sh`: Remove flat structure test (TEST-001-1) + +**Migration Path:** +- Existing flat structure remains valid - no forced migration +- New agents can use nested directories +- Projects can reorganize gradually +- agents.yaml is the single source of truth for agent locations + +**Platform Compatibility:** +- Supports GitHub Copilot workspace conventions +- Compatible with OpenAI ChatGPT project structures +- Aligns with Google Gemini organization patterns +- Works with agent-based IDEs (OpenCode.ai) +- Compatible with Google Colab notebook structures + +**Examples of Supported Structures:** +``` +# Option 1: Flat (original, still valid) +agents/ +├── Architect.md +├── Builder.md +└── Tester.md + +# Option 2: Grouped by role +agents/ +├── core/ +│ └── Architect.md +├── quality/ +│ └── Tester.md +└── security/ + └── SecurityReviewer.md + +# Option 3: Mixed +agents/ +├── Architect.md +├── testing/ +│ └── Tester.md +└── security/ + └── SecurityReviewer.md +``` + +### Validation Changes +- Removed TEST-001-1 (flat structure check) +- Keep TEST-007-1 (file existence check via agents.yaml) +- agents.yaml file_path can now include subdirectories +- Validation focuses on agent quality, not structure + +### Documentation Updates +- README.md: Update architecture description +- COPILOT_INTEGRATION.md: Note flexible structure +- agents.md: Remove flat structure guidance +- All references to "flat file structure" updated or removed + +### Backward Compatibility +- Existing flat structure files work without changes +- agents.yaml schema unchanged (file_path supports subdirs) +- No breaking changes for existing agents +- Validation still passes for flat structures + +### Related Decisions +- DEC-001 (Use Flat File Structure) - SUPERSEDED by this decision +- DEC-002 (YAML Configuration) - unchanged +- DEC-003 (Required Headings) - unchanged + +### Related Specs +- SPEC-001 (Flat File Structure) - DEPRECATED +- SPEC-007 (Flexible Directory Structure) - NEW, replaces SPEC-001 +- SPEC-002 (Agent File Format) - unchanged + +### Output References +- Updated: .github/copilot-instructions.md +- Added: SPEC-007 in specs.md +- Deprecated: SPEC-001 in specs.md +- Decision: This entry (DEC-011) + +--- + +## [DEC-012] Add AGENTS.md for OpenAI ChatGPT Compatibility +**Date**: 2026-01-29 +**Status**: Implemented +**Decision Maker**: User Request / alexanderholman + +### Context +The user requested review and alignment with the OpenAI ChatGPT agents.md standard (https://github.com/agentsmd/agents.md). This standard defines a simple markdown file format (AGENTS.md) that provides practical instructions for AI agents working on a project. + +The AgentFactory project already had comprehensive documentation in `.github/copilot-instructions.md`, but this was GitHub Copilot-specific. The AGENTS.md format is a simpler, more universal format that works across multiple AI platforms including OpenAI ChatGPT. + +### Decision +Create an AGENTS.md file following the OpenAI standard to complement existing documentation: +1. Simple, practical instructions for AI agents +2. Focus on common development tasks +3. Quick reference format +4. Compatible with OpenAI ChatGPT and other AI agents +5. Complement (not replace) existing `.github/copilot-instructions.md` + +### Alternatives Considered +- **Replace copilot-instructions.md**: Replace with AGENTS.md only + - Not chosen because copilot-instructions.md has valuable detail + - Both formats serve different purposes (detailed vs. quick reference) + +- **Ignore the Standard**: Don't add AGENTS.md + - Not chosen because user explicitly requested alignment + - OpenAI standard is gaining adoption + - Improves compatibility across AI platforms + +- **Merge into README**: Add instructions to README.md + - Not chosen because README is for humans, AGENTS.md is for AI + - Separation of concerns is clearer + - Follows established pattern (README for humans, AGENTS.md for AI) + +### Consequences +**Positive:** +- Better compatibility with OpenAI ChatGPT +- Simple, quick reference for AI agents +- Follows emerging standard in AI agent ecosystem +- Complements existing detailed documentation +- Easy for AI agents to find and parse +- Platform-agnostic instructions + +**Negative:** +- Additional file to maintain +- Some duplication with copilot-instructions.md +- Need to keep both files in sync when making architectural changes + +**Trade-offs:** +- Completeness vs. Simplicity (AGENTS.md is simpler) +- Detailed vs. Practical (AGENTS.md is more practical) +- GitHub-specific vs. Universal (AGENTS.md is universal) + +### Implementation Notes + +**AGENTS.md Structure:** +- Project Overview - Brief description +- Key Files and Structure - What files exist and their purpose +- Development Workflow - How to add agents, run validation +- Coding Conventions - Style guidelines +- Testing - How to validate changes +- Common Tasks - Frequently used commands +- Important Rules - MUST/MUST NOT lists +- Directory Structure - Examples of supported structures +- Troubleshooting - Common issues and solutions +- Further Documentation - Links to other docs +- Platform Compatibility - List of supported platforms + +**Documentation Strategy:** +- `.github/copilot-instructions.md` - Comprehensive, GitHub Copilot-focused +- `AGENTS.md` - Simple, practical, platform-agnostic (NEW) +- `agents.md` - Agent definition template and guidelines +- `README.md` - Human-readable project overview +- `specs.md` - Technical specifications (append-only) +- `decisions.md` - Architectural decisions (this file) + +**Key Differences from copilot-instructions.md:** +- Simpler, more concise format +- Focus on practical tasks over comprehensive rules +- More code examples and commands +- Platform-agnostic language +- Quick troubleshooting section + +**Compatibility Benefits:** +- OpenAI ChatGPT can easily parse and follow instructions +- Google Gemini can reference the file +- Any AI agent looking for AGENTS.md will find it +- Follows emerging convention in AI agent ecosystem +- Simple markdown format works everywhere + +### Related Decisions +- DEC-011 (Flexible Directory Structure) - AGENTS.md documents this +- DEC-001 (Flat File Structure) - Superseded, AGENTS.md reflects current state + +### Related Specs +- SPEC-007 (Flexible Directory Structure) - Documented in AGENTS.md +- SPEC-002 (Agent File Format) - Explained in AGENTS.md +- SPEC-003 (Tags and Metadata) - Covered in AGENTS.md + +### Output References +- Created: AGENTS.md (5694 bytes, 202 lines) +- Standards reference: https://github.com/agentsmd/agents.md + +--- diff --git a/project_context_schema.yaml b/project_context_schema.yaml new file mode 100644 index 0000000..56695d2 --- /dev/null +++ b/project_context_schema.yaml @@ -0,0 +1,63 @@ + + +# Defines the standard schema for per-project context files. +# Expected filenames: +# - project_context_.yaml +# - project_context_.yml + +schema_version: "1.0" +schema_name: "project_context" + +fields: + project: + required: true + type: object + fields: + project_id: { required: true, type: string } + title: { required: true, type: string } + intent: { required: true, type: string } + status: { required: true, type: string, allowed: ["planning", "active", "paused", "done"] } + + constraints: + required: false + type: object + fields: + flat_files_only: { required: false, type: boolean, default: true } + allowed_languages: { required: false, type: array, item_type: string } + forbidden_tools: { required: false, type: array, item_type: string } + environments: { required: false, type: array, item_type: string } + + goals: + required: false + type: array + item_type: string + + deliverables: + required: false + type: array + item_type: string + + inputs: + required: false + type: array + item_type: string + + current_state: + required: false + type: object + fields: + what_exists: { required: false, type: array, item_type: string } + what_is_missing: { required: false, type: array, item_type: string } + known_issues: { required: false, type: array, item_type: string } + + workflow: + required: false + type: object + fields: + default_agent_pipeline: { required: false, type: array, item_type: string } + quality_gates_to_apply: { required: false, type: array, item_type: string } + + notes: + required: false + type: array + item_type: string diff --git a/project_context_template.yaml b/project_context_template.yaml new file mode 100644 index 0000000..aa92269 --- /dev/null +++ b/project_context_template.yaml @@ -0,0 +1,44 @@ +# filename: project_context_template.yaml + +project: + project_id: "template" + title: "Template Repository" + intent: "Provide a plug-and-play repository structure with AgentFactory-compatible agents and validation." + status: "active" + +constraints: + flat_files_only: false + environments: + - "GitHub" + - "OpenCode" + allowed_languages: + - "markdown" + - "yaml" + forbidden_tools: [] + +goals: + - "Include core agents, specialisms, and validation tooling" + - "Enable spec -> build -> review workflows from day one" + +deliverables: + - "agents.yaml" + - "agents.md" + - "AGENTS.md" + - "specs.md" + - "decisions.md" + - "agent_runs.md" + - "validate_agents.sh" + +workflow: + default_agent_pipeline: + - "Architect writes/updates spec" + - "Builder produces artifacts" + - "Tester validates requirements" + - "SecurityReviewer checks security" + - "Skeptic reviews edge cases" + - "Editor improves clarity" + - "ProjectManager packages outputs" + quality_gates_to_apply: + - "writing" + - "code" + - "tests" diff --git a/repo.yaml b/repo.yaml new file mode 100644 index 0000000..9fcaf92 --- /dev/null +++ b/repo.yaml @@ -0,0 +1,21 @@ +# filename: repo.yaml +# Minimal repository manifest for the flat-file Agent Factory system. + +repo: + name: "agent-factory" + hierarchy: "flat" + required_files: + - "agents.yaml" + - "agents.md" + - "specs.md" + optional_files: + - "project_context.md" + - "agent_runs.md" + - "decisions.md" + - "{specialism}.md" + logs: + run_log: "agent_runs.md" + decision_log: "decisions.md" + conventions: + tags_required: ["[SPEC]", "[ASSUMPTION]", "[RISK]", "[TODO]", "[DECISION]", "[TEST]", "[DONE]"] + append_only_files: ["specs.md", "agent_runs.md", "decisions.md"] diff --git a/specialisms/CitationManager.md b/specialisms/CitationManager.md new file mode 100644 index 0000000..614dd84 --- /dev/null +++ b/specialisms/CitationManager.md @@ -0,0 +1,28 @@ + + +## [SPEC] Specialism Addendum — Citation Manager (v1.0) + +### Purpose +Defines standards for citation handling, evidence tracking, and "no fabrication" enforcement. + +### Operating Rules +- Never invent citations. +- If a claim needs support and none is provided, tag `[ASSUMPTION]`. +- Prefer primary sources when possible. +- Maintain consistent citation style when used (BibTeX keys, DOI, URL, etc.) + +### Outputs (typical) +- citation audit lists +- "claims needing citations" checklist +- BibTeX entry drafts (from provided metadata only) +- mapping of claim → source + +### Quality Gates (Citation Manager) +- No fabricated sources +- High-impact claims flagged +- Incomplete evidence clearly marked +- Reproducible referencing (keys consistent) + +### [TEST] Acceptance Checks +- Every non-trivial factual claim has either a citation or `[ASSUMPTION]` +- Bibliography keys are consistent and unique diff --git a/specialisms/Coder.md b/specialisms/Coder.md new file mode 100644 index 0000000..8320efe --- /dev/null +++ b/specialisms/Coder.md @@ -0,0 +1,30 @@ + + +## [SPEC] Specialism Addendum — Coder (v1.0) + +### Purpose +Defines standards for writing usable code artifacts. + +### Operating Rules +- Functions named by intent (verbs). +- Inputs/outputs must be explicit. +- Include a minimal "How to run" section. +- Prefer fewer dependencies. +- Handle basic failure modes or document them. + +### Outputs (typical) +- scripts (python/bash) +- config files +- parsers/validators +- small CLIs +- notebooks (only if requested) + +### Quality Gates (Coder) +- Runs end-to-end from clean checkout (or run steps provided) +- Dependencies listed +- Basic error handling or documented failure modes +- No hidden environment assumptions + +### [TEST] Acceptance Checks +- A user can copy/paste the file and execute it with stated prerequisites +- At least one example invocation exists diff --git a/specialisms/Researcher.md b/specialisms/Researcher.md new file mode 100644 index 0000000..a3d4e42 --- /dev/null +++ b/specialisms/Researcher.md @@ -0,0 +1,33 @@ + + +## [SPEC] Specialism Addendum — Researcher (v1.0) + +### Purpose +Defines standards for research work: explanations, evaluation, experimental framing, and scientific writing. + +### Operating Rules +- Prefer explicit definitions over implied meaning. +- Always separate: + - observation + - interpretation + - inference + - speculation `[ASSUMPTION]` +- If describing methods, provide enough detail for reproduction. + +### Outputs (typical) +- literature review notes +- experiment plan +- results interpretation +- figure captions + rationale +- limitations + next work + +### Quality Gates (Researcher) +- Key terms defined +- Claims are cited or tagged `[ASSUMPTION]` +- Hypothesis or goal stated +- Result → interpretation chain is explicit +- Limitations are not hidden + +### [TEST] Acceptance Checks +- Another researcher can explain what was done and why +- A reviewer could replicate the reasoning from the text alone diff --git a/specialisms/Security.md b/specialisms/Security.md new file mode 100644 index 0000000..52c184a --- /dev/null +++ b/specialisms/Security.md @@ -0,0 +1,180 @@ + + +## [SPEC] Specialism Addendum — Security (v1.0) + +### Purpose +Defines standards for security work: security analysis, vulnerability assessment, secure coding practices, and compliance checking. + +### Operating Rules +- Follow principle of least privilege (grant minimum necessary permissions). +- Apply defense in depth (multiple layers of security). +- Use secure defaults (opt-in for risky features, not opt-out). +- Fail securely (errors should not expose sensitive information). +- Never trust user input (validate, sanitize, encode). +- Keep security simple (complexity is the enemy of security). +- Use well-tested security libraries (don't roll your own crypto). +- Separate duties (authorization checks separate from business logic). +- Log security events (authentication, authorization, data access). +- Encrypt sensitive data at rest and in transit. +- Never hardcode secrets (use environment variables or secret management). +- Keep dependencies updated (patch known vulnerabilities). + +### Outputs (typical) +- security assessment reports +- vulnerability lists +- threat models +- security test cases +- remediation recommendations +- compliance checklists +- secure coding guidelines + +### Quality Gates (Security) +- No critical or high severity vulnerabilities in production +- No hardcoded secrets or credentials +- Input validation on all user-supplied data +- Authentication and authorization properly implemented +- Sensitive data encrypted (in transit and at rest) +- Dependencies scanned for known vulnerabilities +- Security logging and monitoring configured +- Security tests pass +- OWASP Top 10 compliance + +### [TEST] Acceptance Checks +- Security assessment completed and documented +- All identified vulnerabilities tracked and prioritized +- Critical and high severity vulnerabilities remediated +- Security test cases cover authentication, authorization, and input validation +- No secrets in source code or configuration files +- Dependencies are up-to-date or vulnerabilities mitigated + +### OWASP Top 10 (2021) Checklist +- [ ] A01:2021 – Broken Access Control + - Authorization checks on all protected resources + - Deny by default + - Rate limiting on APIs +- [ ] A02:2021 – Cryptographic Failures + - Sensitive data encrypted in transit (TLS) + - Sensitive data encrypted at rest + - Strong encryption algorithms (AES-256, etc.) + - Proper key management +- [ ] A03:2021 – Injection + - Input validation on all user data + - Parameterized queries (SQL injection prevention) + - Output encoding (XSS prevention) + - Command injection prevention +- [ ] A04:2021 – Insecure Design + - Threat modeling completed + - Security requirements defined + - Secure design patterns used +- [ ] A05:2021 – Security Misconfiguration + - Secure defaults + - Minimal features enabled + - Error messages don't leak information + - Security headers configured +- [ ] A06:2021 – Vulnerable and Outdated Components + - Dependencies up-to-date + - Known vulnerabilities patched + - Dependency scanning in place +- [ ] A07:2021 – Identification and Authentication Failures + - Strong password requirements + - Multi-factor authentication (when appropriate) + - Session management secure + - No default credentials +- [ ] A08:2021 – Software and Data Integrity Failures + - Code integrity checks + - Secure update mechanisms + - Supply chain security +- [ ] A09:2021 – Security Logging and Monitoring Failures + - Security events logged + - Log integrity protected + - Alerting configured + - Audit trails maintained +- [ ] A10:2021 – Server-Side Request Forgery (SSRF) + - URL validation + - Network segmentation + - Allow-lists for destinations + +### Security Severity Classification +**Critical** +- Remote code execution +- Authentication bypass +- SQL injection with data access +- Exposure of secrets or credentials + +**High** +- Privilege escalation +- Sensitive data exposure +- Cross-site scripting (XSS) on sensitive pages +- Insecure direct object references + +**Medium** +- Information disclosure +- Missing security headers +- Weak cryptography +- Insecure session management + +**Low** +- Security misconfiguration (non-exploitable) +- Missing security best practices +- Outdated dependencies (no known exploit) + +### Secure Coding Practices +**Input Validation** +- Validate all input (type, length, format, range) +- Use allow-lists instead of deny-lists +- Reject invalid input, don't try to clean it +- Validate on the server side (never trust client validation) + +**Authentication** +- Use strong password requirements +- Implement rate limiting on login attempts +- Use secure session management +- Implement logout functionality +- Expire sessions after inactivity + +**Authorization** +- Check authorization on every request +- Use role-based or attribute-based access control +- Implement principle of least privilege +- Don't rely on client-side checks + +**Cryptography** +- Use TLS 1.2 or higher for data in transit +- Use AES-256 or equivalent for data at rest +- Use secure random number generators +- Don't implement custom cryptography +- Store passwords with strong hashing (bcrypt, Argon2) + +**Data Protection** +- Classify data by sensitivity +- Encrypt sensitive data +- Minimize data collection +- Implement data retention policies +- Secure data deletion + +**Error Handling** +- Don't expose sensitive information in errors +- Log errors securely +- Use generic error messages for users +- Handle exceptions properly + +**Logging** +- Log authentication events +- Log authorization failures +- Log sensitive data access +- Don't log sensitive data (passwords, tokens, PII) +- Protect log integrity + +### Secrets Management +- Never commit secrets to version control +- Use environment variables or secret vaults +- Rotate secrets regularly +- Implement secret scanning in CI/CD +- Use different secrets for different environments + +### Dependency Security +- Keep dependencies updated +- Use dependency scanning tools +- Review security advisories +- Pin dependency versions +- Audit third-party code diff --git a/specialisms/Testing.md b/specialisms/Testing.md new file mode 100644 index 0000000..edb85ad --- /dev/null +++ b/specialisms/Testing.md @@ -0,0 +1,113 @@ + + +## [SPEC] Specialism Addendum — Testing (v1.0) + +### Purpose +Defines standards for testing work: test creation, test execution, test documentation, and quality validation. + +### Operating Rules +- Write tests before or alongside code (TDD/BDD when appropriate). +- Test one thing per test (single responsibility). +- Use clear, descriptive test names that explain what is being tested. +- Follow Arrange-Act-Assert (AAA) pattern or equivalent structure. +- Make tests independent (no execution order dependencies). +- Use meaningful assertions with clear failure messages. +- Test the happy path, edge cases, and error conditions. +- Keep tests fast and deterministic (avoid flaky tests). +- Mock external dependencies to isolate units under test. +- Document test requirements and test data. + +### Outputs (typical) +- test plans +- unit tests +- integration tests +- acceptance tests +- test reports +- coverage reports +- test data and fixtures + +### Quality Gates (Testing) +- All MUST requirements have at least one test +- Tests have clear, descriptive names +- Tests follow AAA pattern or equivalent +- Tests are independent and repeatable +- Edge cases and error conditions tested +- Test coverage metrics provided +- Tests pass consistently (no flaky tests) +- Test execution steps documented + +### [TEST] Acceptance Checks +- Every MUST requirement can be validated by running a test +- Tests can be executed following documented steps +- Test results clearly show pass/fail status +- Coverage report shows which requirements are tested +- Test failures provide enough information to diagnose issues + +### Testing Checklist +- [ ] Test plan defines scope and approach +- [ ] MUST requirements are covered by tests +- [ ] Happy path scenarios tested +- [ ] Edge cases and boundary conditions tested +- [ ] Error handling and failure scenarios tested +- [ ] Tests have clear, descriptive names +- [ ] Tests are independent and isolated +- [ ] Test data and fixtures documented +- [ ] Tests execute successfully +- [ ] Coverage metrics calculated +- [ ] Test report documents results + +### Test Types +**Unit Tests** +- Test individual functions or methods +- Fast execution (milliseconds) +- No external dependencies (mocked) +- High code coverage expected (>80%) + +**Integration Tests** +- Test component interactions +- Moderate execution time (seconds) +- May use test doubles or real dependencies +- Focus on interfaces and contracts + +**Acceptance Tests** +- Test requirement satisfaction +- Slower execution (seconds to minutes) +- End-to-end scenarios +- Validate business requirements + +**Security Tests** +- Test security requirements +- Authentication and authorization tests +- Input validation and injection tests +- Secrets and sensitive data protection tests + +### Test Documentation Standards +Each test should include: +- **Test ID**: Unique identifier +- **Test Name**: Clear, descriptive name +- **Test Description**: What is being tested and why +- **Preconditions**: Setup required before test +- **Test Steps**: How to execute the test +- **Expected Results**: What success looks like +- **Actual Results**: What actually happened (during execution) +- **Pass/Fail Status**: Clear outcome + +### Test Naming Conventions +Use descriptive names that explain the test: +- `test___` +- Example: `test_login_with_valid_credentials_succeeds` +- Example: `test_login_with_invalid_password_returns_401` +- Example: `test_calculate_total_with_empty_cart_returns_zero` + +### Coverage Metrics +- **Requirement Coverage**: % of requirements tested +- **Code Coverage**: % of code executed by tests (when applicable) +- **Branch Coverage**: % of decision branches tested +- **Pass Rate**: % of tests passing + +### Test Maintenance +- Update tests when requirements change +- Remove obsolete tests +- Refactor tests for clarity +- Keep test code clean and maintainable +- Version test data with tests diff --git a/specs.md b/specs.md new file mode 100644 index 0000000..82e188a --- /dev/null +++ b/specs.md @@ -0,0 +1,329 @@ +# Specifications + +**NOTE: This file is append-only. Do not modify or remove existing entries. Only add new specifications at the end.** + +--- + +## Specification Format + +Each specification entry MUST follow this format: + +``` +## [SPEC-XXX] Specification Title +**Date**: YYYY-MM-DD +**Status**: Draft | Approved | Implemented | Deprecated +**Priority**: MUST | SHOULD | MAY + +### Description +Detailed description of the specification. + +### Requirements +- MUST requirement 1 +- MUST requirement 2 +- SHOULD requirement 1 +- MAY requirement 1 + +### Tests +- **Test ID**: TEST-XXX-1 + - **Description**: What this test validates + - **Pass Criteria**: Specific conditions for pass + - **Fail Criteria**: Specific conditions for fail + +### Related Specs +- Links to related specifications + +--- +``` + +## Initial Specifications + +--- + +## [SPEC-001] Agent Factory File Structure +**Date**: 2026-01-28 +**Status**: Approved +**Priority**: MUST + +### Description +The Agent Factory repository MUST maintain a flat file structure for all agent-related files. This ensures simplicity, ease of navigation, and prevents over-engineering of the directory hierarchy. + +### Requirements +- MUST use flat file structure (single level depth maximum) +- MUST NOT create nested subdirectories for agent files +- MUST store all agent definition files in root or single `agents/` directory +- SHOULD use consistent naming conventions for agent files + +### Tests +- **Test ID**: TEST-001-1 + - **Description**: Verify no nested directories beyond one level + - **Pass Criteria**: `find agents/ -mindepth 2 -type f` returns no results + - **Fail Criteria**: Any files found at depth > 1 + +- **Test ID**: TEST-001-2 + - **Description**: Verify agent files exist at correct location + - **Pass Criteria**: All agent file paths in agents.yaml resolve to existing files + - **Fail Criteria**: Any file_path in agents.yaml does not exist + +### Related Specs +- SPEC-002 (Agent File Format) + +--- + +## [SPEC-002] Agent File Format and Required Headings +**Date**: 2026-01-28 +**Status**: Approved +**Priority**: MUST + +### Description +All agent files MUST follow a standardized format with required headings to ensure consistency and completeness of documentation. + +### Requirements +- MUST include heading: "## Purpose" +- MUST include heading: "## Inputs" +- MUST include heading: "## Outputs" +- MUST include heading: "## Behavior" +- MUST include heading: "## Constraints" +- MUST have headings in the order listed above +- SHOULD include examples and usage documentation +- MAY include additional headings for supplementary information + +### Tests +- **Test ID**: TEST-002-1 + - **Description**: Verify all required headings exist in agent file + - **Pass Criteria**: All five required headings found in correct order + - **Fail Criteria**: Any required heading missing or out of order + +- **Test ID**: TEST-002-2 + - **Description**: Verify heading format follows markdown H2 convention + - **Pass Criteria**: All required headings use `## ` prefix + - **Fail Criteria**: Incorrect heading level used + +### Related Specs +- SPEC-001 (File Structure) +- SPEC-003 (Tags and Metadata) + +--- + +## [SPEC-003] Agent Tags and Metadata +**Date**: 2026-01-28 +**Status**: Approved +**Priority**: MUST + +### Description +All agents MUST be tagged with at least one tag from the approved list in agents.yaml to enable categorization and discovery. + +### Requirements +- MUST have at least one tag in agents.yaml definition +- MUST use tags from the allowed_tags list in agents.yaml +- MUST have unique agent ID +- SHOULD have version number following semantic versioning +- MAY include custom metadata fields + +### Tests +- **Test ID**: TEST-003-1 + - **Description**: Verify each agent has at least one tag + - **Pass Criteria**: All agents in agents.yaml have tags array with length >= 1 + - **Fail Criteria**: Any agent has empty or missing tags array + +- **Test ID**: TEST-003-2 + - **Description**: Verify agent IDs are unique + - **Pass Criteria**: No duplicate IDs in agents.yaml + - **Fail Criteria**: Duplicate IDs found + +- **Test ID**: TEST-003-3 + - **Description**: Verify tags are from allowed list + - **Pass Criteria**: All tags used are in allowed_tags list + - **Fail Criteria**: Any tag not in allowed_tags list + +### Related Specs +- SPEC-002 (Agent File Format) + +--- + +## [SPEC-004] Append-Only File Management +**Date**: 2026-01-28 +**Status**: Approved +**Priority**: MUST + +### Description +Certain files in the Agent Factory (specs.md, agent_runs.md, decisions.md) MUST be append-only to maintain historical record and prevent loss of information. + +### Requirements +- MUST NOT delete content from append-only files +- MUST NOT modify existing entries in append-only files +- MUST only add new entries at the end of append-only files +- SHOULD include timestamp with each new entry +- SHOULD include clear separation between entries + +### Tests +- **Test ID**: TEST-004-1 + - **Description**: Verify no content deletion in append-only files + - **Pass Criteria**: Git diff shows only additions to specs.md, agent_runs.md, decisions.md + - **Fail Criteria**: Git diff shows deletions or modifications to existing content + +- **Test ID**: TEST-004-2 + - **Description**: Verify new entries are added at end of file + - **Pass Criteria**: All changes are additions after the last existing entry + - **Fail Criteria**: Changes appear in middle of file + +### Related Specs +- SPEC-005 (No Fabrication Rule) + +--- + +## [SPEC-005] No Fabrication of Citations and Results +**Date**: 2026-01-28 +**Status**: Approved +**Priority**: MUST + +### Description +Agent documentation and results MUST NOT contain fabricated citations, data, or results. All information MUST be verifiable and accurate. + +### Requirements +- MUST NOT fabricate citations or references +- MUST NOT invent test results or data +- MUST provide verifiable sources for all claims +- SHOULD link to original sources when referencing external information +- MAY include placeholder text if data is pending, clearly marked as such + +### Tests +- **Test ID**: TEST-005-1 + - **Description**: Manual review of citations and references + - **Pass Criteria**: All citations can be verified and traced to source + - **Fail Criteria**: Any citation cannot be verified or is fabricated + +- **Test ID**: TEST-005-2 + - **Description**: Verify test results match actual test output + - **Pass Criteria**: Documented test results match actual test execution output + - **Fail Criteria**: Results do not match or appear fabricated + +### Related Specs +- SPEC-004 (Append-Only Files) + +--- + +## [SPEC-006] Markdown Output Preference +**Date**: 2026-01-28 +**Status**: Approved +**Priority**: SHOULD + +### Description +All agent outputs, documentation, and reports SHOULD use Markdown format for consistency and readability. + +### Requirements +- SHOULD use Markdown for all documentation +- SHOULD use standard Markdown syntax (CommonMark or GitHub Flavored Markdown) +- SHOULD use code blocks with language specification for code examples +- MAY use extended Markdown features where supported + +### Tests +- **Test ID**: TEST-006-1 + - **Description**: Verify documentation files use .md extension + - **Pass Criteria**: All agent documentation files use .md extension + - **Fail Criteria**: Documentation files use other formats + +- **Test ID**: TEST-006-2 + - **Description**: Verify Markdown syntax validity + - **Pass Criteria**: Files parse correctly with Markdown parser + - **Fail Criteria**: Syntax errors in Markdown files + +### Related Specs +- SPEC-002 (Agent File Format) + +--- + +## [SPEC-007] Flexible Directory Structure +**Date**: 2026-01-29 +**Status**: Approved +**Priority**: SHOULD +**Supersedes**: SPEC-001 + +### Description +The Agent Factory repository SHOULD use a flexible directory structure that allows for logical organization while maintaining compatibility with major AI platforms including GitHub Copilot, OpenAI ChatGPT, Google Gemini, agent-based IDEs (OpenCode.ai), and Google Colab. + +### Requirements +- SHOULD organize agent files in the `agents/` directory or subdirectories +- MAY use nested subdirectories for logical grouping (e.g., by role, domain, priority) +- SHOULD use self-documenting directory names +- MUST still use agents.yaml as the central registry for all agents +- SHOULD follow conventions compatible with major AI agent platforms +- MAY organize specialisms in nested subdirectories under `specialisms/` + +### Examples of Allowed Structures +``` +agents/ +├── core/ +│ ├── Architect.md +│ ├── Builder.md +│ └── ProjectManager.md +├── quality/ +│ ├── Tester.md +│ ├── SecurityReviewer.md +│ └── Skeptic.md +└── documentation/ + ├── Editor.md + └── DocWriter.md +``` + +Or flat structure (still allowed): +``` +agents/ +├── Architect.md +├── Builder.md +├── Tester.md +└── ... +``` + +Or mixed: +``` +agents/ +├── Architect.md +├── Builder.md +├── testing/ +│ ├── Tester.md +│ └── test_utilities/ +│ └── helpers.md +└── security/ + └── SecurityReviewer.md +``` + +### Tests +- **Test ID**: TEST-007-1 + - **Description**: Verify all agent files referenced in agents.yaml exist + - **Pass Criteria**: All file_path entries in agents.yaml resolve to existing files + - **Fail Criteria**: Any file_path in agents.yaml does not exist + +- **Test ID**: TEST-007-2 + - **Description**: Verify agents/ directory exists + - **Pass Criteria**: agents/ directory exists in repository root + - **Fail Criteria**: agents/ directory is missing + +### Related Specs +- SPEC-001 (Deprecated by this spec) +- SPEC-002 (Agent File Format) - still applies +- SPEC-003 (Tags and Metadata) - still applies + +### Migration Notes +- Existing flat structure is still valid and allowed +- Projects may gradually reorganize into nested structures +- agents.yaml remains the authoritative registry +- No breaking changes to existing agent files + +--- + +## [SPEC-001] Agent Factory File Structure (DEPRECATED) +**Date**: 2026-01-28 +**Status**: Deprecated +**Deprecated Date**: 2026-01-29 +**Superseded By**: SPEC-007 +**Priority**: N/A (was MUST) + +### Deprecation Notice +This specification has been superseded by SPEC-007 (Flexible Directory Structure). The flat file structure requirement has been removed to allow for better organization and compatibility with major AI agent platforms including GitHub Copilot, OpenAI ChatGPT, Google Gemini, agent-based IDEs (OpenCode.ai), and Google Colab. + +**Original Description**: The Agent Factory repository MUST maintain a flat file structure for all agent-related files. + +**Why Deprecated**: The rigid flat structure constraint was limiting organization flexibility and was not aligned with conventions used by major AI agent platforms. The new flexible structure (SPEC-007) allows for both flat and nested organizations while maintaining compatibility and discoverability through agents.yaml. + +--- + diff --git a/tasks/TASKS.md b/tasks/TASKS.md new file mode 100644 index 0000000..8ef02e6 --- /dev/null +++ b/tasks/TASKS.md @@ -0,0 +1,13 @@ +# Tasks + +This directory defines recurring and stateful task tracking for OpenCode usage. These files are intended to be lightweight and human-editable. + +## Files +- `cadence.yaml` — recurring daily/weekly/monthly tasks. +- `wip.md` — current work in progress (open tasks). +- `triage.md` — items requiring user input or decision. + +## Usage +- Update `wip.md` as tasks change state. +- Move blockers or questions into `triage.md`. +- Maintain recurring tasks in `cadence.yaml`. diff --git a/tasks/cadence.yaml b/tasks/cadence.yaml new file mode 100644 index 0000000..d79873c --- /dev/null +++ b/tasks/cadence.yaml @@ -0,0 +1,16 @@ +cadence: + daily: + - id: check-open-prs + description: Review open PRs and summarize status. + - id: review-triage + description: Review triage.md and surface items needing user input. + + weekly: + - id: dependency-audit + description: Check for outdated dependencies or security alerts. + - id: workflow-health + description: Verify validation workflows and bots are functioning. + + monthly: + - id: template-sync + description: Ensure template repo stays aligned with AgentFactory updates. diff --git a/tasks/triage.md b/tasks/triage.md new file mode 100644 index 0000000..9580e4c --- /dev/null +++ b/tasks/triage.md @@ -0,0 +1,10 @@ +# Triage + +## Needs User Input +- (empty) + +## Blockers +- (empty) + +## Decisions Pending +- (empty) diff --git a/tasks/wip.md b/tasks/wip.md new file mode 100644 index 0000000..5502a20 --- /dev/null +++ b/tasks/wip.md @@ -0,0 +1,10 @@ +# WIP + +## Active +- (empty) + +## Next +- (empty) + +## Done +- (empty) diff --git a/traits/TRAITS.md b/traits/TRAITS.md new file mode 100644 index 0000000..8fb2e37 --- /dev/null +++ b/traits/TRAITS.md @@ -0,0 +1,42 @@ +# Traits + +Traits are reusable behavioral modules that can be attached to agents, specialisms, or tasks. They define constraints, expectations, and operational guardrails that shape how work is executed. + +## Trait Format +Each trait MUST include: + +- **Trait ID** (unique) +- **Purpose** (what it enforces) +- **Applies To** (agents, specialisms, tasks) +- **Behavior** (required behaviors) +- **Checks** (how to confirm adherence) +- **Overrides** (how to disable or override) + +Use this template: + +```markdown +# Trait: + +## Trait ID + + +## Purpose + + +## Applies To +- agent +- specialism +- task + +## Behavior +- MUST ... +- SHOULD ... +- MUST NOT ... + +## Checks +- +- + +## Overrides +- +``` diff --git a/traits/pr-authorship-clarity.md b/traits/pr-authorship-clarity.md new file mode 100644 index 0000000..4be462a --- /dev/null +++ b/traits/pr-authorship-clarity.md @@ -0,0 +1,24 @@ +# Trait: PR Authorship Clarity + +## Trait ID +pr-authorship-clarity-001 + +## Purpose +Ensure PR comments, reviews, and actions clearly reflect their true author (human vs automation) and avoid misattributed authorship. + +## Applies To +- task +- agent + +## Behavior +- MUST state when actions are performed using a user-authenticated token. +- MUST avoid implying automated/bot authorship unless a bot identity is actually used. +- SHOULD prefer bot/app credentials for automated reviews or comments. +- MUST document the chosen identity in the PR summary (e.g., “commented as alexanderholman via user token”). + +## Checks +- PR comments indicate author identity and method. +- If a bot identity is desired, `gh auth status` shows bot/app credentials. + +## Overrides +- Allowed when user explicitly requests comments as their account. diff --git a/traits/pr-review-automation.md b/traits/pr-review-automation.md new file mode 100644 index 0000000..cfbd800 --- /dev/null +++ b/traits/pr-review-automation.md @@ -0,0 +1,24 @@ +# Trait: PR Review Automation + +## Trait ID +pr-review-automation-001 + +## Purpose +Standardize how automated reviews are requested and handled (Copilot, Codex, or other bots), and ensure review provenance is explicit. + +## Applies To +- task +- agent + +## Behavior +- MUST request automated review using the correct reviewer handle or trigger comment. +- MUST record which reviewer handle was used and whether the request succeeded. +- SHOULD include a fallback path if reviewer handle is invalid (e.g., comment trigger). +- MUST not claim a review was requested if the command failed. + +## Checks +- PR timeline shows the automated reviewer or trigger comment. +- CLI output confirms reviewer request success. + +## Overrides +- Allowed if repository does not support automated reviewers; document why. diff --git a/validate_agents.sh b/validate_agents.sh new file mode 100755 index 0000000..f3daa41 --- /dev/null +++ b/validate_agents.sh @@ -0,0 +1,187 @@ +#!/bin/bash +# validate_agents.sh +# Validation script for Agent Factory +# This script runs all MUST requirement tests + +# Note: Do not use 'set -e' here; tests are expected to fail without aborting the script. + +echo "======================================" +echo "Agent Factory Validation Suite" +echo "======================================" +echo "" + +PASSED=0 +FAILED=0 +TOTAL=0 + +# Color codes +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +NC='\033[0m' # No Color + +# Check if agents/ directory exists when agents are defined in agents.yaml +if [ -f "agents.yaml" ]; then + agent_count=$(grep -c ' - id:' agents.yaml 2>/dev/null || echo "0") + if [ "$agent_count" -gt 0 ] && [ ! -d "agents" ]; then + echo -e "${YELLOW}Warning: agents.yaml defines $agent_count agent(s) but agents/ directory does not exist.${NC}" + echo "Create the directory with: mkdir -p agents" + echo "" + fi +fi + +# Test function +run_test() { + local test_id=$1 + local test_name=$2 + local test_command=$3 + + TOTAL=$((TOTAL + 1)) + echo "Running ${test_id}: ${test_name}" + + if eval "$test_command"; then + echo -e "${GREEN}✓ PASS${NC}: ${test_id}" + PASSED=$((PASSED + 1)) + else + echo -e "${RED}✗ FAIL${NC}: ${test_id}" + FAILED=$((FAILED + 1)) + fi + echo "" +} + +# Helper: verify all agent files referenced in agents.yaml exist +check_agent_files_exist() { + if [ ! -f "agents.yaml" ]; then + return 0 # No agents.yaml, nothing to check + fi + + # Extract file_path entries from agents.yaml and check they exist + local missing_files=0 + while IFS= read -r filepath; do + if [ -n "$filepath" ] && [ ! -f "$filepath" ]; then + echo "Missing file: $filepath" + missing_files=$((missing_files + 1)) + fi + done < <(grep 'file_path:' agents.yaml | awk '{print $2}' | tr -d '"') + + [ $missing_files -eq 0 ] +} + +# TEST-007-1: Verify agent files exist (replaces flat structure test) +echo "=== SPEC-007: Flexible Directory Structure ===" +if [ -d "agents" ]; then + run_test "TEST-007-1" \ + "Verify all agent files referenced in agents.yaml exist" \ + "check_agent_files_exist" +else + echo "Note: agents/ directory does not exist yet - skipping TEST-007-1" + echo "" +fi + +# TEST-002-1: Verify required headings in agent files (presence and order) +echo "=== SPEC-002: Agent File Format ===" +if [ -d "agents" ] && [ "$(find agents/ -name '*.md' -type f 2>/dev/null | wc -l)" -gt 0 ]; then + # Find all .md files in agents/ directory (including nested) + while IFS= read -r file; do + [ -f "$file" ] || continue + # Check both presence and order of headings + run_test "TEST-002-1" \ + "Verify required headings in $(basename $file)" \ + "grep -n '^## Purpose' '$file' > /dev/null && \ + grep -n '^## Inputs' '$file' > /dev/null && \ + grep -n '^## Outputs' '$file' > /dev/null && \ + grep -n '^## Behavior' '$file' > /dev/null && \ + grep -n '^## Constraints' '$file' > /dev/null && \ + [ \$(grep -n '^## Purpose' '$file' | cut -d: -f1) -lt \$(grep -n '^## Inputs' '$file' | cut -d: -f1) ] && \ + [ \$(grep -n '^## Inputs' '$file' | cut -d: -f1) -lt \$(grep -n '^## Outputs' '$file' | cut -d: -f1) ] && \ + [ \$(grep -n '^## Outputs' '$file' | cut -d: -f1) -lt \$(grep -n '^## Behavior' '$file' | cut -d: -f1) ] && \ + [ \$(grep -n '^## Behavior' '$file' | cut -d: -f1) -lt \$(grep -n '^## Constraints' '$file' | cut -d: -f1) ]" + done < <(find agents/ -name '*.md' -type f) + + # TEST-002-2: Verify heading format follows markdown H2 convention + while IFS= read -r file; do + [ -f "$file" ] || continue + run_test "TEST-002-2" \ + "Verify H2 heading format in $(basename $file)" \ + "grep -E '^## (Purpose|Inputs|Outputs|Behavior|Constraints)$' '$file' > /dev/null" + done < <(find agents/ -name '*.md' -type f) +else + echo "Note: No agent markdown files found yet - skipping TEST-002-1 and TEST-002-2" + echo "" +fi + +# TEST-003-1: Verify each agent has at least one tag +# TEST-003-2: Verify agent IDs are unique +# TEST-003-3: Verify tags are from allowed list +echo "=== SPEC-003: Tags and Metadata ===" +if [ -f "agents.yaml" ]; then + # TEST-003-1: Check each agent has at least one tag + run_test "TEST-003-1" \ + "Verify each agent has at least one tag" \ + "[ -z \"\$(awk '/ - id:/{flag=1; next} flag && /tags:/{flag=2; next} flag==2 && /^[[:space:]]*-/{flag=0; next} flag==2 && /^ - id:/{print \"no_tags\"; flag=0}' agents.yaml | grep no_tags)\" ]" + + # TEST-003-2: Check agent IDs are unique + run_test "TEST-003-2" \ + "Verify agent IDs are unique in agents.yaml" \ + "! grep 'id:' agents.yaml | awk '{print \$3}' | sed 's/\"//g' | sort | uniq -d | grep -q ." + + # TEST-003-3: Verify tags are from allowed list + if grep -q 'allowed_tags:' agents.yaml; then + # Extract allowed tags and agent tags, then compare + allowed_tags=$(awk '/^allowed_tags:/,/^# / {if (/^ - /) print $2}' agents.yaml | tr '\n' '|' | sed 's/|$//') + if [ -n "$allowed_tags" ]; then + run_test "TEST-003-3" \ + "Verify all tags are from allowed list" \ + "! awk '/^agents:/,/^# Validation/ {if (/^ tags:/) {flag=1; next} if (flag && /^ - /) {print \$2} if (flag && /^ [a-z_]/) flag=0}' agents.yaml | grep -vE \"^($allowed_tags)\$\" | grep -q ." + else + echo "Note: No allowed_tags found - skipping TEST-003-3" + fi + else + echo "Note: No allowed_tags defined in agents.yaml - skipping TEST-003-3" + fi +else + echo "Error: agents.yaml not found" + FAILED=$((FAILED + 3)) + TOTAL=$((TOTAL + 3)) + echo "" +fi + +# TEST-004-1: Verify append-only files are only appended to +echo "=== SPEC-004: Append-Only Files ===" +echo "Note: TEST-004-1 requires git history - checking files exist" +run_test "TEST-004-1" \ + "Verify append-only files exist" \ + "[ -f specs.md ] && [ -f agent_runs.md ] && [ -f decisions.md ]" + +# TEST-005-1: Citations verification (manual review required) +echo "=== SPEC-005: No Fabrication ===" +echo "Note: TEST-005-1 (Citation verification) requires manual review" +echo "" + +# TEST-006-1: Verify documentation uses .md extension +echo "=== SPEC-006: Markdown Output ===" +run_test "TEST-006-1" \ + "Verify key documentation files use .md extension" \ + "[ -f agents.md ] && [ -f specs.md ] && [ -f agent_runs.md ] && [ -f decisions.md ]" + +# Summary +echo "======================================" +echo "Validation Summary" +echo "======================================" + +if [ $TOTAL -eq 0 ]; then + echo -e "${YELLOW}Warning: No tests were executed!${NC}" + exit 1 +fi + +echo "Total tests run: $TOTAL" +echo -e "${GREEN}Passed: $PASSED${NC}" +if [ $FAILED -gt 0 ]; then + echo -e "${RED}Failed: $FAILED${NC}" + echo "" + echo "Please fix the failing tests before proceeding." + exit 1 +else + echo -e "${GREEN}All tests passed!${NC}" + exit 0 +fi diff --git a/workflows/WORKFLOWS.md b/workflows/WORKFLOWS.md new file mode 100644 index 0000000..b495692 --- /dev/null +++ b/workflows/WORKFLOWS.md @@ -0,0 +1,36 @@ +# Workflows + +Workflows are task-level execution patterns that define sequencing, agent handoffs, and required validation steps. Attach a workflow to a task when it must follow a specific pipeline. + +## Workflow Format +Each workflow MUST include: + +- **Workflow ID** (unique) +- **Purpose** (what the workflow achieves) +- **Sequence** (ordered steps and agent roles) +- **Inputs/Outputs** (artifacts and checkpoints) +- **Validation** (required checks) + +Use this template: + +```markdown +# Workflow: + +## Workflow ID + + +## Purpose + + +## Sequence +1. +2. + +## Inputs/Outputs +- Inputs: <...> +- Outputs: <...> + +## Validation +- +- +``` diff --git a/workflows/pr-review-and-merge.md b/workflows/pr-review-and-merge.md new file mode 100644 index 0000000..20047a0 --- /dev/null +++ b/workflows/pr-review-and-merge.md @@ -0,0 +1,22 @@ +# Workflow: PR Review and Merge Gate + +## Workflow ID +pr-review-and-merge-001 + +## Purpose +Ensure PRs receive automated review where available, are validated, and achieve a clear go/no-go decision before merge. + +## Sequence +1. Collect PR context (diff, checks, requested reviewers). +2. Request automated review (Copilot/Codex) using the correct handle or trigger comment. +3. Run required validation/tests. +4. Summarize findings and provide go/no-go. +5. If go, wait for user confirmation before merge. + +## Inputs/Outputs +- Inputs: PR URL/number, repo, validation commands. +- Outputs: PR summary, review status, go/no-go recommendation. + +## Validation +- Validation scripts pass or failures are explained. +- Automated review request is visible in PR timeline.