- ❌ NO shortcuts - do the work properly or don't do it
- ❌ NO fake data - use real data, real tests, real results
- ❌ NO false claims - only report what actually works and is verified
- ✅ ALWAYS implement all code/tests with proper implementation
- ✅ ALWAYS verify before claiming success
- ✅ ALWAYS use real database queries, not mocks, for integration tests
- ✅ ALWAYS run actual tests, not assume they pass
We value the quality we deliver to our users.
Sentinel is an AI-powered platform for automating the entire API testing lifecycle using specialized AI agents. The project combines Claude-Flow orchestration with Agentic QE Fleet for comprehensive testing.
- Frontend: React-based UI (Port 3000) with Redux state management
- Backend Services: Python microservices with FastAPI (Ports 8000-8005, 8088)
- API Gateway (8000), Auth (8005), Spec (8001), Orchestration (8002), Execution (8003), Data (8004)
- Rust Core: High-performance agent core (8088) powered by ruv-swarm
- Database: PostgreSQL with pgvector extension (Port 5432)
- Message Broker: RabbitMQ for asynchronous task processing (Ports 5672/15672)
- Observability: Prometheus (9090), Jaeger (16686)
The platform implements both Python and Rust agents for optimal performance:
- Functional-Positive-Agent: Valid "happy path" tests (Python/Rust)
- Functional-Negative-Agent: Boundary value analysis and negative tests (Python/Rust)
- Functional-Stateful-Agent: Multi-step workflows with SODG graphs (Python/Rust)
- Security-Auth-Agent: BOLA, authorization bypass (Python/Rust)
- Security-Injection-Agent: SQL/NoSQL/Command/LLM injection (Python/Rust)
- Performance-Planner-Agent: k6/JMeter/Locust scripts (Python/Rust)
- Data-Mocking-Agent: Schema-aware test data (Python/Rust)
Performance: Rust agents provide 18-21x faster execution with automatic fallback to Python for resilience.
- Consciousness Verification: Self-modifying test generation with pattern learning
- Psycho-Symbolic Reasoning: Combines psychological models with symbolic logic
- Temporal Consciousness: Nanosecond-precision scheduling
- Knowledge Graph Integration: Semantic understanding of API relationships
- Sublinear Solvers: O(log n) performance for large-scale optimization
Supports multiple LLM providers with automatic fallback:
- Anthropic (Default): Claude Opus 4.1/4, Sonnet 4, Haiku 3.5
- OpenAI: GPT-4 Turbo, GPT-4, GPT-3.5 Turbo
- Google: Gemini 2.5 Pro/Flash, Gemini 2.0 Flash
- Mistral: Large, Small 3, Codestral
- Ollama (Local): DeepSeek-R1, Llama 3.3, Qwen 2.5
Configure via: cd sentinel_backend/scripts && ./switch_llm.sh
- 540+ comprehensive tests with 97.8% pass rate
- 184 AI agent tests (Phase 1 complete)
- 272 LLM provider tests (Phase 2 complete)
- 45+ Playwright E2E tests for frontend
- Performance testing: Load, stress, concurrent execution
# Complete setup
make setup
# Start all services
make start
# Initialize/repair database
make init-db
# Check service status
make status
# Run tests in Docker
cd sentinel_backend && ./run_tests.sh -dRelease Workflow:
git checkout -b release/vX.X.X # 1. Create branch
# Update all version files above # 2. Update versions
npm run test:fast # 3. Run tests
git commit -m "chore(release): bump version to vX.X.X"
git push -u origin release/vX.X.X # 4. Push branch
gh pr create # 5. Create PR to main
# Wait for CI and review # 6. Review
# Merge PR # 7. Merge
git tag vX.X.X && git push origin vX.X.X # 8. Tag
gh release create vX.X.X # 9. GitHub release
npm publish --access public # 10. npm publish19 QE Agents: Test generation, coverage analysis, performance, security, flaky detection, QX analysis
11 QE Subagents: TDD specialists, code reviewers, integration testers
41 QE Skills: agentic-quality-engineering, tdd-london-chicago, api-testing-patterns, six-thinking-hats, brutal-honesty-review, sherlock-review, cicd-pipeline-qe-orchestrator, accessibility-testing, shift-left-testing, testability-scoring (contributed by @fndlalit)
8 Slash Commands: /aqe-execute, /aqe-generate, /aqe-coverage, /aqe-quality
- Agent Reference - All 19 main agents + 11 subagents with capabilities and usage
- Skills Reference - All 41 QE skills organized by category
- Usage Guide - Complete usage examples and workflows
Spawn agents:
Task("Generate tests", "Create test suite for UserService", "qe-test-generator")
Task("Analyze coverage", "Find gaps using O(log n)", "qe-coverage-analyzer")Check learning status:
aqe learn status --agent test-gen
aqe patterns list --framework jest- Use Task tool for agent execution (not just MCP)
- Batch all operations in single messages (TodoWrite, file ops, etc.)
- Test with actual databases, not mocks
- Document only what actually works
ABSOLUTE RULES:
- ALL operations MUST be concurrent/parallel in a single message
- NEVER save working files, text/mds and tests to the root folder
- ALWAYS organize files in appropriate subdirectories
- USE CLAUDE CODE'S TASK TOOL for spawning agents concurrently, not just MCP
MANDATORY PATTERNS:
- TodoWrite: ALWAYS batch ALL todos in ONE call (5-10+ todos minimum)
- Task tool (Claude Code): ALWAYS spawn ALL agents in ONE message with full instructions
- File operations: ALWAYS batch ALL reads/writes/edits in ONE message
- Bash commands: ALWAYS batch ALL terminal operations in ONE message
- Memory operations: ALWAYS batch ALL memory store/retrieve in ONE message
Claude Code's Task tool is the PRIMARY way to spawn Claude Flow agents:
// ✅ CORRECT: Use Claude Code's Task tool for parallel agent execution
[Single Message]:
Task("Research agent", "Analyze requirements and patterns...", "researcher")
Task("Coder agent", "Implement core features...", "coder")
Task("Tester agent", "Create comprehensive tests...", "tester")
Task("Reviewer agent", "Review code quality...", "reviewer")
Task("Architect agent", "Design system architecture...", "system-architect")Claude Flow MCP tools are ONLY for coordination setup:
mcp__claude-flow__swarm_init- Initialize coordination topologymcp__claude-flow__agent_spawn- Define agent types for coordinationmcp__claude-flow__task_orchestrate- Orchestrate high-level workflows
NEVER save to root folder. Use these directories:
/src- Source code files/tests- Test files/docs- Documentation and markdown files/config- Configuration files/scripts- Utility scripts/examples- Example code
This project uses SPARC (Specification, Pseudocode, Architecture, Refinement, Completion) methodology with Claude-Flow orchestration for systematic Test-Driven Development.
npx claude-flow sparc modes- List available modesnpx claude-flow sparc run <mode> "<task>"- Execute specific modenpx claude-flow sparc tdd "<feature>"- Run complete TDD workflownpx claude-flow sparc info <mode>- Get mode details
npx claude-flow sparc batch <modes> "<task>"- Parallel executionnpx claude-flow sparc pipeline "<task>"- Full pipeline processingnpx claude-flow sparc concurrent <mode> "<tasks-file>"- Multi-task processing
npm run build- Build projectnpm run test- Run testsnpm run lint- Lintingnpm run typecheck- Type checking
- Specification - Requirements analysis (
sparc run spec-pseudocode) - Pseudocode - Algorithm design (
sparc run spec-pseudocode) - Architecture - System design (
sparc run architect) - Refinement - TDD implementation (
sparc tdd) - Completion - Integration (
sparc run integration)
- Modular Design: Files under 500 lines
- Environment Safety: Never hardcode secrets
- Test-First: Write tests before implementation
- Clean Architecture: Separate concerns
- Documentation: Keep updated
coder, reviewer, tester, planner, researcher
hierarchical-coordinator, mesh-coordinator, adaptive-coordinator, collective-intelligence-coordinator, swarm-memory-manager
byzantine-coordinator, raft-manager, gossip-coordinator, consensus-builder, crdt-synchronizer, quorum-manager, security-manager
perf-analyzer, performance-benchmarker, task-orchestrator, memory-coordinator, smart-agent
github-modes, pr-manager, code-review-swarm, issue-tracker, release-manager, workflow-automation, project-board-sync, repo-architect, multi-repo-swarm
sparc-coord, sparc-coder, specification, pseudocode, architecture, refinement
backend-dev, mobile-dev, ml-developer, cicd-engineer, api-docs, system-architect, code-analyzer, base-template-generator
tdd-london-swarm, production-validator
migration-planner, swarm-init
- Task tool: Spawn and run agents concurrently for actual work
- File operations (Read, Write, Edit, MultiEdit, Glob, Grep)
- Code generation and programming
- Bash commands and system operations
- Implementation work
- Project navigation and analysis
- TodoWrite and task management
- Git operations
- Package management
- Testing and debugging
- Swarm initialization (topology setup)
- Agent type definitions (coordination patterns)
- Task orchestration (high-level planning)
- Memory management
- Neural features
- Performance tracking
- GitHub integration
KEY: MCP coordinates the strategy, Claude Code's Task tool executes with real agents.
# Add MCP servers (Claude Flow required, others optional)
claude mcp add claude-flow npx claude-flow@alpha mcp start
claude mcp add ruv-swarm npx ruv-swarm mcp start # Optional: Enhanced coordinationswarm_init, agent_spawn, task_orchestrate
swarm_status, agent_list, agent_metrics, task_status, task_results
memory_usage, neural_status, neural_train, neural_patterns
github_swarm, repo_analyze, pr_enhance, issue_triage, code_review
benchmark_run, features_detect, swarm_monitor
- Optional: Use MCP tools to set up coordination topology
- REQUIRED: Use Claude Code's Task tool to spawn agents that do actual work
- REQUIRED: Each agent runs hooks for coordination
- REQUIRED: Batch all operations in single messages
// Single message with all agent spawning via Claude Code's Task tool
[Parallel Agent Execution]:
Task("Backend Developer", "Build REST API with Express. Use hooks for coordination.", "backend-dev")
Task("Frontend Developer", "Create React UI. Coordinate with backend via memory.", "coder")
Task("Database Architect", "Design PostgreSQL schema. Store schema in memory.", "code-analyzer")
Task("Test Engineer", "Write Jest tests. Check memory for API contracts.", "tester")
Task("DevOps Engineer", "Setup Docker and CI/CD. Document in memory.", "cicd-engineer")
Task("Security Auditor", "Review authentication. Report findings via hooks.", "reviewer")
// All todos batched together
TodoWrite { todos: [...8-10 todos...] }
// All file operations together
Write "backend/server.js"
Write "frontend/App.jsx"
Write "database/schema.sql"1️⃣ BEFORE Work:
npx claude-flow@alpha hooks pre-task --description "[task]"
npx claude-flow@alpha hooks session-restore --session-id "swarm-[id]"2️⃣ DURING Work:
npx claude-flow@alpha hooks post-edit --file "[file]" --memory-key "swarm/[agent]/[step]"
npx claude-flow@alpha hooks notify --message "[what was done]"3️⃣ AFTER Work:
npx claude-flow@alpha hooks post-task --task-id "[task]"
npx claude-flow@alpha hooks session-end --export-metrics true// Step 1: MCP tools set up coordination (optional, for complex tasks)
[Single Message - Coordination Setup]:
mcp__claude-flow__swarm_init { topology: "mesh", maxAgents: 6 }
mcp__claude-flow__agent_spawn { type: "researcher" }
mcp__claude-flow__agent_spawn { type: "coder" }
mcp__claude-flow__agent_spawn { type: "tester" }
// Step 2: Claude Code Task tool spawns ACTUAL agents that do the work
[Single Message - Parallel Agent Execution]:
// Claude Code's Task tool spawns real agents concurrently
Task("Research agent", "Analyze API requirements and best practices. Check memory for prior decisions.", "researcher")
Task("Coder agent", "Implement REST endpoints with authentication. Coordinate via hooks.", "coder")
Task("Database agent", "Design and implement database schema. Store decisions in memory.", "code-analyzer")
Task("Tester agent", "Create comprehensive test suite with 90% coverage.", "tester")
Task("Reviewer agent", "Review code quality and security. Document findings.", "reviewer")
// Batch ALL todos in ONE call
TodoWrite { todos: [
{id: "1", content: "Research API patterns", status: "in_progress", priority: "high"},
{id: "2", content: "Design database schema", status: "in_progress", priority: "high"},
{id: "3", content: "Implement authentication", status: "pending", priority: "high"},
{id: "4", content: "Build REST endpoints", status: "pending", priority: "high"},
{id: "5", content: "Write unit tests", status: "pending", priority: "medium"},
{id: "6", content: "Integration tests", status: "pending", priority: "medium"},
{id: "7", content: "API documentation", status: "pending", priority: "low"},
{id: "8", content: "Performance optimization", status: "pending", priority: "low"}
]}
// Parallel file operations
Bash "mkdir -p app/{src,tests,docs,config}"
Write "app/package.json"
Write "app/src/server.js"
Write "app/tests/server.test.js"
Write "app/docs/API.md"Message 1: mcp__claude-flow__swarm_init
Message 2: Task("agent 1")
Message 3: TodoWrite { todos: [single todo] }
Message 4: Write "file.js"
// This breaks parallel coordination!- Auto-assign agents by file type
- Validate commands for safety
- Prepare resources automatically
- Optimize topology by complexity
- Cache searches
- Auto-format code
- Train neural patterns
- Update memory
- Analyze performance
- Track token usage
- Generate summaries
- Persist state
- Track metrics
- Restore context
- Export workflows
Agent activity is automatically visualized in real-time when services are running.
# Terminal 1: Start backend services (WebSocket + REST API)
npx tsx scripts/start-visualization-services.ts
# Terminal 2: Start frontend dashboard
cd frontend && npm run devThen open http://localhost:3000 to view the dashboard.
Task agents automatically emit visualization events via Claude Code hooks:
- PreToolUse hook: Emits
agent:spawnedandagent:startedevents when Task tool is invoked - PostToolUse hook: Emits
agent:completedoragent:errorevents when Task completes
No manual action required - just use the Task tool and agents appear in the visualization!
For custom workflows or debugging:
# Emit spawn event with agent type
npx tsx scripts/emit-agent-event.ts spawn <agentId> <agentType>
# Emit start event
npx tsx scripts/emit-agent-event.ts start <agentId>
# Emit completion event with duration (ms)
npx tsx scripts/emit-agent-event.ts complete <agentId> [duration]
# Emit error event
npx tsx scripts/emit-agent-event.ts error <agentId> "Error message"import { emitAgentSpawn, emitAgentComplete, emitAgentError } from './src/visualization';
// Emit events in your code
await emitAgentSpawn('my-agent', 'researcher');
await emitAgentComplete('my-agent', 5000);
await emitAgentError('my-agent', 'Something went wrong');| Event | Status | When |
|---|---|---|
agent:spawned |
idle |
Agent created |
agent:started |
running |
Agent begins work |
agent:completed |
completed |
Agent finishes |
agent:error |
error |
Agent fails |
- Start with basic swarm init
- Scale agents gradually
- Use memory for context
- Monitor progress regularly
- Train patterns from success
- Enable hooks automation
- Use GitHub tools first
- Documentation: https://github.com/ruvnet/claude-flow
- Issues: https://github.com/ruvnet/claude-flow/issues
Remember: Claude Flow coordinates, Claude Code creates!
Do what has been asked; nothing more, nothing less. NEVER create files unless they're absolutely necessary for achieving your goal. ALWAYS prefer editing an existing file to creating a new one. NEVER proactively create documentation files (*.md) or README files. Only create documentation files if explicitly requested by the User. Never save working files, text/mds and tests to the root folder.
Generated by: Agentic QE Fleet v2.3.0 Initialization Date: 2025-12-08T13:48:54.968Z Fleet Topology: hierarchical
- We always implement all the code/tests, with proper implementation. We value the quality we deliver to our users.