Common questions about Project O and their answers.
Project O is a self-evolving AI Agent system that can modify its own code at runtime. It combines:
- Gerbil Scheme for metaprogramming and self-modification
- Elixir/OTP for industrial-grade fault tolerance
- Zig for high-performance infrastructure
- Rust for compute-intensive operations
"O" represents:
- Origin: The starting point of self-evolving systems
- Ouroboros: The snake eating its own tail, symbolizing self-reference
- Optimization: Continuous self-improvement
- True Self-Evolution: Can modify its own code, not just parameters
- Fault Tolerance: Elixir supervision prevents permanent failure
- Shadow Testing: Tests changes in isolated instances before applying
- Multi-Threaded Evolution: Runs parallel evolution experiments
- Zero Data Loss: Checkpoints + WAL ensure durability
Each language serves a specific purpose:
| Language | Purpose | Reason |
|---|---|---|
| Elixir | Supervision | Battle-tested fault tolerance (OTP) |
| Gerbil | Agent logic | Lisp metaprogramming for self-modification |
| Zig | Infrastructure | Fast, safe, simple C interop |
| Rust | Compute | Memory safety, SIMD optimization |
Gerbil advantages:
- Compiled macros (AOT) for better performance
- Native C FFI through Gambit
- Single-instance module system (faster)
- Production-ready (used in real systems)
Comparison:
| Feature | Gerbil | Racket | Common Lisp |
|---|---|---|---|
| Compiled macros | ✅ | ❌ | ✅ |
| Performance | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ |
| C FFI | Native | Additional layer | CFFI |
| Production use | ✅ | ✅ |
- More modern syntax and tooling
- Better developer experience
- Active ecosystem
- Same BEAM VM benefits
- Easier to attract contributors
Not recommended. The Elixir supervision layer is critical for:
- Preventing permanent failure during evolution
- State persistence and recovery
- Shadow testing orchestration
- Multi-threaded evolution management
Without Elixir, the agent could destroy itself during evolution.
- Detection: Agent identifies improvement opportunity
- Generation: Generates new code using LLM or templates
- Checkpoint: Saves current state to Elixir
- Shadow Test: Tests new code in isolated instance
- Evaluation: Compares performance metrics
- Decision: Promotes if better, rejects if worse
- Hot Reload: Loads new code without restart
- Elixir detects heartbeat timeout (5 seconds)
- Supervisor restarts GerbilManager
- New Gerbil process starts with
--restoreflag - Loads last checkpoint from MemoryVault
- Replays WAL entries since checkpoint
- Agent resumes from pre-crash state
- Total downtime: ~50-100ms
Three-layer approach:
-
Checkpoints: Full state snapshots every 5 minutes
- Stored in DETS (in-memory + disk)
- File backup for redundancy
- Compressed with zstd
-
WAL (Write-Ahead Log): Every operation logged before execution
- Segment-based files
- Automatic rotation
- Replay on recovery
-
Shared Memory: Hot path data (metrics, indexes)
- No serialization overhead
- Atomic operations
- Fast access
Maximum data loss: < 1 second (WAL flush interval)
Shadow testing runs new code in an isolated instance:
Main Instance (Production)
↓ 90% traffic
↓
User Requests
↓ 10% traffic (duplicated)
↓
Shadow Instance (Testing new code)
Process:
- Spawn shadow instance with new code
- Route 10% of traffic to shadow
- Compare metrics (latency, errors, memory)
- Promote if better, reject if worse
- Main instance unaffected during testing
Genetic Algorithm Approach:
- Population: Spawn 50 shadow instances
- Mutation: Each has different code variation
- Competition: All process same tasks
- Evaluation: Measure performance
- Selection: Keep top performers
- Crossover: Mix code from best instances
- Repeat: Iterate for N generations
Result: Finds optimal code through parallel experimentation
| Metric | Without Elixir | With Elixir | Overhead |
|---|---|---|---|
| Latency | 10ms | 11ms | +10% |
| Throughput | 10K QPS | 9K QPS | -10% |
| Memory | 80MB | 100MB | +25% |
Trade-off: 10% performance for infinite reliability
Yes, with optimizations:
- Shared Memory: Hot path data bypasses serialization
- Batch Operations: WAL writes batched (100 entries)
- Async Checkpoints: Background thread, non-blocking
- Connection Pooling: Database connections pooled
Target: 5,000+ QPS per instance
Per instance:
- Base: 80-100MB
- Memory blocks: ~1KB each
- Checkpoints: 50-100MB (compressed)
- WAL: 10-20MB per hour
Total: 150-200MB per agent instance
Recovery timeline:
- Heartbeat timeout detection: 5 seconds
- Supervisor restart: 10ms
- Checkpoint load: 1-2 seconds
- WAL replay: 100ms (1000 entries)
Total: ~2 seconds worst case, ~100ms typical
Required:
- Elixir 1.14+ and Erlang/OTP 25+
- Gerbil Scheme 0.18+
Optional:
- Zig 0.13+ (for infrastructure layer)
- Rust 1.70+ (for compute layer)
- Docker (for containerized deployment)
See GETTING_STARTED.md for details.
cd o_supervisor
mix test # All tests
mix test --cover # With coverage
mix test test/file_test.exs # Specific file
mix test.watch # Watch modeElixir debugging:
# In IEx
iex -S mix
# Get process state
:sys.get_state(OSupervisor.MemoryVault)
# Trace messages
:sys.trace(OSupervisor.GerbilManager, true)
# Start Observer (GUI)
:observer.start()Gerbil debugging:
;; Add debug prints
(displayln "Debug: " variable)
;; Use REPL
gerbil repl
;; Trace execution
(import :std/debug/trace)
(trace-call my-function args)- Read CONTRIBUTING.md
- Create feature branch
- Write tests first (TDD)
- Implement feature
- Update documentation
- Create ADR if architectural change
- Submit pull request
Elixir (supervision/infrastructure):
o_supervisor/lib/o_supervisor/- Core moduleso_supervisor/test/- Tests
Gerbil (agent logic):
gerbil/agent/- Agent modulesgerbil/ffi/- FFI bindingsgerbil/utils/- Utilities
Zig (infrastructure):
zig/- Infrastructure modules
Rust (compute):
rust/- Compute modules
Option 1: Docker Compose (Recommended)
docker-compose up -dOption 2: Elixir Release
cd o_supervisor
MIX_ENV=prod mix release
_build/prod/rel/o_supervisor/bin/o_supervisor startOption 3: Kubernetes (Coming soon)
Minimum:
- CPU: 2 cores
- RAM: 4GB
- Disk: 10GB
- OS: Linux or macOS
Recommended:
- CPU: 4+ cores
- RAM: 8GB+
- Disk: 50GB+ (for checkpoints/WAL)
- OS: Linux (Ubuntu 20.04+)
Built-in monitoring:
- Prometheus metrics:
http://localhost:9568/metrics - Grafana dashboards:
http://localhost:3000 - Health check:
http://localhost:4000/health
Key metrics:
o_supervisor_health_metrics- Agent healtho_supervisor_checkpoint_created- Checkpoint eventso_supervisor_wal_appended- WAL operationsvm_memory_total- Memory usage
What to backup:
- Checkpoints:
data/checkpoints/ - WAL logs:
data/wal/ - Configuration:
o_supervisor/config/
Backup script:
#!/bin/bash
BACKUP_DIR="/backups/o_$(date +%Y%m%d_%H%M%S)"
mkdir -p $BACKUP_DIR
cp -r data/checkpoints $BACKUP_DIR/
cp -r data/wal $BACKUP_DIR/
cp -r o_supervisor/config $BACKUP_DIR/
tar -czf $BACKUP_DIR.tar.gz $BACKUP_DIR
rm -rf $BACKUP_DIR# Stop O
docker-compose down
# Extract backup
tar -xzf backup.tar.gz
# Restore files
cp -r backup/checkpoints data/
cp -r backup/wal data/
cp -r backup/config o_supervisor/
# Start O
docker-compose up -dCheck:
- Elixir/Erlang installed:
elixir --version - Gerbil installed:
gerbil version - Data directories exist:
ls data/ - Ports available:
lsof -i :4000 - Logs:
tail -f data/logs/o_supervisor.log
# Remove corrupted checkpoints
rm data/checkpoints/*.ckpt
rm data/checkpoints/checkpoints.dets
# Restart O (will create new checkpoint)
docker-compose restart o_supervisor# Compact old WAL segments
cd o_supervisor
iex -S mix
# In IEx
OSupervisor.WALManager.compact_old_segments()Check:
- Number of memory blocks: Too many?
- Checkpoint size: Too large?
- Shadow instances: Too many running?
Solutions:
- Reduce
max_concurrent_shadowsin config - Implement memory block pruning
- Increase checkpoint compression level
Profile:
# In IEx
:fprof.trace([:start])
# Run your operation
:fprof.trace([:stop])
:fprof.profile()
:fprof.analyse()Common causes:
- Too frequent checkpoints
- Large WAL entries
- Slow disk I/O
- Network latency
Security features:
- Input validation on all messages
- Sandboxed code execution (planned)
- Resource limits per shadow instance
- Encrypted data at rest (planned)
- Encrypted data in transit (planned)
Security considerations:
- O can modify its own code (by design)
- Shadow testing provides safety net
- Elixir supervision prevents permanent damage
- WAL provides audit trail
DO NOT open public issues for security vulnerabilities.
Instead:
- Email: security@project-o.example.com
- Include: Description, steps to reproduce, impact
- We'll respond within 24 hours
- We'll work with you on disclosure timeline
O is designed for legitimate AI agent development. Like any powerful tool, it can be misused. We:
- Provide security guidelines
- Implement safety mechanisms
- Monitor for abuse
- Reserve right to revoke access
See CONTRIBUTING.md for:
- Code contributions
- Documentation improvements
- Bug reports
- Feature requests
- Community support
- Documentation: Check
docs/directory - FAQ: This document
- Issues: Search existing issues
- Discussions: GitHub Discussions
- New Issue: Open if not found
Yes! See IMPLEMENTATION_CHECKLIST.md:
- Phase 0 ✅ Complete - Elixir foundation
- Phase 1 🚧 In Progress - Gerbil core
- Phase 2 📋 Planned - Infrastructure (Zig)
- Phase 3 📋 Planned - Protected evolution
- Phase 4 📋 Planned - Multi-threaded evolution
- Phase 5 📋 Planned - Advanced features
MIT License. See LICENSE file.
Yes! O is designed to be extensible:
Elixir modules:
defmodule OSupervisor.MyCustomModule do
use GenServer
# Your implementation
end
# Add to supervision tree in application.exGerbil modules:
;;; my-module.ss
(export #t my-function)
(def (my-function arg)
;; Your implementation
)Yes! O's LLM integration is pluggable:
;; gerbil/agent/llm.ss
(def (make-llm provider: provider model: model ...)
(case provider
(:openai (make-openai-client ...))
(:anthropic (make-anthropic-client ...))
(:ollama (make-ollama-client ...))
(:custom (make-custom-client ...))))Yes! Each instance is independent:
# Instance 1
PORT=4000 iex -S mix
# Instance 2
PORT=4001 iex -S mix
# Or with Docker
docker-compose up --scale o_supervisor=3Yes! This is a Phase 5 goal:
- Agent analyzes evolution success rate
- Generates new evolution strategies
- Tests strategies in shadow instances
- Adopts better strategies
- Meta-evolution: evolving how to evolve
- Documentation: docs/
- GitHub Issues: Issues
- Discussions: Discussions
Last Updated: 2026-01-16
Version: 1.0