Skip to content

GrandRegentSarva/TARS

Repository files navigation

TARS -- Autonomous Drone Observability & Learning Platform

Existing drone platforms monitor missions. Our platform learns from missions.

Every failure, mitigation, outcome, agent decision, and evaluation becomes operational knowledge that improves future mission recommendations.


What Is TARS?

TARS is a runtime feedback system for autonomous drones. It lets drone agents continuously trace, evaluate, and introspect their own behavior — detecting telemetry anomalies, decision inconsistencies, and mission failures to iteratively refine future actions.

Built with PX4, Gazebo, MAVSDK, Python, Gemini, Phoenix, Neo4j, and Redis.


Architecture

TARS is organized as a layered pipeline. Each layer builds on the one below it:

Layer What It Does Port
Phase 1 — Mission Foundation Runs PX4 SITL + Gazebo headless simulations, collects async telemetry via MAVSDK, injects faults (GPS block, battery drain, sensor cascade, etc.), and writes structured JSON output.
Phase 2 — Mission Replay Imports mission JSON into PostgreSQL, provides ordered replay frames with elapsed timing, and exposes a FastAPI REST API for mission queries. 8000
Phase 3 — State Engine Transforms replay frames into classified mission states (phase, health, risk score) and stores them as Redis timelines. Deterministic phase classification and additive risk scoring. 8002
Phase 4 — Incident Engine Evaluates state timelines against 7 rule types across 4 severity levels. Collapses consecutive matches into bounded incidents with gap-based merging and persistence thresholds. 8003
Phase 5 — Gemini Reasoning Analyzes bounded incidents using Google Gemini (via ADK) to produce structured, advisory-only root-cause assessments. Provider-neutral interface with versioned prompts and control-command rejection at the model boundary. 8004
Phase 6 — Phoenix Integration Instruments the reasoning layer with OpenTelemetry tracing and exports spans to Arize Phoenix. Produces parent-child span hierarchies with OpenInference semantic conventions and configurable content capture (full / metadata / disabled).
Phase 7 — Operational Memory Projects bounded facts from Phases 2, 4, and 5 into a Neo4j graph. Connects missions → incidents → root causes → mitigations → outcomes. Answers "Have we seen this before?" with provenance-preserving history queries. 8005
Phase 8 — Phoenix MCP Analysis-only self-introspection via Phoenix traces. Lets the reasoning agent inspect its own prior reasoning through 3 read-only MCP tools (search, summary, compare). Fail-open design: Phoenix unavailability never blocks reasoning. 4 content modes, secret redaction, not_an_evaluation enforcement.
Phase 9 — Evaluation Layer Measures reasoning quality against bounded ground-truth labels, mission outcomes, and incident facts. Produces durable, inspectable evaluation scores (root-cause accuracy, recommendation quality, consistency, false positives/negatives) without changing operational behavior. 8006
Phase 10 — Learning Engine Turns evaluated mission history into candidate operational knowledge. Aggregates Phase 9 evaluations, Phase 7 operational memory, and safe trace metadata into candidate knowledge with evidence, confidence, and provenance. Candidate knowledge is not truth — it is input to Phase 11 validation. 8007

Each phase has its own FastAPI service, test suite, and configuration. Phases communicate over HTTP — no shared databases, no tight coupling.


Quick Start

Prerequisites

  • Docker 24+ with Docker Compose v2
  • Python 3.10+
  • ~10GB disk space for PX4 Docker image (one-time download)
  • Optional: QGroundControl for visual drone tracking

1. Clone and Setup

cd ~/Desktop/Projects/TARS

# Copy environment config
cp .env.example .env

# Create a Python virtual environment and install dependencies
python3 -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt

2. Start the Simulation

# First run builds the Docker image (~15-30 min, downloads PX4 source + compiles)
./scripts/start_simulation.sh

# Wait until you see "Ready for takeoff" in the logs

Note: The first build takes a while because it clones and compiles the entire PX4 firmware. Subsequent starts are fast (~10 seconds).

3. Run a Mission (in a new terminal)

# Activate the venv first
. .venv/bin/activate

# Run the default square mission with telemetry collection
./scripts/run_mission.sh

# Or with a custom mission ID
MISSION_ID=my_first_mission ./scripts/run_mission.sh

# Run a mission with a fault scenario (faults recorded in output JSON)
FAULT_SCENARIO=s1 ./scripts/run_mission.sh

4. View the Output

# Telemetry is saved as JSON in output/
cat output/mission_*.json | python3 -m json.tool | head -50

5. Inject Faults (in another terminal)

# Activate the venv first
. .venv/bin/activate

# Interactive fault injection while a mission is running
PYTHONPATH=src .venv/bin/python3 -m tars.phase1.fault_injector

6. Stop the Simulation

./scripts/start_simulation.sh --stop

Project Structure

TARS/
|-- plans/                              # Architecture and planning docs
|   |-- phase-1-mission-foundation.md
|   |-- phase-2-mission-replay-system.md
|   |-- phase-3-state-engine.md
|   |-- phase-4-incident-engine.md
|   |-- phase-5-gemini-reasoning-layer.md
|   |-- phase-6-phoenix-integration.md
|   |-- phase-7-neo4j-operational-memory.md
|   +-- phase-8-phoenix-mcp.md
|-- docker/                             # Docker setup
|   |-- Dockerfile.px4-sitl             # PX4 SITL + Gazebo headless
|   +-- docker-compose.yml              # PX4 SITL + PostgreSQL + Redis + Neo4j
|-- src/
|   +-- tars/
|       |-- phase1/                     # Phase 1 -- Mission Foundation
|       |   |-- telemetry_collector.py  # Async telemetry streaming via MAVSDK
|       |   |-- mission_runner.py       # Autonomous mission execution
|       |   |-- fault_injector.py       # Fault injection + scenarios
|       |   +-- models/
|       |       +-- telemetry.py        # Pydantic data models
|       |-- phase2/                     # Phase 2 -- Mission Replay System
|       |   |-- api.py                  # FastAPI app and routes
|       |   |-- config.py              # Environment settings
|       |   |-- database.py            # Async SQLAlchemy engine/session
|       |   |-- importer.py            # Phase 1 JSON import + validation
|       |   |-- replay.py              # Replay frame construction
|       |   |-- service.py             # Mission query orchestration
|       |   +-- models/
|       |       |-- db.py              # SQLAlchemy ORM tables
|       |       +-- schemas.py         # API request/response schemas
|       |-- phase3/                     # Phase 3 -- State Engine
|       |   |-- api.py                  # FastAPI app (port 8002)
|       |   |-- config.py              # Environment settings
|       |   |-- models.py              # Pydantic models and enums
|       |   |-- phase_classifier.py    # Deterministic phase rules
|       |   |-- risk.py                # Risk scoring and health assessment
|       |   |-- state_processor.py     # Frame-to-state transformation
|       |   |-- store.py               # Async Redis state store
|       |   |-- replay_client.py       # HTTP client for Phase 2 API
|       |   +-- service.py             # Processing orchestration
|       |-- phase4/                     # Phase 4 -- Incident Engine
|       |   |-- api.py                  # FastAPI app (port 8003)
|       |   |-- config.py              # Environment settings
|       |   |-- models.py              # Incident enums and schemas
|       |   |-- rules.py               # Deterministic state rules
|       |   |-- statistics.py          # Rolling windows and trend detection
|       |   |-- detector.py            # Incident collapser
|       |   |-- store.py               # Async Redis incident store
|       |   |-- state_client.py        # HTTP client for Phase 3 API
|       |   +-- service.py             # Detection orchestration
|       |-- phase5/                     # Phase 5 -- Gemini Reasoning Layer
|       |   |-- api.py                  # FastAPI app (port 8004)
|       |   |-- config.py              # Environment settings
|       |   |-- models.py              # Reasoning schemas and provider protocol
|       |   |-- prompts.py             # Versioned system instruction and prompt
|       |   |-- agent.py               # Google ADK Gemini agent configuration
|       |   |-- provider.py            # Gemini + fake reasoning providers
|       |   |-- incident_client.py     # HTTP client for Phase 4 API
|       |   |-- store.py               # Async Redis reasoning store
|       |   +-- service.py             # Reasoning orchestration
|       |-- phase6/                     # Phase 6 -- Phoenix Integration
|       |   |-- config.py              # PhoenixSettings (env-driven)
|       |   |-- attributes.py          # Stable trace attribute constants
|       |   +-- tracing.py             # TracerProvider setup, OTLP exporter
|       |-- phase7/                     # Phase 7 -- Operational Memory
|           |-- api.py                  # FastAPI app (port 8005)
|           |-- config.py              # Environment settings
|           |-- models.py              # Graph models, enums, request/response
|           |-- database.py            # Async Neo4j driver lifecycle
|           |-- schema.py             # Constraints and indexes
|           |-- mapper.py             # Pure mapping + deterministic IDs
|           |-- repository.py         # Graph MERGE/MATCH operations
|           |-- service.py            # Sync + query orchestration
|           |-- phase2_client.py      # HTTP client for Phase 2 API
|           |-- phase4_client.py      # HTTP client for Phase 4 API
|           |-- phase5_client.py      # HTTP client for Phase 5 API
|       +-- phase8/                     # Phase 8 -- Phoenix MCP Self-Introspection
|           |-- config.py              # PhoenixMCPSettings (env-driven, disabled by default)
|           |-- models.py              # Pydantic models, secret redaction, safety bounds
|           |-- phoenix_client.py      # GraphQL client + FakePhoenixTraceClient
|           |-- summarizer.py          # Raw trace → safe summary conversion
|           |-- mcp_tools.py           # 3 read-only MCP tool definitions
|           |-- tool_policy.py         # IntrospectionPolicy decision engine
|           +-- service.py             # IntrospectionService orchestration
|       +-- phase9/                     # Phase 9 -- Evaluation Layer
|           |-- api.py                  # FastAPI app (port 8006)
|           |-- config.py              # Environment settings + weight validation
|           |-- models.py              # Evaluation schemas, enums, metric contracts
|           |-- database.py            # Async SQLAlchemy engine/session
|           |-- repository.py          # PostgreSQL evaluation persistence
|           |-- evaluator.py           # Deterministic scoring (root cause, recommendation, consistency)
|           |-- ground_truth.py        # Multi-source ground-truth resolution
|           |-- service.py             # Evaluation orchestration
|           |-- phoenix_exporter.py    # Optional Phoenix eval export
|           +-- adapters/
|               |-- phase4_client.py   # Read-only Phase 4 incident client
|               |-- phase5_client.py   # Read-only Phase 5 reasoning client
|               +-- phase7_client.py   # Read-only Phase 7 outcome client
|       +-- phase10/                    # Phase 10 -- Learning Engine
|           |-- api.py                  # FastAPI app (port 8007)
|           |-- config.py              # Environment settings + weight validation
|           |-- models.py              # Learning schemas, enums, candidate contracts
|           |-- database.py            # Async SQLAlchemy engine/session
|           |-- repository.py          # PostgreSQL candidate knowledge persistence
|           |-- service.py             # Learning run orchestration
|           |-- evidence_loader.py     # Phase 9 + Phase 7 evidence merge
|           |-- pattern_miner.py       # Deterministic pattern grouping
|           |-- scorer.py              # Versioned confidence scoring
|           |-- statement_templates.py # Cautious association language templates
|           +-- adapters/
|               |-- phase9_client.py   # Read-only Phase 9 evaluation client
|               |-- phase7_client.py   # Read-only Phase 7 memory client
|               +-- phoenix_client.py  # Read-only Phoenix trace metadata client
|-- migrations/                         # Alembic database migrations
|   |-- env.py
|   +-- versions/
|-- scripts/
|   |-- start_simulation.sh             # Launch/stop PX4 simulation
|   |-- run_mission.sh                  # Run a Phase 1 mission
|   |-- start_replay_api.sh             # Start Phase 2 API server
|   |-- import_mission.sh               # Import mission JSON via API
|   |-- start_state_api.sh              # Start Phase 3 State API server
|   |-- process_mission_state.sh        # Process a mission through Phase 3
|   |-- start_incident_api.sh           # Start Phase 4 Incident API server
|   |-- process_mission_incidents.sh    # Detect incidents for a mission
|   |-- start_reasoning_api.sh          # Start Phase 5 Reasoning API server
|   |-- analyze_incident.sh            # Analyze an incident through Phase 5
|   |-- start_memory_api.sh            # Start Phase 7 Memory API server
|   |-- sync_mission_memory.sh         # Sync a mission into Neo4j graph
|   |-- query_similar_incidents.sh     # Query similar historical incidents
|   |-- start_evaluation_api.sh        # Start Phase 9 Evaluation API server
|   +-- start_learning_api.sh          # Start Phase 10 Learning API server
|-- tests/
|   |-- phase2/                         # Phase 2 tests
|   |   |-- test_importer.py
|   |   |-- test_replay.py
|   |   +-- test_api.py
|   |-- phase3/                         # Phase 3 tests
|   |   |-- test_phase_classifier.py
|   |   |-- test_risk.py
|   |   |-- test_state_processor.py
|   |   |-- test_store.py
|   |   +-- test_api.py
|   |-- phase4/                         # Phase 4 tests
|   |   |-- test_rules.py
|   |   |-- test_statistics.py
|   |   |-- test_detector.py
|   |   |-- test_store.py
|   |   +-- test_api.py
|   |-- phase5/                         # Phase 5 tests
|   |   |-- test_models.py
|   |   |-- test_prompts.py
|   |   |-- test_client.py
|   |   |-- test_provider.py
|   |   |-- test_store.py
|   |   |-- test_service.py
|   |   +-- test_api.py
|   |-- phase6/                         # Phase 6 tests
|   |   |-- test_config.py              # 31 configuration tests
|   |   |-- test_tracing.py             # 14 tracing bootstrap tests
|   |   +-- test_reasoning_traces.py    # 44 reasoning trace tests
|   |-- phase7/                         # Phase 7 tests
|   |   |-- test_models.py              # Model validation tests
|   |   |-- test_mapper.py              # Mapping + deterministic ID tests
|   |   |-- test_clients.py             # Upstream HTTP client tests
|   |   |-- test_repository.py          # Graph operation tests
|   |   |-- test_service.py             # Service orchestration tests
|   |   +-- test_api.py                 # API endpoint tests
|   |-- phase8/                         # Phase 8 tests (149 tests)
|   |   |-- test_config.py              # 18 configuration tests
|   |   |-- test_models.py              # 30 model validation tests
|   |   |-- test_summarizer.py          # 17 summarization tests
|   |   |-- test_phoenix_client.py      # 18 fake client tests
|   |   |-- test_mcp_tools.py           # 12 MCP tool tests
|   |   |-- test_tool_policy.py         # 9 policy decision tests
|   |   +-- test_reasoning_integration.py  # 45 integration tests
|   |-- phase9/                         # Phase 9 tests
|   |   |-- test_config.py              # Configuration validation tests
|   |   |-- test_models.py              # Model validation tests
|   |   |-- test_evaluator.py           # Deterministic scoring tests
|   |   |-- test_ground_truth.py        # Ground-truth resolution tests
|   |   |-- test_repository.py          # Persistence tests
|   |   |-- test_service.py             # Service orchestration tests
|   |   |-- test_phoenix_exporter.py    # Phoenix export tests
|   |   +-- test_api.py                 # API endpoint tests
|   +-- phase10/                        # Phase 10 tests
|       |-- test_config.py              # Configuration validation tests
|       |-- test_models.py              # Model + enum validation tests
|       |-- test_evidence_loader.py     # Evidence merge + dedup tests
|       |-- test_pattern_miner.py       # Deterministic pattern grouping tests
|       |-- test_scorer.py              # Confidence scoring tests
|       |-- test_repository.py          # Persistence tests
|       |-- test_service.py             # Service orchestration tests
|       +-- test_api.py                 # API endpoint tests
|-- output/                             # Telemetry JSON files
|-- alembic.ini                         # Alembic configuration
|-- pytest.ini                          # Pytest configuration
|-- requirements.txt                    # Python dependencies
|-- .env.example                        # Configuration template
+-- README.md

Phase 2 -- Mission Replay System

Phase 2 runs independently of PX4/Gazebo. You only need PostgreSQL and the Phase 2 API.

1. Start PostgreSQL

docker compose -f docker/docker-compose.yml up postgres -d

2. Run Database Migrations

PYTHONPATH=src .venv/bin/alembic upgrade head

3. Start the Replay API

./scripts/start_replay_api.sh

# API docs available at http://localhost:8000/docs
# Health check at http://localhost:8000/health

4. Import a Mission

# Via the script (requires API running)
./scripts/import_mission.sh output/mission_20260608_120000.json

# Or via curl
curl -X POST http://localhost:8000/api/v1/missions/import \
  -H "Content-Type: application/json" \
  -d '{"path": "output/mission_20260608_120000.json", "overwrite": false}'

5. Query Missions

# List all missions
curl http://localhost:8000/api/v1/missions

# Get mission detail (includes faults)
curl http://localhost:8000/api/v1/missions/mission_20260608_120000

# Get telemetry events
curl http://localhost:8000/api/v1/missions/mission_20260608_120000/events

# Replay a mission
curl http://localhost:8000/api/v1/missions/mission_20260608_120000/replay

# Replay with time range
curl "http://localhost:8000/api/v1/missions/mission_20260608_120000/replay?from_ms=5000&to_ms=30000"

6. Run Tests

# Requires PostgreSQL running
PYTHONPATH=src .venv/bin/pytest tests/phase2/ -v

Phase 3 -- State Engine

Phase 3 runs independently of PX4/Gazebo. You need Redis, the Phase 2 API (for replay data), and the Phase 3 State API.

1. Start Redis

docker compose -f docker/docker-compose.yml up redis -d

2. Start the Phase 2 Replay API

Phase 3 fetches replay frames from Phase 2, so the Phase 2 API must be running:

# Start PostgreSQL + run migrations if not already done
docker compose -f docker/docker-compose.yml up postgres -d
PYTHONPATH=src .venv/bin/alembic upgrade head
./scripts/start_replay_api.sh

3. Start the State API

./scripts/start_state_api.sh

# API docs available at http://localhost:8002/docs
# Health check at http://localhost:8002/health

4. Process a Mission

# Via the script (requires both APIs running)
./scripts/process_mission_state.sh mission_20260608_120000

# Or via curl
curl -X POST http://localhost:8002/api/v1/state/process/mission_20260608_120000 \
  -H "Content-Type: application/json" \
  -d '{}'

# Process with time range (partial replay -- does not update current state)
curl -X POST http://localhost:8002/api/v1/state/process/mission_20260608_120000 \
  -H "Content-Type: application/json" \
  -d '{"from_ms": 5000, "to_ms": 30000}'

5. Query State

# Get current state snapshot
curl http://localhost:8002/api/v1/state/mission_20260608_120000/current

# Get full state timeline
curl http://localhost:8002/api/v1/state/mission_20260608_120000/timeline

# Get timeline for a time range
curl "http://localhost:8002/api/v1/state/mission_20260608_120000/timeline?from_ms=5000&to_ms=30000"

# Get state at a specific time
curl http://localhost:8002/api/v1/state/mission_20260608_120000/at/15000

# Get processing status
curl http://localhost:8002/api/v1/state/mission_20260608_120000/status

6. Run Tests

# Pure logic tests (no Redis required)
PYTHONPATH=src .venv/bin/pytest tests/phase3/test_phase_classifier.py tests/phase3/test_risk.py tests/phase3/test_state_processor.py -v

# All tests including Redis integration (requires Redis running)
PYTHONPATH=src .venv/bin/pytest tests/phase3/ -v

Phase 4 -- Incident Engine

Phase 4 runs independently of PX4/Gazebo. You need Redis, the Phase 3 State API (for state timelines), and the Phase 4 Incident API.

1. Start Redis and Phase 3

# Start Redis
docker compose -f docker/docker-compose.yml up redis -d

# Start Phase 2 + Phase 3 APIs (Phase 4 depends on Phase 3 timelines)
./scripts/start_replay_api.sh &
./scripts/start_state_api.sh &

2. Start the Incident API

./scripts/start_incident_api.sh

# API docs available at http://localhost:8003/docs
# Health check at http://localhost:8003/health

3. Process Mission Incidents

# Via the script (requires Phase 3 + Phase 4 APIs running)
./scripts/process_mission_incidents.sh mission_20260608_120000

# Or via curl
curl -X POST http://localhost:8003/api/v1/incidents/process/mission_20260608_120000 \
  -H "Content-Type: application/json" \
  -d '{}'

4. Query Incidents

# List all incidents for a mission
curl http://localhost:8003/api/v1/incidents/mission_20260608_120000

# List incidents within a time range
curl "http://localhost:8003/api/v1/incidents/mission_20260608_120000?from_ms=5000&to_ms=30000"

# Get a specific incident by ID
curl http://localhost:8003/api/v1/incidents/mission_20260608_120000/inc_abc123

# Get processing status
curl http://localhost:8003/api/v1/incidents/mission_20260608_120000/status

5. Run Tests

# Pure logic tests (no Redis required)
PYTHONPATH=src .venv/bin/pytest tests/phase4/test_rules.py tests/phase4/test_statistics.py tests/phase4/test_detector.py -v

# All tests including Redis integration (requires Redis running)
PYTHONPATH=src .venv/bin/pytest tests/phase4/ -v

Phase 5 -- Gemini Reasoning Layer

Phase 5 runs independently of PX4/Gazebo. You need Redis, the Phase 4 Incident API (for incident data), and the Phase 5 Reasoning API.

1. Start Redis and Phase 4

# Start Redis
docker compose -f docker/docker-compose.yml up redis -d

# Start Phase 2 + Phase 3 + Phase 4 APIs
./scripts/start_replay_api.sh &
./scripts/start_state_api.sh &
./scripts/start_incident_api.sh &

2. Configure Gemini (Optional for Startup)

# Set your Gemini API key in .env
echo "GEMINI_API_KEY=your-key-here" >> .env

# Or export directly
export GEMINI_API_KEY=your-key-here

Note: The API starts without a Gemini key but analysis endpoints will return configuration errors. Health endpoint reports gemini: unconfigured.

3. Start the Reasoning API

./scripts/start_reasoning_api.sh

# API docs available at http://localhost:8004/docs
# Health check at http://localhost:8004/health

4. Analyze an Incident

# Via the script (requires Phase 4 + Phase 5 APIs running)
./scripts/analyze_incident.sh mission_20260608_120000 inc_abc123

# Or via curl
curl -X POST http://localhost:8004/api/v1/reasoning/analyze/mission_20260608_120000/inc_abc123 \
  -H "Content-Type: application/json" \
  -d '{"overwrite": true}'

5. Query Analyses

# Get analysis for a specific incident
curl http://localhost:8004/api/v1/reasoning/mission_20260608_120000/inc_abc123

# List all analyses for a mission
curl http://localhost:8004/api/v1/reasoning/mission_20260608_120000

# Reuse existing analysis (no Gemini call)
curl -X POST http://localhost:8004/api/v1/reasoning/analyze/mission_20260608_120000/inc_abc123 \
  -H "Content-Type: application/json" \
  -d '{"overwrite": false}'

6. Run Tests

# Pure logic tests (no Redis or Gemini required)
PYTHONPATH=src .venv/bin/pytest tests/phase5/test_models.py tests/phase5/test_prompts.py tests/phase5/test_provider.py tests/phase5/test_client.py -v

# All tests including Redis integration (requires Redis running)
PYTHONPATH=src .venv/bin/pytest tests/phase5/ -v

Phase 6 -- Phoenix Integration

Phase 6 instruments the Phase 5 reasoning pipeline with OpenTelemetry tracing and exports spans to Arize Phoenix. No separate API — it hooks into Phase 5's service layer.

Configuration

Set the following in .env:

PHOENIX_ENABLED=true
PHOENIX_ENDPOINT=http://localhost:6006
PHOENIX_PROJECT_NAME=tars-reasoning
PHOENIX_CONTENT_MODE=full  # full | metadata | disabled

Run Tests

# All Phase 6 tests (no Phoenix required -- tracing is mocked)
PYTHONPATH=src .venv/bin/pytest tests/phase6/ -v

Phase 7 -- Operational Memory

Phase 7 projects bounded facts from Phases 2, 4, and 5 into a Neo4j graph database. It connects missions → incidents → root causes → mitigations → outcomes and answers "Have we seen this before?" with provenance-preserving history queries.

1. Start Neo4j and Upstream APIs

# Start Neo4j (+ PostgreSQL and Redis for upstream phases)
docker compose -f docker/docker-compose.yml up neo4j postgres redis -d

# Start Phase 2 + Phase 3 + Phase 4 + Phase 5 APIs
./scripts/start_replay_api.sh &
./scripts/start_state_api.sh &
./scripts/start_incident_api.sh &
./scripts/start_reasoning_api.sh &

2. Configure Neo4j

# Set your Neo4j password in .env (must match docker-compose)
echo "NEO4J_PASSWORD=tars" >> .env

# Or export directly
export NEO4J_PASSWORD=tars

Note: The default Docker Compose configuration sets the Neo4j password to tars. Schema constraints and indexes are created automatically on API startup.

3. Start the Memory API

./scripts/start_memory_api.sh

# API docs available at http://localhost:8005/docs
# Health check at http://localhost:8005/health

4. Sync a Mission

# Via the script (requires upstream APIs running)
./scripts/sync_mission_memory.sh mission_20260608_120000

# Or via curl
curl -X POST http://localhost:8005/api/v1/memory/sync \
  -H "Content-Type: application/json" \
  -d '{"mission_id": "mission_20260608_120000"}'

# Check sync status
curl http://localhost:8005/api/v1/memory/sync/mission_20260608_120000

5. Query Operational Memory

# Get incident neighborhood (root causes, mitigations, outcomes)
curl http://localhost:8005/api/v1/memory/incidents/inc_abc123

# Find similar historical incidents
curl "http://localhost:8005/api/v1/memory/incidents/inc_abc123/similar?limit=10"

# Or via the script
./scripts/query_similar_incidents.sh inc_abc123

6. Record Mitigations and Outcomes

# Record an applied mitigation
curl -X POST http://localhost:8005/api/v1/memory/mitigations \
  -H "Content-Type: application/json" \
  -d '{
    "incident_id": "inc_abc123",
    "mitigation_text": "Switched to backup GPS receiver",
    "applied_by": "operator"
  }'

# Record an outcome
curl -X POST http://localhost:8005/api/v1/memory/outcomes \
  -H "Content-Type: application/json" \
  -d '{
    "incident_id": "inc_abc123",
    "status": "recovered",
    "description": "GPS signal restored after switching to backup receiver",
    "mitigation_application_id": "ma_xyz789"
  }'

7. Run Tests

# All Phase 7 tests (no Neo4j required -- all graph operations are mocked)
PYTHONPATH=src .venv/bin/pytest tests/phase7/ -v

# Individual test modules
PYTHONPATH=src .venv/bin/pytest tests/phase7/test_models.py tests/phase7/test_mapper.py -v
PYTHONPATH=src .venv/bin/pytest tests/phase7/test_clients.py tests/phase7/test_repository.py -v
PYTHONPATH=src .venv/bin/pytest tests/phase7/test_service.py tests/phase7/test_api.py -v

Phase 8 -- Phoenix MCP Self-Introspection

Phase 8 adds analysis-only trace introspection to the reasoning pipeline. The reasoning agent can inspect its own prior reasoning traces through Phoenix, using 3 read-only MCP tools. This is not an evaluation layer — all outputs carry not_an_evaluation=True and explicit limitation warnings.

Design Principles

  • Read-only: No trace creation, modification, or evaluation scores
  • Fail-open: Phoenix unavailability never blocks reasoning — returns empty context
  • Bounded: Max 10 traces per query, 2000-char summaries, 5s query timeout
  • Safe: Secret redaction on all summaries, no raw telemetry exposure
  • Opt-in: Disabled by default, requires both config flag and per-request flag

Configuration

# Enable in .env
PHOENIX_MCP_ENABLED=true
PHOENIX_MCP_CONTENT_MODE=summary          # metadata | summary | full_dev | disabled
PHOENIX_MCP_GRAPHQL_ENDPOINT=http://localhost:6006/graphql
PHOENIX_MCP_MAX_TRACES=10
PHOENIX_MCP_MAX_SUMMARY_CHARS=2000
PHOENIX_MCP_QUERY_TIMEOUT_S=5.0
PHOENIX_MCP_REDACT_SECRETS=true
PHOENIX_MCP_REQUIRE_REQUEST_FLAG=true     # Require use_introspection=true per request

MCP Tools

Tool Description
search_traces Find prior reasoning traces by mission_id, incident_type, root_cause, outcome, time range
get_trace_summary Get a safe summary of a specific trace (redacted, truncated, no raw content)
compare_traces Compare 2–5 traces, producing descriptive observations (never evaluative scores)

Using Introspection in Analysis

# Analyze with introspection enabled (requires PHOENIX_MCP_ENABLED=true)
curl -X POST http://localhost:8004/api/v1/reasoning/analyze/mission_001/inc_abc123 \
  -H "Content-Type: application/json" \
  -d '{"use_introspection": true}'

The response includes introspection metadata when traces are found:

{
  "introspection_used": true,
  "introspection_trace_ids": ["trace_abc", "trace_def"],
  "introspection_summary": "Consulted 2 prior traces for similar incidents"
}

Content Modes

Mode What's Included
disabled Introspection completely off
metadata Trace IDs, timestamps, incident types — no reasoning content
summary Metadata + redacted summaries and stage info
full_dev Everything including raw content (development only)

Health Check

# Phoenix MCP status is included in the Phase 5 health endpoint
curl http://localhost:8004/health
# Response includes: "phoenix_mcp": "ok" | "disabled" | "unavailable"

Run Tests

# All Phase 8 tests (no Phoenix required -- uses FakePhoenixTraceClient)
PYTHONPATH=src .venv/bin/pytest tests/phase8/ -v

# Individual test modules
PYTHONPATH=src .venv/bin/pytest tests/phase8/test_config.py tests/phase8/test_models.py -v
PYTHONPATH=src .venv/bin/pytest tests/phase8/test_summarizer.py tests/phase8/test_phoenix_client.py -v
PYTHONPATH=src .venv/bin/pytest tests/phase8/test_mcp_tools.py tests/phase8/test_tool_policy.py -v
PYTHONPATH=src .venv/bin/pytest tests/phase8/test_reasoning_integration.py -v

Phase 9 -- Evaluation Layer

Phase 9 measures the quality of reasoning outputs against bounded ground-truth labels, mission outcomes, and incident facts. It produces durable, inspectable evaluation scores without changing operational behavior.

Design Principles

  • Analysis-only: Never calls flight-control APIs, invokes Gemini, or mutates upstream records
  • Bounded metrics: All scores are [0.0, 1.0], all results carry advisory_only=True
  • Explicit evidence: Missing ground truth produces insufficient_evidence, not invented scores
  • Fail-open: Phoenix and Phase 7 are optional; their unavailability does not fail evaluation
  • Deterministic: All scoring uses versioned aliases and families, no LLM calls during evaluation

1. Start PostgreSQL and Run Migrations

docker compose -f docker/docker-compose.yml up postgres -d
PYTHONPATH=src .venv/bin/alembic upgrade head

2. Start the Evaluation API

./scripts/start_evaluation_api.sh

# API docs available at http://localhost:8006/docs
# Health check at http://localhost:8006/health

3. Create a Ground-Truth Label

curl -X POST http://localhost:8006/api/v1/evaluations/labels \
  -H "Content-Type: application/json" \
  -d '{
    "mission_id": "mission_20260618_120000",
    "incident_id": "inc_abc123",
    "root_cause": "gps_interference",
    "preferred_mitigation": "switch_to_visual_odometry",
    "outcome": "recovered",
    "source": "operator_label",
    "labeled_by": "operator"
  }'

4. Evaluate a Reasoning Result

# With inline ground truth
curl -X POST http://localhost:8006/api/v1/evaluations/evaluate \
  -H "Content-Type: application/json" \
  -d '{
    "mission_id": "mission_20260618_120000",
    "incident_id": "inc_abc123",
    "reasoning_id": "reason_abc123",
    "ground_truth": {
      "root_cause": "gps_interference",
      "preferred_mitigation": "switch_to_visual_odometry",
      "outcome": "recovered"
    }
  }'

# Using stored labels (no inline ground truth needed)
curl -X POST http://localhost:8006/api/v1/evaluations/evaluate \
  -H "Content-Type: application/json" \
  -d '{
    "mission_id": "mission_20260618_120000",
    "incident_id": "inc_abc123",
    "reasoning_id": "reason_abc123"
  }'

5. Query Evaluations

# Get a specific evaluation
curl http://localhost:8006/api/v1/evaluations/eval_abc123

# Get all evaluations for a mission
curl http://localhost:8006/api/v1/evaluations/mission/mission_20260618_120000

# Get all evaluations for a reasoning result
curl http://localhost:8006/api/v1/evaluations/reasoning/reason_abc123

6. Batch Evaluate

curl -X POST http://localhost:8006/api/v1/evaluations/batch \
  -H "Content-Type: application/json" \
  -d '{
    "targets": [
      {"mission_id": "mission_001", "incident_id": "inc_001", "reasoning_id": "reason_001"},
      {"mission_id": "mission_001", "incident_id": "inc_002", "reasoning_id": "reason_002"}
    ]
  }'

7. Run Tests

# All Phase 9 tests (no PostgreSQL, Phoenix, Gemini, Neo4j, or PX4 required)
PYTHONPATH=src .venv/bin/pytest tests/phase9/ -v

# Individual test modules
PYTHONPATH=src .venv/bin/pytest tests/phase9/test_config.py tests/phase9/test_models.py -v
PYTHONPATH=src .venv/bin/pytest tests/phase9/test_evaluator.py tests/phase9/test_ground_truth.py -v
PYTHONPATH=src .venv/bin/pytest tests/phase9/test_service.py tests/phase9/test_api.py -v

Phase 10 -- Learning Engine

Phase 10 mines Phase 9 evaluation records, Phase 7 operational memory, and safe trace metadata for repeated patterns. It produces candidate knowledge with evidence, confidence, and provenance. Candidate knowledge is NOT truth — it is input to Phase 11 validation.

Design Principles

  • Deterministic only — no Gemini, no LLM calls
  • advisory_only=True on every candidate, always
  • Cautious language — "is associated with", never "causes" or "fixes"
  • No validated status — candidates are proposed, superseded, retired, or rejected
  • No flight-control impact — candidates never change recommendations

1. Start PostgreSQL and Run Migrations

docker compose -f docker/docker-compose.yml up postgres -d
PYTHONPATH=src .venv/bin/alembic upgrade head

2. Start the Learning API

scripts/start_learning_api.sh
# Or manually:
PYTHONPATH=src .venv/bin/uvicorn tars.phase10.api:app --host 0.0.0.0 --port 8007

3. Trigger a Learning Run

# Full learning run (mines all candidate types)
curl -X POST http://localhost:8007/api/v1/learning/runs \
  -H "Content-Type: application/json" \
  -d '{}'

# Dry run (no persistence, returns candidates without saving)
curl -X POST http://localhost:8007/api/v1/learning/runs \
  -H "Content-Type: application/json" \
  -d '{"dry_run": true}'

# Filter by mission
curl -X POST http://localhost:8007/api/v1/learning/runs \
  -H "Content-Type: application/json" \
  -d '{"mission_ids": ["mission_001", "mission_002"]}'

# Filter by candidate type
curl -X POST http://localhost:8007/api/v1/learning/runs \
  -H "Content-Type: application/json" \
  -d '{"candidate_types": ["mitigation_effectiveness", "root_cause_pattern"]}'

4. Query Learning Runs

# Get a specific learning run
curl http://localhost:8007/api/v1/learning/runs/{run_id}

5. Browse Candidates

# List all proposed candidates
curl "http://localhost:8007/api/v1/learning/candidates?status=proposed"

# Filter by type
curl "http://localhost:8007/api/v1/learning/candidates?candidate_type=mitigation_effectiveness"

# Paginate
curl "http://localhost:8007/api/v1/learning/candidates?limit=10&offset=20"

# Get a specific candidate
curl http://localhost:8007/api/v1/learning/candidates/{candidate_id}

# Get evidence for a candidate
curl http://localhost:8007/api/v1/learning/candidates/{candidate_id}/evidence

6. Retire a Candidate

curl -X POST http://localhost:8007/api/v1/learning/candidates/{candidate_id}/retire \
  -H "Content-Type: application/json" \
  -d '{"reason": "Superseded by newer analysis"}'

7. Run Tests

# All Phase 10 tests (no PostgreSQL, Phoenix, Gemini, Neo4j, or PX4 required)
PYTHONPATH=src .venv/bin/pytest tests/phase10/ -v

# Individual test modules
PYTHONPATH=src .venv/bin/pytest tests/phase10/test_config.py tests/phase10/test_models.py -v
PYTHONPATH=src .venv/bin/pytest tests/phase10/test_evidence_loader.py tests/phase10/test_pattern_miner.py -v
PYTHONPATH=src .venv/bin/pytest tests/phase10/test_scorer.py tests/phase10/test_repository.py -v
PYTHONPATH=src .venv/bin/pytest tests/phase10/test_service.py tests/phase10/test_api.py -v

Telemetry Output Format

Each mission produces a JSON file in output/:

{
  "mission_id": "mission_001",
  "drone_id": "tars-sim-01",
  "start_time": "2024-01-15T10:30:00Z",
  "end_time": "2024-01-15T10:35:42Z",
  "faults_injected": [],
  "telemetry": [
    {
      "timestamp": "2024-01-15T10:30:01Z",
      "position": {
        "latitude_deg": 47.3977,
        "longitude_deg": 8.5456,
        "absolute_altitude_m": 488.5,
        "relative_altitude_m": 22.3
      },
      "battery": {
        "voltage_v": 11.8,
        "remaining_percent": 87.0
      },
      "gps": {
        "num_satellites": 12,
        "fix_type": "FIX_3D"
      },
      "attitude": {
        "roll_deg": 2.1,
        "pitch_deg": -1.3,
        "yaw_deg": 145.7
      },
      "flight_mode": "MISSION",
      "health": {
        "is_gyrometer_calibration_ok": true,
        "is_accelerometer_calibration_ok": true,
        "is_magnetometer_calibration_ok": true,
        "is_home_position_ok": true,
        "is_global_position_ok": true
      }
    }
  ],
  "mission_result": "SUCCESS",
  "summary": {
    "total_snapshots": 342,
    "duration_seconds": 342.0,
    "max_altitude_m": 22.5,
    "distance_traveled_m": 450.2,
    "min_battery_percent": 81.2,
    "max_speed_m_s": 5.2,
    "collection_rate_hz": 1.0
  }
}

Fault Injection

Interactive Mode

PYTHONPATH=src .venv/bin/python3 -m tars.phase1.fault_injector

Available commands:

Command Fault Effect
1 GPS Block Complete GPS signal loss
2 GPS Noise Noisy/jumpy GPS readings
3 Battery Drain Accelerated battery discharge
4 Baro Offset Incorrect altitude readings
5 Mag Offset Corrupted compass heading
6 Wind 8 m/s wind from north with moderate turbulence
7 Restore All Remove all injected faults

Pre-built Scenarios

Scenario Inspired By What Happens
s1 -- GPS Degradation NASA Ingenuity Progressive GPS noise -> block
s2 -- Altitude Confusion Amazon MK30 Conflicting altitude sensors
s3 -- Sensor Cascade Bell 525 Multiple sensors fail simultaneously
s4 -- Wind Shear Drone delivery incidents Progressive crosswind -> severe gust

Configuration

Edit .env or set environment variables:

Phase 1

Variable Default Description
PX4_CONNECTION udp://:14540 MAVSDK connection string
TELEMETRY_RATE_HZ 1 Snapshots per second
DRONE_ID tars-sim-01 Drone identifier
OUTPUT_DIR output Telemetry output directory
MISSION_ID auto-generated Mission identifier
FAULT_SCENARIO (none) Run a fault scenario during the mission (s1s4)

Phase 2

Variable Default Description
DATABASE_URL postgresql+asyncpg://tars:tars@localhost:5432/tars PostgreSQL connection string
API_HOST 0.0.0.0 FastAPI server host
API_PORT 8000 FastAPI server port

Phase 3

Variable Default Description
REDIS_URL redis://localhost:6379/0 Redis connection string
PHASE2_API_URL http://localhost:8000 Phase 2 Replay API base URL
STATE_API_HOST 0.0.0.0 State API server host
STATE_API_PORT 8002 State API server port

Phase 4

Variable Default Description
PHASE3_API_URL http://localhost:8002 Phase 3 State API base URL
INCIDENT_API_HOST 0.0.0.0 Incident API server host
INCIDENT_API_PORT 8003 Incident API server port
INCIDENT_MAX_GAP_MS 5000 Max gap between matches to merge
INCIDENT_MIN_STATES 3 Min states for persistence threshold
INCIDENT_HIGH_RISK 0.8 Risk threshold for immediate incident
INCIDENT_ELEVATED_RISK 0.6 Risk threshold for elevated detection

Phase 5

Variable Default Description
PHASE4_API_URL http://localhost:8003 Phase 4 Incident API base URL
REASONING_API_HOST 0.0.0.0 Reasoning API server host
REASONING_API_PORT 8004 Reasoning API server port
INCIDENT_CLIENT_TIMEOUT 30.0 HTTP client timeout for Phase 4 calls
GEMINI_API_KEY (empty) Gemini API key (required for live reasoning)
GEMINI_MODEL gemini-2.5-flash Gemini model identifier
GEMINI_TEMPERATURE 0.1 Gemini temperature (low for stable reasoning)

Phase 6

Variable Default Description
PHOENIX_ENABLED true Enable/disable Phoenix tracing
PHOENIX_ENDPOINT http://localhost:6006 Phoenix OTLP endpoint
PHOENIX_PROJECT_NAME tars-reasoning Phoenix project name
PHOENIX_CONTENT_MODE full Content capture: full, metadata, disabled
PHOENIX_EXPORT_TIMEOUT_SECONDS 5 OTLP export timeout
PHOENIX_BATCH_EXPORT true Use batch span processor

Phase 7

Variable Default Description
NEO4J_URI bolt://localhost:7687 Neo4j Bolt connection URI
NEO4J_USER neo4j Neo4j username
NEO4J_PASSWORD (empty) Neo4j password
NEO4J_DATABASE neo4j Neo4j database name
MEMORY_API_HOST 0.0.0.0 Memory API server host
MEMORY_API_PORT 8005 Memory API server port
PHASE2_API_URL http://localhost:8000 Phase 2 Replay API base URL
PHASE4_API_URL http://localhost:8003 Phase 4 Incident API base URL
PHASE5_API_URL http://localhost:8004 Phase 5 Reasoning API base URL
MEMORY_CLIENT_TIMEOUT 30.0 HTTP client timeout for upstream calls
MEMORY_QUERY_DEFAULT_LIMIT 20 Default result limit for queries
MEMORY_QUERY_MAX_LIMIT 100 Maximum result limit for queries

Phase 8

Variable Default Description
PHOENIX_MCP_ENABLED false Enable Phoenix MCP self-introspection
PHOENIX_MCP_CONTENT_MODE summary Content capture: metadata, summary, full_dev, disabled
PHOENIX_MCP_GRAPHQL_ENDPOINT http://localhost:6006/graphql Phoenix GraphQL endpoint
PHOENIX_MCP_MAX_TRACES 10 Maximum traces per query
PHOENIX_MCP_MAX_SUMMARY_CHARS 2000 Maximum summary length
PHOENIX_MCP_QUERY_TIMEOUT_S 5.0 Query timeout in seconds
PHOENIX_MCP_CACHE_TTL_S 300 Cache TTL in seconds
PHOENIX_MCP_REDACT_SECRETS true Redact secrets from summaries
PHOENIX_MCP_ALLOWED_TOOLS search_traces,get_trace_summary,compare_traces Allowed MCP tools
PHOENIX_MCP_REQUIRE_REQUEST_FLAG true Require per-request use_introspection flag

Phase 9

Variable Default Description
EVALUATION_ENABLED true Enable the Phase 9 evaluation service
EVALUATION_DATABASE_URL postgresql+asyncpg://tars:tars@localhost:5432/tars PostgreSQL connection string
EVALUATION_VERSION v1.0 Evaluator version stamped on results
EVALUATION_API_HOST 0.0.0.0 Evaluation API server host
EVALUATION_API_PORT 8006 Evaluation API server port
EVALUATION_BATCH_LIMIT 50 Maximum targets per batch request
EVALUATION_CONSISTENCY_MIN_CASES 3 Minimum cases for consistency scoring
EVALUATION_SIMILARITY_LIMIT 20 Maximum similar cases to compare
EVALUATION_EXPORT_PHOENIX false Export eval scores to Phoenix
EVALUATION_REQUIRE_OPERATOR_LABEL false Require explicit operator labels
EVALUATION_ROOT_CAUSE_WEIGHT 0.40 Overall score root-cause weight
EVALUATION_RECOMMENDATION_WEIGHT 0.35 Overall score recommendation weight
EVALUATION_CONSISTENCY_WEIGHT 0.15 Overall score consistency weight
EVALUATION_FALSE_POSITIVE_WEIGHT 0.05 Overall score false-positive penalty
EVALUATION_FALSE_NEGATIVE_WEIGHT 0.05 Overall score false-negative penalty

Phase 10

Variable Default Description
LEARNING_ENABLED true Enable the Phase 10 learning service
LEARNING_DATABASE_URL postgresql+asyncpg://tars:tars@localhost:5432/tars PostgreSQL connection string
LEARNING_VERSION v1.0 Learning engine version stamped on candidates
LEARNING_API_HOST 0.0.0.0 Learning API server host
LEARNING_API_PORT 8007 Learning API server port
LEARNING_MIN_EVALUATED_CASES 5 Minimum evaluated cases to form a pattern
LEARNING_MIN_DISTINCT_MISSIONS 3 Minimum distinct missions for diversity
LEARNING_MIN_CONFIDENCE 0.60 Minimum confidence to emit a candidate
LEARNING_MIN_SUCCESS_RATE 0.70 Minimum success rate for mitigation patterns
LEARNING_MAX_FALSE_POSITIVE_RATE 0.20 Maximum false-positive rate allowed
LEARNING_SCORING_SUPPORT_WEIGHT 0.35 Confidence score support weight
LEARNING_SCORING_OUTCOME_WEIGHT 0.25 Confidence score outcome weight
LEARNING_SCORING_EVALUATION_WEIGHT 0.20 Confidence score evaluation weight
LEARNING_SCORING_DIVERSITY_WEIGHT 0.10 Confidence score diversity weight
LEARNING_SCORING_CONTRADICTION_WEIGHT 0.10 Confidence score contradiction penalty weight
PHASE9_API_URL http://localhost:8006 Phase 9 API base URL
PHASE7_API_URL http://localhost:8005 Phase 7 API base URL
LEARNING_TRACE_METADATA_ENABLED false Enable Phoenix trace metadata enrichment
PHOENIX_BASE_URL http://localhost:6006 Phoenix API base URL

Hardware Notes

Developed and tested on:

  • CPU: Intel i3-6100U (2C/4T @ 2.3GHz)
  • RAM: 8GB
  • GPU: Intel HD 520 (integrated)
  • OS: Pop!_OS 22.04

Gazebo runs in headless mode (no 3D rendering) to fit within RAM constraints. Use QGroundControl on the host for visual drone tracking on a 2D map.


Roadmap

Phase Name Status
1 Mission Foundation (PX4 + Gazebo + MAVSDK) ✅ Done
2 Mission Replay System (FastAPI + PostgreSQL) ✅ Done
3 State Engine (Python + Redis) ✅ Done
4 Incident Engine (Rules + Statistical Detection) ✅ Done
5 Gemini Reasoning Layer (Google ADK) ✅ Done
6 Phoenix Integration (OpenInference Tracing) ✅ Done
7 Neo4j Operational Memory ✅ Done
8 Phoenix MCP (Self-Introspection) ✅ Done
9 Evaluation Layer (Reasoning Quality Metrics) ✅ Done
10 Learning Engine ✅ Current
11 Knowledge Validation Planned
12 Adaptive Recommendation Engine Planned

License

MIT

About

Self-Learning Drone Observability Platform

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors