Skip to content

glitchymagic/ai-thinking-daemon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ai-thinking-daemon

An autonomous AI thinking agent that runs 24/7. It maintains a priority-based stimulus queue, generates thoughts via LLM endpoints, filters repetitive output through anti-rumination detection, rotates across multiple model servers, and persists a stream of thoughts with quality scoring.

Why This Exists

Most AI systems only think when asked. This daemon thinks continuously -- processing events, generating observations, making predictions, and building a persistent stream of scored thoughts. It is designed for systems that benefit from ambient intelligence: a background process that watches, thinks, and surfaces insights without being prompted.

The anti-rumination system is the key innovation. Without it, an autonomous LLM loops on the same observations endlessly. The daemon uses topic signatures and concept overlap detection to suppress repetitive thoughts and force novelty in each thinking cycle.

Architecture

                     +-------------------+
                     |  Context Sources  |
                     |  (pluggable)      |
                     +--------+----------+
                              |
                              v
                     +-------------------+
                     |  Event Poller     |
                     |  (60s interval)   |
                     +--------+----------+
                              |
                    creates Stimulus objects
                              |
                              v
+----------+        +-------------------+        +-------------------+
| Timer    |------->|  Priority Queue   |------->|  LLM Client       |
| Prompts  |        |  P1=reactive      |        |  (multi-endpoint   |
| (5min,   |        |  P2=scheduled     |        |   rotation with    |
|  4hr)    |        |  P3=deliberative  |        |   health checks)   |
+----------+        |  P4=deep          |        +--------+----------+
                    +-------------------+                 |
                                                          v
                                                +-------------------+
                                                | Anti-Rumination   |
                                                | Filter            |
                                                | - topic sigs      |
                                                | - concept overlap |
                                                | - pattern detect  |
                                                +--------+----------+
                                                         |
                                              pass / discard
                                                         |
                                                         v
                                                +-------------------+
                                                | Thought Stream    |
                                                | (JSONL + scoring) |
                                                +-------------------+
                                                         |
                                                         v
                                                +-------------------+
                                                | Briefing Writer   |
                                                | (30min summaries) |
                                                +-------------------+

Features

  • Priority Queue: Stimuli are prioritized (P1 reactive > P2 scheduled > P3 deliberative > P4 deep). Reactive events (new data, alerts) preempt scheduled thinking.
  • Multi-Endpoint LLM Client: Supports both OpenAI-compatible and Ollama endpoints. Auto-discovers models, rotates on failure, cools down unhealthy endpoints.
  • Anti-Rumination Filter: Three-layer deduplication:
    1. Topic signatures: Extracts key concepts, identifiers, and named entities from each thought
    2. Concept overlap: Computes Jaccard-style overlap between new and recent thoughts (>55% = duplicate)
    3. Pattern detection: Catches self-narration patterns ("You are thinking about...")
  • Quality Scoring: Each thought gets an importance score (0-10) and a self-assessed quality score (1-5) based on novelty, length, and content type.
  • Thought Stream: Append-only JSONL file with full metadata (timestamp, model, latency, scores, stimulus info). Weekly rotation.
  • Rolling Briefings: Every 30 minutes, the daemon summarizes its recent thoughts into a briefing document for other systems to consume.
  • Deliberative Prompts: Rotating set of thinking prompts that force diverse cognitive modes (analytical, contrarian, forward-looking, meta-cognitive, creative).
  • Self-Healing: Watches its own thought output rate. If no thought is produced for 30 minutes, resets HTTP clients and re-probes all endpoints.
  • Memory Leak Protection: Monitors RSS and self-restarts if memory exceeds threshold.
  • HTTP API: Local REST API for health checks, state inspection, pause/resume, and injecting session context.
  • Pause Windows: Configurable time windows where the daemon pauses (e.g., during scheduled reflection jobs).

Quick Start

# Install dependencies
pip install requests

# Configure at least one LLM endpoint
export THINKING_DAEMON_LLM_URL="http://localhost:8800/v1/chat/completions"
export THINKING_DAEMON_MODEL="your-model-name"

# Create a system prompt
cat > system_prompt.md << 'EOF'
You are a thinking daemon. You observe data, make predictions, and surface insights.
When you have an important observation, prefix it with SAVE: to persist it.
When you make a prediction, prefix it with PREDICT: to track it.
Keep thoughts concise (20-80 words). Be specific. Reference real data.
EOF

# Run
python3 thinking_daemon.py --system-prompt system_prompt.md

Configuration

Environment Variables

Variable Default Description
THINKING_DAEMON_LLM_URL http://localhost:8800/v1/chat/completions Primary OpenAI-compatible endpoint
THINKING_DAEMON_MODEL (auto-discover) Model name for primary endpoint
THINKING_DAEMON_OLLAMA_URLS http://localhost:11434 Comma-separated Ollama URLs
THINKING_DAEMON_MLX_URLS (same as primary) Comma-separated MLX server URLs
THINKING_DAEMON_OLLAMA_MODEL_PREFERENCES qwen2.5:32b,llama3:70b Comma-separated model preferences for Ollama
THINKING_DAEMON_PORT 8768 HTTP API port
THINKING_DAEMON_STREAM_FILE ./thought_stream.jsonl Path to thought stream output
THINKING_DAEMON_STATE_FILE ./daemon_state.json Path to persistent state

Timing Parameters

Parameter Default Description
Poll interval 60s How often to check context sources for new events
Think interval 5 min How often to generate deliberative thoughts
Deep interval 4 hours How often to run deep analysis prompts
Briefing interval 30 min How often to write rolling briefings
LLM timeout 120s Timeout for LLM calls
Endpoint cooldown 60s How long to skip a failed endpoint

Anti-Rumination: How It Works

The biggest challenge with autonomous LLM thinking is rumination -- the model saying the same thing over and over in slightly different words.

Topic Signatures

Each thought is decomposed into a set of topic tokens:

  • Uppercase identifiers (2-5 letters, e.g., stock tickers, acronyms)
  • Domain concept phrases (configurable list)
  • Named entities extracted via regex

Concept Overlap Detection

For each new thought, the daemon computes overlap with the last 8 thoughts:

overlap = |new_topics & old_topics| / min(|new_topics|, |old_topics|)

Using min() as the denominator means that if a thought's entire topic set is contained in a recent thought, it scores as a duplicate even if the recent thought covered more topics.

Thresholds:

  • >= 0.55: Thought is discarded as a duplicate
  • >= 0.50: Thought is discarded (anti-rumination filter)
  • Quality scoring penalizes high overlap even below thresholds

Pattern Detection

Known rumination patterns are caught directly:

  • "You are thinking about..." (self-narration)
  • Repeated status descriptions without new insight
  • Listing known facts without analysis

Extending with Custom Context Sources

The daemon polls context sources for new events. To add your own:

  1. Create a context loader function that returns a text summary:
async def load_my_context() -> str:
    """Load current state for grounding thoughts in real data."""
    # Read your data sources
    # Return a text summary
    return "Current status: ..."
  1. Create an event poller that pushes stimuli to the queue:
async def poll_my_events(state: DaemonState, queue: PriorityQueue):
    """Check for new events and create stimuli."""
    # Check your event source
    new_events = check_for_events()
    for event in new_events:
        queue.push(Stimulus(
            priority=1,  # 1=reactive, 2=scheduled, 3=deliberative, 4=deep
            layer="reactive",
            trigger=event.type,
            description=event.description,
        ))
  1. Register them in the main loop.

HTTP API

Endpoint Method Description
/health GET Uptime, thought counts
/state GET Full daemon state (paused, buffer, predictions)
/thoughts?last=N GET Last N thoughts from the stream
/briefing GET Latest rolling briefing
/pause POST Pause thinking
/resume POST Resume thinking
/session-context POST Inject context into the session buffer

Example System Prompts

General Observer

You are a thinking daemon observing a complex system.
Your job is to notice patterns, make predictions, and surface insights.
Keep thoughts between 20-80 words. Be specific. Reference real data.
Prefix important observations with SAVE: to persist them.
Prefix predictions with PREDICT: so they can be tracked.

Market Analyst

You are an autonomous market analysis daemon.
You observe price data, volume, and market structure.
Think about what is happening, why, and what might happen next.
Do not repeat yourself. Each thought must add something new.

System Monitor

You are an infrastructure thinking daemon.
You watch service health, error rates, and system metrics.
Think about root causes, correlations, and preventive actions.
Only SAVE observations that are actionable.

Thought Stream Format

Each line in the JSONL stream:

{
  "timestamp": "2024-03-15T10:30:00+00:00",
  "thought_num": 42,
  "layer": "deliberative",
  "trigger": "timer",
  "stimulus": "What patterns span the last 24 hours?",
  "thought": "The actual generated thought text...",
  "importance": 6,
  "thought_quality_self_score": 4,
  "model_used": "qwen2.5:32b",
  "endpoint_used": "http://localhost:11434/api/chat",
  "latency_ms": 3200,
  "token_count": 85
}

Running as a Service

macOS (launchd)

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
  "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.example.thinking-daemon</string>
    <key>ProgramArguments</key>
    <array>
        <string>/usr/bin/python3</string>
        <string>/path/to/thinking_daemon.py</string>
        <string>--system-prompt</string>
        <string>/path/to/system_prompt.md</string>
    </array>
    <key>KeepAlive</key>
    <true/>
    <key>StandardOutPath</key>
    <string>/tmp/thinking-daemon.log</string>
    <key>StandardErrorPath</key>
    <string>/tmp/thinking-daemon.err</string>
</dict>
</plist>

Linux (systemd)

[Unit]
Description=AI Thinking Daemon
After=network.target

[Service]
ExecStart=/usr/bin/python3 /path/to/thinking_daemon.py --system-prompt /path/to/system_prompt.md
Restart=always
RestartSec=10
Environment=THINKING_DAEMON_LLM_URL=http://localhost:8800/v1/chat/completions

[Install]
WantedBy=multi-user.target

License

MIT

About

Autonomous AI thinking daemon — 3 fine-tuned brains, 7 cognitive modes, self-prediction, dialectical reasoning, anti-rumination, persistent memory. 2,300+ thoughts generated.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages