RawLLM

Minimal POC of "dumb orchestrator – smart model". The LLM evolves itself by writing plugins (add_plugin / run_plugin). Inspired by MemPalace (96.6% on LongMemEval) and Claude Code’s TAOR loop. The core is immutable (<150 loc). HTTP transport is a plugin.

Idea

Instead of hard-coding complexity into the orchestrator, the model is given two tools:

add_plugin(name, code) — write and save a new plugin (or overwrite an existing one)
run_plugin(name, input_data) — execute an already-loaded plugin by name

The core (orchestrator) is immutable and deliberately "dumb" (~150 lines). All intelligence — creating new capabilities, coordinating agents, managing memory, parsing data — lives in the LLM’s reasoning and in the plugins it generates. The model decides when to write a plugin and can hot-reload them at any time.

Inspiration

MemPalace — the approach that dominated LongMemEval (96.6%) without complex RAG, simply by giving the model raw data and freedom to decide.
Claude Code — the TAOR (Think‑Act‑Observe‑Repeat) architecture and the "dumb orchestrator, smart model" principle.
Critique of over-engineered RAG pipelines — give the model a clean context and let it decide.

Status

✅ Core & HTTP Plugin — implemented, tested, CI passing.
✅ Sprint 1 (Multi-Agent Foundation) — advanced metrics, tool reranking, and rejection handling completed.
⏳ Sprint 2 (Context & Reflection) — in progress: context repository and self-correction loop.

Quick start

# 1. Install dependencies via Poetry or pip
poetry install
# or
pip install -e .

# 2. Create .env with your Anthropic API key
echo "ANTHROPIC_API_KEY=sk-ant-..." > .env

# 3. Start the orchestrator (HTTP server on port 8080)
poetry run python run.py
# or
python run.py

# 4. Send a request
curl -X POST http://localhost:8080/ \
     -H "Content-Type: application/json" \
     -d '{"prompt": "Write a plugin that returns the current time", "context": {}}'

The server port can be overridden via the HTTP_PORT environment variable.

Docker sandbox (WSL)

For more isolated plugin execution in WSL, use the Docker backend. In this mode, untrusted plugins run under a separate user and container filesystem boundary rather than sharing the orchestrator process privileges.

rawllm-core - orchestrator process user on host side
rawllm-plugin - plugin subprocess user inside sandbox container

1) Build sandbox image

docker build -t rawllm/plugin-sandbox:latest -f docker/sandbox/Dockerfile .

2) Enable docker backend

echo "SANDBOX_BACKEND=docker" >> .env
echo "SANDBOX_DOCKER_IMAGE=rawllm/plugin-sandbox:latest" >> .env

3) Run tests in WSL (Docker required)

pytest -q

When docker backend is enabled, plugin execution uses isolated volumes:

rawllm_workspace (rw)
rawllm_core_repo (ro snapshot)
rawllm_plugin_store (ro snapshot)

Running with Free / Lightweight LLMs

run.py supports any OpenAI-compatible provider via the LLM_PROVIDER environment variable (default: anthropic). All security and versioning features are automatically active.

Supported providers

`LLM_PROVIDER`	API key env var	Default model
`anthropic` (default)	`ANTHROPIC_API_KEY`	`claude-3-5-sonnet-20241022`
`groq`	`GROQ_API_KEY`	`llama3-70b-8192`
`gemini`	`GEMINI_API_KEY`	`gemini-2.0-flash`
`openrouter`	`OPEN_ROUTER_API_KEY`	`qwen/qwen3-coder:free`
`deepseek`	`DEEPSEEK_API_KEY`	`deepseek-chat`
`ollama`	(none required)	`llama3.2:3b`
`ollama-qwen-coder`	(none required)	`qwen2.5-coder:7b`

Override any default with LLM_MODEL and LLM_BASE_URL.

Groq (free tier)

echo "GROQ_API_KEY=gsk_..." >> .env
LLM_PROVIDER=groq python run.py

Google Gemini

echo "GEMINI_API_KEY=AIza..." >> .env
LLM_PROVIDER=gemini python run.py

OpenRouter (free models)

echo "OPEN_ROUTER_API_KEY=sk-or-..." >> .env
LLM_PROVIDER=openrouter python run.py
# Use a specific free model:
LLM_PROVIDER=openrouter LLM_MODEL=google/gemma-3-27b-it:free python run.py

DeepSeek

echo "DEEPSEEK_API_KEY=sk-..." >> .env
LLM_PROVIDER=deepseek python run.py

Ollama (fully local, no API key)

# 1. Install Ollama: https://ollama.com/
ollama pull llama3.2:3b   # or any model you prefer

# 2. Run
LLM_PROVIDER=ollama python run.py
# Custom model:
LLM_PROVIDER=ollama LLM_MODEL=mistral python run.py

Local Qwen Coder 7B for container testing

# 1. Pull the local coding model in WSL / host environment
ollama pull qwen2.5-coder:7b

# 2. Run RawLLM against the dedicated provider alias
LLM_PROVIDER=ollama-qwen-coder python run.py

If the orchestrator itself runs in a container and Ollama stays on the host, override the endpoint explicitly:

LLM_PROVIDER=ollama-qwen-coder \
LLM_BASE_URL=http://host.docker.internal:11434/v1 \
python run.py

CLI (`rawllm`)

Install the package in editable mode to get the rawllm command:

pip install -e .

Or invoke directly without installing:

python cli.py <command>

Orchestrator lifecycle

rawllm run                        # use default provider (anthropic)
rawllm run --provider groq        # use a specific provider

Plugin management

rawllm plugin list
rawllm plugin show my_plugin
rawllm plugin add my_plugin path/to/code.py
rawllm plugin rollback my_plugin

Plugin authoring contract

Use module-level docstring in every plugin as a prompt for RawLLM. The docstring should describe plugin role, input/output contract, operational constraints, and failure behavior. See plugins/http.py as a reference template.

Dependency approval

rawllm deps pending               # list modules awaiting approval
rawllm deps approve requests      # approve a module
rawllm deps reject requests       # reject a module

Metrics & analytics

rawllm metrics show                          # all plugins, table format
rawllm metrics show --plugin my_plugin       # one plugin
rawllm metrics show --format json            # JSON output
rawllm metrics evolution my_plugin           # chronological timeline
rawllm metrics trajectory <id>               # view specific execution trajectory
rawllm metrics success-rate                  # aggregate success scores

Configuration

rawllm config show
rawllm config set LLM_PROVIDER groq
rawllm config set ALLOWED_REQUIREMENTS "json,datetime,requests"

⚠️ Security Warning

The statement "plugins run with the same privileges as the orchestrator" applies to trusted in-process plugins and the legacy subprocess backend. When SANDBOX_BACKEND=docker is enabled, untrusted plugins run in a separate container with reduced privileges and isolated mounted volumes. Do not load plugins from untrusted sources in a production environment. This project is a research POC — run it only inside a hardened isolated environment (sandbox, Docker, VM), and review Docker runtime permissions.

Architecture

rawllm/
├── core/
│   ├── llm/                    # LLM abstraction subpackage
│   │   ├── protocol.py         # LLMClientProtocol structural Protocol
│   │   ├── registry.py         # LLM_PROVIDERS — single source of truth
│   │   ├── factory.py          # get_llm_client(provider) factory
│   │   └── clients/
│   │       ├── anthropic.py    # AnthropicClient
│   │       └── openai_compat.py# OpenAICompatibleClient (Groq, Gemini, Ollama, …)
│   ├── plugin_manager.py       # Plugin loading, hot-reload, versioning, sandbox
│   ├── tool_executor.py        # Tool-call routing + dependency gating
│   ├── taor_loop.py            # Think → Act → Observe → Repeat loop
│   ├── config.py               # Settings: trusted_plugins, allowed_requirements
│   ├── tool_management.py      # Tool reranking and rejection handling (Sprint 1)
│   ├── metrics.py              # Event logging with success_score and trajectory tracking
│   └── utils.py                # Shared utilities + extract_imports
├── plugins/
│   └── http.py                 # HTTP transport plugin (port set via HTTP_PORT)
├── plugins_store/              # Versioned plugin storage (created automatically)
│   ├── current/                # Symlinks to active versions
│   └── archive/{name}/         # Previous versions with metrics snapshots
├── cli.py                      # CLI entry point (rawllm)
├── system_prompt.txt           # LLM system prompt
└── run.py                      # Unified entry point (Anthropic / Groq / Gemini / Ollama / …)

License

MIT — use the ideas freely, fork, and improve.

└── HOLOBIONT_ROADMAP.md # Development roadmap and future phases

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.github		.github
core		core
docker/sandbox		docker/sandbox
plugins		plugins
scripts		scripts
tests		tests
workspace		workspace
.env.example		.env.example
.gitignore		.gitignore
.gitmessage.txt		.gitmessage.txt
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
HOLOBIONT_ROADMAP.md		HOLOBIONT_ROADMAP.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
cli.py		cli.py
pyproject.toml		pyproject.toml
run.py		run.py
system_prompt.txt		system_prompt.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RawLLM

Idea

Inspiration

Status

Quick start

Docker sandbox (WSL)

1) Build sandbox image

2) Enable docker backend

3) Run tests in WSL (Docker required)

Running with Free / Lightweight LLMs

Supported providers

Groq (free tier)

Google Gemini

OpenRouter (free models)

DeepSeek

Ollama (fully local, no API key)

Local Qwen Coder 7B for container testing

CLI (`rawllm`)

Orchestrator lifecycle

Plugin management

Plugin authoring contract

Dependency approval

Metrics & analytics

Configuration

⚠️ Security Warning

Architecture

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RawLLM

Idea

Inspiration

Status

Quick start

Docker sandbox (WSL)

1) Build sandbox image

2) Enable docker backend

3) Run tests in WSL (Docker required)

Running with Free / Lightweight LLMs

Supported providers

Groq (free tier)

Google Gemini

OpenRouter (free models)

DeepSeek

Ollama (fully local, no API key)

Local Qwen Coder 7B for container testing

CLI (rawllm)

Orchestrator lifecycle

Plugin management

Plugin authoring contract

Dependency approval

Metrics & analytics

Configuration

⚠️ Security Warning

Architecture

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

CLI (`rawllm`)

Packages