Skip to content

OwLLM/owllm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
OwLLM

One file · ~30 MB · No admin required · Windows 10 / 11 x64

installer portable


Your team of AI agents. Build them. Own them. Run them anywhere.

OwLLM is an open platform to build, deploy, and run custom AI agent teams — on your hardware, your VPS, or in a VM, 24/7. Bring your own models: local, cloud, or both. Fine-tune. Quantize. Abliterate. Red-team. Automate.

Latest version Discussions License Stars

[!IMPORTANT] OwLLM Desktop currently ships for Windows 10/11 (x64) only. macOS (Apple Silicon + Intel) and Linux (x86_64) builds are coming via the already-configured cross-platform CI. Watch the repo for release notifications.


What makes OwLLM different

Most AI tools give you a chatbox. OwLLM gives you a workforce.

You compose teams of specialised agents — an orchestrator that plans, a coder that writes, a critic that reviews, a researcher that fact-checks — and they collaborate on real tasks in parallel. Each team is a graph of roles + prompts you define. The 18 teams shipped in this repo are starter samples, not the menu.

What OwLLM gives you that others don't
🧩 Build your own teams Compose agents from 8 base roles + custom prompts. Visual graph builder. Hot-updates through this repo — push a team JSON, it lands on every installed app.
☁️ Cloud OR local — same teams No 4090? Plug in Claude / GPT / Gemini / Kimi keys, teams work identically. Have a GPU? Run open-weight models locally and stop paying per token. Mix both in the same conversation.
🎓 Fine-tune any model Full LoRA + Unsloth + TRL pipeline. Drop a JSONL, watch loss curves, save adapters. Works on consumer GPUs (8 GB+).
🔬 Abliterate for safety research Orthogonalise weights against refusal directions. Generate adversarial datasets. Train better safety classifiers. The honest tools the field actually needs.
🛠 GGUF + quantization built-in Convert HF safetensors → GGUF, quantize Q4/Q5/Q6/Q8/F16. Ship custom models anyone with llama.cpp can run.
🛡 Red-team capable Compose adversarial agent teams whose job is to find vulnerabilities — in models, code, apps. Pair with fine-tuning to train defenders.
🔒 OS-level isolation (Win/Mac/Linux) Flip it on and every tool your agents run — shell, file writes, edits, search, and the cloud CLIs (Claude/Codex/Gemini/Kimi) — runs inside a real Linux sandbox: WSL2 on Windows, a Lima VM on macOS, bubblewrap on Linux (Mac/Linux beta). The project lives on the sandbox filesystem, so a model that runs rm -rf or writes outside it cannot touch your real drive or home. Projects are isolated by default and the toolchain auto-installs. Your provider logins — the CLIs and every API key — auto-sync into the sandbox, so isolated cloud agents just work; the Accounts page tests each provider on both host and sandbox. Connect GitHub to clone/push private repos from inside; convert a project isolated↔not anytime. Code page, agentic teams, and the fine-tuning chat are all covered; the rest stays native.
🔌 MCP-first tooling Plug in any Model Context Protocol server (filesystem, git, browser, Postgres, GitHub…). Keyless DuckDuckGo web search is auto-installed on first run — no API key, no card. Engine-agnostic: any search MCP you add is used automatically. Curated packs per team.
🏠 Run anywhere Desktop today. Headless on a $5/mo VPS, 24/7 — on the roadmap. Containerised / VM — on the roadmap. Your agents, your hardware, your terms.

What teams can do

OwLLM ships starter teams in nine categories. All of them are forkable and remixable — they're templates, not the menu. The real product is the team builder.

Category What teams here do Starter samples
🛠 Code Architect → code → critic → refactor; bug hunting; reviews code_artisan, dev_squad, code_reviewer, bug_hunter
🔬 Research Multi-source synthesis with real citations, fact-checking research_lab, learning_tutor
📊 Data SQL → notebook → viz → narrative data_analyst
🎨 Design Product → UX → tech → critique product_studio
✍️ Writing Outline → draft → edit → SEO → publish writers_room, social_desk
🤝 Ops Triage → respond → schedule → digest secretary, concierge, customer_support
💼 Personal Calendar, finance, health, home automation finance, health_coach, smart_home
🌐 Social Outreach, support, community management sales_outreach, n8n_workflow_builder
🛡 Safety / Red-team Adversarial dataset generation, jailbreak research, refusal probing (build your own — see data/teams/SCHEMA.md)
🎮 Gamify Agent-vs-agent, achievements, arena (in progress — Q4 2026)

Browse the 18 starter teams → · Build your own →

Build your own team — 5-minute walkthrough

  1. Open Studio in the desktop app
  2. Drop in agents: orchestrator + 1..N specialists (coder, critic, researcher, brainstormer, devops, documentation, operator, …)
  3. Wire the dispatch graph (orchestrator → coder → critic → back to orchestrator)
  4. Write each agent's system prompt
  5. Save → team appears in your picker
  6. Publish to the community via PR against data/teams/ — your team becomes one-click installable for every other user

Power tools nobody else ships

Fine-tune any open-weight model

LoRA pipeline with Unsloth, TRL, PEFT, bitsandbytes. Llama / Qwen / Mistral / Gemma — anything on HuggingFace. Live loss curves, graceful Stop preserves checkpoints, resume-from-checkpoint and resume-adapter both supported. Runs on a 12 GB GPU.

Abliterate (refusal removal for safety research)

Orthogonalise weight matrices against refusal directions (the Labonne / Arditi technique, packaged). Use cases:

  • AI safety labs training refusal classifiers need cleanly-uncensored teacher models
  • Red teams need models that don't sandbag jailbreak tests
  • Academic research on alignment failure modes

The corpus prep + abliteration script ship together.

GGUF creation + quantization

Convert HF safetensors → GGUF, quantize to Q4_K_M / Q5_K_M / Q6_K / Q8_0 / F16. The same pipeline that gives you tiny, fast custom models others can run on llama.cpp / Ollama / LM Studio.

Adversarial dataset generation

Build a team whose role is to PROBE another model. Output: a labelled dataset of jailbreak attempts, refusal patterns, edge cases. Sells to AI safety labs. Trains your own filters.

Cloud or local — same teams, your choice

You don't need a 4090. Many users will never have one.

  • Cloud-only: Plug in Claude / GPT / Gemini / Kimi API keys. Teams work identically. ~30 MB install, runs on any laptop.
  • Local + cloud mix: Have a 3060? Run Llama for the bulk, hand off to Claude for the hard parts in the same conversation. Save 90% on tokens.
  • Local-only: Have a 4090? Never touch a cloud API. Privacy by default. Stop paying per token forever.

Same teams. Same agent definitions. Same UI. The model layer is just plumbing.

Run anywhere

Mode Status Use case
Desktop (Windows) ✅ shipped Daily-driver AI workstation on your laptop
Desktop (macOS / Linux) 🔜 Q3 2026 Mac / Ubuntu users
Headless on VPS (24/7) 🔜 Q4 2026 Run your custom teams on a $5/mo box. Reach them via Telegram, web, API. Always-on agentic services.
Containerised / VM 🔜 Q4 2026 Drop OwLLM into your existing infra.

The team definitions, role prompts, MCP configs, and model selections are all portable across deployment modes — build a team once, run it anywhere.

Install (Windows only — for now)

  1. Download OwLLM.Desktop.Setup.exe (~30 MB — one file, that's it)
  2. Run OwLLM-Desktop-Setup-x64.exe. Windows SmartScreen may flag it the first time (the binary isn't EV-signed yet) — click "More info" → "Run anyway".
  3. On first launch, a hardware-aware wizard opens. It detects your hardware and offers the modules that fit:
    • Local Inference (~33 MB CPU / ~32 MB Vulkan / ~285 MB CUDA) — only needed if you want local models
    • Audio / Speech-to-Text (~148 MB) — for voice messages, mic input
    • Fine-tuning (~12 GB) — only if you'll train models
    • MCP toolchain (~260 MB) — only if you want browser / git / postgres MCP servers

Cloud-only? Skip the wizard entirely and just enter your API keys in Settings. The shell alone is enough for cloud-model chat + agent orchestration.

How updates work

Three independent update streams — small, fast, no full reinstalls:

  • Shell auto-updates via Tauri's signed updater
  • Modules (llama backend, fine-tune env, audio, MCP) check + swap per-launch
  • Data layer (team templates, role prompts, model profiles, MCP recommendations) hot-pulls from data/ in this repo on launch. A new team you contribute today reaches every installed app within minutes — no rebuild.

That's why the data/ tree is open and community-driven even though the app binaries are closed-source.

Roadmap

  • Multi-agent dispatch with worktree isolation
  • Modular installer + hardware-aware wizard
  • MCP-first tool architecture
  • Fine-tuning + abliteration pipeline
  • GGUF / quantization pipeline
  • Telegram bridge
  • WSL tool isolation — agents run their tools inside Ubuntu, off your Windows drive
  • Cloud CLIs inside the sandbox — Claude/Codex/Gemini/Kimi run isolated too
  • Connect GitHub — isolated agents clone private repos + push from inside the sandbox
  • Auto login-sync — codex/claude/gemini/kimi + every API key mirrored into the sandbox
  • Convert projects isolated↔not from the header; Accounts tests host + sandbox
  • [~] Mac/Linux isolation (beta) — Lima VM (macOS) + bubblewrap (Linux), same model as WSL
  • Visual team builder — Q3 2026
  • macOS + Linux desktop — Q3 2026
  • 24/7 headless / VPS mode — Q4 2026
  • Container / VM deployment — Q4 2026
  • Gamification (agent-vs-agent arena, achievements) — Q4 2026 (in progress)
  • WhatsApp bridge — Q4 2026
  • Vision models (LLaVA / Pixtral) — Q4 2026
  • Voice output (TTS) — Q1 2027
  • Public team marketplace — Q1 2027

Track active work in Discussions → Roadmap.

Who's this for

  • Indie devs & founders — your AI workforce, not a SaaS subscription
  • AI safety researchers — abliteration, red-team teams, adversarial dataset gen
  • Model creators — fine-tune, quantize, ship GGUFs
  • Automation builders — replace n8n / Zapier with agents that understand meaning
  • Privacy-bound teams — legal, medical, defence, regulated industries
  • Agencies — run custom client agent teams 24/7 (when VPS mode lands)
  • Power users — anyone tired of generic chatboxes

Community

License

Repository contents (agent teams, role definitions, registry, schemas, docs): MIT — fork freely, share team packs, build on it.

Application binaries via Releases: see EULA.md. Source for the application itself is not currently public.

Acknowledgements

Standing on the shoulders of: llama.cpp, whisper.cpp, Tauri, Unsloth, Model Context Protocol, and the open-weight model creators (Meta, Alibaba, Mistral, Google, DeepSeek, Anthropic for their safety research).

If you build something cool with OwLLM, share it in Discussions → Show & Tell. Stars are how this category proves itself worth investing in.

About

Local-first AI workstation. Run open-weight models, fine-tune, orchestrate multi-agent teams. No cloud required.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors