┌──────────────────────┐
│ action ─────┐ │
│ ▼ │
│ ╭─────────────╮ │
│ │ ▓ gate ▓ │ │
│ ╰─────────────╯ │
│ ╱ │ ╲ │
│ ✓ ? ✗ │
│ allow ask deny │
└──────────────────────┘
bareguard
One chokepoint between your agent and the world. Bounds what the agent does, not what it says. Single audit log. Hard caps that halt with a human in the loop. ~1,000 lines, one production dep.
One chokepoint between your agent and the world. Every action the agent takes — a shell command, a file write, a network call, a spend — passes through one Gate and comes back allow, deny, or ask a human. You get a hard floor under a probabilistic agent, and a single audit log of everything it tried.
That floor is Axis A of a floor + harness model: you can't make a probabilistic agent deterministic, so you fence where the dice can do damage. Axis B (opt-in) is the complement — it reconciles what came back against what you asked for. Both are shown below.
Small on purpose: one Gate, three call sites (redact · check · record), thirteen primitives you can each read in a sitting. Embed it like the rest of the bare suite — no daemon, no SaaS, no telemetry.
What it isn't — bareguard owns one layer and is honest about the rest. It's not a content filter (toxicity / PII / schema → guardrails-ai), not a sandbox (containment → Docker / gVisor), and not auth (who the actor is → upstream; per-principal policy rides action._ctx). It decides the action; it never runs it. The deliberate non-goals are listed once in the NO-GO list.
npm install bareguard
Requires Node.js >= 20. One production dep: proper-lockfile. Ships with TypeScript types (generated from JSDoc) — import { Gate, type GateConfig } from "bareguard" works out of the box, no @types package needed.
import { Gate } from "bareguard";
const gate = new Gate({
tools: { allowlist: ["bash", "read", "write", "fetch"] },
bash: { allow: ["git", "ls"], denyPatterns: [/sudo/, /rm\s+-rf/] },
fs: { writeScope: ["/tmp/agent"], readScope: ["/tmp"], deny: ["~/.ssh"] },
budget: { maxCostUsd: 5.00, maxTokens: 100_000 },
limits: { maxTurns: 50 },
humanChannel: async (event) => {
// event.kind: "ask" | "halt" — your UX decides (TUI, Slack, web, PIN)
return { decision: "allow" }; // or "deny" / "topup" / "terminate"
},
});
await gate.init();
// In your agent loop:
const decision = await gate.check(action); // audit auto-redacts if `secrets` is set
if (decision.outcome === "allow") {
const result = await yourExecutor(action);
await gate.record(action, result); // result.costUsd / result.tokens
}
// gate.check never returns "askHuman" — bareguard resolves that internally
// via humanChannel and gives you a terminal allow/deny.The Core is three modules: bareagent drives the think→act loop, litectx supplies ranked context, and bareguard gates every action between them. A loop looks like:
const ctx = await memory.recall(goal); // litectx → ranked context
const action = await agent.next(goal, ctx); // bareagent → a proposed action
const decision = await gate.check(action); // bareguard → allow / deny / ask-a-human
if (decision.outcome === "allow") await run(action);And it gates on meaning, not text: when the agent writes a memory, litectx emits the fact's source and bareguard's flags primitive decides on that field directly — flags: { provenance: { web: "ask" }, injectionRisk: { high: "deny" } } — no brittle regex over a serialized blob.
Wiring it into a real agent? Hand your AI assistant the integration guide and describe what you want:
Read bareguard.context.md from node_modules/bareguard/bareguard.context.md,
then wire a Gate into my agent. Here's my setup: <describe loop, tools, budget>.
That file has the humanChannel patterns, shared-budget-across-processes setup, eval order, audit format, and 10 wiring recipes.
Thirteen small files — each ~30–180 lines. The gate runs them in a fixed order (deny → ask → scope → default, first match wins) and they compose into harness bundles: tighten-only capability presets an agent picks at runtime, never load-bearing for safety — pick the wrong one and the floor still holds. (In code-mode, the agent writes a code body over a typed tool menu and the gate stays in the parent process; the agent never holds a raw tool.)
- Scope what runs —
toolsis a closed allowlist (deny-by-default);bash/fs/netbound which commands, paths, and domains are even reachable. - Tier what's dangerous —
bash.classifyranks a command safe → destructive → super-destructive across Linux / macOS / Windows and routes the severity to your human channel;contentships safe defaults (rm -rf /,DROP TABLEdenied outright; destructive verbs ask). - Bound what accumulates —
budgetcaps spend, tokens, or any countable resource, andlimitscaps turns / children / depth — both halt with a human in the loop, not silently, and the cap is shared across processes. - Gate on meaning, not text —
flagsreads a structured field's value (a memory engine'sprovenance/injectionRisk) straight off the action, no regex; the same channel can also confirm before every call of a tool. - Prove what happened —
secretsauto-redacts every audit line, and oneauditJSONL joins each request to its outcome and its approval, even when two actions look identical.
Full per-primitive reference lives in the Usage Guide and Integration Guide — not here.
Tested across Linux + macOS + Windows × Node 20 + 22: real-subprocess shared-budget contention, halt cascades, single-file audit atomicity, and family-tree stitching across a 3-deep spawn tree.
Axis A (everything above) gates what the agent is about to do. Axis B is the complement: after a result comes back, it carries a fact about whether that result honored the request, so a human approval shows independent facts instead of the agent's own summary. It's a detector, never an enforcer — it annotates an Axis-A stop, it never blocks alone.
The line bareguard holds is facts, not judgments. A fact is something you computed deterministically — a membership test, a number comparison (booking €400 > your €300 cap; this memory is agent-authored, you asked for human-only). bareguard carries facts; it never runs an LLM and never decides. A soft "this feels off-topic" is a non-fact — a judgment — and stays on your side of the line, because a model's guess must never auto-pass or auto-block an action.
// you compute the fact (a deterministic check); bareguard buffers it and rides the next ask
await gate.annotate({ surface: true, verdict: "broke", where: "you said under €300; the booking is €400" });
await gate.check({ type: "book", needsReview: "yes" }); // the fact surfaces as event.annotations
const facts = gate.drainAnnotations(); // and/or feed them back to the agentYou declare undoable action types via axisB: { reversible: [...] }; reversibility is read from the gated action's type, never the fact, the agent, or the model. The knob (strict default | relaxed) is pure noise control on the reversible path, never safety.
| Integration Guide | LLM-optimized wiring — hand it to your AI assistant. |
| Usage Guide | Eval order, common gotchas, and 8 deployment recipes. |
| Harness cookbook | Vetted capability bundles — tighten-only presets over one floor. |
| PRD | Unified design spec + future-feature candidates. |
| Harness research | Problem space, the A2A intent-drift experiment, and identity/the gate (auth is upstream; per-principal policy via _ctx) — three merged. |
| NO-GO list | What bareguard deliberately won't do. |
| Decisions log · CHANGELOG | Design calls and release history. |
Local-first, composable agent infrastructure. Same API patterns throughout — mix and match, each module works standalone.
Core — the brain, the gate, the memory.
- bareagent — the think→act→observe loop. Goal in → coordinated actions out. Replaces LangChain, CrewAI, AutoGen.
- bareguard — the single gate every action passes through. Action in → allow / deny / ask-a-human out. Replaces hand-rolled allowlists and scattered policy code.
- litectx — tree-sitter code + memory graph with activation decay, plus lightweight context engineering (write · select · compress · isolate). Query in → ranked context out.
Optional reach — give the agent hands.
- barebrowse — a real browser for agents. URL in → pruned snapshot out. Replaces Playwright, Selenium, Puppeteer.
- baremobile — Android + iOS device control. Screen in → pruned snapshot out. Replaces Appium, Espresso, XCUITest.
- beeperbox — 50+ messaging networks via one MCP server (headless Beeper Desktop in Docker). Chat in → unified message stream out. Replaces Twilio, per-platform bot APIs.