Guardrailed agentic execution engine β so your agent stops when it should, not when your budget runs out.
A zero-dependency library for running autonomous AI agent loops with built-in guardrails: loop detection, token budgets, step limits, duplicate call prevention, and graceful degradation.
An AI agent spent $847 calling the same tool 2,847 times before I added loop detection.
The agent was stuck in a retry loop β same tool, same arguments, same error, 2,847 times in a row. It burned through the entire monthly token budget in 4 hours. The task? A simple "find and summarize" that should have cost $0.03.
Without guardrails, autonomous agents are token-eating machines. This library adds the brakes.
- Loop Detection β Detects repeated tool calls, argument cycles, and stuck states before they burn your budget
- Token Budget β Set hard token limits per task; agent stops gracefully when budget is exhausted
- Step Limits β Cap the number of reasoning + tool-call steps per execution
- Duplicate Prevention β Identical tool calls (same name + same args) are blocked with cached results
- Graceful Degradation β When limits hit, agent returns partial results instead of crashing
- Provider Agnostic β Works with any OpenAI-compatible API: MiMo, DeepSeek, OpenRouter, OpenAI, Anthropic
- MiMo Optimized β Special handling for MiMo thinking models (mimo-v2.5-pro) to prevent token runaway
- Observability β Built-in event emitter for monitoring every step, tool call, and guardrail trigger
- Zero Dependencies β Pure ESM, Node.js 18+, nothing extra
npm install ai-agent-loopimport { createAgentLoop } from 'ai-agent-loop';
const agent = createAgentLoop({
provider: {
baseUrl: 'https://token-plan-sgp.xiaomimimo.com/v1',
apiKey: process.env.MIMO_API_KEY,
model: 'mimo-v2.5',
},
guards: {
maxSteps: 20, // Max reasoning + tool steps
maxTokens: 50_000, // Hard token budget per task
maxToolCalls: 50, // Max tool invocations
duplicateWindow: 10, // Block duplicate calls within last N
stuckThreshold: 3, // Detect stuck state after N same-result calls
},
tools: [
{
name: 'search_web',
description: 'Search the web',
parameters: { type: 'object', properties: { query: { type: 'string' } }, required: ['query'] },
handler: async ({ query }) => fetch(`https://api.search.example/${query}`).then(r => r.json()),
},
],
});
const result = await agent.run('Find the top 3 news stories today and summarize each in one sentence.');
console.log(result.content); // Final response
console.log(result.steps); // Number of steps taken
console.log(result.tokensUsed); // Total tokens consumed
console.log(result.guardsHit); // Which guardrails triggered (if any)
console.log(result.cost); // Estimated cost in USDββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ai-agent-loop β
ββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β Goal β β LLM β β Tool β β
β β Parser ββββΆβ Call ββββΆβ Execute β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββ β
β β GUARDRAIL ENGINE β β
β β β β
β β ββββββββββββ ββββββββββββ ββββββββββββ β β
β β β Loop β β Token β β Step β β β
β β β Detect β β Budget β β Limit β β β
β β ββββββββββββ ββββββββββββ ββββββββββββ β β
β β ββββββββββββ ββββββββββββ ββββββββββββ β β
β β β Dup β β Stuck β β Cost β β β
β β β Block β β Detect β β Track β β β
β β ββββββββββββ ββββββββββββ ββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββ β
β β GRACEFUL DEGRADATION β β
β β (partial results + reason for stop) β β
β βββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββΌββββββββββββββββ β
β βΌ βΌ βΌ β
β βββββββββββ ββββββββββββ ββββββββββββ β
β β Xiaomi β β DeepSeek β β OpenAI β β
β β MiMo β β β β β β
β βββββββββββ ββββββββββββ ββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββ
# Run an agent task with guardrails
ai-agent-loop run "Summarize today's HN top stories" --max-steps 10 --max-tokens 20000
# Dry run β show what would happen without executing
ai-agent-loop dry-run "Analyze this codebase" --max-steps 5
# Show guardrail stats from last run
ai-agent-loop stats
# Test loop detection with a deliberately stuck prompt
ai-agent-loop test-loop --iterations 100Creates a guardrailed agent loop instance.
Config options:
| Option | Type | Default | Description |
|---|---|---|---|
provider |
Object |
required | { baseUrl, apiKey, model } |
guards |
Object |
{} |
Guardrail configuration |
guards.maxSteps |
number |
30 |
Max reasoning + tool steps |
guards.maxTokens |
number |
100_000 |
Hard token budget per task |
guards.maxToolCalls |
number |
100 |
Max tool invocations |
guards.duplicateWindow |
number |
10 |
Block duplicates within last N calls |
guards.stuckThreshold |
number |
3 |
Same result N times = stuck |
guards.maxRetries |
number |
3 |
Max retries per failed tool call |
guards.timeoutMs |
number |
30_000 |
Per-step timeout |
tools |
Array |
[] |
Tool definitions with handlers |
onStep |
Function |
null |
Callback for each step |
onGuard |
Function |
null |
Callback when a guardrail triggers |
onToolCall |
Function |
null |
Callback for each tool invocation |
Returns:
| Method | Description |
|---|---|
.run(goal) |
Execute an agent task with full guardrails |
.dryRun(goal) |
Simulate execution, return plan without running |
.getStats() |
Stats from last run |
.on(event, handler) |
Subscribe to events |
.reset() |
Clear all state |
agent.on('step', ({ step, tokensUsed, toolCalls }) => { ... });
agent.on('tool_call', ({ name, args, cached }) => { ... });
agent.on('guard_triggered', ({ guard, detail }) => { ... });
agent.on('complete', ({ content, steps, tokensUsed, cost, degraded }) => { ... });
agent.on('error', ({ error, step }) => { ... });import { createLoopDetector, createTokenBudget, createStepLimiter } from 'ai-agent-loop/guards';
// Use individually
const detector = createLoopDetector({ window: 10, threshold: 3 });
detector.check({ name: 'search', args: { query: 'test' } });
// β { isDuplicate: false, isStuck: false, count: 1 }
const budget = createTokenBudget({ limit: 50_000 });
budget.consume(1200);
budget.remaining(); // β 48_800
budget.isExhausted(); // β falseimport { createBudgetTracker } from 'ai-agent-loop/budget';
const budget = createBudgetTracker({
provider: 'xiaomi',
model: 'mimo-v2.5',
limit: 100_000,
onExhausted: (stats) => console.log('Budget hit!', stats),
});
budget.log({ inputTokens: 500, outputTokens: 200 });
budget.getSummary();
// β { totalTokens: 700, limit: 100_000, used: '0.7%', estimatedCost: '$0.000175' }Without a retry limit, an agent will call a failing tool forever. A tool that returns an error 100% of the time will consume your entire budget in minutes.
// β No guard β agent retries forever
const agent = createAgentLoop({ guards: {} });
// β
Guarded β stops after 3 retries per tool
const agent = createAgentLoop({
guards: { maxRetries: 3, maxToolCalls: 50 },
});MiMo v2.5-pro, DeepSeek Reasoner, and other thinking models consume 10x+ tokens on internal reasoning. An agent loop that's slightly stuck will burn through a budget 10x faster with a thinking model.
Always use non-thinking models for agent loops unless the task genuinely requires chain-of-thought reasoning.
// β Thinking model β runaway token consumption
const agent = createAgentLoop({
provider: { model: 'mimo-v2.5-pro', ... },
});
// β
Non-thinking β predictable token usage
const agent = createAgentLoop({
provider: { model: 'mimo-v2.5', ... },
});Agents often call the same tool with slightly different arguments that resolve to the same result. Example: { query: "weather NYC" } and { query: "weather New York City" }. A simple string comparison misses these.
The duplicate detector normalizes arguments before comparison, catching semantic duplicates.
When an agent's context window fills up, it forgets the original goal and starts repeating earlier steps. This manifests as a loop that's invisible to simple step counting.
Solution: Set maxSteps well below the context window limit. If your model has 128K context, don't let the agent run 200 steps.
When a guardrail triggers, don't just kill the agent. Return whatever partial results it has collected, along with a clear reason for stopping. Users can decide whether to continue with a fresh budget.
const result = await agent.run('Complex research task');
if (result.degraded) {
console.log(`Stopped early: ${result.stopReason}`);
console.log(`Partial results: ${result.content}`);
}MIT β Hijrah Assalam