YOU DON'T WANT A BIG SURPRISE BILL FROM OPENAI OR ANTHROPIC API WHEN U GIVE FREE TIERS
token-based rate limiting for llm apps. tracks usage per user, supports openai and anthropic, includes cost tracking.
asillios-limiter is a typescript library for controlling and monitoring api usage in apps that use large language models like claude or gpt. when you build an app that lets users interact with an llm, you're paying for every token they use. this library wraps your llm calls and automatically tracks how many tokens each user consumes, letting you set limits and get alerts when users approach their quotas. it works with both anthropic and openai response formats out of the box.
contribute open source contributing guidelines (congrats to @Savy011 and @djsurt)
npm install asillios-limiterimport { createLimiter } from "asillios-limiter";
const limiter = createLimiter({
limit: 100000,
window: 60 * 60 * 1000, // 1 hour
onThreshold: (userId, percent) => {
console.log(`user ${userId} hit ${percent}%`);
},
});
// wrap your llm calls - tokens tracked automatically
const response = await limiter.wrap("user-123", () =>
openai.chat.completions.create({
model: "gpt-4",
messages: [{ role: "user", content: "hello" }],
})
);
// check limits and get stats
const canProceed = await limiter.check("user-123");
const remaining = await limiter.getRemainingTokens("user-123");
const stats = await limiter.stats("user-123");uses a rolling window instead of fixed reset times. smoother limiting and harder to game.
enforce multiple limits simultaneously (e.g., per-minute AND per-day):
const limiter = createLimiter({
limits: [
{ tokens: 10000, window: 60 * 1000 }, // 10k per minute
{ tokens: 100000, window: 24 * 60 * 60 * 1000 }, // 100k per day
],
});let users temporarily exceed limits:
const limiter = createLimiter({
limit: 50000,
window: 60 * 60 * 1000,
burstPercent: 20, // allow 20% overage
});track estimated api costs alongside tokens:
const limiter = createLimiter({
limit: 100000,
window: 60 * 60 * 1000,
trackCost: true,
costLimit: 5.0, // $5 limit
});
// pass model name to calculate cost
const response = await limiter.wrap(
"user-123",
() => anthropic.messages.create({ ... }),
{ model: "claude-3-sonnet" }
);
const stats = await limiter.stats("user-123");
// { tokensUsed, remaining, costUsed, costRemaining, ... }supported models: gpt-4, gpt-4-turbo, gpt-4o, gpt-4o-mini, gpt-3.5-turbo, claude-3-opus, claude-3-sonnet, claude-3-haiku, claude-sonnet-4
import { createLimiter, expressMiddleware } from "asillios-limiter";
const limiter = createLimiter({ limit: 50000, window: 3600000 });
app.use(
"/api/chat",
expressMiddleware(limiter, (req) => req.user?.id ?? null)
);import { createLimiter, nextMiddleware } from "asillios-limiter";
const limiter = createLimiter({ limit: 50000, window: 3600000 });
const checkLimit = nextMiddleware(limiter, (req) => req.headers.get("x-user-id"));
export async function POST(req: Request) {
const { allowed, response } = await checkLimit(req);
if (!allowed) return response;
// proceed with llm call...
}built-in redis adapter (bring your own client):
import { createLimiter, RedisStorage } from "asillios-limiter";
import Redis from "ioredis";
const redis = new Redis();
const limiter = createLimiter({
limit: 100000,
window: 60 * 60 * 1000,
storage: new RedisStorage(redis),
});interface LimiterConfig {
// simple: single limit
limit?: number;
window?: number;
// advanced: multiple limits
limits?: { tokens: number; window: number }[];
// burst allowance percentage
burstPercent?: number;
// cost tracking
trackCost?: boolean;
costLimit?: number;
// storage and callbacks
storage?: StorageAdapter;
onThreshold?: (userId: string, percent: number) => void;
thresholds?: number[]; // default: [80, 90, 100]
}wraps async function and tracks tokens from response.
await limiter.wrap("user-123", () => llmCall(), {
throwOnLimit: true, // throw if over limit
model: "gpt-4", // for cost calculation
});returns true if user is within all limits.
returns remaining tokens for primary limit.
returns full usage stats:
{
tokensUsed: number;
remaining: number;
resetAt: Date;
percentUsed: number;
costUsed?: number; // if trackCost enabled
costRemaining?: number; // if costLimit set
}manually add tokens (for streaming, etc).
reset user's usage.
mit
