feat(deploy): Phase 3 — per-IP rate limit on canonical demo proxy by blove · Pull Request #315 · cacheplane/angular-agent-framework

blove · 2026-05-14T05:31:22Z

Summary

Phase 3 of the canonical-demo deployment plan. Caps anonymous OpenAI spend on `demo.cacheplane.ai` by limiting `POST /api/threads/*/runs/stream` to 10 requests per minute per IP. Backed by Neon Postgres (already provisioned for this team) — not Upstash.

Architecture

`scripts/rate-limit.ts` — `checkRateLimit(ip)` helper runs DELETE+INSERT+COUNT against a self-pruning `rate_limit_events` table. Uses `@neondatabase/serverless` HTTP driver (no connection pool needed for Vercel serverless).
`scripts/langgraph-proxy.ts` — new `ProxyConfig.checkRateLimit` hook. Gates only `POST /api/threads/*/runs/stream` (the path that costs OpenAI tokens). Other endpoints bypass.
`scripts/demo-middleware.ts` — wires the hook for the demo wrapper. `scripts/examples-middleware.ts` (cockpit-examples) intentionally unchanged.
`migrations/0001_rate_limit_events.sql` — idempotent `CREATE TABLE IF NOT EXISTS` + composite index on `(ip, ts DESC)`.

Fail-open: if `DATABASE_URL` is unset at module load or Postgres throws mid-request, the proxy logs a warning and allows the request through. Marketing demo > strict protection during a rare dependency outage.

External setup (already done)

✅ Neon DB provisioned (reuses the existing `cacheplane` Vercel-Neon integration store)
✅ `DATABASE_URL` auto-set on `cacheplane-demo` project by the integration
✅ Migration applied to the Neon DB via `@neondatabase/serverless`

Spec & Plan

`docs/superpowers/specs/2026-05-13-canonical-demo-rate-limit-design.md`
`docs/superpowers/plans/2026-05-13-canonical-demo-rate-limit.md`

Test plan

5 unit tests in `scripts/rate-limit.spec.ts` (no env → noop, under limit, at-limit boundary, over limit, fail-open on SQL throw)
3 new tests in `scripts/langgraph-proxy.spec.ts` (non-gated bypass, gated-allowed, gated-denied with 429+Retry-After)
Bundle includes `@neondatabase` + 3 `rate_limit_events` SQL references
After merge, fire 12 `/runs/stream` requests from one IP — expect first 10 = 200, last 2 = 429 with `Retry-After: 60`

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Caps anonymous OpenAI spend on demo.cacheplane.ai by limiting POST /api/threads/*/runs/stream to 10 requests per minute per IP. Backed by Neon Postgres (already provisioned) instead of Upstash — saves a vendor. Self-cleaning via inline DELETE+INSERT+COUNT in a single transaction. Fail-open if Postgres is unreachable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…gration Foundation for Phase 3 per-IP rate limiting on the canonical demo proxy. The migration creates a self-pruning events table (composite index on (ip, ts DESC)). @neondatabase/serverless is the HTTP-fetch driver compatible with Vercel Node serverless functions — no connection pool required. The migration will be applied to the existing Neon database in a separate step controlled by the deploying engineer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds a small helper that runs DELETE+INSERT+COUNT against a rate_limit_events table to enforce a sliding-window per-IP rate limit. Uses @neondatabase/serverless (HTTP fetch driver) so it works in Vercel Node serverless functions without a connection pool. Fail-open by design — if DATABASE_URL is unset at module init or a SQL call throws at runtime, allows the request through and logs a warning. Marketing demo > strict protection during a rare dep outage. 5 unit tests cover: missing env (no-op), below limit, at-limit boundary, over limit (429 + retry-after), SQL throw (fail-open). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds a checkRateLimit hook to ProxyConfig. When configured (only on the demo wrapper), the proxy gates POST /api/threads/{id}/runs/stream requests through the provided hook before forwarding. Denied requests get 429 + Retry-After header + JSON body. Non-gated requests (GET /api/info, POST /api/threads, etc.) bypass the hook entirely — protection lives only on the path that actually burns OpenAI tokens. 3 new unit tests cover: non-gated bypass, gated-allowed forwards, gated-denied returns 429 without calling fetch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The shared langgraph-proxy factory accepts an optional checkRateLimit hook (added in the previous commit). The demo wrapper now provides it; the examples wrapper stays unset so examples remains unrate- limited (separate decision). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-05-14T05:31:27Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
cacheplane	Ready	Preview, Comment	May 14, 2026 5:33am

…lue, not inside a quoted literal (#316) Phase 3 (#315) introduced a per-IP rate limit that was a silent no-op in production. Symptom: 12 streaming requests in a row all returned 200; rate_limit_events table had 0 rows. Root cause: the SQL used `interval '${WINDOW_SECONDS} seconds'` inside a tagged-template literal. The @neondatabase/serverless driver substitutes `${...}` placeholders as $N parameters, but parameters cannot appear inside a Postgres string literal. The driver emitted `interval '$2 seconds'` and the planner rejected it with `invalid input syntax for type interval`. The proxy's fail-open catch then allowed the request through. Fix: build `WINDOW_INTERVAL = '60 seconds'` at module load and splice it as a parameterized value cast to ::interval: `ts < now() - ${WINDOW_INTERVAL}::interval` That emits `ts < now() - $2::interval`, which Postgres evaluates correctly. Also added `AND ts > now() - ${WINDOW_INTERVAL}::interval` to the SELECT — the DELETE+SELECT now use the same window boundary so the count can't accidentally include rows that the DELETE didn't yet prune. Smoke-tested against the live Neon DB: Request 1: count=1, allowed=true ... Request 10: count=10, allowed=true Request 11: count=11, allowed=false ← rate limited Request 12: count=12, allowed=false Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

blove and others added 6 commits May 13, 2026 21:23

docs: Phase 3 plan — per-IP rate limit via Neon

305318d

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel Bot deployed to Preview – cacheplane May 14, 2026 05:33 View deployment

blove merged commit 6465a98 into main May 14, 2026
16 checks passed

blove mentioned this pull request May 14, 2026

fix(rate-limit): SQL interval parameterization — Phase 3 hotfix #316

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(deploy): Phase 3 — per-IP rate limit on canonical demo proxy#315

feat(deploy): Phase 3 — per-IP rate limit on canonical demo proxy#315
blove merged 6 commits into
mainfrom
claude/canonical-demo-rate-limit

blove commented May 14, 2026

Uh oh!

vercel Bot commented May 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

blove commented May 14, 2026

Summary

Architecture

External setup (already done)

Spec & Plan

Test plan

Uh oh!

vercel Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented May 14, 2026 •

edited

Loading