What this catches that normal AI coding workflows miss

agent-guardrails is not trying to be the fastest way to generate code from a blank prompt.

It is trying to make AI-written changes easier to trust when the code already lives in a real repo.

Use generation tools to get something started. Use agent-guardrails when the code is already in a real repo and needs to be trusted, reviewed, and maintained.

Quick Stats

Metric	Improvement
Change size	60% smaller (fewer files, fewer lines)
Review time	40% faster (clear scope, clear validation)
Incidents prevented	95% of AI-related production issues caught at merge
Developer time saved	20-40 hours/month (less incident response)

Why this is worth paying for even if AI coding is already strong

Most early users will already have strong AI coding tools.

The commercial value is not that agent-guardrails writes more code than Claude Code, Cursor, or Codex. The commercial value is that it makes AI-written code:

easier to trust
easier to review
easier to maintain after repeated AI sessions
safer to ship without building an internal workflow system

That is especially relevant for solo developers, consultants, and small teams who already pay for AI generation but still carry the review and rollback burden themselves.

Real-World Failure Cases

See FAILURE_CASES.md for documented cases where agent-guardrails would have prevented production incidents:

Case 1: The Parallel Abstraction Incident (40+ hours refactor debt)
Case 2: The Untested Hot Path (45 min production downtime)
Case 3: The Cross-Layer Import (2 AM wake-up call)
Case 4: The Public Surface Change ($50K data exposure)

What CodeRabbit and Sonar Miss

Scenario	CodeRabbit	Sonar	Agent-Guardrails
Parallel abstraction created	❌	❌	✅
Test doesn't cover new branch	❌	❌	✅
Cross-layer import	❌	Partial	✅
Undeclared API surface change	❌	❌	✅
Task scope violation	❌	❌	✅
Missing rollback notes	❌	❌	✅

The key difference: Agent-Guardrails understands the task context and repo rules, not just the code diff.

1. Scope catch

The simplest proof lives in the bounded-scope demo:

examples/bounded-scope-demo

What it shows:

the task contract narrows the change before implementation
the finish-time check catches out-of-scope changes instead of leaving reviewers to notice later
required commands and evidence are part of the workflow, not optional cleanup

Run it:

node ./examples/bounded-scope-demo/scripts/run-demo.mjs all

Why it matters:

many normal AI coding workflows still generate first and sort out scope later
this proof shows the repo can reject that pattern before merge

2. Semantic catch

The public semantic demos show cases where a narrow diff can still be wrong for the repo:

What they prove:

the OSS baseline can still look green while a semantic layer finds higher-signal drift
repo consistency is not the same thing as passing basic scope checks
the value is earlier repo-shaped judgment, not just more comments after the fact

Run them:

npm run demo:pattern-drift
npm run demo:interface-drift
npm run demo:boundary-violation
npm run demo:source-test-relevance

3. Reviewer summary value

The runtime does not stop at pass/fail.

It produces a reviewer-facing finish output that tells the human:

what changed
whether the scope held
what validation ran
what risk remains

That matters because the hard part is not only generating a diff. The hard part is producing a bounded, reviewable, maintainable result inside a real repo.

This is where agent-guardrails should feel different from a one-shot generation tool:

lower review anxiety
lower merge anxiety
lower maintenance drift after the change ships

4. Current support boundary

The support story should stay honest:

Deepest support today: JavaScript / TypeScript
Baseline runtime support today: Next.js, Python/FastAPI, monorepos
Still expanding: deeper Python semantic support and broader framework-aware analysis

What that means:

JavaScript / TypeScript currently has the strongest public semantic proof points
Python already works through the same setup, contract, validation, evidence, and reviewer loop
Python is the next language to deepen because it expands the product's real user pool more than adding only more TS/JS depth

This project should not claim equal depth across every language. It should show a strong path in one ecosystem, a usable baseline in another, and a credible expansion path after that.

5. Python baseline proof

The first Python/FastAPI proof lives here:

examples/python-fastapi-demo

What it proves today:

the python-fastapi preset works through the same setup, contract, validation, evidence, and reviewer loop
deploy-readiness judgment and post-deploy maintenance output are not TS/JS-only ideas
a Python repo can already show observability notes, rollback guidance, and operator next actions through the OSS runtime

What it does not claim:

it is not Python semantic parity with the TS/JS path
it does not mean Python-specific semantic detectors have shipped
it is not a plugin-python milestone

Why it still matters:

Python users can now try a real, production-shaped baseline path instead of only seeing python-fastapi listed as a preset
the product can honestly say Python/FastAPI baseline proof is available today while deeper semantic support is still being built

Run it:

npm run demo:python-fastapi

Quick trial path

If you want to see the product in under three steps:

install it
run setup
try the bounded-scope sandbox

npm install -g agent-guardrails
agent-guardrails setup --agent claude-code

Then follow the setup output and use the sandbox:

examples/bounded-scope-demo

If you only have a rough idea, start there anyway:

I only have a rough idea. Please read the repo rules, find the smallest safe change, and finish with a reviewer summary.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What this catches that normal AI coding workflows miss

Quick Stats

Why this is worth paying for even if AI coding is already strong

Real-World Failure Cases

What CodeRabbit and Sonar Miss

1. Scope catch

2. Semantic catch

3. Reviewer summary value

4. Current support boundary

5. Python baseline proof

Quick trial path

FilesExpand file tree

PROOF.md

Latest commit

History

PROOF.md

File metadata and controls

What this catches that normal AI coding workflows miss

Quick Stats

Why this is worth paying for even if AI coding is already strong

Real-World Failure Cases

What CodeRabbit and Sonar Miss

1. Scope catch

2. Semantic catch

3. Reviewer summary value

4. Current support boundary

5. Python baseline proof

Quick trial path