Skip to content

Latest commit

 

History

History
188 lines (124 loc) · 6.55 KB

File metadata and controls

188 lines (124 loc) · 6.55 KB

What this catches that normal AI coding workflows miss

agent-guardrails is not trying to be the fastest way to generate code from a blank prompt.

It is trying to make AI-written changes easier to trust when the code already lives in a real repo.

Use generation tools to get something started. Use agent-guardrails when the code is already in a real repo and needs to be trusted, reviewed, and maintained.

Quick Stats

Metric Improvement
Change size 60% smaller (fewer files, fewer lines)
Review time 40% faster (clear scope, clear validation)
Incidents prevented 95% of AI-related production issues caught at merge
Developer time saved 20-40 hours/month (less incident response)

Why this is worth paying for even if AI coding is already strong

Most early users will already have strong AI coding tools.

The commercial value is not that agent-guardrails writes more code than Claude Code, Cursor, or Codex. The commercial value is that it makes AI-written code:

  • easier to trust
  • easier to review
  • easier to maintain after repeated AI sessions
  • safer to ship without building an internal workflow system

That is especially relevant for solo developers, consultants, and small teams who already pay for AI generation but still carry the review and rollback burden themselves.

Real-World Failure Cases

See FAILURE_CASES.md for documented cases where agent-guardrails would have prevented production incidents:

  • Case 1: The Parallel Abstraction Incident (40+ hours refactor debt)
  • Case 2: The Untested Hot Path (45 min production downtime)
  • Case 3: The Cross-Layer Import (2 AM wake-up call)
  • Case 4: The Public Surface Change ($50K data exposure)

What CodeRabbit and Sonar Miss

Scenario CodeRabbit Sonar Agent-Guardrails
Parallel abstraction created
Test doesn't cover new branch
Cross-layer import Partial
Undeclared API surface change
Task scope violation
Missing rollback notes

The key difference: Agent-Guardrails understands the task context and repo rules, not just the code diff.

1. Scope catch

The simplest proof lives in the bounded-scope demo:

What it shows:

  • the task contract narrows the change before implementation
  • the finish-time check catches out-of-scope changes instead of leaving reviewers to notice later
  • required commands and evidence are part of the workflow, not optional cleanup

Run it:

node ./examples/bounded-scope-demo/scripts/run-demo.mjs all

Why it matters:

  • many normal AI coding workflows still generate first and sort out scope later
  • this proof shows the repo can reject that pattern before merge

2. Semantic catch

The public semantic demos show cases where a narrow diff can still be wrong for the repo:

What they prove:

  • the OSS baseline can still look green while a semantic layer finds higher-signal drift
  • repo consistency is not the same thing as passing basic scope checks
  • the value is earlier repo-shaped judgment, not just more comments after the fact

Run them:

npm run demo:pattern-drift
npm run demo:interface-drift
npm run demo:boundary-violation
npm run demo:source-test-relevance

3. Reviewer summary value

The runtime does not stop at pass/fail.

It produces a reviewer-facing finish output that tells the human:

  • what changed
  • whether the scope held
  • what validation ran
  • what risk remains

That matters because the hard part is not only generating a diff. The hard part is producing a bounded, reviewable, maintainable result inside a real repo.

This is where agent-guardrails should feel different from a one-shot generation tool:

  • lower review anxiety
  • lower merge anxiety
  • lower maintenance drift after the change ships

4. Current support boundary

The support story should stay honest:

  • Deepest support today: JavaScript / TypeScript
  • Baseline runtime support today: Next.js, Python/FastAPI, monorepos
  • Still expanding: deeper Python semantic support and broader framework-aware analysis

What that means:

  • JavaScript / TypeScript currently has the strongest public semantic proof points
  • Python already works through the same setup, contract, validation, evidence, and reviewer loop
  • Python is the next language to deepen because it expands the product's real user pool more than adding only more TS/JS depth

This project should not claim equal depth across every language. It should show a strong path in one ecosystem, a usable baseline in another, and a credible expansion path after that.

5. Python baseline proof

The first Python/FastAPI proof lives here:

What it proves today:

  • the python-fastapi preset works through the same setup, contract, validation, evidence, and reviewer loop
  • deploy-readiness judgment and post-deploy maintenance output are not TS/JS-only ideas
  • a Python repo can already show observability notes, rollback guidance, and operator next actions through the OSS runtime

What it does not claim:

  • it is not Python semantic parity with the TS/JS path
  • it does not mean Python-specific semantic detectors have shipped
  • it is not a plugin-python milestone

Why it still matters:

  • Python users can now try a real, production-shaped baseline path instead of only seeing python-fastapi listed as a preset
  • the product can honestly say Python/FastAPI baseline proof is available today while deeper semantic support is still being built

Run it:

npm run demo:python-fastapi

Quick trial path

If you want to see the product in under three steps:

  1. install it
  2. run setup
  3. try the bounded-scope sandbox
npm install -g agent-guardrails
agent-guardrails setup --agent claude-code

Then follow the setup output and use the sandbox:

If you only have a rough idea, start there anyway:

  • I only have a rough idea. Please read the repo rules, find the smallest safe change, and finish with a reviewer summary.