🛡️ LedgerGuard

This project was made for the Cursor x Briefcase Hackathon 2026 as a robust, high-confidence AI harness designed to process and interpret financial data (like invoices and bank feeds) with maximum reliability. It features a sophisticated seven-layer safety and validation pipeline, ensuring that large language models can be safely deployed in high-stakes financial environments.

7-Layer Guard Pipeline

Invariant evaluation and execution graph that processes incoming requests through seven distinct layers of security, validation, and deterministic checking.

Layer	Mechanism	Type	Budget	Description
L1	HTML/Hidden-Text Sanitization	Deterministic	<10ms	Strips adversarial payloads concealed in documents.
L2	Llama Prompt Guard 2 (86M)	Local ML (CPU)	~100ms	Detects indirect prompt injections via state-of-the-art classifier.
L3	Pydantic AI Extraction	LLM + Retry	~1s	Strict structured JSON extraction with self-correction.
L4	Dual-LLM Cross-Verification	LLM	~2s	Employs a secondary LLM (e.g., Claude) to verify the primary's work.
L5	Deterministic Policy Checks	Deterministic	<10ms	Domain-specific logic (e.g., verifying if an entity is dissolved).
L6	OpenAI Guardrails	LLM	~500ms	Checks final actions for safety and alignment before execution.
L7	Confidence-Based Routing	Deterministic	<1ms	Decides whether to auto-execute or escalate to a human operator.

Technology Stack

Backend / Agent Framework: Python 3.11, FastAPI, Pydantic AI v1, LangGraph v1.
Frontend / Client: Next.js 16, TypeScript, Tailwind CSS v4, Vercel AI SDK 6, Framer Motion.
Safety & ML: HuggingFace Transformers (Prompt Guard), OpenAI API, Anthropic API.
Database: SQLModel + SQLite for zero-ops, immutable audit logging.
External Integrations: Mock Xero Ledger, UK Companies House REST client, Specter MCP.

Key Features

1. Dual-Pane Validation UI

A real-time, side-by-side comparison interface demonstrating "Guardrails OFF vs ON," where users can watch the seven-layer pipeline catch and neutralize adversarial inputs live.

2. Confidence-Aware Queue

A "Human-Out-Of-The-Loop" (HOOTL) inbox that automatically processes mundane financial transactions but flags edge cases or low-confidence predictions for human review.

3. Comprehensive Evaluation Harness

A built-in eval_runner capable of scoring the agent against happy paths, edge cases, and active adversarial attacks, generating actionable reliability metrics like Expected Calibration Error (ECE).

4. Zero-Ops Audit Trail

Every decision made by the agent, along with confidence scores and fired guard layers, is immutably logged to an SQLite database via a custom @audit_logged decorator.

Quick Start

Backend Environment

# Set up virtual environment
python -m venv .venv && source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Configure environment variables
cp .env.example .env

# Run the API server
python -m uvicorn app.main:app --reload --port 8000

Frontend Application

cd frontend
npm install
npm run dev

The application is served at http://localhost:3000.

/: The Dual-pane demonstration.
/queue: The confidence-aware operator queue.
/evals: Real-time evaluation scorecard.

Project Structure Highlights

agent/
├── core/                # Domain-invariant engine (Graph, Guards, Audit, Evals)
├── domain/              # Reference implementation: UK Invoice Processing
└── domain_bank_txn/     # Reference implementation: Bank Feed Categorization
app/                     # FastAPI App & Vercel UI Message Stream Protocol
frontend/                # Next.js 16 App Router UI

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.claude		.claude
.cursor		.cursor
agent		agent
app		app
data		data
docs		docs
frontend		frontend
mcp		mcp
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛡️ LedgerGuard

7-Layer Guard Pipeline

Technology Stack

Key Features

1. Dual-Pane Validation UI

2. Confidence-Aware Queue

3. Comprehensive Evaluation Harness

4. Zero-Ops Audit Trail

Quick Start

Backend Environment

Frontend Application

Project Structure Highlights

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🛡️ LedgerGuard

7-Layer Guard Pipeline

Technology Stack

Key Features

1. Dual-Pane Validation UI

2. Confidence-Aware Queue

3. Comprehensive Evaluation Harness

4. Zero-Ops Audit Trail

Quick Start

Backend Environment

Frontend Application

Project Structure Highlights

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages