Skip to content

TheGhulam/LedgerGuard

Repository files navigation

🛡️ LedgerGuard

License: MIT Python 3.11+ Next.js 16 FastAPI Pydantic Agentic Engineering RAG Architecture LLM Guardrails Pydantic Validation

This project was made for the Cursor x Briefcase Hackathon 2026 as a robust, high-confidence AI harness designed to process and interpret financial data (like invoices and bank feeds) with maximum reliability. It features a sophisticated seven-layer safety and validation pipeline, ensuring that large language models can be safely deployed in high-stakes financial environments.

7-Layer Guard Pipeline

Invariant evaluation and execution graph that processes incoming requests through seven distinct layers of security, validation, and deterministic checking.

Layer Mechanism Type Budget Description
L1 HTML/Hidden-Text Sanitization Deterministic <10ms Strips adversarial payloads concealed in documents.
L2 Llama Prompt Guard 2 (86M) Local ML (CPU) ~100ms Detects indirect prompt injections via state-of-the-art classifier.
L3 Pydantic AI Extraction LLM + Retry ~1s Strict structured JSON extraction with self-correction.
L4 Dual-LLM Cross-Verification LLM ~2s Employs a secondary LLM (e.g., Claude) to verify the primary's work.
L5 Deterministic Policy Checks Deterministic <10ms Domain-specific logic (e.g., verifying if an entity is dissolved).
L6 OpenAI Guardrails LLM ~500ms Checks final actions for safety and alignment before execution.
L7 Confidence-Based Routing Deterministic <1ms Decides whether to auto-execute or escalate to a human operator.

Technology Stack

  • Backend / Agent Framework: Python 3.11, FastAPI, Pydantic AI v1, LangGraph v1.
  • Frontend / Client: Next.js 16, TypeScript, Tailwind CSS v4, Vercel AI SDK 6, Framer Motion.
  • Safety & ML: HuggingFace Transformers (Prompt Guard), OpenAI API, Anthropic API.
  • Database: SQLModel + SQLite for zero-ops, immutable audit logging.
  • External Integrations: Mock Xero Ledger, UK Companies House REST client, Specter MCP.

Key Features

1. Dual-Pane Validation UI

A real-time, side-by-side comparison interface demonstrating "Guardrails OFF vs ON," where users can watch the seven-layer pipeline catch and neutralize adversarial inputs live.

2. Confidence-Aware Queue

A "Human-Out-Of-The-Loop" (HOOTL) inbox that automatically processes mundane financial transactions but flags edge cases or low-confidence predictions for human review.

3. Comprehensive Evaluation Harness

A built-in eval_runner capable of scoring the agent against happy paths, edge cases, and active adversarial attacks, generating actionable reliability metrics like Expected Calibration Error (ECE).

4. Zero-Ops Audit Trail

Every decision made by the agent, along with confidence scores and fired guard layers, is immutably logged to an SQLite database via a custom @audit_logged decorator.

Quick Start

Backend Environment

# Set up virtual environment
python -m venv .venv && source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Configure environment variables
cp .env.example .env

# Run the API server
python -m uvicorn app.main:app --reload --port 8000

Frontend Application

cd frontend
npm install
npm run dev

The application is served at http://localhost:3000.

  • /: The Dual-pane demonstration.
  • /queue: The confidence-aware operator queue.
  • /evals: Real-time evaluation scorecard.

Project Structure Highlights

agent/
├── core/                # Domain-invariant engine (Graph, Guards, Audit, Evals)
├── domain/              # Reference implementation: UK Invoice Processing
└── domain_bank_txn/     # Reference implementation: Bank Feed Categorization
app/                     # FastAPI App & Vercel UI Message Stream Protocol
frontend/                # Next.js 16 App Router UI

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors