Skip to content

lucianareynaud/guardspan

Repository files navigation

guardspan

LLM compliance checker for regulated financial environments. Built with LangGraph, OpenTelemetry, and FastAPI.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         FastAPI Gateway                          │
│  POST /check  ·  GET /health  ·  GET /docs                      │
│  • Generates run_id before workflow execution                    │
│  • Produces minimal audit event for blocked/failed executions    │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│                      LangGraph Workflow                          │
│                                                                  │
│  Input Classifier                                                │
│       ↓                                                          │
│  Policy Enforcer ──[block]──→ END                                │
│       ↓ [allow]                                                  │
│  Model Router                                                    │
│       ↓                                                          │
│  LLM Executor ──[error]──→ END                                   │
│       ↓ [success]                                                │
│  Output Validator                                                │
│       ↓                                                          │
│  Audit Logger                                                    │
│       ↓                                                          │
│      END                                                         │
└─────────────────────────────────────────────────────────────────┘
                         │
          ┌──────────────┴──────────────┐
          ▼                             ▼
  Audit Persistence             OTel Export
  DynamoDB (if configured)      OTLP endpoint (if set)
  Console fallback              ConsoleSpanExporter fallback

What it does

Guardspan implements a compliance-aware LLM workflow for regulated financial environments. The system enforces deterministic policy checks before LLM execution, routes requests to cost-appropriate models based on semantic complexity, validates outputs for compliance violations and PII leakage, and maintains a correlated audit trail with OpenTelemetry observability.

The workflow operates as a six-node synchronous LangGraph pipeline. Policy enforcement blocks prohibited requests before they reach the LLM, preventing wasted API costs and compliance exposure. Model routing selects between gpt-4o-mini and gpt-4o based on input complexity and financial domain keywords, optimizing cost while maintaining response quality. Output validation applies deterministic rules to detect compliance violations (guaranteed returns, insider information), PII patterns (email, phone numbers), and quality issues (truncated responses).

Every execution produces a correlated audit record regardless of outcome. Blocked and failed executions emit minimal audit events at the API boundary. Successful executions persist complete audit records with redacted user input, token usage, cost attribution, and validation results. All workflow nodes emit OpenTelemetry spans with a stable graphspan.* attribute namespace, enabling distributed tracing and cost analysis across executions.

Configuration

# Required
OPENAI_API_KEY=sk-...

# Optional — ConsoleSpanExporter if absent
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317

# Optional — console logging if absent
DYNAMODB_AUDIT_TABLE=guardspan-audit

Install and run

make install
make run

# POST http://localhost:8000/check
# GET  http://localhost:8000/health
# GET  http://localhost:8000/docs

Example request:

curl -X POST http://localhost:8000/check \
  -H "Content-Type: application/json" \
  -d '{
    "user_input": "Como funciona um CDB?",
    "advisor_verified": false
  }'

Example response:

{
  "run_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "policy_decision": "allow",
  "policy_reason": "Policy check passed",
  "model_used": "gpt-4o-mini",
  "estimated_cost_usd": 0.000123,
  "actual_cost_usd": 0.000098,
  "validation_status": "pass",
  "validation_flags": [],
  "audit_backend": "console",
  "error": null
}

Test

make test

Test suite includes:

  • 23 unit tests for deterministic logic (policy, validation, redaction, cost)
  • 22 node tests with OTel fail-open verification
  • 12 integration tests via FastAPI TestClient
  • 9 property-based tests with Hypothesis (max_examples=30)

All tests mock OpenAI API calls. No real API requests are made during testing.

OTel span schema

All workflow nodes emit spans with the graphspan.* attribute namespace:

Attribute Node Type Description
graphspan.node.name All string Node identifier
graphspan.audit.run_id All string UUID correlating execution
graphspan.complexity.level Input Classifier string "low" or "high"
graphspan.policy.decision Policy Enforcer string "allow" or "block"
graphspan.policy.reason Policy Enforcer string Policy rule explanation
graphspan.model.name Model Router, LLM Executor string "gpt-4o-mini" or "gpt-4o"
graphspan.cost.estimated_usd Model Router float Pre-execution cost estimate
graphspan.cost.actual_usd LLM Executor float Post-execution actual cost
graphspan.tokens.input LLM Executor int Prompt tokens consumed
graphspan.tokens.output LLM Executor int Completion tokens generated
graphspan.validation.status Output Validator string "pass" or "fail"
graphspan.validation.flags_count Output Validator int Number of validation flags
graphspan.audit.backend Audit Logger string "dynamodb", "console", or "none"
graphspan.error LLM Executor string Error message if execution failed

Cost model

Model Input (per 1K tokens) Output (per 1K tokens) Example cost (100-word request)
gpt-4o-mini $0.00015 $0.0006 ~$0.0001
gpt-4o $0.0025 $0.01 ~$0.002

Cost estimation uses 1 token ≈ 4 characters and assumes output is 2× input tokens. Actual cost is calculated from provider-reported token usage.

Architecture decisions

Why LangGraph over a sequential chain

LangGraph provides explicit conditional edges for policy-based branching and error handling. The workflow has two conditional branches: Policy Enforcer routes to END on block, and LLM Executor routes to END on error. A sequential chain would require implicit exception handling or nested conditionals, obscuring the control flow. LangGraph makes the execution paths explicit in the graph structure, improving readability and testability.

Why policy and validation are deterministic (no LLM)

Policy enforcement and output validation are compliance-critical control points. Using an LLM for these decisions introduces non-determinism, latency, cost, and the risk of prompt injection. Deterministic regex-based rules provide predictable behavior, zero latency, zero cost, and immunity to adversarial inputs. The trade-off is reduced flexibility — adding new rules requires code changes — but this is acceptable for a regulated environment where policy changes follow a formal review process.

Why run_id is generated at the API boundary

The run_id is the identity of an execution, not the identity of an audit record. Generating it at the API boundary before workflow invocation ensures it is present in all responses (including blocked and failed executions), all OTel spans, and all audit records. If the Audit Logger generated it, blocked and failed executions would have no run_id in their responses, breaking correlation. The API boundary is the single source of truth for execution identity.

Why audit_backend over a boolean

A boolean audit_persisted does not distinguish between "logged to console" and "not recorded anywhere", which are operationally different states for a regulated system. The three-value enum ("dynamodb", "console", "none") makes the persistence outcome explicit: durable storage succeeded, best-effort logging occurred, or no record was produced. This distinction is meaningful for troubleshooting and operational review.

Why redacted storage for user_input

Guardspan operates in a regulated financial context subject to LGPD (Brazilian General Data Protection Law). Raw storage maximizes forensic utility but conflicts with Article 6 data minimization principles. Hashed storage eliminates forensic utility entirely. Redacted storage preserves semantic context for operational review while reducing data subject risk.

The redaction implementation uses deterministic regex patterns for CPF (Brazilian tax ID), CNPJ (Brazilian company ID), Brazilian phone numbers, and email addresses. Patterns are replaced with tokens like [CPF_REDACTED] and [EMAIL_REDACTED].

Redaction scope and limitations

This redaction is illustrative and not production-grade DLP. The regex patterns are Brazilian-context only and do not cover:

  • US Social Security Numbers
  • Credit card numbers (PAN)
  • IBAN or other international banking identifiers
  • Addresses or geolocation data
  • Biometric data
  • Health information (HIPAA-regulated data)

Production deployments in regulated environments must integrate a dedicated DLP solution (e.g., AWS Macie, Google Cloud DLP API, Microsoft Purview) or a commercial PII detection library with broader pattern coverage and language support.

DynamoDB is optional. The local MVP requires no AWS resources. Audit records fall back to console logging if DYNAMODB_AUDIT_TABLE is not configured. This design supports local development and testing without cloud dependencies.

graphspan

The span attribute namespace graphspan.* used in this project is intended for future extraction as a standalone PyPI library. The namespace provides a stable telemetry contract for LangGraph workflows, independent of the application name.

Planned repository: github.com/lucianareynaud/graphspan

The graphspan library will provide:

  • Standardized span attribute names for LLM workflows
  • Fail-open span emission helpers
  • Cost attribution utilities
  • Audit correlation patterns

The tracer name guardspan.workflow identifies this application. The attribute namespace graphspan.* identifies the telemetry contract. These are deliberately separate to enable library extraction without breaking existing instrumentation.

About

Compliance-aware LLM workflow for regulated financial environments: policy enforcement, model routing, output validation, and correlated audit trail via LangGraph and OTel.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors