Skip to content

AlperNab/clause-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

56 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

βš–οΈ Clause Extractor

Contract and compliance review UX with clause grids, obligation tracking, redline planning, and risk controls.

Python FastAPI SQLite Local LLM Cloud LLM No Fake Data

Domain: Legal / Contract Review Β· Suite: Legal & Compliance Suite Β· Accent: #c084fc

πŸš€ Quick Start Β· ✨ Features Β· πŸŽ›οΈ Customization Β· 🧠 LLM Providers Β· πŸ§ͺ Testing Β· 🧯 Troubleshooting


🧭 What This Project Does

Clause Extractor is a standalone, browser-based AI workflow app for Legal / Contract Review. It turns structured inputs, uploaded files, and project-specific settings into reviewable outputs using a deterministic local engine plus optional local/cloud LLM enhancement.

Core job: Contract β†’ clause extraction and risk analysis.

Designed for: Domain operator, business owner, analyst, or team member who needs this workflow executed reliably.

Why use it:

  • 🧩 Standalone project folder: run this project by itself without depending on a central dashboard.
  • πŸ–₯️ Elegant GUI: includes project-specific panels, structured forms, upload handling, output preview, and exports.
  • 🧠 Model-flexible: choose local models for privacy or cloud models for stronger reasoning.
  • 🧾 Auditable: every run is stored in SQLite with inputs, settings, result, and export history.
  • 🚫 No fake live data: external systems are only used when real API keys/connectors are configured.
  • πŸ›‘οΈ Human review gates: sensitive legal, medical, hiring, finance, or security outputs are flagged for review.

✨ Features

  • clause taxonomy
  • obligation calendar
  • risk heatmap
  • negotiation fallback language
  • evidence highlights
  • missing clause detector
  • playbook comparison

🧱 Built-In Platform Capabilities

  • ⚑ FastAPI backend with documented JSON endpoints.
  • 🎨 Responsive web UI with dark, polished SaaS-style layout.
  • πŸ“ File upload and text extraction for common document/code formats.
  • πŸ—‚οΈ Job history saved locally in data/*.sqlite3.
  • πŸ” Encrypted provider settings for API keys and local endpoints.
  • πŸ“€ Exports to Markdown, JSON, DOCX, and PDF when dependencies are available.
  • πŸ”Œ Provider routing for local and cloud LLMs.
  • πŸ§ͺ Local test file to verify the project runs.

🎨 UX/UI Design

UX profile: Legal Review Desk

Workflow layout: Document intake β†’ clause map β†’ risk heatmap β†’ negotiation/actions

Empty state: Paste legal text or upload a document. This is decision support, not legal advice.

Main UI Components

  • Clause taxonomy grid
  • Risk heatmap
  • Evidence highlights
  • Missing-clause detector
  • Obligation calendar

Review / Workflow Lanes

  • Read
  • Classify
  • Risk-rate
  • Negotiate
  • Finalize

Metrics Shown in the Interface

  • Risk exposure
  • Missing clauses
  • Obligations found
  • Review readiness

Quick Actions

  • Extract key clauses
  • Risk-rate clauses
  • Create fallback language
  • Build negotiation checklist

🧩 Project Inputs

These are the main fields exposed by the GUI and /api/run. Required fields are enforced before execution.

Field Type Required Default Purpose
contract
Contract
text Yes β€” Affects input: Contract.
work_brief
Work brief / source text / URL / instructions
textarea Yes β€” Paste the material, URL, description, or instruction needed for this project.

πŸŽ›οΈ Customization

This project is not a generic prompt box. The customization controls are connected to workflow behavior, validation, output shape, and export format.

Field Type Required Default Purpose
execution_mode
Execution mode
select No Production Controls strictness, depth, and output format for this project workflow.
contract_type
contract type
text No β€” Affects customization: contract type.
jurisdiction
jurisdiction
select No United States Affects customization: jurisdiction.
party_role
party role
select No buyer/client Affects customization: party role.
risk_tolerance
risk tolerance
slider No 50 Affects customization: risk tolerance.
materiality_threshold
materiality threshold
slider No 50 Affects customization: materiality threshold.
preferred_terms
preferred terms
text No β€” Affects customization: preferred terms.
legal_language_level
legal language level
select No English Affects customization: legal language level.
output_format
output format
select No Markdown Affects customization: output format.
language
language
select No English Affects customization: language.
privacy_mode
privacy mode
select No cloud allowed Affects customization: privacy mode.
confidence_threshold
Confidence threshold
slider No 75 Items below this confidence are escalated to the human review queue.

Select / Option Controls

  • Execution mode: Draft, Production, Audit / strict review, JSON/API output
  • jurisdiction: United States, United Kingdom, European Union, Egypt, UAE, Saudi Arabia, Custom
  • party role: buyer/client, seller/vendor, employer, employee, landlord, tenant, disclosing party, receiving party, mutual, custom
  • legal language level: English, Arabic, Egyptian Arabic, French, German, Spanish
  • output format: Markdown, JSON, CSV, PDF, DOCX, XLSX
  • language: English, Arabic, Egyptian Arabic, French, German, Spanish
  • privacy mode: cloud allowed, local only, redact sensitive data

🧠 LLM Providers

You can run the project with the local deterministic engine, or enhance the output with a configured LLM provider.

Supported Provider Types

Provider Type Examples Best For
Local OpenAI-compatible Ollama, LM Studio, vLLM Private files, offline/local workflows, cost control
Cloud OpenAI-compatible OpenAI, OpenRouter, custom gateway General high-quality generation and structured output
Anthropic Claude models Long-context reasoning and document-heavy workflows
Google Gemini Gemini models Multimodal or Google ecosystem workflows
Mistral Mistral API Fast European cloud models
Azure OpenAI Azure deployments Enterprise-controlled cloud deployment
AWS Bedrock Bedrock-hosted models AWS enterprise environments

Recommended Model Usage

Use Case Recommendation
Drafting fast cloud or local instruct model
Reasoning strong reasoning model
Private documents local model via Ollama/LM Studio/vLLM
Vision/PDF pages vision-capable model when image pages are used

πŸš€ Quick Start

1) Clone or open this folder

cd clause-extractor

2) Run on macOS / Linux / WSL

chmod +x run_gui.sh
./run_gui.sh

3) Run on Windows PowerShell

Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
.\run_gui_windows.ps1

4) Open the GUI

http://127.0.0.1:9106

πŸ› οΈ Manual Installation

Use this when you want full control instead of the run scripts.

cd clause-extractor
python -m venv .venv
source .venv/bin/activate      # Windows: .venv\Scripts\activate
python -m pip install --upgrade pip
pip install -r requirements.txt
cp .env.example .env           # Windows: copy .env.example .env
uvicorn app.main:app --host 127.0.0.1 --port 9106

πŸ” Environment Variables

The project can be configured through the GUI settings screen or .env/environment variables.

Variable Purpose
AI_SUITE_HOST Host to bind the local app, usually 127.0.0.1.
AI_SUITE_PORT Port for this project GUI, default 9106.
AI_SUITE_DB SQLite database path for job history.
AI_SUITE_SECRET_KEY Secret used for local encryption/signing. Set this in production.
OPENAI_API_KEY Enables OpenAI-compatible cloud calls.
ANTHROPIC_API_KEY Enables Anthropic/Claude calls.
GEMINI_API_KEY Enables Google Gemini calls.
OPENROUTER_API_KEY Enables OpenRouter model routing.
MISTRAL_API_KEY Enables Mistral cloud models.
AZURE_OPENAI_ENDPOINT Azure OpenAI endpoint URL.
AZURE_OPENAI_API_KEY Azure OpenAI key.
AZURE_OPENAI_DEPLOYMENT Azure deployment name.
OLLAMA_BASE_URL Local Ollama OpenAI-compatible base URL.
LMSTUDIO_BASE_URL Local LM Studio OpenAI-compatible base URL.
VLLM_BASE_URL Local vLLM OpenAI-compatible base URL.

πŸ–₯️ How to Use the GUI

  1. Open the local URL.
  2. Review the project purpose and workflow lanes.
  3. Fill the required input fields.
  4. Adjust only the project-related customization controls.
  5. Upload source files when needed.
  6. Choose Rule Engine for local deterministic output or select a configured LLM provider.
  7. Run the workflow.
  8. Review warnings, scorecards, and output sections.
  9. Export the result as Markdown, JSON, DOCX, or PDF.

πŸ”„ Workflow

  • Contract
  • clause extraction and risk analysis

Analysis Modules

  • legal_clause_scan
  • obligation_calendar
  • risk_register

Output Sections

  • Clause table
  • Risk register
  • Obligations
  • Negotiation notes

Scorecards

  • Risk severity
  • Evidence strength
  • Negotiation impact
  • Obligation urgency
  • Missing protection risk

πŸ“€ Outputs & Exports

  • clause table
  • redline suggestions
  • risk memo
  • obligation calendar

The export system is designed for reviewable deliverables. For regulated or business-critical work, export drafts should be reviewed before sending to clients, customers, patients, employees, authorities, or production systems.


πŸ”Œ Real Integrations & Connector Policy

Configured integrations in this standalone folder:

  • File upload
  • REST API
  • Export download
  • Job history

Real Connector Requirements

  • Accounting/ERP connector for posting entries
  • tax/VAT rules source for jurisdiction-specific validation
  • human finance review before payment or filing
  • approved legal playbook or clause library
  • jurisdiction-specific review by qualified counsel
  • document management/e-sign connector if exporting final agreements

Important: this project does not simulate live data. If a workflow needs live Shopify, ATS, ERP, tax, customs, medical, security, market, map, analytics, or repository data, it must be connected with valid credentials and real API access. Missing connectors should produce clear setup errors rather than invented results.


🧯 Guardrails

  • Show uncertainty and confidence
  • Cite evidence from input when possible
  • Human review required for legal, medical, financial, hiring, or security decisions
  • Do not invent facts absent from input

Recommended operating rules:

  • βœ… Use local models for private or sensitive files.
  • βœ… Keep API keys out of Git.
  • βœ… Review low-confidence or high-impact outputs manually.
  • βœ… Keep source files and exported deliverables organized under data/.
  • ❌ Do not treat AI output as legal, medical, tax, hiring, trading, or security authority without expert review.

πŸ§ͺ Testing

Run the local smoke test:

python tests/test_single_project.py

Run a health check after starting the server:

curl http://127.0.0.1:9106/api/health

Expected result: the API returns ok: true and identifies this project.


🧬 API Usage

Method Endpoint Use
GET / Opens the browser GUI.
GET /api/health Health check for deployment and uptime monitoring.
GET /api/projects Returns the local project configuration.
GET /api/projects/{slug} Returns the project plugin metadata.
GET /api/providers Lists configured providers and local/cloud options.
POST /api/providers Saves provider settings/API keys.
POST /api/upload Uploads source files for extraction or context.
POST /api/run Runs the project workflow.
GET /api/jobs Lists previous runs and job history.
GET /api/jobs/{job_id} Reads one completed job.
GET /api/jobs/{job_id}/export/{fmt} Exports a job as md, json, docx, or pdf.
GET /api/project-local-status Verifies local project registration and implementation status.

Minimal Run Request

curl -X POST http://127.0.0.1:9106/api/run \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {
      "work_brief": "Paste the source material or task details here"
    },
    "customization": {
      "execution_mode": "Production"
    },
    "provider": "rule_engine"
  }'

πŸ“ Folder Structure

clause-extractor/
β”œβ”€ app/                         # FastAPI backend, schemas, DB, providers, exports
β”œβ”€ static/                      # Browser GUI assets
β”œβ”€ plugins/                     # Project plugin JSON metadata
β”œβ”€ data/                        # SQLite DB, uploads, exports
β”œβ”€ tests/                       # Smoke tests
β”œβ”€ project_config.json          # Project-specific inputs, controls, UX, workflow
β”œβ”€ PROJECT_IMPLEMENTATION.md    # Implementation details and domain notes
β”œβ”€ requirements.txt             # Python dependencies
β”œβ”€ run_gui.sh                   # macOS/Linux/WSL launcher
β”œβ”€ run_gui_windows.ps1          # Windows PowerShell launcher
└─ README.md                    # This file

🚒 Deployment Notes

For local/private deployment, run with uvicorn behind a reverse proxy if needed. For production:

  • Set AI_SUITE_SECRET_KEY.
  • Use HTTPS.
  • Store provider keys in environment variables or a proper secret manager.
  • Restrict upload sizes and allowed file types.
  • Back up the SQLite database or move job storage to a managed database.
  • Add authentication before exposing beyond localhost.
  • Enable logging and monitoring.

Example production-style command:

AI_SUITE_HOST=0.0.0.0 AI_SUITE_PORT=9106 uvicorn app.main:app --host 0.0.0.0 --port 9106

🧯 Troubleshooting

Problem Fix
python not found Install Python 3.10+ and ensure it is on PATH.
PowerShell blocks the script Run Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass.
Port already in use Set another port: AI_SUITE_PORT=9200 ./run_gui.sh.
Provider fails Verify API key, base URL, selected model, and account quota.
Local model fails Start Ollama/LM Studio/vLLM before running the workflow.
PDF/DOCX export fails Reinstall requirements and confirm optional export dependencies installed.
Upload extraction is incomplete Use cleaner source files or paste the important text into work_brief.

🧭 Extension Points

You can extend this project by editing:

  • project_config.json for inputs, settings, output sections, UX metadata, and workflow labels.
  • plugins/clause-extractor.json for plugin metadata.
  • app/domain_engine.py for deterministic business logic.
  • app/llm_gateway.py for provider integrations.
  • static/app.js and static/styles.css for GUI behavior and component design.
  • tests/test_single_project.py for stronger project-specific tests.

βœ… Final Implementation Status

Area Status
Standalone folder GUI βœ… Implemented
FastAPI backend βœ… Implemented
Project-specific config βœ… Implemented
Local deterministic workflow βœ… Implemented
Local/cloud LLM routing βœ… Implemented
Uploads and exports βœ… Implemented
Job history βœ… Implemented
Real external connectors ⚠️ Requires valid credentials/API setup
Fake/simulated live data ❌ Not allowed

πŸ“œ License

Use the license included in this folder. If no explicit license is present, treat the code as private until you choose one.

About

Contract to clause extraction and risk analysis

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages