Documentify

Documentify is a local AI SaaS platform for analyzing large documents such as legal contracts, research papers, financial reports, and policy documents. It runs entirely on open-source tooling with Flutter on the frontend and FastAPI on the backend, using Ollama, SentenceTransformers, and ChromaDB for local-first RAG.

What it does

Upload and process PDF, DOCX, and TXT documents
Generate executive summaries, key insights, and document metrics
Detect contract and policy risks with section-linked references
Run grounded Q&A over a single document or an entire knowledge base
Search semantically across multiple uploaded documents
Compare two documents for structural differences
Generate research study packs: simple explanations, key contributions, slides, flashcards, and quizzes
Run an autonomous document agent to synthesize multi-document reports
Export analysis reports as Markdown, text, or PDF

System Architecture

See [architecture.md](/Users/dcaayushd/Development/Flutter_Dev/Flutter Projects/Documentify/docs/architecture.md) for the diagram.

flowchart TD
    A[Flutter Web + Mobile] --> B[FastAPI API]
    B --> C[Document Parsing]
    C --> D[Chunking]
    D --> E[SentenceTransformers]
    E --> F[ChromaDB]
    F --> G[Retriever]
    G --> H[Ollama Local LLM]
    H --> I[Insights + Q&A + Research + Agent Reports]

Folder Structure

Documentify/
├── lib/
│   ├── core/
│   ├── models/
│   ├── screens/
│   ├── services/
│   └── widgets/
├── backend/
│   ├── app/
│   │   ├── agent/
│   │   ├── core/
│   │   ├── document_parser/
│   │   ├── embeddings/
│   │   ├── rag/
│   │   ├── research_mode/
│   │   ├── risk_analysis/
│   │   ├── routes/
│   │   ├── schemas/
│   │   └── services/
│   ├── data/
│   ├── Dockerfile
│   └── requirements.txt
├── docs/
│   └── architecture.md
├── docker-compose.yml
└── Dockerfile.web

Backend Implementation

The FastAPI backend provides a local-first document intelligence pipeline:

Parsing: PyMuPDF, pdfplumber, python-docx, and optional local OCR via pytesseract
Chunking: section-aware semantic chunking with overlap
Embeddings: SentenceTransformers using all-MiniLM-L6-v2
Vector retrieval: persistent ChromaDB
LLM inference: Ollama with open-source models such as mistral
Risk analysis: domain heuristics for contracts, financial reports, and policy documents
Research mode: study materials and presentation outputs
Autonomous agent: retrieval + synthesis workflow for cross-document reporting

API Endpoints

POST /upload-document
POST /process-document
GET /documents
GET /document-summary
GET /key-insights
GET /risk-analysis
POST /chat
GET /search
POST /research-mode
POST /agent-analysis
POST /compare-documents
GET /export-report
GET /health

Flutter Frontend Implementation

The Flutter app is designed as a modern AI SaaS dashboard with Material 3 and a dark visual system:

3-panel desktop layout: sidebar, semantic document viewer, AI insights rail
Responsive mobile/tablet experience with modal insights and dedicated chat
Startup-style login and onboarding screen
Upload workflow with processing pipeline states
Research mode with slides, flashcards, and quiz UI
Semantic viewer with section highlights and jump-to-reference behavior
Integrated chat for grounded document Q&A

API Integration

The frontend calls the local FastAPI backend through lib/services/api_service.dart. If the backend is not running yet, the app falls back to rich demo data so the interface still behaves like a polished product during development.

Set the backend URL at build/run time if needed:

flutter run --dart-define=API_BASE_URL=http://127.0.0.1:8000

Local Setup

1. Start Ollama

Install Ollama locally, then pull an open-source model:

ollama serve
ollama pull mistral

2. Run the backend

cd backend
python3 -m venv .venv
.venv/bin/python -m pip install -r requirements.txt
./run_dev.sh

Optional: enable OCR for scanned PDFs

The parser is OCR-ready and will automatically fall back to local OCR for low-text, image-heavy PDF pages when Tesseract is available.

macOS:

brew install tesseract

Linux:

sudo apt-get install tesseract-ocr tesseract-ocr-eng

Optional environment variables:

export OCR_ENABLED=1
export OCR_LANGUAGE=eng
export OCR_RENDER_DPI=220
export OCR_MIN_CONFIDENCE=35

If Tesseract is installed in a non-standard location, set:

export TESSERACT_CMD=/full/path/to/tesseract

3. Run the Flutter app

flutter pub get
flutter run -d chrome --dart-define=API_BASE_URL=http://127.0.0.1:8000

Docker Configuration

The repo includes:

[backend/Dockerfile](/Users/dcaayushd/Development/Flutter_Dev/Flutter Projects/Documentify/backend/Dockerfile)
[Dockerfile.web](/Users/dcaayushd/Development/Flutter_Dev/Flutter Projects/Documentify/Dockerfile.web)
[docker-compose.yml](/Users/dcaayushd/Development/Flutter_Dev/Flutter Projects/Documentify/docker-compose.yml)

Bring the full stack up with:

docker compose up --build

Services:

Frontend web app: http://localhost:8080
FastAPI backend: http://localhost:8000
Ollama: http://localhost:11434
Backend container includes tesseract-ocr for scanned-PDF OCR fallback

Notes

All models and libraries in this project are free and open source.
The backend is designed to run locally with no paid API dependency.
The frontend includes demo fallback data to keep the product presentable before the full local model stack is installed.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Documentify

What it does

System Architecture

Folder Structure

Backend Implementation

API Endpoints

Flutter Frontend Implementation

API Integration

Local Setup

1. Start Ollama

2. Run the backend

Optional: enable OCR for scanned PDFs

3. Run the Flutter app

Docker Configuration

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Documentify

What it does

System Architecture

Folder Structure

Backend Implementation

API Endpoints

Flutter Frontend Implementation

API Integration

Local Setup

1. Start Ollama

2. Run the backend

Optional: enable OCR for scanned PDFs

3. Run the Flutter app

Docker Configuration

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages