Documentify is a local AI SaaS platform for analyzing large documents such as legal contracts, research papers, financial reports, and policy documents. It runs entirely on open-source tooling with Flutter on the frontend and FastAPI on the backend, using Ollama, SentenceTransformers, and ChromaDB for local-first RAG.
- Upload and process
PDF,DOCX, andTXTdocuments - Generate executive summaries, key insights, and document metrics
- Detect contract and policy risks with section-linked references
- Run grounded Q&A over a single document or an entire knowledge base
- Search semantically across multiple uploaded documents
- Compare two documents for structural differences
- Generate research study packs: simple explanations, key contributions, slides, flashcards, and quizzes
- Run an autonomous document agent to synthesize multi-document reports
- Export analysis reports as Markdown, text, or PDF
See [architecture.md](/Users/dcaayushd/Development/Flutter_Dev/Flutter Projects/Documentify/docs/architecture.md) for the diagram.
flowchart TD
A[Flutter Web + Mobile] --> B[FastAPI API]
B --> C[Document Parsing]
C --> D[Chunking]
D --> E[SentenceTransformers]
E --> F[ChromaDB]
F --> G[Retriever]
G --> H[Ollama Local LLM]
H --> I[Insights + Q&A + Research + Agent Reports]
Documentify/
├── lib/
│ ├── core/
│ ├── models/
│ ├── screens/
│ ├── services/
│ └── widgets/
├── backend/
│ ├── app/
│ │ ├── agent/
│ │ ├── core/
│ │ ├── document_parser/
│ │ ├── embeddings/
│ │ ├── rag/
│ │ ├── research_mode/
│ │ ├── risk_analysis/
│ │ ├── routes/
│ │ ├── schemas/
│ │ └── services/
│ ├── data/
│ ├── Dockerfile
│ └── requirements.txt
├── docs/
│ └── architecture.md
├── docker-compose.yml
└── Dockerfile.web
The FastAPI backend provides a local-first document intelligence pipeline:
- Parsing:
PyMuPDF,pdfplumber,python-docx, and optional local OCR viapytesseract - Chunking: section-aware semantic chunking with overlap
- Embeddings:
SentenceTransformersusingall-MiniLM-L6-v2 - Vector retrieval:
persistent
ChromaDB - LLM inference:
Ollamawith open-source models such asmistral - Risk analysis: domain heuristics for contracts, financial reports, and policy documents
- Research mode: study materials and presentation outputs
- Autonomous agent: retrieval + synthesis workflow for cross-document reporting
POST /upload-documentPOST /process-documentGET /documentsGET /document-summaryGET /key-insightsGET /risk-analysisPOST /chatGET /searchPOST /research-modePOST /agent-analysisPOST /compare-documentsGET /export-reportGET /health
The Flutter app is designed as a modern AI SaaS dashboard with Material 3 and a dark visual system:
- 3-panel desktop layout: sidebar, semantic document viewer, AI insights rail
- Responsive mobile/tablet experience with modal insights and dedicated chat
- Startup-style login and onboarding screen
- Upload workflow with processing pipeline states
- Research mode with slides, flashcards, and quiz UI
- Semantic viewer with section highlights and jump-to-reference behavior
- Integrated chat for grounded document Q&A
The frontend calls the local FastAPI backend through lib/services/api_service.dart. If the backend is not running yet, the app falls back to rich demo data so the interface still behaves like a polished product during development.
Set the backend URL at build/run time if needed:
flutter run --dart-define=API_BASE_URL=http://127.0.0.1:8000Install Ollama locally, then pull an open-source model:
ollama serve
ollama pull mistralcd backend
python3 -m venv .venv
.venv/bin/python -m pip install -r requirements.txt
./run_dev.shThe parser is OCR-ready and will automatically fall back to local OCR for low-text, image-heavy PDF pages when Tesseract is available.
macOS:
brew install tesseractLinux:
sudo apt-get install tesseract-ocr tesseract-ocr-engOptional environment variables:
export OCR_ENABLED=1
export OCR_LANGUAGE=eng
export OCR_RENDER_DPI=220
export OCR_MIN_CONFIDENCE=35If Tesseract is installed in a non-standard location, set:
export TESSERACT_CMD=/full/path/to/tesseractflutter pub get
flutter run -d chrome --dart-define=API_BASE_URL=http://127.0.0.1:8000The repo includes:
- [backend/Dockerfile](/Users/dcaayushd/Development/Flutter_Dev/Flutter Projects/Documentify/backend/Dockerfile)
- [Dockerfile.web](/Users/dcaayushd/Development/Flutter_Dev/Flutter Projects/Documentify/Dockerfile.web)
- [docker-compose.yml](/Users/dcaayushd/Development/Flutter_Dev/Flutter Projects/Documentify/docker-compose.yml)
Bring the full stack up with:
docker compose up --buildServices:
- Frontend web app:
http://localhost:8080 - FastAPI backend:
http://localhost:8000 - Ollama:
http://localhost:11434 - Backend container includes
tesseract-ocrfor scanned-PDF OCR fallback
- All models and libraries in this project are free and open source.
- The backend is designed to run locally with no paid API dependency.
- The frontend includes demo fallback data to keep the product presentable before the full local model stack is installed.