A desktop application that scans local directories, extracts content from files, and uses AI to generate summaries, describe images, and transcribe media—helping you understand and organize your filesystem.
FileVyasa is a personal AI assistant that scans, understands, and organizes local files. It proposes clear folder hierarchies, safely moves or renames file items. The goal is to make finding and organizing files as effortless as conversing with an intelligent librarian who knows your work.
Current Status: v1.1 — Scan, extract, and summarize files with AI. Future versions will add clustering, folder planning, and safe file reorganization.
- Smart Folder Monitoring — Add folders to monitor, auto-detect changes on re-sync
- Content Extraction — Supports 30+ file types including documents, images, media, code, and archives
- AI Summarization — Generate concise summaries for documents using local (Ollama) or cloud LLMs (100+ providers via LiteLLM)
- Image Description — AI-powered descriptions for photos and images via Ollama llava
- Media Transcription — Audio/video transcription via faster-whisper (local, no API needed)
- OCR Support — Extract text from image-based PDFs using python-doctr
- Google Workspace — Extract content from Google Docs and Sheets (service account required)
- Desktop App — Native cross-platform Tauri app with file browser, search, and filtering
- Parallel Processing — Configurable concurrent extraction and AI processing for fast scans
file-vyasa/
├── backend/ # Python FastAPI backend (v1.1.0)
├── frontend/ # Tauri 2.x + React 19 desktop app
├── agentic_development_docs/ # Design docs and roadmap
├── handle_file_types/ # File type handling utilities
└── sample_data/ # Test data for development
- Python 3.11+ with uv
- Node.js 18+ with pnpm
- Rust (for Tauri desktop app)
- Ollama (recommended for local AI) or cloud LLM API key
Backend:
cd backend
uv sync
uv run python run.pyFrontend:
cd frontend
pnpm install
pnpm tauri:devBoth services must run concurrently. The frontend connects to the backend at http://127.0.0.1:8000. Visit /docs for Swagger API documentation.
FileVyasa uses LiteLLM which supports 100+ LLM providers.
ollama pull llama3.2 # For document summaries
ollama pull llava # For image descriptions (optional)No API key required—runs entirely on your machine. Best for privacy-sensitive use cases.
Access larger models (like gpt-oss:120b) running on Ollama's high-end GPU infrastructure when your local hardware can't handle them.
- Sign up at ollama.com and get an API key from settings/keys
- Select "Ollama Cloud" as provider in Settings
- Enter your API key
Available models: gpt-oss:120b, gpt-oss:20b, qwen3:8b, qwen3:4b, llama3.1:70b, gemma3:27b, deepseek-r1:70b
export FILEVYASA_LLM_API_KEY=your-key-hereConfigure provider/model in the Settings panel or backend/config/settings.yaml.
| Category | Extensions |
|---|---|
| Documents | PDF (with OCR), DOCX, XLSX, PPTX, TXT, MD, RTF |
| Images | JPG, PNG, GIF, WEBP, HEIC, BMP, TIFF |
| Media | MP3, MP4, WAV, FLAC, M4A, MOV, AVI, MKV, WEBM, OGG |
| Code | PY, JS, TS, HTML, CSS, JSON, YAML, XML, and more |
| Notebooks | IPYNB (Jupyter) |
| Web | HTML, HTM, XML |
| Archives | ZIP, TAR, GZ, RAR, 7Z (metadata only) |
| Google Docs, Google Sheets (via service account) |
- v1.1 (Current) — Scan, extract, summarize with AI
- v1.2 — Structured file objects, detail panel, filters
- v1.3 — Clustering via Constella, folder planning board (preview)
- v1.4 — Action planning with approval state (no execution)
- v1.5 — Safe executor with rollback, duplicate detection
See design docs for detailed plans.
- Backend README — API endpoints, configuration, extractors, development
- Frontend README — UI architecture, components, building, development

