Omni-Doc is a robust local RAG (Retrieval-Augmented Generation) system engineered to handle massive, high-complexity technical documentation.
While standard AI loaders often fail on large-scale PDFs (3,000+ pages) due to dense tables, JSON schemas, and non-standard formatting, Omni-Doc uses a "Mechanical Split" architecture to guarantee 100% stability.
- Orchestration: LangGraph (State-machine reasoning)
- Data Framework: LlamaIndex (Data indexing & retrieval)
- LLM:
Gemma 3:4Bvia Ollama - Embeddings:
Nomic-Embed-Textvia Ollama - Schema: Pydantic for structured, cited responses
Omni-Doc is built to bypass the hard constraints of local embedding servers:
- Deterministic Splitting: We ignore unreliable "semantic" breaks in favor of strict, fixed-length character chunks. This ensures no single payload ever exceeds the API context limit.
- Purified Nodes: By stripping all metadata during the embedding phase, we maximize the available token space for actual content.
- Stateful Verification: Using a graph-based workflow, the agent must prove its answers exist within the documentation before responding.
Ensure Ollama is installed and the models are pulled:
sudo docker container start {name of container}
ollama pull gemma3:4b
ollama pull nomic-embed-text