Local RAG pipeline enabling semantic document search and question answering using Ollama LLMs, ChromaDB vector database, and Streamlit interface.
This application allows users to upload PDF or image file and ask questions directly about the document content.
Built for learning and experimenting with:
- RAG pipelines
- Local LLM inference
- Vector search systems
- Upload PDF / Image files
- Embedding with
qwen3-embedding:0.6b - Vector storage using ChromaDB (persistent)
- Using
"qwen3:1.7b", "deepseek-r1", "llama3.2:3b"for LLM inferences - Using
Streamlitfor Web-UI
- Language: Python 3.10+
- Text splitting: LangChain
- Data processing: pandas, kagglehub
- Vector Database: ChromaDB (persistent)
- LLM & Embedding: Ollama (local inference)
-
Clone the repository:
git clone https://github.com/your-username/rag-document-reader.git cd rag-document-reader
cd rag-document-reader
cd rag-vietnamese-qa
```
-
Create and activate virtual environment:
python -m venv .venv .venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Install Ollama & pull models:
-
Download Ollama: https://ollama.com/download
-
Pull embedding & LLM models:
ollama pull qwen3-embedding:0.6b ollama pull qwen3:1.7b ollama pull deepseek-r1 ollama pull llama3.2:3b ollama serve
-
-
Run the chatbot with Streamlit:
streamlit run app.py
