A document-aware, hallucination-resistant chatbot built with OpenAI + LangChain + RAG + Embeddings. This chatbot answers your queries from uploaded PDF notes, and if the answer isnβt present in your notes, it falls back to Wikipedia with a disclaimer.
β¨ Features
π Upload PDF notes and ask questions directly.
π Uses Retrieval-Augmented Generation (RAG) to fetch answers from your notes.
π Falls back to Wikipedia search if the answer is not in the provided documents.
π¬ Beautiful Streamlit UI with custom chat bubbles and Lottie animations.
π§ Reduces hallucination problem of LLMs by making sources transparent.
π οΈ Tech Stack
Language Model: GPT-3.5 Turbo (OpenAI)
Frameworks & Libraries:
LangChain β Document retrieval & chains
FAISS β Vector similarity search
PyPDF2 β Extract text from PDFs
python-docx β Extract text from DOCX files
Streamlit β Interactive UI
dotenv β API key management
π How It Works
Upload your notes (PDF, DOCX, or TXT).
Notes are split into chunks using RecursiveCharacterTextSplitter.
Chunks are converted into embeddings using OpenAIEmbeddings.
User queries are matched with the most relevant chunks using FAISS similarity search.
The chatbot generates a response using ChatOpenAI with a custom prompt.
If the relevant context is not found, the bot fetches a short summary from Wikipedia.
Stylized chat bubbles display the user query and bot response in a beautiful Streamlit UI.