💬 Chatbot with Memory + PDF Q&A

An intelligent AI chatbot that remembers conversations and answers questions from your PDF documents using Retrieval-Augmented Generation (RAG)

🎯 Overview

This project demonstrates a modern Conversational AI application that combines:

Conversational Memory - Remembers previous interactions
PDF Processing - Extract and analyze document content
Semantic Search - Find relevant information using embeddings
RAG (Retrieval-Augmented Generation) - Answer questions using document context

Perfect for building intelligent assistants that can answer questions about your documents!

✨ Key Features

Feature	Description
💬 Memory System	Maintains conversation history for context-aware responses
📄 PDF Support	Upload and process PDF documents instantly
🔍 Smart Search	Semantic search using HuggingFace embeddings
🤖 AI Powered	Google Gemini API for intelligent responses
⚡ Fast Retrieval	FAISS vector database for efficient document search
🎨 Modern UI	Clean and intuitive Streamlit interface
🆓 Free to Use	Leverages free Gemini API tier

🛠️ Technology Stack

Frontend & Framework
├── Streamlit           → Interactive web interface
├── LangChain           → AI/ML orchestration framework
└── Python              → Core programming language

AI & NLP
├── Google Gemini API   → Large Language Model
├── HuggingFace         → Embeddings (sentence-transformers)
└── LangChain Core      → LLM chains and prompts

Data Storage
├── FAISS               → Vector similarity search
├── PyPDF               → PDF parsing
└── Session State       → Conversation memory

📋 Requirements

Python 3.8 or higher
Google API Key (free tier available)
Internet connection

🚀 Quick Start

1️⃣ Clone the Repository

git clone https://github.com/YOUR_USERNAME/chatbot_with_memory.git
cd chatbot_with_memory

2️⃣ Install Dependencies

pip install -r requirements.txt

3️⃣ Set Up Environment Variables

Create a .env file in the root directory:

GOOGLE_API_KEY=your_gemini_api_key_here

Get your free API key:

Visit Google AI Studio
Click "Create API Key"
Copy and paste into your .env file

4️⃣ Run the Application

streamlit run app.py

The app will open at http://localhost:8501 🎉

💡 How It Works

Architecture Flow

User Input
    ↓
[PDF Processing] ← Upload PDF
    ↓
[Text Splitting] → Break into chunks
    ↓
[Embeddings] → Convert to vectors (HuggingFace)
    ↓
[FAISS Index] → Store vectors for fast search
    ↓
[Query Retrieval] → Find relevant documents
    ↓
[LLM + Context] → Gemini generates answer
    ↓
[Output] → Display to user + Sources

Step-by-Step Usage

Upload a PDF
- Click "Choose File" in the sidebar
- Select your PDF document
- Click "Process PDF"
Ask Questions
- Type your question in the chat input
- The chatbot retrieves relevant content from the PDF
- Gemini generates an intelligent answer
View Sources
- Expand the "Sources" section to see which PDF content was used

📁 Project Structure

chatbot_with_memory/
│
├── app.py                 # Main Streamlit application
├── requirements.txt       # Python dependencies
├── .env.example          # Environment variables template
├── .gitignore            # Git ignore rules
├── README.md             # This file
│
└── vectorstore/          # FAISS vector database (auto-created)
    └── index.faiss

📦 Dependencies

All required packages are in requirements.txt:

streamlit              # Web framework
langchain              # AI orchestration
langchain-google-genai # Gemini integration
langchain-community    # Additional tools
langchain-core         # Core utilities
langchain-text-splitters # Text chunking
python-dotenv          # Environment management
faiss-cpu              # Vector search
sentence-transformers  # Embeddings
pypdf                  # PDF parsing

🎓 Example Usage

# The app handles everything through the UI, but here's the flow:

# 1. Upload PDF → creates embeddings
# 2. User asks: "What is the main topic?"
# 3. System retrieves: Top 4 relevant document chunks
# 4. Gemini generates: Context-aware answer
# 5. Display: Answer + Source references

⚙️ Configuration

Customize Retrieval

Edit in app.py:

retriever = vectorstore.as_retriever(search_kwargs={"k": 4})  # Change k for more/fewer results

Change LLM Model

Update in app.py:

llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",  # Or use gemini-pro
    google_api_key=GOOGLE_API_KEY,
    temperature=0.3  # Adjust for more/less creativity
)

Adjust Text Chunk Size

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,      # Increase for larger chunks
    chunk_overlap=200     # Overlap for context continuity
)

🔧 Troubleshooting

Issue	Solution
`ModuleNotFoundError`	Run `pip install -r requirements.txt`
`GOOGLE_API_KEY not found`	Check `.env` file and verify key is added
`No module named 'faiss'`	Install: `pip install faiss-cpu`
`Slow PDF processing`	Reduce chunk_size in `RecursiveCharacterTextSplitter`
`Poor answer quality`	Increase k value in retriever for more context

🌟 Tips for Best Results

✅ Upload clear, well-formatted PDFs ✅ Ask specific questions - More specific = better answers ✅ Use PDF reports, manuals, or books - Works great with technical docs ✅ Ask follow-up questions - Leverage conversation memory ✅ Check sources - Verify the answer against referenced documents

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🤝 Contributing

Contributions are welcome! Feel free to:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📧 Contact & Support

Issues? Open a GitHub issue
Ideas? Discussions are welcome
Questions? Check existing issues first

🙏 Acknowledgments

LangChain - AI framework
Google Gemini API - LLM
Streamlit - Web framework
FAISS - Vector search
HuggingFace - Embeddings

⭐ If you find this helpful, please give it a star!

Made with ❤️ for the AI community

▶️ Run the Application streamlit run app.py 🧠 How It Works PDF Processing Pipeline PDF Upload ↓ Text Extraction ↓ Chunking ↓ Embeddings Generation ↓ FAISS Vector Storage ↓ Retriever ↓ Gemini Response 📸 Features Overview Chat Memory

The chatbot remembers previous conversation history using LangChain memory.

PDF Question Answering

Users can upload PDFs and ask contextual questions from the document.

Semantic Search

FAISS retrieves the most relevant document chunks using vector embeddings.

📦 Requirements streamlit langchain langchain-community langchain-google-genai faiss-cpu pypdf sentence-transformers python-dotenv google-generativeai ☁️ Deployment

Recommended platforms:

Hugging Face Spaces Streamlit Community Cloud 🧪 Example Use Cases AI document assistant Resume analyzer Research paper Q&A Company knowledge chatbot Educational assistant Legal document search 📈 Future Improvements Multi-PDF support Persistent database memory Voice input/output Authentication system Chat export Image understanding Local LLM support with Ollama 👨‍💻 Author

umran666

GitHub:

https://github.com/your-username ⭐ If you like this project

Give it a star on GitHub ⭐

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💬 Chatbot with Memory + PDF Q&A

🎯 Overview

✨ Key Features

🛠️ Technology Stack

📋 Requirements

🚀 Quick Start

1️⃣ Clone the Repository

2️⃣ Install Dependencies

3️⃣ Set Up Environment Variables

4️⃣ Run the Application

💡 How It Works

Architecture Flow

Step-by-Step Usage

📁 Project Structure

📦 Dependencies

🎓 Example Usage

⚙️ Configuration

Customize Retrieval

Change LLM Model

Adjust Text Chunk Size

🔧 Troubleshooting

🌟 Tips for Best Results

📝 License

🤝 Contributing

📧 Contact & Support

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

💬 Chatbot with Memory + PDF Q&A

🎯 Overview

✨ Key Features

🛠️ Technology Stack

📋 Requirements

🚀 Quick Start

1️⃣ Clone the Repository

2️⃣ Install Dependencies

3️⃣ Set Up Environment Variables

4️⃣ Run the Application

💡 How It Works

Architecture Flow

Step-by-Step Usage

📁 Project Structure

📦 Dependencies

🎓 Example Usage

⚙️ Configuration

Customize Retrieval

Change LLM Model

Adjust Text Chunk Size

🔧 Troubleshooting

🌟 Tips for Best Results

📝 License

🤝 Contributing

📧 Contact & Support

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages