Skip to content

code-quad3/ChatWithPdf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Fullstack Gen AI PDF Context-Aware Chatbot

Overview

This is a full-stack Gen AI application that functions as a PDF context-aware chatbot. Users can upload PDF documents and ask questions related to their content. The application leverages a Large Language Model (LLM) to provide relevant answers based on the PDF context.

Setup

For Backend:

  1. Navigate to the backend directory:
    cd backend
  2. Create a virtual environment:
    python -m venv venv
  3. Activate the virtual environment:
    • Windows:
      venv\Scripts\activate
    • Linux:
      source venv/bin/activate
  4. Install the required dependencies:
    pip install -r requirements.txt
  5. Run the FastAPI application:
    uvicorn main:app --reload --host 0.0.0.0 --port 8000

For Frontend:

  1. Navigate to the frontend directory:
    cd frontend
  2. Install the npm dependencies:
    npm install
  3. Start the development server:
    npm run dev

API Endpoints

/upload (POST)

  • Receives files via a POST request.
  • Extracts text content from the uploaded PDF using pymupdf.
  • Divides the extracted text into smaller chunks.
  • Generates vector embeddings for each chunk using Langchain.
  • The text extraction and embedding processes are executed in separate threads to handle large files and CPU-intensive tasks efficiently.
  • Stores file metadata (e.g., filename, upload timestamp) in a SQLite database.
  • Saves the generated vector embeddings in the Fasis vector database.

/ask (POST)

  • Retrieves relevant vector embeddings from the Fasis index based on the user's question.
  • Receives the user's question via the POST request body.
  • Combines the retrieved context (from the PDF) and the user's question.
  • Sends this combined information to the LLM.
  • Returns the LLM's response to the user.

Tools and Frameworks

  • Frontend:
    • Vite: A fast build tool and development server.
    • Tailwind CSS: A utility-first CSS framework.
    • React: A JavaScript library for building user interfaces.
  • Backend:
    • FastAPI: A modern, high-performance web framework for building APIs with Python.
    • PyMuPdf: A Python library for working with PDF and XPS documents.
    • Langchain: A framework for developing applications powered by language models.
  • Database:
    • Fasis: A vector database for efficient similarity search.
    • SQLite: A lightweight, disk-based database for storing metadata.

Demo Video

AppDemo.mp4

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors