Skip to content

Multi-Agent Reasoning System for PDF Analysis using OpenAI Agents SDK with RAG pipeline, autonomous intent detection, and interactive Streamlit UI for intelligent document Q&A.

Notifications You must be signed in to change notification settings

shwetam19/PDF-Rag-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“š Multi-Agent PDF Analysis System

Advanced document analysis system using OpenAI Agents SDK with autonomous multi-agent orchestration, RAG pipeline, and interactive UI.

🎯 Overview

This system implements a sophisticated multi-agent architecture using OpenAI Agents SDK (v0.6.1) for intelligent PDF document analysis. It features autonomous intent detection, retrieval-augmented generation (RAG), specialized reasoning agents, and an interactive Streamlit interface with citation highlighting.

πŸ—οΈ System Architecture

Multi-Agent Framework

User Query β†’ Planner Agent (Intent Detection)
              ↓
        Appropriate Agent Chain
              ↓
    RAG Agent (Retrieval + Generation)
              ↓
    Specialized Reasoning Agent
              ↓
    Response with Cited Evidence

6 Specialized Agents

  • Planner Agent - Autonomous orchestrator using handoffs
  • RAG Agent - Retrieval-augmented generation with FAISS
  • Summarization Agent - Full-document summarization
  • Comparator Agent - Cross-document comparison analysis
  • Timeline Builder Agent - Chronological event organization
  • Aggregator Agent - Multi-source information synthesis

πŸ”§ Technical Stack

Component Technology
Agent Framework OpenAI Agents SDK v0.6.1
LLM OpenAI (provider-agnostic)
Vector Database FAISS (IndexFlatIP)
Embeddings sentence-transformers (384-dim)
PDF Processing pdfplumber + PyMuPDF
UI Framework Streamlit

πŸ“‹ Features

Core Capabilities

βœ… Autonomous Intent Detection - No manual mode selection
βœ… RAG Pipeline - Semantic search with grounded responses
βœ… Multi-Document Analysis - Cross-document retrieval
βœ… Citation Tracking - Every answer includes ranked evidence
βœ… Interactive PDF Viewer - Click-to-navigate with highlighting
βœ… Agent Orchestration - Dynamic agent chaining

Advanced Features

βœ… Tool Calling - Agents call Python functions (@function_tool)
βœ… Autonomous Handoffs - LLM-driven delegation (no manual routing)
βœ… Global State Management - Tools access shared Vector Store
βœ… Evidence Highlighting - Yellow highlights on cited passages
βœ… Execution Tracing - Transparent agent workflow via Runner logs

πŸš€ Quick Start

1. Prerequisites

  • Python 3.9+
  • OpenAI API key (Get one here) OR a Gemini API Key

2. Installation

# Clone repository
git clone <repository-url>
cd pdf_agent_system

# Install dependencies
pip install -r requirements.txt

3. Configuration

# Copy environment template
cp .env.example .env

# Edit .env and add your API key
OPENAI_API_KEY=your_key_here

4. Run Application

streamlit run app.py

Access the application at: http://localhost:8501

πŸ“ Project Structure

pdf_agent_system/
β”œβ”€β”€ agents/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ tools.py                    # Standalone tools for SDK agents
β”‚   β”œβ”€β”€ rag_agent.py                # RAG Agent definition
β”‚   β”œβ”€β”€ summarization_agent.py      # Summarization Agent definition
β”‚   β”œβ”€β”€ specialized_agents.py       # Reasoning Agents (Comparator, Timeline, etc.)
β”‚   └── planner_agent.py            # Orchestrator with Handoffs
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ state.py                    # Singleton for tool access
β”‚   β”œβ”€β”€ pdf_processor.py            # PDF extraction + chunking
β”‚   └── vector_store.py             # FAISS vector database
β”œβ”€β”€ config/
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── settings.py                 # Configuration
β”œβ”€β”€ app.py                          # Streamlit UI
β”œβ”€β”€ requirements.txt                # Dependencies
β”œβ”€β”€ .env.example                    # Configuration template
β”œβ”€β”€ .gitignore                      # Git ignore rules
└── README.md                       # This file

πŸŽ“ How It Works

1. OpenAI Agents SDK Integration

We use the native Agent and Runner primitives:

from agents import Agent, Runner

# Agents invoke tools and hand off to others
result = Runner.run_sync(planner_agent, user_query)
print(result.final_output)

2. Tool Functions

Tools are defined using the @function_tool decorator and access shared state:

@function_tool
def retrieve_documents(query: str):
    """Retrieve relevant chunks"""
    return global_state.vector_store.search(query)

3. Autonomous Orchestration

The Planner Agent uses instructions and the handoffs list to route dynamically:

planner_agent = Agent(
    name="Planner",
    instructions="Route queries to the correct specialist...",
    handoffs=[rag_agent, summarization_agent, comparator_agent]
)

πŸ’‘ Usage Examples

Example 1: Question Answering

User: "What are the main findings in the research paper?"

System Flow:

  1. Planner delegates to RAG Agent
  2. RAG Agent calls 'retrieve_documents' tool
  3. Agent generates answer with citations

Output:

Answer: "The research identifies three main findings: [1] X, [2] Y, [3] Z"

Example 2: Comparative Analysis

User: "Compare the methodologies across these papers"

System Flow:

  1. Planner delegates to RAG Agent
  2. RAG Agent retrieves methodology sections
  3. RAG Agent hands off to Comparator Agent
  4. Comparator Agent analyzes differences

Output:

Structured comparison with specific examples

βš™οΈ Configuration

Environment Variables

# Required
OPENAI_API_KEY=sk-your-key-here

# Optional (defaults shown)
CHUNK_SIZE=1000
CHUNK_OVERLAP=200
TOP_K_RETRIEVAL=5

πŸ’° Cost Considerations

OpenAI Pricing

Model Input (per 1M tokens) Output (per 1M tokens)
gpt-4o-mini $0.150 $0.600
gpt-4o $2.50 $10.00

Typical Usage

  • Per Query: ~2,000 input + 500 output tokens = ~$0.0006
  • Per Session: ~10 queries = ~$0.006

πŸŽ‰ Acknowledgments

  • OpenAI - Agents SDK framework
  • Facebook Research - FAISS vector search
  • Sentence Transformers - Embedding models
  • Streamlit - Interactive UI framework

✨ Built with OpenAI Agents SDK v0.6.1 | Multi-Agent Architecture ✨

About

Multi-Agent Reasoning System for PDF Analysis using OpenAI Agents SDK with RAG pipeline, autonomous intent detection, and interactive Streamlit UI for intelligent document Q&A.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages