PostgreSQL + pgvector Setup Guide

Current Status ✅

What's Working:

✅ PostgreSQL 15.15 installed and running
✅ Database agentic_rag exists and is accessible
✅ PostgreSQL stores documents, collections, conversations
✅ SQLite stores embeddings persistently (backend/embeddings.db)
✅ Full application functionality working

Optional Enhancement:

⚠️ pgvector extension NOT installed (see below for installation)

Quick Start: Install pgvector (Optional)

To enable PostgreSQL-based vector storage, run ONE command:

./build_pgvector_for_pg15.sh

This automated script will:

Clone pgvector v0.8.1 from GitHub
Compile it for your PostgreSQL 15 installation
Install the extension (requires sudo password once)
Enable it in the agentic_rag database
Test functionality and clean up

Time required: 2-3 minutes

After installation, restart the backend to use PostgreSQL storage instead of SQLite.

Why PostgreSQL with pgvector?

Currently, the system uses a SQLite fallback for embeddings:

✅ Embeddings persist across restarts in backend/embeddings.db
✅ All features work correctly
⚠️ Not as scalable as PostgreSQL pgvector for very large datasets

With PostgreSQL + pgvector (optional upgrade):

✅ Unified storage (everything in PostgreSQL)
✅ Better performance for large-scale vector operations
✅ Native database support for similarity search
✅ Production-ready for enterprise deployments

Installation

macOS (using Homebrew)

# Install PostgreSQL
brew install postgresql@15

# Start PostgreSQL service
brew services start postgresql@15

# Install pgvector extension
brew install pgvector

Ubuntu/Debian

# Install PostgreSQL
sudo apt update
sudo apt install postgresql postgresql-contrib

# Install pgvector
sudo apt install postgresql-15-pgvector

Docker (easiest option)

# Run PostgreSQL with pgvector in Docker
docker run -d \
  --name postgres-pgvector \
  -e POSTGRES_PASSWORD=postgres \
  -e POSTGRES_DB=agentic_rag \
  -p 5432:5432 \
  pgvector/pgvector:pg15

Database Setup

Option 1: Automated Setup Script

Run the provided setup script:

./setup_database.sh

This script will:

Check if PostgreSQL is running
Create the agentic_rag database
Enable the pgvector extension
Display the connection string

Option 2: Manual Setup

# Connect to PostgreSQL
psql -U postgres

# Create database
CREATE DATABASE agentic_rag;

# Connect to the new database
\c agentic_rag

# Enable pgvector extension
CREATE EXTENSION vector;

# Verify extension is enabled
\dx

Configuration

The backend is already configured to use PostgreSQL. The connection settings are in backend/.env:

DATABASE_URL=postgresql+asyncpg://postgres:postgres@localhost:5432/agentic_rag
DATABASE_SYNC_URL=postgresql://postgres:postgres@localhost:5432/agentic_rag

Update these values if you use different credentials:

Username: Change postgres to your username
Password: Change postgres to your password
Host: Change localhost if database is on a different server
Port: Change 5432 if using a different port
Database: Change agentic_rag to your database name

Testing the Setup

Start the backend server:

cd backend
source venv/bin/activate
uvicorn main:app --reload

Check the logs for:

INFO:__main__:PostgreSQL connection successful
INFO:__main__:pgvector extension is available
INFO:__main__:Database initialized successfully

Upload a text document (PDF, TXT, Markdown)
Restart the backend server
Query the document - it should still be searchable (embeddings persisted!)

Troubleshooting

PostgreSQL not running

Error: PostgreSQL not available - using in-memory storage fallback

Solution: Start PostgreSQL:

# macOS (Homebrew)
brew services start postgresql@15

# Linux (systemd)
sudo systemctl start postgresql

# Docker
docker start postgres-pgvector

pgvector extension not found

Error: pgvector extension not found

Solution: Install pgvector extension:

# macOS
brew install pgvector

# Ubuntu/Debian
sudo apt install postgresql-15-pgvector

# Or use Docker image with pgvector pre-installed

Connection refused

Error: Database connection test failed

Solution: Check your connection settings in backend/.env:

Verify the host, port, username, and password
Ensure PostgreSQL is running: pg_isready
Test connection manually: psql -h localhost -U postgres -d agentic_rag

Permission denied

Error: permission denied for database

Solution: Grant permissions to your user:

-- Connect as superuser
psql -U postgres

-- Grant all privileges
GRANT ALL PRIVILEGES ON DATABASE agentic_rag TO your_username;

Fallback Mode

If PostgreSQL is unavailable, the system automatically falls back to in-memory storage:

The application continues to work normally
Embeddings are stored in memory (lost on restart)
A warning is logged: Using in-memory fallback storage

Once PostgreSQL becomes available, restart the backend to use persistent storage.

Database Schema

The system creates the following table:

CREATE TABLE document_embeddings (
    id SERIAL PRIMARY KEY,
    document_id VARCHAR(255) NOT NULL,
    chunk_id VARCHAR(255) NOT NULL UNIQUE,
    text TEXT NOT NULL,
    embedding VECTOR,  -- pgvector type, supports any dimension
    metadata JSONB,
    INDEX ix_document_embeddings_document_id (document_id),
    INDEX ix_document_embeddings_embedding_vector (embedding) USING ivfflat WITH (lists = 100)
);

The embedding column uses pgvector's VECTOR type for efficient similarity search.

Performance Tips

Index Tuning: The default IVFFlat index uses 100 lists. For larger datasets (>100k vectors), increase this:

CREATE INDEX ON document_embeddings USING ivfflat (embedding vector_cosine_ops) WITH (lists = 1000);

Connection Pooling: The system uses SQLAlchemy's connection pooling (pool_size=5, max_overflow=10). Adjust in backend/core/database.py if needed.
Monitoring: Check active connections:

SELECT count(*) FROM pg_stat_activity WHERE datname = 'agentic_rag';

Migration from In-Memory Storage

If you have documents uploaded before this update:

Re-upload your documents after setting up PostgreSQL
The new embeddings will be stored persistently
Old in-memory embeddings will be lost (they were temporary anyway)

Next Steps

Configure your OpenAI API key in the UI (Settings)
Upload documents and test vector search
Embeddings now persist across restarts!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PostgreSQL + pgvector Setup Guide

Current Status ✅

Quick Start: Install pgvector (Optional)

Why PostgreSQL with pgvector?

Installation

macOS (using Homebrew)

Ubuntu/Debian

Docker (easiest option)

Database Setup

Option 1: Automated Setup Script

Option 2: Manual Setup

Configuration

Testing the Setup

Troubleshooting

PostgreSQL not running

pgvector extension not found

Connection refused

Permission denied

Fallback Mode

Database Schema

Performance Tips

Migration from In-Memory Storage

Next Steps

FilesExpand file tree

POSTGRESQL_SETUP.md

Latest commit

History

POSTGRESQL_SETUP.md

File metadata and controls

PostgreSQL + pgvector Setup Guide

Current Status ✅

Quick Start: Install pgvector (Optional)

Why PostgreSQL with pgvector?

Installation

macOS (using Homebrew)

Ubuntu/Debian

Docker (easiest option)

Database Setup

Option 1: Automated Setup Script

Option 2: Manual Setup

Configuration

Testing the Setup

Troubleshooting

PostgreSQL not running

pgvector extension not found

Connection refused

Permission denied

Fallback Mode

Database Schema

Performance Tips

Migration from In-Memory Storage

Next Steps