Skip to content

amitgambhir/rag-system-design-guide

Repository files navigation

📘 RAG System Design Guide

The single-page reference for designing, evaluating, and operating production RAG systems.

Content: Markdown Site: MkDocs Material Publish: GitHub Pages

Table of Contents

  1. RAG Ecosystem
  2. The Problem
  3. What It Does
  4. Demo
  5. Built On
  6. Quickstart
  7. Architecture
  8. Run Locally
  9. Deploy Your Own
  10. Why This Is Different

RAG Ecosystem

This repo is part of a broader RAG toolkit:

Repo What it covers
rag-auditor Evaluate your RAG pipeline
multi-llm-rag-agent-chat Build a production RAG chatbot with multi-LLM routing
rag-system-design-guide ← you are here Design reference — architecture patterns and trade-offs

Start with the design guide, build with the chatbot, evaluate with the auditor.


The Problem

Most RAG explainers stop at isolated concepts.

You can find plenty of material on chunking, embeddings, or vector search — but almost nothing that connects those decisions to evaluation, observability, security, and production operations. When you're designing a real system, that missing connective tissue is exactly what matters.

This guide puts the full picture in one place.


What It Does

Input:  A team or individual planning, reviewing, or debugging a RAG architecture
Output: A decision-oriented guide covering what to build, what to avoid, and how to run it in production
Part Focus Highlights
Part I Foundations Foundation models, LLM pitfalls, how RAG works, RAG vs. prompt engineering vs. fine-tuning
Part II System Design Problem framing, failure scenarios, ingestion, chunking, embeddings, search, retrieval, reranking, prompting, generation, hallucination reduction
Part III Operations & Architecture Evaluation metrics, observability, scaling, Kubernetes, security, enterprise RAG architecture
Part IV Advanced Topics RAG vs. MCP vs. AI agents, HyDE, CRAG, Self-RAG, Adaptive RAG, GraphRAG, multi-modal RAG, guardrails, agentic RAG
Appendices Practical Reference The 2026 RAG Developer Stack and recommended tools

Demo

Open the site and you land on a single-page reference with:

Part I   → Foundations
Part II  → System Design
Part III → Operations & Architecture
Part IV  → Advanced Topics

Plus:
- Design pitfalls & best practices
- The 2026 RAG Developer Stack
- Recommended tools & technologies

Built On

Technology Role
Markdown Keeps the guide easy to edit and version
MkDocs Material Gives the site a clean docs layout, navigation, and search
GitHub Pages Hosts the published documentation site
GitHub Actions Deploys the site automatically on every push to main

Site config lives in mkdocs.yml. The deploy workflow is .github/workflows/deploy-pages.yml. The docs/ directory contains symlinks back to the root Markdown files so content stays in one place.


Quickstart

git clone https://github.com/amitgambhir/rag-system-design-guide.git
cd rag-system-design-guide
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
mkdocs serve

Then open http://127.0.0.1:8000/.


Architecture

README.md + RAG System Design - Complete Q&A Guide.md   ← source of truth
                        │
                        ▼
              docs/index.md + docs/guide.md              ← symlinks, not copies
                        │
                        ▼
                    mkdocs.yml                           ← site config
                        │
                        ▼
            GitHub Actions on push to main               ← CI/CD
                        │
                        ▼
                gh-pages branch deploy
                        │
                        ▼
              Published GitHub Pages site

Run Locally

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
mkdocs serve

Deploy Your Own

  1. Create a new GitHub repository.
  2. Push this folder's contents to the main branch.
  3. In GitHub, open Settings → Pages.
  4. Set the source to Deploy from a branch and choose gh-pages/ (root).
  5. Push to main — the workflow in .github/workflows/deploy-pages.yml will publish the site.

Why This Is Different

  • Focuses on system design trade-offs, not just definitions.
  • Connects retrieval, generation, and operations in one continuous reference.
  • Covers both foundations and production realities: evals, observability, security, and scaling.

License

Released under the MIT License.

I wrote this as the reference I wish I had when turning RAG ideas into production architecture.

About

A practitioner-focused guide to designing, building, and operating production RAG systems — from foundations to enterprise architecture.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors