Nik — Technical Support Agent Backend

Backend for an AI-powered technical support agent that answers field technician questions from 89+ product manuals.

Overview

Solar energy installers working on rooftops with inverters, battery storage, and EV chargers need fast, reliable answers from product documentation. The current support process fails them: call the hotline, wait on hold, get told to email photos, call back the next day, reach a different agent, and start the explanation over. Meanwhile, the installer is still on the roof.

This backend replaces that loop with a single conversation. A technician types a question — or sends a photo of an error code — and Nik (named after Nikola Tesla), a ReAct agent, searches the complete documentation library, reasons about what it found, and delivers a precise answer with document name and page number citations. The agent is not a keyword search engine. It decides autonomously which tools to call, how many times to search, when to retrieve a full document for deeper context, and when to ask the technician a clarifying question before answering. All responses are generated in German, the working language of the installers.

The system currently indexes 89 product documents across 700+ searchable chunks, with health monitoring across four external services (PostgreSQL, MinIO, ChromaDB, Cohere).

Tech Stack

Layer	Technology
AI orchestration	LangGraph (ReAct agent, checkpointer, interrupt/resume)
Generation + vision	OpenAI GPT-4o
Embeddings	OpenAI `text-embedding-3-large` (3072 dimensions)
Reranking	Cohere `rerank-v4.0-pro` cross-encoder
API framework	FastAPI (fully async)
Streaming	SSE via sse-starlette
Relational DB	PostgreSQL 16 (asyncpg + SQLAlchemy async)
Object storage	MinIO (S3-compatible)
Vector DB	ChromaDB (cosine similarity)
Migrations	Alembic (async PostgreSQL)
Configuration	pydantic-settings, 12-factor env vars
Authentication	fastapi-users v15 (JWT bearer, email verification)
Admin panel	SQLAdmin (web-based database management)
Transactional email	AWS SES via boto3 (console fallback for dev)

Key Capabilities

Capability	Description
Agent Intelligence
ReAct agent loop	LangGraph-based autonomous agent that reasons, selects tools, observes results, and iterates — not a fixed pipeline
Interrupt/resume clarification	Agent pauses mid-execution to ask the user a question, then resumes exactly where it left off on reply
Multimodal image analysis	Users upload photos of error codes, displays, or wiring; GPT-4o analyzes them inline with the conversation
Search and Retrieval
Semantic search + reranking	ChromaDB vector search retrieves candidates; Cohere cross-encoder reranks to the most relevant chunks
Full document retrieval	Agent can pull a complete document with all sections, tables, image descriptions, and a PDF download link
Source citations	Every answer cites document name and page number — the agent never fabricates information
Real-Time Interaction
SSE streaming	Token-by-token response streaming with live tool-call notifications via Server-Sent Events
Stateless one-shot mode	Separate endpoint for single questions without session overhead
Authentication and Access Control
JWT authentication	fastapi-users with Bearer token; email verification required for login
Code-based verification	8-digit verification codes via email (AWS SES) as alternative to token links
Role-based access control	Three tiers — active user, verified user, superuser — each protecting different endpoint groups
Admin panel	Web-based SQLAdmin UI and REST API for user and data management (superuser only)
Data and Persistence
Multi-turn chat sessions	Dual storage: LangGraph checkpointer for agent memory, PostgreSQL for app records and message history
Image attachments	Per-session image uploads stored in MinIO with ownership-verified presigned URL access
Operations
Per-user rate limiting	Sliding-window in-memory limiter on chat and upload endpoints (configurable per route)
Health monitoring	Dependency checks for PostgreSQL, MinIO, ChromaDB, and Cohere with a Kubernetes readiness probe

Architecture

Agent Decision Flow

flowchart TD
    A[User sends question] --> B[ReAct Agent]
    B --> C{Reason}
    C -->|search docs| D[search_knowledge]
    C -->|need full context| E[get_full_document]
    C -->|ambiguous query| F[ask_user_clarification]
    C -->|ready to answer| G[Generate answer with citations]
    D --> H{Sufficient?}
    E --> H
    H -->|no| C
    H -->|yes| G
    F -->|user replies| B
    G --> I[Persist + respond]
    I --> J[JSON or SSE stream]

    style A fill:#dbeafe,stroke:#93c5fd,color:#1e3a5f
    style B fill:#f8fafc,stroke:#cbd5e1,color:#334155
    style C fill:#f8fafc,stroke:#cbd5e1,color:#334155
    style D fill:#fef3c7,stroke:#fcd34d,color:#713f12
    style E fill:#fef3c7,stroke:#fcd34d,color:#713f12
    style F fill:#ffedd5,stroke:#fdba74,color:#7c2d12
    style G fill:#d1fae5,stroke:#6ee7b7,color:#064e3b
    style H fill:#f8fafc,stroke:#cbd5e1,color:#334155
    style I fill:#f8fafc,stroke:#cbd5e1,color:#334155
    style J fill:#d1fae5,stroke:#6ee7b7,color:#064e3b

Data Architecture

flowchart LR
    API[FastAPI Backend] --> M[MinIO]
    API --> C[ChromaDB]
    API --> P[PostgreSQL 16]

    M --- M1[PDF documents]
    M --- M2[Parsed metadata JSON]
    M --- M3[User photo uploads]

    C --- C1[Vector embeddings]
    C --- C2[700+ searchable chunks]

    P --- P1[Users + auth + sessions + messages]
    P --- P2[Image references + feedback]
    P --- P3[Agent checkpoints]
    P --- P4[Verification codes]

    style API fill:#f8fafc,stroke:#cbd5e1,color:#334155
    style M fill:#fef3c7,stroke:#fcd34d,color:#713f12
    style C fill:#fef3c7,stroke:#fcd34d,color:#713f12
    style P fill:#dbeafe,stroke:#93c5fd,color:#1e3a5f

Each layer serves a distinct purpose: PostgreSQL handles relational queries and ACID transactions (including user authentication, verification codes, and LangGraph agent state via a dedicated psycopg3 connection pool), MinIO stores large binary files with S3-compatible presigned URL access, and ChromaDB provides vector similarity search over document embeddings.

Streaming Protocol

The POST /chat/stream endpoint delivers the agent's response progressively via Server-Sent Events.

Event	Payload	When
`metadata`	`{"session_id": "..."}`	Once, after session setup
`tool_start`	`{"tool": "search_knowledge"}`	Agent calls a tool
`token`	`{"content": "Die Installation..."}`	Each LLM response token
`sources`	`[{"filename": "...", "page_number": 8}]`	After answer completes
`done`	`{"success": true, "needs_clarification": false}`	Terminal event
`error`	`{"message": "..."}`	On failure

API Reference

Method	Endpoint	Auth	Description
	Auth
`POST`	`/auth/register`	--	Register a new user account
`POST`	`/auth/login`	--	Login and receive a JWT access token
`POST`	`/auth/logout`	Bearer	Invalidate the current token
`POST`	`/auth/request-verify-token`	--	Request an email verification token
`POST`	`/auth/verify`	--	Verify email with a JWT token
`POST`	`/auth/verify-code`	--	Verify email with an 8-digit code
`POST`	`/auth/forgot-password`	--	Request a password reset email
`POST`	`/auth/reset-password`	--	Reset password with a JWT token
`POST`	`/auth/reset-password-code`	--	Reset password with an 8-digit code
`GET`	`/auth/me`	Bearer	Get current user profile
`PATCH`	`/auth/me`	Bearer	Update current user profile
	Chat
`POST`	`/chat`	Verified	Send a message and get an AI response
`POST`	`/chat/stream`	Verified	Stream the AI response via SSE
`POST`	`/chat/upload`	Verified	Upload images for a chat session
`GET`	`/chat/{session_id}/messages`	Verified	Get conversation history with images
`PATCH`	`/chat/{session_id}`	Verified	Update session title or pinned status
`DELETE`	`/chat/{session_id}`	Verified	Soft-delete a session
`GET`	`/chat/images/{object_key}`	Verified	Fetch image via presigned URL redirect
	Users
`GET`	`/users/me/sessions`	Verified	List chat sessions for the authenticated user
	Agent
`POST`	`/agent/ask`	Active	One-shot AI answer — no session, no persistence
	Search and Documents
`GET`	`/search`	Active	Semantic search across document chunks
`GET`	`/documents`	Active	List all documents with metadata
`GET`	`/documents/{id}`	Active	Document details with presigned download URLs
	Admin (User Management)
`GET`	`/admin/users`	Superuser	List all users with session counts
`GET`	`/admin/users/{user_id}`	Superuser	Get user details
`PATCH`	`/admin/users/{user_id}`	Superuser	Update user flags or display name
`DELETE`	`/admin/users/{user_id}`	Superuser	Delete a user and all associated data
	Admin (Embedding)
`POST`	`/admin/embed`	Superuser	Trigger embedding pipeline for all documents
`POST`	`/admin/embed/{id}`	Superuser	Embed a single document
`GET`	`/admin/embed/status`	Superuser	Embedding statistics
	Admin (Web Panel)
--	`/admin`	Session	SQLAdmin web UI (superuser login)
	Health
`GET`	`/health`	--	System health (PostgreSQL, MinIO, ChromaDB, Cohere)
`GET`	`/health/ready`	--	Kubernetes readiness probe

Auth levels: -- = public, Active = any active user, Verified = active + email verified, Superuser = active + verified + superuser, Session = superuser session cookie (web UI), Bearer = any authenticated user.

Example: Full Chat Flow

# 1. Register a new account
curl -X POST "http://localhost:8000/auth/register" \
  -H "Content-Type: application/json" \
  -d '{"email": "techniker@example.com", "display_name": "Max Mustermann", "password": "SecurePass123!"}'

# 2. Verify email (use the 8-digit code from the verification email)
curl -X POST "http://localhost:8000/auth/verify-code" \
  -H "Content-Type: application/json" \
  -d '{"email": "techniker@example.com", "code": "12345678"}'

# 3. Login to get a JWT token
curl -X POST "http://localhost:8000/auth/login" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "username=techniker@example.com&password=SecurePass123!"
# Response: {"access_token": "eyJ...", "token_type": "bearer"}

# 4. Send a message (starts a new session automatically)
curl -X POST "http://localhost:8000/chat" \
  -H "Authorization: Bearer eyJ..." \
  -H "Content-Type: application/json" \
  -d '{"question": "Wie installiere ich den X3-Hybrid G4?"}'

# 5. Get conversation history
curl "http://localhost:8000/chat/SESSION_UUID/messages" \
  -H "Authorization: Bearer eyJ..."

Project Structure

📁 app/
├── config/settings.py          # pydantic-settings configuration
├── main.py                     # Application entry point and lifespan management
│
├── auth/                       # JWT authentication (fastapi-users)
│   ├── router.py               # Route aggregation (login, register, verify, reset, me)
│   ├── code_router.py          # Code-based verify-code and reset-password-code endpoints
│   ├── backend.py              # JWT strategy and Bearer transport configuration
│   ├── manager.py              # UserManager with email hooks (verification, password reset)
│   ├── dependencies.py         # Auth deps: current_active_user, verified_user, superuser
│   ├── email.py                # AWS SES email sending with console fallback
│   ├── verification_codes.py   # 8-digit code generation and redemption logic
│   └── schemas.py              # UserRead, UserCreate, UserUpdate
│
├── admin/                      # Admin panel and user management
│   ├── router.py               # REST API: list/get/update/delete users (superuser only)
│   ├── service.py              # AdminService with user CRUD and cascade deletion
│   ├── setup.py                # SQLAdmin mounting and view registration
│   ├── sqladmin_auth.py        # Session-based superuser authentication for web panel
│   ├── views.py                # ModelView definitions for all models
│   └── schemas.py              # AdminUserSummary, AdminUserDetail, AdminUserUpdateRequest
│
├── agent/                      # ReAct agent (LangGraph)
│   ├── graph.py                # AgentState, dual graph compilation (chat + stateless)
│   ├── tools.py                # search_knowledge, get_full_document, ask_user_clarification
│   ├── prompts.py              # German system prompt and fallback messages
│   ├── checkpointer.py         # AsyncPostgresSaver lifecycle (psycopg3 pool)
│   ├── client.py               # Cohere reranking client
│   ├── utils.py                # Source extraction from tool messages
│   ├── schemas.py
│   └── router.py
│
├── chat/                       # Persistent conversations
│   ├── service.py              # Orchestration: prepare, invoke/stream, persist
│   ├── streaming.py            # SSE event protocol over graph.astream()
│   ├── schemas.py
│   └── router.py
│
├── middleware/
│   └── rate_limit.py           # Per-user sliding-window limiter
│
├── users/
│   ├── service.py              # Session listing with search and pinning
│   ├── schemas.py
│   └── router.py               # GET /users/me/sessions (authenticated)
│
├── postgres/
│   ├── client.py               # Async engine and session factory (asyncpg)
│   ├── base.py                 # SQLAlchemy Base, UUID and Timestamp mixins
│   └── models.py               # User, Session, Message, MessageImage, Feedback, VerificationCode
│
├── minio/client.py             # S3-compatible file operations and presigned URLs
├── chroma/client.py            # ChromaDB collection management
│
├── vectorstore/
│   ├── chunking.py             # Title-based document sectioning
│   ├── service.py              # Embedding and semantic search
│   └── router.py
│
├── documents/                  # Document listing and retrieval
└── health/                     # Health check endpoints

📁 alembic/
├── env.py                      # Async migration runner
└── versions/                   # 0001-0004: schema, pinned_at, auth fields, verification codes

📁 scripts/
├── seed_minio.py               # Load product documents into MinIO
└── seed_chromadb.py            # Generate and store embeddings

How the Agent Works

1. Search

The agent calls search_knowledge, which queries ChromaDB with the user's question using text-embedding-3-large (3072 dimensions). ChromaDB returns candidates ranked by cosine similarity, then Cohere's rerank-v4.0-pro cross-encoder rescores them to surface the most relevant chunks.

2. Reason

The ReAct loop evaluates the search results and decides autonomously what to do next. If the chunks lack sufficient detail, the agent calls get_full_document to retrieve the complete document structure — all sections, tables, image descriptions, and a presigned PDF download link. If the query is ambiguous (e.g., multiple product models match), the agent calls ask_user_clarification, which triggers a LangGraph interrupt() to pause execution and wait for the user's reply before resuming.

3. Answer

Once the agent has sufficient context, GPT-4o generates a precise answer in German, citing every source by document name and page number. If the documentation does not contain the answer, the agent states this explicitly rather than fabricating information.

Getting Started

Prerequisites

Python 3.10+
Docker and Docker Compose
OpenAI API key (embeddings + generation)
Cohere API key (reranking — free tier available)

Setup

# 1. Clone and configure
cp .env.example .env
# Edit .env — add OPENAI_API_KEY, COHERE_API_KEY, and set JWT_SECRET
# (Optional: configure AWS SES credentials for email verification)

# 2. Create virtual environment
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# 3. Start infrastructure (MinIO + PostgreSQL)
docker compose up -d
# MinIO Console: http://localhost:9001 (minioadmin/minioadmin)
# PostgreSQL: localhost:5432 (postgres/postgres)

# 4. Run database migrations
alembic upgrade head

# 5. Seed product documents
python scripts/seed_minio.py

# 6. Start the API
uvicorn app.main:app --reload
# API: http://localhost:8000
# Docs: http://localhost:8000/docs
# Admin Panel: http://localhost:8000/admin (requires superuser account)

# 7. Generate embeddings
python scripts/seed_chromadb.py

# 8. Verify health
curl http://localhost:8000/health

Development Approach

Built in focused, mergeable increments — each PR self-contained and deployable:

PR	Title	What was built
#1	MinIO storage	S3-compatible blob storage with FastAPI scaffold
#2	Vector embeddings	ChromaDB integration with semantic search endpoint
#3	PostgreSQL + Alembic	Relational models and async migrations
#4	RAG pipeline	LangGraph pipeline with Cohere reranking
#5	Chat persistence	Sessions and messages connected to the agent pipeline
#6	ReAct agent	Autonomous agent replacing the fixed 4-stage pipeline
#7-8	Interrupt/resume	`ask_user_clarification` tool with LangGraph interrupt flow
#9	Image uploads	Multimodal support with base64 content blocks
#10	Production hardening	SSE streaming, per-user rate limiting, secure image proxy
#18	Session management	Pinned sessions, session update/delete, title search
#19	Authentication	JWT auth with fastapi-users, email verification, code-based flows, AWS SES
--	Admin panel	SQLAdmin web UI, REST user management API, superuser session auth

The codebase uses async Python throughout with type hints on all functions. Request/response validation uses Pydantic v2, modules follow a domain-driven structure (app/{domain}/router.py, service.py, schemas.py), and services use the singleton pattern. All external dependencies have health checks, AI prompts are externalized for maintainability, and configuration follows the 12-factor app methodology via environment variables.

Configuration

Key environment variables (see .env.example for the full list):

Variable	Description
`OPENAI_API_KEY`	Required — embeddings and answer generation
`OPENAI_CHAT_MODEL`	LLM model (default: `gpt-4o`)
`COHERE_API_KEY`	Required — relevance reranking
`POSTGRES_HOST` / `POSTGRES_DB`	PostgreSQL connection
`MINIO_ENDPOINT`	MinIO server address (default: `localhost:9000`)
`CHROMA_PERSIST_DIRECTORY`	ChromaDB storage path (default: `./chroma_data`)
`RATE_LIMIT_CHAT`	Chat rate limit (default: `10/minute`)
`RATE_LIMIT_UPLOAD`	Upload rate limit (default: `20/minute`)
`IMAGE_PRESIGN_EXPIRES_MINUTES`	Image URL expiration (default: `5`)
`JWT_SECRET`	Required — JWT signing key (change in production)
`JWT_LIFETIME_SECONDS`	Token expiry (default: `3600`)
`AWS_ACCESS_KEY_ID`	Optional — AWS credentials for SES email
`AWS_SECRET_ACCESS_KEY`	Optional — AWS credentials for SES email
`AWS_REGION`	AWS region for SES (default: `eu-central-1`)
`SES_FROM_EMAIL`	Sender email for verification emails
`VERIFICATION_CODE_LIFETIME_SECONDS`	Code expiry (default: `3600`)
`SQLADMIN_SECRET`	Session cookie key for admin panel

Released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
alembic		alembic
app		app
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
alembic.ini		alembic.ini
docker-compose.yml		docker-compose.yml
minio.license		minio.license
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nik — Technical Support Agent Backend

Overview

Tech Stack

Key Capabilities

Architecture

Agent Decision Flow

Data Architecture

Streaming Protocol

API Reference

Project Structure

How the Agent Works

1. Search

2. Reason

3. Answer

Getting Started

Prerequisites

Setup

Development Approach

Configuration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Nik — Technical Support Agent Backend

Overview

Tech Stack

Key Capabilities

Architecture

Agent Decision Flow

Data Architecture

Streaming Protocol

API Reference

Project Structure

How the Agent Works

1. Search

2. Reason

3. Answer

Getting Started

Prerequisites

Setup

Development Approach

Configuration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages