HelpmateAI

HelpmateAI is a grounded long-document QA system for PDFs and DOCX files. It indexes documents locally, runs hybrid retrieval with reranking, and returns citation-aware answers with visible evidence. The product is now moving onto a Next.js + FastAPI surface while the Python retrieval and generation core stays modular and benchmark-driven.

What It Does

uploads long-form PDF and DOCX documents
builds or reuses a persisted local Chroma index keyed by document fingerprint
runs hybrid retrieval with dense search, lexical search, fusion, and optional reranking
infers document structure, section kinds, clause metadata, and content-type hints
uses a planned retrieval stack with:
- chunk_first for exact factual and clause-style questions
- synopsis_first for broader narrative or synthesis questions
- hybrid_both for mixed or distributed evidence questions
builds section synopses and lightweight topology edges for document-aware retrieval control
repairs low-confidence journal-style section maps at indexing time with a small bounded model pass
uses a lightweight LLM-assisted route refinement only when deterministic planning is low-confidence
uses deterministic structural fallback for weak-evidence cases instead of LLM query rewriting
uses a dedicated global_summary_first route for broad paper-summary questions
short-circuits obviously irrelevant questions with retrieval guardrails before generation
runs a bounded post-rerank evidence selector that can promote a lower-ranked chunk when it is more direct than rank 1
generates grounded answers with citations, evidence panels, and explicit supported/unsupported status
evaluates retrieval quality with a layered benchmark stack:
- custom retrieval hit-rate and MRR
- structure-aware retrieval metrics
- abstention checks
- Vectara as the primary external retrieval baseline
- OpenAI File Search as a historical/reference retrieval baseline
- ragas as the main answer-quality metric

Current State

The repo is no longer a notebook demo. It is a real app-shaped project with:

frontend/ as the evolving Next.js product UI
backend/ as the FastAPI boundary over the Python core
app.py as the retained Streamlit research and benchmark shell
Dockerfile as the backend deployment image
Dockerfile.streamlit as the retained internal Streamlit image
src/ for reusable ingestion, retrieval, generation, cache, and UI logic
src/structure/, src/query_analysis/, src/sections/, and src/query_router.py for the document-intelligence and routing layers
tests/ for focused fast checks around the core logic
docs/ for architecture, evaluation policy, roadmap, and history

The original notebook remains only as a historical reference.

Why It Is In A Good Spot

The RAG core is already in a strong position:

it generalizes across policy documents, theses, and research papers
it competes well against external retrieval baselines
it now uses document-topology guidance without sacrificing chunk-grounded answers
it uses indexing-time structure repair only for suspicious low-confidence PDFs instead of pushing more model work into the live query path
it can rescue rank-order mistakes with a bounded evidence-selection layer instead of more free-form LLM planning
it now has a cleaner global-summary route for broad paper and thesis questions while keeping the factual path stable
it has a cleaner evaluation story now:
- Vectara for external retrieval comparison
- ragas for answer-quality comparison
it has reached the point where frontend/product polish is a better next investment than another large retrieval-core rewrite

Current Product Direction

HelpmateAI is at the start of a new phase:

the backend retrieval system is stable enough to keep
the product shell is shifting to Next.js + FastAPI
Streamlit remains useful for fast iteration, demos, and benchmark visibility, but it is now a secondary shell rather than the main product direction

Stack

Next.js
FastAPI
Streamlit
ChromaDB
OpenAI
scikit-learn
sentence-transformers
uv for project and dependency management

Quickstart

Install Python dependencies with uv and frontend dependencies with npm install in [frontend](C:\Users\Leander Antony A\Documents\Projects\HelpmateAI_RAG_QA_System\frontend).
Set OPENAI_API_KEY in .env if you want live answer generation and evaluation.
Run the backend: uv run uvicorn backend.main:app --reload --port 8001.
Run the frontend: npm run dev in [frontend](C:\Users\Leander Antony A\Documents\Projects\HelpmateAI_RAG_QA_System\frontend).
Optionally run streamlit run app.py for the internal benchmark/debug shell.

pyproject.toml and uv.lock are the dependency source of truth.

Deployment Shape

Recommended production split:

marketing site: Framer on www
product UI: Next.js on app
API: FastAPI on api

Example:

www.helpmate.ai -> Framer
app.helpmate.ai -> Vercel project rooted at [frontend](C:\Users\Leander Antony A\Documents\Projects\HelpmateAI_RAG_QA_System\frontend)
api.helpmate.ai -> FastAPI service using [Dockerfile](C:\Users\Leander Antony A\Documents\Projects\HelpmateAI_RAG_QA_System\Dockerfile)

Important runtime notes:

the frontend defaults to same-origin /api
production rewrites are controlled by API_REWRITE_TARGET
backend storage paths can be overridden with:
- HELPMATE_DATA_DIR
- HELPMATE_UPLOADS_DIR
- HELPMATE_INDEXES_DIR
- HELPMATE_CACHE_DIR
backend CORS is controlled by HELPMATE_CORS_ORIGINS

See [docs/deployment.md](C:\Users\Leander Antony A\Documents\Projects\HelpmateAI_RAG_QA_System\docs\deployment.md) for the step-by-step deployment plan.

Important Docs

Current Scope

supported document types: .pdf, .docx
retrieval-first long-document QA
local-first indexing and caching
dual-path retrieval with heuristic plus lightweight LLM routing
deterministic weak-evidence expansion instead of model-based query rewriting
topology-aware planning plus synopsis retrieval
low-confidence indexing-time structure repair for noisy journal PDFs
dedicated global-summary routing for broad paper-summary questions
bounded post-rerank evidence selection before final answer generation
benchmark-aware product surface in both the retained Streamlit shell and the newer frontend/backend app flow

Out of scope for the current phase:

auth and quotas
hosted user persistence
paraphrasing/document-rewrite workflows
full production hardening for multi-user hosted deployment

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.github/workflows		.github/workflows
.streamlit		.streamlit
backend		backend
docs		docs
frontend		frontend
images		images
src		src
static/sample_files		static/sample_files
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
DEVLOG.md		DEVLOG.md
Dockerfile		Dockerfile
Dockerfile.streamlit		Dockerfile.streamlit
HelpmateAI_RAG_project_Cleaned.ipynb		HelpmateAI_RAG_project_Cleaned.ipynb
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
app.py		app.py
pyproject.toml		pyproject.toml
render.yaml		render.yaml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HelpmateAI

What It Does

Current State

Why It Is In A Good Spot

Current Product Direction

Stack

Quickstart

Deployment Shape

Important Docs

Current Scope

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HelpmateAI

What It Does

Current State

Why It Is In A Good Spot

Current Product Direction

Stack

Quickstart

Deployment Shape

Important Docs

Current Scope

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages