AI HR Copilot is a compact Streamlit MVP for evidence-based candidate prescreening. It is a decision-support tool for homework and portfolio use, not an ATS, CRM, or autonomous hiring system.
Recruiters and hiring managers often need a fast first-pass candidate review, but a plain keyword matcher is too shallow and a full ATS is too heavy for a compact MVP.
This project evaluates one candidate against one vacancy through a structured workflow that surfaces evidence, gaps, risks, and interview questions instead of hiding everything behind a black-box score.
vacancy -> scorecard -> resume parsing -> evidence extraction -> fit matrix -> heuristic_score + llm_score -> final_score -> decision -> interview questions -> history
- The LLM is instructed to extract evidence, not invent fit.
- The app distinguishes must-have, nice-to-have, responsibilities, and red flags.
- Missing evidence is treated as unconfirmed, not automatically false.
heuristic_score,llm_score, andfinal_scoreare stored separately.- The final recommendation is explainable through the fit matrix and deterministic rules.
- a fast first-pass review without reducing the candidate to a keyword match;
- separate
heuristic_score,llm_score, andfinal_scoreinstead of one opaque number; - interview questions, risks, and missing evidence that can be discussed with the candidate;
- local history and candidate comparison for repeated screening sessions.
vacancy text + candidate name + PDF or manual resume text
-> resume preparation and truncation
-> vacancy scorecard + evidence extraction via OpenAI JSON analysis
-> deterministic local scoring
-> heuristic_score + llm_score -> final_score
-> decision + confidence + interview questions
-> JSON download + local history record
- Streamlit renders separate metrics for
final_score,heuristic_score, andllm_score. - The app stores history records and shows candidate comparison when there are at least two saved analyses.
- JSON download is built directly from the final result model.
- The app must not fail before analysis if
OPENAI_API_KEYis missing; this behavior is documented and guarded in runtime flow. - The repository includes pytest coverage for app import smoke, GPT request building, parsing, scoring, models, and storage.
- Streamlit UI
- vacancy input
- candidate name input
- PDF resume upload via
pdfplumber - manual resume text fallback
- resume cleanup and truncation via
MAX_RESUME_CHARS - OpenAI JSON analysis
- deterministic local scoring for hard skills, experience, and soft skills
- adjustable score weights in UI
- local JSON history
- candidate comparison view
- JSON report download
- Docker support
- pytest coverage for parser, models, scoring, storage, and app import smoke
.
├── app.py
├── config.py
├── parser.py
├── prompts.py
├── gpt_service.py
├── scoring.py
├── models.py
├── storage.py
├── requirements.txt
├── Dockerfile
├── docker-compose.yml
├── data/
│ └── .gitkeep
└── tests/
├── conftest.py
├── test_app_smoke.py
├── test_models.py
├── test_parser.py
├── test_scoring.py
└── test_storage.py
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
streamlit run app.pydocker compose up --buildIf .env does not exist yet:
cp .env.example .env
docker compose up --buildThe app is exposed on http://localhost:8501.
Defined in .env.example:
OPENAI_API_KEYOPENAI_MODELMAX_RESUME_CHARSHISTORY_PATHLOG_LEVELHARD_SKILLS_WEIGHTEXPERIENCE_WEIGHTSOFT_SKILLS_WEIGHT
Important runtime behavior:
- the app must not crash without
OPENAI_API_KEYbefore the user clicks Analyze .envmust not be committeddata/history.jsonmust not be committed
Required fields always present:
scorestrengthsweaknessesmissing_skillssummary
Extended fields with graceful fallback:
decisionconfidencevacancy_scorecardfit_matrixrisksinterview_questionsheuristic_scorellm_scorefinal_score
vacancy_scorecard contains:
must_havenice_to_haveresponsibilitiessoft_skillsred_flags
Each fit_matrix item contains:
criteriontypefoundevidencecriterion_scoreconfidence
pytest
streamlit run app.py
docker compose up --build- LLM may still return invalid or partial JSON; the app falls back safely but cannot guarantee perfect analysis quality.
- PDF parsing can be noisy on badly formatted resumes.
- Heuristic scoring is intentionally simple and explainable, not production-grade ranking logic.
- Local JSON history is suitable for a demo MVP, not for multi-user production use.
- SQLite or PostgreSQL instead of local JSON
- DOCX support
- OCR for scanned resumes
- hh.ru integration
- PDF or Markdown export
- model comparison across multiple OpenAI models
This MVP covers the assignment requirements through:
- Streamlit interface
- vacancy input
- candidate name input
- PDF parsing with
pdfplumber - manual text fallback
.env-based OpenAI integration- strict JSON response contract
- mandatory output fields
- local scoring with editable weights
- JSON history and candidate comparison
- JSON report download
- Dockerfile and
docker-compose.yml - portfolio-grade README
- minimal pytest suite