Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 8 additions & 5 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
{
"name": "Python 3",
// Or use a Dockerfile or Docker Compose file. More info: https://containers.dev/guide/dockerfile
"image": "mcr.microsoft.com/devcontainers/python",
"name": "Immermatch Dev",
"image": "mcr.microsoft.com/devcontainers/python:3.13",
"customizations": {
"codespaces": {
"openFiles": [
Expand All @@ -13,11 +12,15 @@
"settings": {},
"extensions": [
"ms-python.python",
"ms-python.vscode-pylance"
"ms-python.vscode-pylance",
"charliermarsh.ruff",
"tamasfe.even-better-toml",
"GitHub.copilot",
"GitHub.copilot-chat"
]
}
},
"updateContentCommand": "[ -f packages.txt ] && sudo apt update && sudo apt upgrade -y && sudo xargs apt install -y <packages.txt; [ -f requirements.txt ] && pip3 install --user -r requirements.txt; pip3 install --user streamlit; echo '✅ Packages installed and Requirements met'",
"postCreateCommand": "pip install -e '.[dev,test]' && pre-commit install --hook-type pre-commit --hook-type pre-push && cp -n .env.example .env 2>/dev/null || true",
"postAttachCommand": {
"server": "streamlit run immermatch/app.py --server.enableCORS false --server.enableXsrfProtection false"
},
Expand Down
18 changes: 18 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
root = true

[*]
indent_style = space
indent_size = 4
end_of_line = lf
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = true

[*.{yml,yaml,json,jsonc,toml}]
indent_size = 2

[*.md]
trim_trailing_whitespace = false

[Makefile]
indent_style = tab
33 changes: 33 additions & 0 deletions .github/prompts/pr-review.prompt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
Fetch and address review comments from the most recent PR on the current branch.

## Execution policy

- Run all `gh` commands (or equivalent GitHub MCP calls) immediately without asking for confirmation.
- Do **not** start code edits until after presenting a full comment assessment and getting explicit user confirmation.

## Workflow

1. **Get the PR number:** `gh pr view --json number --jq .number`
2. **Get review comments:**
Use the GitHub MCP to fetch PR review comments:
```bash
gh pr view --json reviews --jq '.reviews[] | {author: .author.login, body, state}'
gh pr view --json comments --jq '.comments[] | {author: .author.login, body, path, line}'
gh api repos/{owner}/{repo}/pulls/{number}/comments --jq '.[] | {path, line, body, user: .user.login}'
```
3. **List all comments first (no edits yet):**
- Produce a complete checklist of every review comment.
- For each item include:
- **Assessment:** valid / duplicate / not applicable
- **Suggestion:** exact fix you plan to apply (or why you will skip)
4. **Ask for confirmation:**
- Share the full checklist with the user.
- Ask for explicit confirmation before implementing any code/document changes.
5. **After confirmation, implement valid changes**, then run the check suite:
```bash
source .venv/bin/activate && pytest tests/ -x -q && ruff check --fix . && ruff format --check . && mypy .
```
6. **Commit strategy:**
- **Trivial fixes** (typos, naming, small refactors): `git add -A && git commit --amend --no-edit && git push --force-with-lease`
- **Substantive changes** (new tests, logic changes, API modifications): `git add -A && git commit -m "fix: address PR feedback — <summary>" && git push`
7. **NEVER** force-push to `main` or any shared branch — only the current feature branch
8 changes: 7 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@ __pycache__/
*.so
.Python
.coverage
.mypy_cache/
.ruff_cache/
.pytest_cache/
htmlcov/
build/
develop-eggs/
dist/
Expand All @@ -31,7 +35,9 @@ immermatch.egg-info/

# IDE
.idea/
.vscode/
.vscode/*
!.vscode/extensions.json
!.vscode/launch.json
*.swp
*.swo

Expand Down
10 changes: 10 additions & 0 deletions .vscode/extensions.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"recommendations": [
"ms-python.python",
"ms-python.vscode-pylance",
"charliermarsh.ruff",
"tamasfe.even-better-toml",
"GitHub.copilot",
"GitHub.copilot-chat"
]
}
41 changes: 41 additions & 0 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
{
"version": "0.2.0",
"configurations": [
{
"name": "Streamlit App",
"type": "debugpy",
"request": "launch",
"module": "streamlit",
"args": ["run", "immermatch/app.py"],
"cwd": "${workspaceFolder}",
"env": {},
"justMyCode": true
},
{
"name": "pytest: Current File",
"type": "debugpy",
"request": "launch",
"module": "pytest",
"args": ["${file}", "-x", "-v"],
"cwd": "${workspaceFolder}",
"justMyCode": false
},
{
"name": "pytest: All Tests",
"type": "debugpy",
"request": "launch",
"module": "pytest",
"args": ["tests/", "-x", "-v"],
"cwd": "${workspaceFolder}",
"justMyCode": false
},
{
"name": "Daily Task",
"type": "debugpy",
"request": "launch",
"program": "${workspaceFolder}/daily_task.py",
"cwd": "${workspaceFolder}",
"justMyCode": true
}
]
}
79 changes: 28 additions & 51 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ This document defines the persona, context, and instruction sets for the AI agen
**Input:** Raw text extracted from a CV (PDF, DOCX, Markdown, or plain text).
**Output:** A structured JSON summary of the candidate.

**System Prompt:**
**System Prompt:** *(source of truth: `immermatch/search_agent.py:PROFILER_SYSTEM_PROMPT`)*
> You are an expert technical recruiter with deep knowledge of European job markets.
> You will be given the raw text of a candidate's CV. Extract a comprehensive profile.
>
Expand Down Expand Up @@ -73,7 +73,7 @@ The system prompt is selected based on the active **SearchProvider**:

Used when `provider.name == "Bundesagentur für Arbeit"`. Generates keyword-only queries (no location tokens) because the BA API has a dedicated `wo` parameter for location filtering.

**System Prompt:**
**System Prompt:** *(source of truth: `immermatch/search_agent.py:BA_HEADHUNTER_SYSTEM_PROMPT`)*
> You are a Search Specialist generating keyword queries for the German Federal Employment Agency job search API (Bundesagentur für Arbeit).
>
> Based on the candidate's profile, generate distinct keyword queries to find relevant job openings. The API searches across German job listings and handles location filtering separately.
Expand All @@ -94,7 +94,7 @@ Used when `provider.name == "Bundesagentur für Arbeit"`. Generates keyword-only

Used when `provider.name != "Bundesagentur für Arbeit"` (e.g., SerpApiProvider for non-German markets). Generates location-enriched queries optimised for Google Jobs.

**System Prompt:**
**System Prompt:** *(source of truth: `immermatch/search_agent.py:HEADHUNTER_SYSTEM_PROMPT`)*
> You are a Search Specialist. Based on the candidate's profile and location, generate 20 distinct search queries to find relevant job openings.
>
> IMPORTANT: Keep queries SHORT and SIMPLE (1-3 words). Google Jobs works best with simple, broad queries.
Expand Down Expand Up @@ -139,7 +139,7 @@ class SearchProvider(Protocol):

**Output:** A JSON object with score, reasoning, and missing skills.

**System Prompt:**
**System Prompt:** *(source of truth: `immermatch/evaluator_agent.py:SCREENER_SYSTEM_PROMPT`)*
> You are a strict Hiring Manager. Evaluate if the candidate is a fit for this specific job.
>
> **Scoring Rubric (0-100):**
Expand All @@ -162,7 +162,7 @@ class SearchProvider(Protocol):
- Temperature: 0.2 (low for consistent scoring)
- Max tokens: 8192

**Execution:** Jobs are evaluated in parallel using `ThreadPoolExecutor(max_workers=10)` with thread-safe progress tracking. On API errors, a fallback score of 50 is assigned.
**Execution:** Jobs are evaluated in parallel using `ThreadPoolExecutor(max_workers=30)` with thread-safe progress tracking. On API errors, a fallback score of 50 is assigned.

---

Expand All @@ -175,7 +175,7 @@ class SearchProvider(Protocol):

**Output:** A markdown-formatted career summary string (not JSON).

**System Prompt:**
**System Prompt:** *(source of truth: `immermatch/evaluator_agent.py:ADVISOR_SYSTEM_PROMPT`)*
> You are a career advisor. Given a candidate profile and their evaluated job matches, write a very brief summary. Use a friendly and encouraging tone, but be honest about the fit. Focus on actionable insights. Use emojis to make it more engaging.
>
> Structure your response in plain text with these sections:
Expand Down Expand Up @@ -231,7 +231,7 @@ SERPAPI_PARAMS = {

### Blocked Job Portals (SerpApi only)

Jobs from the following portals are discarded during search result parsing (see `search_agent.py:_BLOCKED_PORTALS`):
Jobs from the following portals are discarded during search result parsing (see `immermatch/serpapi_provider.py:BLOCKED_PORTALS`):

> bebee, trabajo, jooble, adzuna, jobrapido, neuvoo, mitula, trovit, jobomas, jobijoba, talent, jobatus, jobsora, studysmarter, jobilize, learn4good, grabjobs, jobtensor, zycto, terra.do, jobzmall, simplyhired

Expand All @@ -241,6 +241,8 @@ Listings with no remaining apply links after filtering are skipped entirely.

## 6. Pydantic Schemas

*(Source of truth: `immermatch/models.py` — keep this section in sync when fields change.)*

All models use `Field()` with descriptions and defaults where appropriate.

```python
Expand Down Expand Up @@ -559,42 +561,30 @@ Immermatch is **free to self-host** (bring your own API keys). The official host

## 14. Development Workflow & Agent Instructions

This section documents the development process and conventions for both human and AI agents working on this codebase. `CLAUDE.md` is a lightweight quick-reference version of these instructions that Claude Code loads automatically. It points agents here for full architecture context.
This section documents the development process and conventions for both human and AI agents working on this codebase.

### Quick Reference (for AI agents)
### Agent instruction files

```bash
# Activate the virtual environment first — ALWAYS required:
source .venv/bin/activate

# Test: pytest tests/ -x -q
# Lint: ruff check --fix . && ruff format --check .
# Types: mypy .
# Run app: streamlit run immermatch/app.py
# All: ruff check --fix . && mypy . && pytest tests/ -x -q
```
| File | Consumed by | Purpose |
|---|---|---|
| `CLAUDE.md` | Claude Code | Lightweight quick-reference: env setup, check suite, rules, architecture table |
| `.github/copilot-instructions.md` | GitHub Copilot Chat | Same rules as CLAUDE.md, tuned for Copilot context |
| `.github/copilot/*.prompt.md` | Copilot Chat (reusable prompts) | `write-tests`, `new-db-function`, `new-pydantic-model`, `pr-review` |
| `AGENTS.md` (this file) | All agents + humans | Full architecture docs — the single source of truth |

### Makefile

Common tasks are wrapped in a `Makefile` at the repo root:

**IMPORTANT:** After every code change, run the check suite **without asking for permission** — just do it:
```bash
source .venv/bin/activate && pytest tests/ -x -q && ruff check --fix . && ruff format --check . && mypy .
make check # pytest + ruff lint + ruff format + mypy (the full gate)
make test # pytest only
make lint # ruff check --fix + ruff format --check
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The make lint description here says it runs both ruff check --fix and ruff format --check, but the newly added Makefile’s lint target only runs ruff check --fix. Update this table or the Makefile so they match.

Suggested change
make lint # ruff check --fix + ruff format --check
make lint # ruff check --fix

Copilot uses AI. Check for mistakes.
make typecheck # mypy
make run # streamlit run
make coverage # pytest --cov with term-missing report
make clean # remove caches and build artifacts
```
Do not ask the user "Shall I run the tests?" — always run them automatically.

### Conventions for AI agents

- **Always activate the virtual environment** (`source .venv/bin/activate`) before running any command (`pytest`, `ruff`, `mypy`, `streamlit`, etc.). The project's dependencies are installed only in `.venv`.
- Use `google-genai` package, NOT the deprecated `google.generativeai`
- Gemini model: `gemini-3-flash-preview`
- Pydantic models live in `immermatch/models.py` — follow existing patterns
- All external services (Gemini, SerpAPI, Supabase, Resend) must be mocked in tests — no API keys needed to run `pytest`
- Shared test fixtures in `tests/conftest.py`: `sample_profile`, `sample_job`, `sample_evaluation`, `sample_evaluated_job`
- Test fixture files (sample CVs, etc.) live in `tests/fixtures/`
- All DB writes use the admin client (`get_admin_client()`), never the anon client
- Log subscriber UUIDs, never email addresses
- All `st.error()` calls must show generic messages; real exceptions go to `logger.exception()`
- Follow the test file naming convention: `tests/test_<module>.py` for `immermatch/<module>.py`
- After implementing changes, always run `pytest tests/ -x -q` to verify nothing is broken
- Use as much as possible external libraries and built-in functions instead of writing custom code (e.g., for date parsing, string manipulation, etc.) — this increases reliability and reduces bugs

### Development workflow

Expand Down Expand Up @@ -634,16 +624,3 @@ The recommended workflow for implementing tasks/issues:
Already configured in `.pre-commit-config.yaml`:
- **On commit:** trailing whitespace, YAML/TOML/JSON checks, large file check, merge conflict detection, private key detection, secrets scanning, ruff lint+format, mypy
- **On push:** full test suite (`pytest tests/ -x -q --tb=short`)

---

# Open Issues
- How to deal with many API requests for SerpAPI? It's quite expensive at scale.
- Make UI more engaging and personalized (use first name?).
- Some jobs don't exist anymore, but are still found by SerpAPI through job aggregators. Can we detect and filter these better?
- Let the user also personalize the search/evaluation by editing the generated queries, their profile, or having an extra "preferences" input (e.g., "I want to work in fintech", "I want a remote job", "I don't want to work for big corporations")?
- Let the user upload multiple CVs (e.g., one for software engineering, one for data science) and route them to different job searches?
- Let the user update their daily digest preferences (e.g., "only send me jobs with score > 80", "send me a weekly digest instead of daily")?
- Integrate Stripe for paid newsletter subscriptions (Phase 2).
- Write issue templates for the public repo.
- The SerpAPI query and the job evaluation are currently separate steps. Can we combine them to save API calls? For example, can we ask Gemini to generate the search queries AND evaluate the jobs in one go? Or can we at least evaluate each job as we parse it, instead of collecting them all and then evaluating? This might increase speed.
1 change: 1 addition & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ source .venv/bin/activate # ALWAYS required before any command

```bash
source .venv/bin/activate && pytest tests/ -x -q && ruff check --fix . && ruff format --check . && mypy .
# Or simply: make check
```

## Rules
Expand Down
46 changes: 46 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
.PHONY: check test lint coverage format typecheck run install install-dev clean

SHELL := /bin/bash

## Run the full check suite (test + lint + format + typecheck)
check:
source .venv/bin/activate && pytest tests/ -x -q && ruff check --fix . && ruff format --check . && mypy .

## Run tests only
test:
source .venv/bin/activate && pytest tests/ -x -q

## Run tests with coverage
coverage:
source .venv/bin/activate && pytest tests/ --cov=immermatch --cov-report=term-missing

## Lint with ruff (auto-fix)
lint:
source .venv/bin/activate && ruff check --fix .
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lint target only runs ruff check --fix ., but the docs elsewhere in this PR describe make lint as also running ruff format --check. Either update the lint recipe to include the format check, or adjust the documentation so they stay consistent.

Suggested change
source .venv/bin/activate && ruff check --fix .
source .venv/bin/activate && ruff check --fix . && ruff format --check .

Copilot uses AI. Check for mistakes.

## Format with ruff
format:
source .venv/bin/activate && ruff format .

## Type check with mypy
typecheck:
source .venv/bin/activate && mypy .

## Run the Streamlit app locally
run:
source .venv/bin/activate && streamlit run immermatch/app.py

## Install runtime dependencies
install:
python -m venv .venv
source .venv/bin/activate && pip install -e .

## Install all dependencies (runtime + test + dev + pre-commit hooks)
install-dev:
python -m venv .venv
source .venv/bin/activate && pip install -e ".[dev,test]" && pre-commit install --hook-type pre-commit --hook-type pre-push

## Remove build artifacts and caches
clean:
rm -rf .mypy_cache .ruff_cache .pytest_cache htmlcov .coverage build dist *.egg-info __pycache__
find . -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null || true