feat(scoring): local LLM hybrid mode — Claude calibrates, Qwen2.5 scores subsequent runs

## Summary

Extend the hybrid scoring mode to use a local LLM (llama.cpp + Qwen2.5-Coder-1.5B) for subsequent runs after Claude has calibrated the scoring profile on the first pass. Claude acts as teacher on run 1, generating scored examples and reasoning. The local model acts as student on all subsequent runs, pattern-matching against those anchors.

## Motivation

The current hybrid mode transitions from LLM scoring → static rule-based scoring after the first run. Static rules miss nuance — they can't adapt to new job types or edge cases. A local LLM with few-shot examples from Claude is richer signal: near-zero cost, no API dependency, and more flexible than hand-crafted rules.

Hardware constraint: VPS limits local model to 1.5B parameters. The few-shot anchoring approach makes this viable — Qwen2.5-Coder-1.5B doesn't need to reason from scratch, it pattern-matches against Claude's established scores.

## Design

### Run 1 — Claude (teacher)
- Scores the first batch of jobs for a given CV as today
- Additionally generates a set of **scored examples with reasoning** and saves them to `scoring_profiles/[cv_name].json`:
  ```json
  {
    "criteria": ["senior PM title", "Paris or remote", "data/AI domain"],
    "red_flags": ["engineering-only role", "outside France"],
    "examples": [
      {"title": "...", "company": "...", "score": 88, "reasoning": "Strong data platform background, Paris, PM title confirmed"},
      {"title": "...", "company": "...", "score": 52, "reasoning": "Engineering role, no product ownership scope"}
    ]
  }
  ```

### Run 2+ — Qwen2.5-Coder (student)
- Loads profile + few-shot examples from `scoring_profiles/[cv_name].json`
- Prompt: "Here is a CV profile and N scored examples. Score the following job using the same logic. Return JSON with score and one-line reasoning."
- Qwen pattern-matches against Claude's anchors
- Jobs within `uncertainty_band` escalate back to Claude for re-scoring

### New scoring mode
Add `local_llm` as a valid value for `scoring.mode`, or extend `hybrid` to support a `local_provider` config key:

```yaml
scoring:
  mode: hybrid
  local_provider: llamacpp         # used for subsequent runs instead of static rules
  uncertainty_band: [65, 85]       # escalate to cloud LLM if local score falls here
```

### Integration
- `llama.cpp` in server mode exposes an OpenAI-compatible API at `localhost:8080/v1`
- New `LlamaCppProvider` follows the same pattern as `openai_provider.py` with custom `base_url`
- No API key required — local inference only

## What this replaces

Replaces the static rule-based second pass in hybrid mode. Static scoring stays available as a fallback if the local server isn't running.

## Related

- Extends the existing hybrid scorer (`providers/scoring/llm_scorer.py`, `scoring_profiles/`)
- Related to `feat/kimi-k2-provider` branch — `LlamaCppProvider` shares the OpenAI-compatible base_url pattern already prototyped there
- Related to the local LLM idea in `wiki/projects/ajsaa/ideas/unimplemented/local_llm_for_mundane_tasks.md`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(scoring): local LLM hybrid mode — Claude calibrates, Qwen2.5 scores subsequent runs #54

Summary

Motivation

Design

Run 1 — Claude (teacher)

Run 2+ — Qwen2.5-Coder (student)

New scoring mode

Integration

What this replaces

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat(scoring): local LLM hybrid mode — Claude calibrates, Qwen2.5 scores subsequent runs #54

Description

Summary

Motivation

Design

Run 1 — Claude (teacher)

Run 2+ — Qwen2.5-Coder (student)

New scoring mode

Integration

What this replaces

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions