Word-lesson audio cache ignores CEFR level → mixed-level word lessons

## Problem

A daily **word lesson** can mix CEFR levels across its word segments. Example from prod (lesson `863`, a `three_words_lesson`): the three segments carry `['A1', 'A1', 'A2']`. So an A1 learner can get a word whose example sentences were written for B2.

## Root cause

`AudioLessonMeaning.find()` keys the cache on `meaning` + `teacher_language` only — `difficulty_level` is **not** part of the lookup:

```python
# zeeguu/core/model/audio_lesson_meaning.py:69
@classmethod
def find(cls, meaning, teacher_language=None):
    """Find a non-deprecated audio lesson for a specific meaning and teacher language."""
    query = cls.query.filter_by(meaning=meaning).filter(cls.deprecated_at.is_(None))
    if teacher_language:
        query = query.filter_by(teacher_language_id=teacher_language.id)
    return query.first()
```

So the **first row ever generated** for a `(meaning, teacher_language)` pair wins and is reused for every later user, regardless of their level. The `difficulty_level` column is written at generation time but never consulted on lookup.

The meaning-audio script is genuinely level-dependent — the prompt says *"the lesson is for somebody who is CEFR level {cefr_level} so ensure that sentences are of the appropriate difficulty."* So a reused row really does carry the wrong difficulty for the new learner, not just a mislabeled tag.

## Asymmetry worth noting

Dialogue lessons (topic/situation) **do** filter the cache by level — `AudioLessonDialogue.find_unheard(...)` includes `difficulty_level=cefr_level`. So dialogue lessons are level-correct; only meaning-audio reuse is level-blind. The fix would bring meaning-audio in line with how dialogues already work.

## Why now

Surfaced while adding a per-lesson `DailyAudioLesson.cefr_level()` to the API responses (for the shared-lesson link preview). Because segments can disagree, that method currently reports the **most common** level across segments to mask the inconsistency — a workaround that would be unnecessary if word-audio were cached per level.

## Options

1. **Cache per level** — add `difficulty_level` to `AudioLessonMeaning.find()`, matching the dialogue behavior. Costs more generation / less cross-user reuse, but each learner gets level-appropriate examples.
2. **Reuse across adjacent levels only** (e.g. share within A1–A2, B1–B2) to keep some reuse while bounding the mismatch.
3. **Accept & document** — the headword/translation are identical regardless of level; only example-sentence complexity differs. If that's deemed acceptable, drop the per-segment level field from the model's mental model and treat lesson level as the user's generation level.

## Acceptance

- Decide the caching key for meaning-audio.
- If changing: update `AudioLessonMeaning.find()` + the regeneration path in `daily_lesson_generator.generate_audio_lesson_meaning()`, and consider whether existing rows need backfill/deprecation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Word-lesson audio cache ignores CEFR level → mixed-level word lessons #636

Problem

Root cause

Asymmetry worth noting

Why now

Options

Acceptance

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Word-lesson audio cache ignores CEFR level → mixed-level word lessons #636

Description

Problem

Root cause

Asymmetry worth noting

Why now

Options

Acceptance

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions