WS3 — enrichment evaluation harness

A test bench to measure the **quality** of the enrichment system. Distinct from the mechanical validator (a per-run integrity gate) — this measures whether summaries, topic assignments and overviews are *good*, so configurations, rubrics and executors can be compared and regressions caught.

### Scope
- Layered evaluation: item summaries, topic assignment, topic-page overviews.
- A gold set produced by a frontier model + an LLM-as-judge scorer.
- An action loop that feeds eval findings back into the declarative rubrics.

### Status
Designed, not yet planned. Built after the WS2 enrichment layers exist (they now do).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WS3 — enrichment evaluation harness #8

Scope

Status

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

WS3 — enrichment evaluation harness #8

Description

Scope

Status

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions