feat(evaluator): expose two-pass extraction in UI and docs#38
Open
DeryFerd wants to merge 1 commit into
Open
Conversation
Co-authored-by: Cursor <cursoragent@cursor.com>
jankric
pushed a commit
to jankric/evonic
that referenced
this pull request
May 16, 2026
…prevent reassignment bug The function declaration at original line 1354 could reassign showTab at runtime, overwriting the first override at line 1216. Moving the declaration to before both overrides ensures the wrapping chain works correctly: showTab -> wrapper2 -> wrapper1 -> original
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Evonic already runs two-pass evaluation under the hood (Pass 1 answers the prompt, Pass 2 extracts a clean final value before scoring), but that behavior was mostly invisible: no in-app docs, no way to turn it off without editing
.env, and no clear signal on the evaluators page thattwo_passuses Pass 2.This PR exposes that workflow for maintainers and benchmark operators.
Settings API —
GET/PUT /api/settings/two-pass-enabledstores the preference in the app settings table.AnswerExtractor.is_enabled()checks the DB first, then falls back toTWO_PASS_ENABLEDfrom the environment, so toggling in the UI applies on the next evaluation run without a server restart.Evaluators UI (
/evaluate/evaluators) — A panel at the top explains what Pass 2 does and includes an enable/disable toggle. Built-in evaluators that use Pass 2 (e.g.two_pass) show a small Pass 2 badge in the list.Documentation — Adds
docs/two-pass-evaluation.md(flow, config, result fields, how to disable) and serves it in-app at/evaluate/docs/two-passwith a link from the evaluators page.No change to extraction prompts or scoring logic; this is documentation and operator controls only.
Validation
python -m pytest tests/test_answer_extractor.py::TestTwoPassEnabled tests/test_evaluators.py::TestTwoPassEvaluator -q— 4 passed/evaluate/evaluators, toggle two-pass off/on, confirmPUT /api/settings/two-pass-enabledsucceeds/evaluate/docs/two-passand confirm the guide renderspass2metadata still appears in results when enabled