fix: 锚定评分 + Frontmatter规则化 + TEHC盲区补齐#6
Open
MuseFantasy wants to merge 2 commits into
Open
Conversation
P0: 锚定评分替代裸判 - 新增「锚定评分协议」: LLM比对3档锚定示例而非自由打分1-10 - Phase 1/2 评分步骤全部改用锚定比对 - 新增约束规则 alchaincyf#8 强制锚定评分(temperature=0, thinking disabled) - 置信度出口: low时自动触发2模型交叉验证 - 理论依据: Hashemi et al., ACL 2024 (锚定比对偏差≤3分 vs 裸判8-15分) P1: 维度1 Frontmatter 规则化 - name格式/description质量/version+license 改为确定性检查清单 - 每项有具体分数档位(3/2/1/0), LLM仅兜底 P2: TEHC 盲区补齐 - H质量: 维度3增加反模式警示+异常具体性判断 - C自动验证: 维度4增加可程序化完成条件(exit code/lint/文件存在) - 负触发: 维度1补充「何时不触发」检查 - 新增 TEHC四组件覆盖映射表 新增文件: references/anchor-library/dimension-anchors.md (含8维锚定示例+TEHC映射)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
修复内容
P0: 锚定评分替代 LLM 裸判 (解决评分不一致)
Darwin 原版 Phase 1 使用"按维度打 1-10 分"的自由评分方式:
修复方案:引入 LLM-Rubric 锚定比对法 (Hashemi et al., ACL 2024)
P1: 维度1 Frontmatter 规则化 (消除 LLM 随机性)
维度1 (Frontmatter 质量,权重8) 改为确定性检查清单:
P2: TEHC 四组件盲区补齐
新增 references/anchor-library/dimension-anchors.md:含 8 维度锚定示例 + TEHC 覆盖映射表