Skip to content

fix(#182): BM25 搜索用 pagerank 打破二元分数#193

Merged
gjczone merged 1 commit into
mainfrom
fix/issue-182
Jun 3, 2026
Merged

fix(#182): BM25 搜索用 pagerank 打破二元分数#193
gjczone merged 1 commit into
mainfrom
fix/issue-182

Conversation

@gjczone
Copy link
Copy Markdown
Owner

@gjczone gjczone commented Jun 3, 2026

Related Issue

closes #182

Changes

  • src/search.py: final_score = bm25 * (1 + pagerank * 10),保留 4 位小数
  • tests/test_issue_182.py: 2 个回归测试

Verification

  • unittest: 470 passed
  • pytest: 182 passed
  • ruff + mypy: 全绿

- search / fallback_search: final_score = bm25 * (1 + pagerank * 10),保留 4 位小数
- 保存 _pageranks 数组供打分时使用
- 解决短 query(如 "websocket handler")只有 0.5/1.0 两档分数的问题
- 2 个回归测试:pagerank tiebreaker + 中等 corpus 至少 3 个不同分数

closes #182
@gjczone gjczone merged commit 5446d02 into main Jun 3, 2026
@gjczone gjczone deleted the fix/issue-182 branch June 3, 2026 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BM25 search scores lack granularity (binary 0.5 / 1.0)

1 participant