Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,15 @@
# Changelog

## [0.15.12] — 2026-06-04

### Changed
- **`OverallBestEntry` model** (`app/models.py`): added `agent_path: str` and `commit_hash: str` to each per-spec best record. `GET /leaderboard/overall` and `GET /rounds/{id}/leaderboard` now carry the data needed to deep-link from an agent's profile straight to the exact GitHub commit of the code that achieved each score. This closes a flywheel gap — the dashboard's AgentDetailPage shows per-problem results but had no way to surface the code that earned them, so a contributor who wanted to fork the entry had to manually browse the repo. Both query paths in `app/routes/leaderboard.py` and `app/routes/rounds.py` updated to SELECT `s.agent_path, s.commit_hash` and pass them through to the model.

### Tests
- Existing 24 tests in `test_overall_leaderboard.py` + `test_rounds.py` continue to pass — the new fields are populated from columns the test fixtures already provide.

---

## [0.15.11] — 2026-06-04

### Changed
Expand Down
2 changes: 2 additions & 0 deletions app/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,8 @@ class OverallBestEntry(BaseModel):
normalized_score: float # rank / (N+1) where N = entries for this spec; 1.0 = not entered
submission_id: str
submitted_at: datetime
agent_path: str # repo path to this submission's agent dir (e.g. "agents/foo/v1")
commit_hash: str # commit hash for deep-linking to the exact code on GitHub


class OverallLeaderboardEntry(BaseModel):
Expand Down
3 changes: 3 additions & 0 deletions app/routes/leaderboard.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,7 @@ async def _compute_overall_leaderboard() -> OverallLeaderboard:
SELECT s.id as submission_id, s.spec_id, s.contributor,
s.mass_grams, COALESCE(s.score, s.mass_grams) as score,
COALESCE(s.score_metric, 'mass_grams') as score_metric,
s.agent_path, s.commit_hash,
s.submitted_at
FROM submissions s
WHERE s.spec_id IN ({spec_id_placeholders}) AND s.passed = 1
Expand Down Expand Up @@ -152,6 +153,8 @@ async def _compute_overall_leaderboard() -> OverallLeaderboard:
normalized_score=normalized,
submission_id=row["submission_id"],
submitted_at=row["submitted_at"],
agent_path=row["agent_path"],
commit_hash=row["commit_hash"],
)
)

Expand Down
3 changes: 3 additions & 0 deletions app/routes/rounds.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,7 @@ async def get_round_leaderboard(round_id: str):
SELECT s.id as submission_id, s.spec_id, s.contributor,
s.mass_grams, COALESCE(s.score, s.mass_grams) as score,
COALESCE(s.score_metric, 'mass_grams') as score_metric,
s.agent_path, s.commit_hash,
s.submitted_at
FROM submissions s
WHERE s.spec_id IN ({placeholders}) AND s.passed = 1
Expand Down Expand Up @@ -216,6 +217,8 @@ async def get_round_leaderboard(round_id: str):
normalized_score=normalized,
submission_id=row["submission_id"],
submitted_at=row["submitted_at"],
agent_path=row["agent_path"],
commit_hash=row["commit_hash"],
)
)

Expand Down
Loading