Skip to content

bump autoevals version for patch release#171

Merged
Caitlin Pinn (cpinn) merged 1 commit intomainfrom
caitlin/bump-version
Feb 12, 2026
Merged

bump autoevals version for patch release#171
Caitlin Pinn (cpinn) merged 1 commit intomainfrom
caitlin/bump-version

Conversation

@cpinn
Copy link
Contributor

@cpinn Caitlin Pinn (cpinn) commented Feb 12, 2026

Need to update the patch version for the planned bug fix release.

@github-actions
Copy link

github-actions bot commented Feb 12, 2026

Braintrust eval report

Autoevals (caitlin/bump-version-1770919263)

Score Average Improvements Regressions
NumericDiff 75.3% (+17pp) 41 🟢 -
Time_to_first_token 1.42tok (-1.01tok) 116 🟢 3 🔴
Llm_calls 1.55 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 279.25tok (+2.63tok) - 60 🔴
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Completion_tokens 18.6tok (+0.03tok) - 1 🔴
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 297.85tok (+2.66tok) - 61 🔴
Estimated_cost 0$ (0$) 59 🟢 -
Duration 4s (+0.31s) 99 🟢 120 🔴
Llm_duration 3.43s (-0.13s) 66 🟢 53 🔴

1 similar comment
@github-actions
Copy link

Braintrust eval report

Autoevals (caitlin/bump-version-1770919263)

Score Average Improvements Regressions
NumericDiff 75.3% (+17pp) 41 🟢 -
Time_to_first_token 1.42tok (-1.01tok) 116 🟢 3 🔴
Llm_calls 1.55 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 279.25tok (+2.63tok) - 60 🔴
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Completion_tokens 18.6tok (+0.03tok) - 1 🔴
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 297.85tok (+2.66tok) - 61 🔴
Estimated_cost 0$ (0$) 59 🟢 -
Duration 4s (+0.31s) 99 🟢 120 🔴
Llm_duration 3.43s (-0.13s) 66 🟢 53 🔴

@cpinn Caitlin Pinn (cpinn) merged commit 93f22c1 into main Feb 12, 2026
7 checks passed
@github-actions
Copy link

github-actions bot commented Feb 12, 2026

Braintrust eval report

Autoevals (main-1770920105)

Score Average Improvements Regressions
NumericDiff 73.5% (-2pp) - 4 🔴
Time_to_first_token 1.43tok (+0.01tok) 45 🟢 74 🔴
Llm_calls 1.55 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 279.25tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Completion_tokens 18.6tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 297.85tok (+0tok) - -
Estimated_cost 0$ (0$) 60 🟢 -
Duration 3.5s (-0.5s) 148 🟢 71 🔴
Llm_duration 2.9s (-0.53s) 78 🟢 41 🔴

@Qard Stephen Belanger (Qard) deleted the caitlin/bump-version branch February 12, 2026 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants