Log selected model in Agent.Model() for alloy observability#2134
Log selected model in Agent.Model() for alloy observability#2134derekmisler wants to merge 2 commits intodocker:mainfrom
Conversation
Add slog.Info to Model() so the chosen provider/model ID and pool size are emitted on every model selection. This surfaces in the verbose log artifact, making it possible to see which model handled each agent turn without needing API provider dashboard access.
There was a problem hiding this comment.
Review Summary
Assessment: 🔴 CRITICAL
This PR adds valuable observability logging to track model selection, but introduces a critical bug that can cause runtime panics.
Critical Issues
1 HIGH severity bug found:
- Missing validation for empty models slice can cause panic
2 MEDIUM severity issues found:
- Incorrect pool_size logging when overrides are active
- Insufficient test coverage for override scenarios
Recommendation
The logging feature is useful for alloy observability, but the panic risk must be addressed before merging. The pool_size logging issue should also be fixed to ensure accurate telemetry.
This review focuses only on code added or modified in this PR.
When model overrides are active, pool_size now reports the override count instead of always reporting len(a.models). Also extends the test to cover the override scenario.
|
/review |
There was a problem hiding this comment.
Review Summary
Assessment: 🔴 CRITICAL
Summary
Found 1 CRITICAL issue in the changed code that requires attention before merging.
The PR adds valuable observability by logging model selection details. However, the refactoring exposes a pre-existing panic risk when the models slice is empty.
Critical Issues
- Panic risk:
rand.Intn(0)will panic ifa.modelsis empty (no models configured and no overrides)
Recommendation
Add defensive checks in the Model() method to handle empty model slices gracefully, or ensure validation at agent construction time prevents this scenario entirely.
Summary
slog.Info("Model selected", ...)toAgent.Model()so the chosenprovider/modelID and pool size are logged on every model selectionpool_size > 1indicates alloy was in playMotivation
Model usage metrics (model name, tokens, cost) are already tracked via telemetry events through the Marlin → Snowflake → Sigma pipeline, but that data is aggregate — useful for dashboarding trends, not for debugging a specific run.
This change surfaces model selection inline with the agent conversation in the verbose log artifact, making it possible to see exactly which model handled each agent turn during a particular workflow run. Different use case: debugging a specific run vs. dashboarding trends.
Test plan
TestModelOverrideandTestModelOverride_ConcurrentAccesspassTestModel_LogsSelectionverifies the log line contains agent name, "Model selected" message, and correct pool sizeCloses https://github.com/docker/gordon/issues/236