-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Hey! I'm not sure if this is currently possible, so I'm opening this issue to ask.
I'd like my scorer to return rich metadata alongside the score. Currently, it seems like we can only return primitives (Int, Float) or a Hash that gets converted to a float score. What I'd love to be able to do is return a structured result like this:
return {
score: 0.5,
metadata: {
failure_type: "missing_argument_keys",
reason: "Correct tool \"#{expected_name}\" but missing #{missing_keys.length} expected key(s): #{missing_keys.join(", ")}",
expected_name: expected_name,
actual_name: actual_name,
mismatched_arguments: ...
}
}
}I would like this to be accessible on the Experiment interface.
This would be really useful for debugging and analysis. Being able to attach structured context (failure reasons, mismatched values, etc.) directly to a score makes it much easier to understand why a score is what it is, and also create filters in the Experiment to inspect a subset of failures.
Is there a way to achieve this today that I'm missing? If not, would you be open to supporting this kind of structured return value from Scorers?
Thanks!