We should explore adding support for doing some automatic evaluations of our evals. Some basic ideas:
Unit Test like features:
- Contains
- Equals
Model Evaluators: Maybe we could have a parent act as the evaluator also. Some people might find this useful for things that they don't want to evaluate on their own or don't have simple rules for.
We should explore adding support for doing some automatic evaluations of our evals. Some basic ideas:
Unit Test like features:
Model Evaluators: Maybe we could have a parent act as the evaluator also. Some people might find this useful for things that they don't want to evaluate on their own or don't have simple rules for.