(feat) Automatic Evals

We should explore adding support for doing some automatic evaluations of our evals. Some basic ideas: 

Unit Test like features: 
1. Contains
2. Equals

Model Evaluators: Maybe we could have a parent act as the evaluator also. Some people might find this useful for things that they don't want to evaluate on their own or don't have simple rules for.