LabCraft-Eval: a stochastic Inspect AI environment for evaluating AI agents on benign molecular-microbiology protocols, with deterministic four-axis trajectory scoring.
microbiology molecular-biology ai-safety safeguards biosafety agent-evaluation inspect-ai protocol-evaluation
-
Updated
Jun 2, 2026 - Python