Hello,
Some NCAR colleagues pointed to this project after I saw that the Earth2MIP project was made inactive/stable.
In any case, I think it would be valuable to consider submitting a BoF to the TPC26 conference to discuss a wider multi-institutional and international effort for this work. See https://tpc26.org/agenda-2026/ They plan to finalize sessions by mid-April and submissions will be reviewed in order of submissions. With AIMIP, this would likely fall under the parallel track...
- Open Suite for Evaluating Model Skills, Knowledge, Reasoning, and Safety (EVAL)
Develop an open suite of tools, methods, and benchmarks for evaluating the scientific skills, knowledge, reasoning, agentic capabilities and safety/security of frontier models and AI systems.
I am intending to attend TPC and would be interested to coordinate such a BoF meeting. Would AIMIP community be interested in engaging in this potential opportunity?
Hello,
Some NCAR colleagues pointed to this project after I saw that the Earth2MIP project was made inactive/stable.
In any case, I think it would be valuable to consider submitting a BoF to the TPC26 conference to discuss a wider multi-institutional and international effort for this work. See https://tpc26.org/agenda-2026/ They plan to finalize sessions by mid-April and submissions will be reviewed in order of submissions. With AIMIP, this would likely fall under the parallel track...
Develop an open suite of tools, methods, and benchmarks for evaluating the scientific skills, knowledge, reasoning, agentic capabilities and safety/security of frontier models and AI systems.
I am intending to attend TPC and would be interested to coordinate such a BoF meeting. Would AIMIP community be interested in engaging in this potential opportunity?