2024workshop/schedule.md at main · evaleval/2024workshop · GitHub

274 lines (268 loc) · 8.34 KB

title	Schedule
nav	true

<style> .schedule-table { width: 100%; border-collapse: separate; border-spacing: 0; margin-bottom: 20px; box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1); border-radius: 8px; overflow: hidden; } .schedule-table th, .schedule-table td { border-right: 1px solid #e0e0e0; border-bottom: 1px solid #e0e0e0; padding: 12px; text-align: left; } .schedule-table th:last-child, .schedule-table td:last-child { border-right: none; } .schedule-table tr:last-child td { border-bottom: none; } .schedule-table th { background-color: #f0f0f0; font-weight: bold; } .schedule-table tr:nth-child(even) { background-color: #f8f9fa; } .schedule-table tr:hover { background-color: #e9ecef; } .time-column { white-space: nowrap; font-weight: bold; } .session-column { font-weight: bold; } .description-column ul { margin: 0; padding-left: 20px; } @media (max-width: 768px) { .schedule-table { box-shadow: none; border-radius: 0; overflow: visible; } .schedule-table, .schedule-table tbody, .schedule-table tr, .schedule-table td { display: block; } .schedule-table thead { display: none; } .schedule-table tr { margin-bottom: 15px; border: 1px solid #e0e0e0; border-radius: 8px; overflow: hidden; box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1); } .schedule-table td { border: none; position: relative; padding-left: 50%; } .schedule-table td:before { content: attr(data-label); position: absolute; left: 6px; width: 45%; padding-right: 10px; white-space: nowrap; font-weight: bold; } .time-column, .session-column { background-color: #f0f0f0; } .schedule-table td:empty { display: none; } } </style>

All times in Pacific (Vancouver BC Local Time)

Time	Session	Description
9:00 - 9:15 AM	☕ Coffee	⏰😴📢⬆
9:15 - 9:30 AM	👋 Welcome and Introduction	Opening Remarks Overview of Workshop Structure and Objectives
9:30 - 10:30 AM	🎤 Opening Panel: Reflections on the Landscape	Panel Discussion on AI Evaluation Challenges Panelists: Abeba Birhane, Su Lin Blodgett, Abigail Jacobs, Lee Wan Sie Topics: Underlying frameworks and incentive structures Defining robust evaluations and contextual challenges Multimodal evaluation needs (text, images, audio, video) Q&A
10:30 - 11:30 AM	💭 Oral Session 1: Provocations and Ethics in AI Evaluation	Presentations (25 min): "Provocation: Who benefits from 'inclusion' in Generative AI?" "(Mis)use of nude images in machine learning research" "Evaluating Refusal" Breakout (35 min): Group Discussion (20 min): Ethics and Bias in Evaluation Design, Refusal and Boundary Setting, Research Ethics and Data Usage Report Back (15 min)
11:30 AM - 12:30 PM	🌏 Oral Session 2: Multimodal and Cross-Cultural Evaluation Methods	Presentations (25 min): "JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark" "Critical human-AI use scenarios and interaction modes for societal impact evaluations" "Cascaded to End-to-End: New Safety, Security, and Evaluation Questions for Audio Language Models" Breakout (35 min): Group Discussion (20 min): Language, Image, Audio, Video, Cross-Culture Report Back (15 min)
12:30 - 2:30 PM	🍽️ Lunch and Poster Session	12:30 - 1:15 PM: Lunch and Networking 1:15 - 2:30 PM: Poster Presentations
2:30 - 3:00 PM	📊 Oral Session 3: Systematic Approaches to AI Impact Assessment	Presentations: "GenAI Evaluation Maturity Framework (GEMF)" "AIR-Bench 2024: Safety Evaluation Based on Risk Categories" "Evaluating Generative AI Systems is a Social Science Measurement Challenge"
3:00 - 3:30 PM	🔄 Break
3:30 - 4:05 PM	💡 Oral Session 3 Breakout	Group Discussion (20 min): Choosing Evaluations: Selecting relevant evaluations from a large repository Reviewing Tools and Datasets: Assessment of current tools and gaps Evaluating Reliability and Validity: Exploring construct validity and ranking methods Report Back (15 min)
4:05 - 5:00 PM	🤝 What's Next? Coalition Development	Recap and Teasers (15 min): Overview of coalition groups Interactive Discussion (40 min): Measurement Modeling Developing Criteria for Evaluating Evaluations Documentation: Creating Proposed Documentation Standards Eval Repository: Building Out Resource Repositories Scorecard/Checklist: Conducting Reviews and Publishing Annual Scorecards
5:00 - 5:30 PM	👋 Closing Session	Summary of Key Insights and Next Steps

<script> document.addEventListener('DOMContentLoaded', (event) => { const table = document.querySelector('.schedule-table'); const headers = table.querySelectorAll('th'); const headerTexts = Array.from(headers).map(header => header.textContent); table.querySelectorAll('tbody tr').forEach(row => { row.querySelectorAll('td').forEach((cell, index) => { cell.setAttribute('data-label', headerTexts[index]); // Remove the data-label attribute for empty cells if (cell.textContent.trim() === '') { cell.removeAttribute('data-label'); } }); }); }); </script>