<div dir="ltr"><div dir="ltr"><div><div class="gmail_default"><font face="georgia, serif" color="#000000"><font style="vertical-align:inherit"><font style="vertical-align:inherit"><b>When:</b>    </font></font><font style="vertical-align:inherit"><font style="vertical-align:inherit">    Monday, April 7th at <b style="background-color:rgb(255,255,0)">11:30am CT</b><b> </b></font></font></font></div><div class="gmail_default"><font face="georgia, serif" color="#000000"><font style="vertical-align:inherit"><font style="vertical-align:inherit"><b><br></b></font></font></font></div><div class="gmail_default"><font face="georgia, serif" color="#000000"></font></div><div class="gmail_default"><font face="georgia, serif" color="#000000"><b>Where:       </b>Talk will be given <font style="font-weight:bold"><u>live, in-person</u></font><font style="font-weight:bold"> </font>at</font></div><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="georgia, serif" color="#000000">                       TTIC, 6045 S. Kenwood Avenue</font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font color="#000000" face="georgia, serif">                       5th Floor, Room 530<b> </b></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><b><font face="georgia, serif" color="#000000"><br></font></b></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="georgia, serif" color="#000000"><b style="letter-spacing:0.2px">Virtually:</b><span style="letter-spacing:0.2px">  via Panopto </span>(<a href="https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=b0137928-cd56-44de-911e-b28e01151565" target="_blank">livestream</a><span style="letter-spacing:0.2px">)</span><br></font></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="georgia, serif" color="#000000"><br></font></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="georgia, serif" color="#000000"><font style="vertical-align:inherit"><font style="vertical-align:inherit"><b>Who: </b>         </font></font><span class="gmail_default"></span>Sanmi Koyejo, Stanford</font></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"></p><div><p style="letter-spacing:0.2px"><font face="georgia, serif" color="#000000"><span style="letter-spacing:normal"><b>Title:</b>          </span><span style="letter-spacing:normal"><span class="gmail_default"></span></span><span style="letter-spacing:normal">Beyond Benchmarks: Building a Science of AI Measurement</span></font></p><div><font face="georgia, serif" color="#000000"><b>Abstract:  </b><span class="gmail_default"></span>The widespread deployment of AI systems in critical domains demands more rigorous approaches to evaluating their capabilities and safety. While current evaluation practices rely on static benchmarks, these methods face fundamental efficiency, reliability, and real-world relevance challenges. This talk presents a path toward a measurement framework that bridges established psychometric principles with modern AI evaluation needs. We demonstrate how techniques from Item Response Theory, amortized computation, and predictabi<span class="gmail_default" style="font-family:georgia,serif;font-size:small;color:rgb(0,0,0)"></span>lity analysis can substantially improve the rigor and efficiency of AI evaluation. Through case studies, we show how this approach can enable more reliable, scalable, and meaningful evaluation of AI systems. This work points toward a broader vision: evolving AI evaluation from a collection of benchmarks into a rigorous measurement science that can effectively guide research, deployment, and policy decisions.</font></div><div><font face="georgia, serif" color="#000000"><br></font></div><div><font face="georgia, serif" color="#000000"><b>Short Bio</b>: <span class="gmail_default"></span>Sanmi Koyejo is an assistant professor in the Department of Computer Science at Stanford University and a co-founder of Virtue AI. At Stanford, Koyejo leads the Stanford Trustworthy Artificial Intelligence Research (STAIR) lab, which works to develop the principles and practice of safe and secure AI. Koyejo has received several awards, including a Skip Ellis Early Career Award, a Sloan Fellowship, and a PECASE. Koyejo serves on the Neural Information Processing Systems Foundation Board and as president of Black in AI.</font></div><div><font face="georgia, serif" color="#000000"><br></font></div></div><div><font face="georgia, serif" color="#000000"><br><b>Host:<span class="gmail_default"> <a href="mailto:mwalter@ttic.edu" target="_blank">Matt Walter</a> & <a href="mailto:jingyanw@ttic.edu" target="_blank">Jingyan Wang </a></span></b></font></div><br clear="all"></div><div><br></div><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><b style="background-color:rgb(255,255,255)"><font color="#3d85c6">Brandie Jones </font></b><div><div><div><font color="#3d85c6"><b><i>Executive </i></b></font><b style="color:rgb(61,133,198)"><i>Administrative Assistant</i></b></div></div><div><span style="background-color:rgb(255,255,255)"><font color="#3d85c6">Toyota Technological Institute</font></span></div><div><span style="background-color:rgb(255,255,255)"><font color="#3d85c6">6045 S. Kenwood Avenue</font></span></div><div><span style="background-color:rgb(255,255,255)"><font color="#3d85c6">Chicago, IL  60637</font></span></div></div><div><span style="background-color:rgb(255,255,255)"><font color="#3d85c6"><a href="http://www.ttic.edu" target="_blank">www.ttic.edu</a> </font></span></div><div><span style="background-color:rgb(255,255,255)"><font color="#3d85c6"><br></font></span></div></div></div></div>
</div>