[CS] Rosa Zhou Candidacy Exam/Feb 17, 2025

Wed Feb 12 10:57:18 CST 2025

This is an announcement of Rosa Zhou's Candidacy Exam.
===============================================
Candidate: Rosa Zhou

Date: Monday, February 17, 2025

Time: 11 am CST

Location: JCL 298

Title: Explainability-Driven AI: Improving Model Robustness and Supporting Scientific Discovery

Abstract: The development of explainable artificial intelligence (AI) has become a crucial step in enhancing model transparency, robustness, and usability. This thesis explores the intersection of explainability and AI robustness, proposing novel approaches to improve model performance and support scientific discovery. We investigate how explainable AI techniques can be leveraged to improve out-of-distribution generalization and model decision-making. By incorporating natural language explanations and rationale-based models, we aim to address challenges in model interpretability and resilience, particularly in the face of adversarial attacks and misleading inputs. Additionally, we propose using mechanistic interpretability to understand what models have learned, particularly in scenarios where they exhibit superhuman performance, thereby providing insights into the internal workings of these models and aiding in the generation of novel scientific hypotheses. This research contributes to advancing both the theoretical understanding of AI systems and their practical application in fields requiring high levels of reliability and transparency, such as scientific research and critical decision-making.

Advisors: Chenhao Tan

Committee Members: Chenhao Tan, Xuezhi Wang, and Ari Holtzman