[CS] Updated: Chacha Chen Dissertation Defense/Jun 11, 2025

Mon Jun 9 15:34:58 CDT 2025

This is an announcement of Chacha Chen's Dissertation Defense.
===============================================
Candidate: Chacha Chen

Date: Wednesday, June 11, 2025

Time:  2 pm CST

Location: JCL 346

Remote Location: https://urldefense.com/v3/__https://uchicago.zoom.us/j/9535186672?pwd=aXNDdEUzcCtHSGFjcTl3Q0xNcS9udz09__;!!BpyFHLRN4TMTrA!6ZO8SvPkZX3iK3ou4Sm0g2h_cd2D_GsiVkp4pLgV3S2tMk8COKxgjYe9R0XeY78IS82KP9h_wYqlYiw_n_lpSA$

Title: Human-AI Decision Making with Case Studies in Radiology

Abstract: With the explosive progress of machine learning (ML), especially recent foundation models, these advanced systems are increasingly reshaping our daily workflows across domains. This makes human-centered AI research critically important, as it aims to build AI models to better support human tasks and improve decision-making. My PhD work focuses on improving human-AI collaboration through both behavioral study and building more effective AI systems. We begin with a theoretical analysis of the interaction between machine learning models and human decisions, which highlights a key insight: human intuition plays a critical role in effective human-AI collaboration. Using prostate cancer diagnosis with MRI as a real-world test bed, we conducted user studies with domain experts to investigate how advanced, human-level ML models are perceived and used in clinical decision-making. Our findings show that experts are often hesitant to adopt AI tools, and even when they do, they struggle to appropriately rely on AI.  Importantly, by applying a theoretical framework of human-AI reliance, we identified actionable strategies that help ensure complementary performance (human+AI performance exceeds either alone). In parallel, we explored multimodal large language models for radiology. Starting with an evaluation of out-of-the-box performance of current LLMs (e.g., GPT-4o, Llama) on chest X-ray reporting, we found that, although impressive in general domains, current LLMs perform poorly on specialized medical tasks. Our analysis reliably identified visual understanding as the primary performance bottleneck. Additionally, we proposed a fine-grained tabular based evaluation method with expert curated high-quality data. This benchmark not only enhances the rigor of current evaluation but also holds promise for guiding future model development. My work contributes to broader efforts in adapting foundation models to high-stakes, domain-specific applications. More broadly, my research contributes to the growing understanding of how AI is evolving from simple tools to sophisticated collaborators in knowledge work and specialized fields.

Advisors: Chenhao Tan

Committee member names: Chenhao Tan, Yuxin Chen, Aritrick Chatterjee, James Evans