[CS] Wenxin Ding Candidacy Exam/May 23, 2025

Fri May 9 10:24:19 CDT 2025

This is an announcement of Wenxin Ding's Candidacy Exam.
===============================================
Candidate: Wenxin Ding

Date: Friday, May 23, 2025

Time: 12:30 pm CST

Remote Location: https://uchicago.zoom.us/j/99775731213?pwd=quOul8ELfeahMYt6PKJGQhofaqGZhQ.1

Location: JCL 298

Title: Rethinking Model Robustness via Minimal Data Modification

Abstract: Recent advances in machine learning (ML) have increased the size and complexity of models as generative models are now trained on billions of samples featuring billions of parameters. Given the massive amounts of training data, many assume that these large models are inherently robust to data attacks, because it would require altering a significant portion of the training data to affect the model’s robustness. My research challenges this prevailing assumption, and asks a different question: “Is it possible to manipulate or affect a model’s security behavior by introducing minimal yet strategically optimized changes to its training data?” If possible, any entity can introduce significant change to model behavior with limited compute resources.

My recent results address the question by analytically verifying and empirically validating its feasibility, under the context of poisoning attacks against large text-to-image generative models. In this talk, I will present my work showing that large text-to-image models (e.g., trained on billions of samples) are surprisingly vulnerable to low-volume poisoning attacks. By injecting hundreds of poisoned samples, an attacker can effectively mislead the model to generate wrong images when prompted with a specific keyword, e.g., generating cat images when prompted with “dog”. More importantly, a group of parallel attacks (e.g., 300), each targeting a keyword, can break the model’s capability of generating any meaningful images. In addition to designing efficient poisoning strategies to cast such an effect, my research also analytically identifies and verifies the root cause of such vulnerability, which lies in the model’s architecture design. We model the behavior of the cross-attention mechanism as an abstract problem of “supervised graph alignment” and formally quantify the impact of training data by the hardness of alignment. Finally, this line of work has turned into a practical protection tool (Nightshade) for human creatives, where artists can add “poisons’’ into their digital arts before posting them online to prevent unauthorized training on their data.

Advisors: Ben Zhao, Heather Zheng

Committee: Heather Zheng, Ben Zhao, Yuxin Chen, Grant Ho