<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">


<head>


<meta http-equiv="Content-Type" content="text/html; charset=utf-8">


<meta name="Generator" content="Microsoft Word 15 (filtered medium)">


<style><!--


/* Font Definitions */


@font-face


        {font-family:SimSun;


        panose-1:2 1 6 0 3 1 1 1 1 1;}


@font-face


        {font-family:"Cambria Math";


        panose-1:2 4 5 3 5 4 6 3 2 4;}


@font-face


        {font-family:Aptos;


        panose-1:2 11 0 4 2 2 2 2 2 4;}


@font-face


        {font-family:"\@SimSun";


        panose-1:2 1 6 0 3 1 1 1 1 1;}


/* Style Definitions */


p.MsoNormal, li.MsoNormal, div.MsoNormal


        {margin:0in;


        font-size:10.0pt;


        font-family:"Aptos",sans-serif;}


a:link, span.MsoHyperlink


        {mso-style-priority:99;


        color:blue;


        text-decoration:underline;}


span.EmailStyle19


        {mso-style-type:personal-reply;


        font-family:"Aptos",sans-serif;


        color:windowtext;}


.MsoChpDefault


        {mso-style-type:export-only;


        font-size:10.0pt;


        mso-ligatures:none;}


@page WordSection1


        {size:8.5in 11.0in;


        margin:1.0in 1.0in 1.0in 1.0in;}


div.WordSection1


        {page:WordSection1;}


--></style>


</head>


<body lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word">


<div class="WordSection1">


<p class="MsoNormal"><span style="font-size:11.0pt">Hello,<o:p></o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt">I would like to unsubscribe from this email list, however, I am not a CS student so I don’t have a


</span><span style="font-size:11.0pt"><a href="mailto:cnetid@cs.uchicago.edu">cnetid@cs.uchicago.edu</a> email address.<o:p></o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt">Thank you!</span><span style="font-size:11.0pt"><o:p></o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>


<div id="mail-editor-reference-message-container">


<div>


<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">


<p class="MsoNormal" style="margin-bottom:12.0pt"><b><span style="font-size:12.0pt;color:black">From:


</span></b><span style="font-size:12.0pt;color:black">cs <cs-bounces+ybai189464=cs.uchicago.edu@mailman.cs.uchicago.edu> on behalf of via cs <cs@mailman.cs.uchicago.edu><br>


<b>Date: </b>Friday, November 8, 2024 at 1:56</span><span style="font-size:12.0pt;font-family:"Arial",sans-serif;color:black"> </span><span style="font-size:12.0pt;color:black">PM<br>


<b>To: </b>cs@cs.uchicago.edu <cs@cs.uchicago.edu>, colloquium@cs.uchicago.edu <colloquium@cs.uchicago.edu><br>


<b>Subject: </b>[CS] REMINDER: Chaoqi Wang Candidacy Exam/Nov. 20th<o:p></o:p></span></p>


</div>


<div>


<p class="MsoNormal"><span style="font-size:11.0pt">This is an announcement of Chaoqi Wang's Candidacy Exam.<br>


===============================================<br>


Candidate: Chaoqi Wang<br>


<br>


Date: Wednesday, November 20<br>


<br>


Time: 4PM -5PM CST <br>


<br>


Location: JCL 390<br>


<br>


Remote Location: <a href="https://urldefense.com/v3/__https:/www.google.com/url?q=https:**Auchicago.zoom.us*j*4959891834*pwd*3DTXRsbmtGUkJhNWpvZk9aVHBDYjFHdz09&sa=D&source=calendar&usd=2&usg=AOvVaw2M-iGX3eKWItqxa1PldJ-m__;Ly8vLz8l!!BpyFHLRN4TMTrA!-M2i6naQUvTWJCQBKMUe3hpkOr_y-2A_KyqByiTraK9BF-tcMPYyO8C4CvrWSNxMF79G1bC9iOCVK8r3JClTjA$">


https://urldefense.com/v3/__https://www.google.com/url?q=https:**Auchicago.zoom.us*j*4959891834*pwd*3DTXRsbmtGUkJhNWpvZk9aVHBDYjFHdz09&sa=D&source=calendar&usd=2&usg=AOvVaw2M-iGX3eKWItqxa1PldJ-m__;Ly8vLz8l!!BpyFHLRN4TMTrA!-M2i6naQUvTWJCQBKMUe3hpkOr_y-2A_KyqByiTraK9BF-tcMPYyO8C4CvrWSNxMF79G1bC9iOCVK8r3JClTjA$</a><br>


<br>


Title: Towards Robust Alignment of Language Models with Human Preferences<br>


<br>


Abstract: The rapid advancement of large language models (LLMs) has improved natural language understanding and generation, yet aligning these models with human preferences remains challenging due to safety concerns, training complexities, and biases from spurious


 correlations. This thesis introduces new optimization techniques and bias mitigation strategies to improve LLM alignment with human values.<br>


<br>


We first present \textbf{$f$-Direct Preference Optimization ($f$-DPO)}, an extension of Direct Preference Optimization that uses various \$f\$-divergences to simplify the relationship between the reward function and optimal policy, eliminating the need for


 normalizing constants. Empirical results show $f$-DPO balances alignment performance and generation diversity, surpassing traditional Proximal Policy Optimization methods in divergence efficiency. Next, we address the limitations of single-sample comparisons


 by proposing \textbf{Multi-sample Direct Preference Optimization (mDPO)} and \textbf{Multi-sample Identity Preference Optimization (mIPO)}. These methods use group-wise preferences to optimize collective characteristics, enhancing diversity and reducing bias


 more effectively than single-sample approaches, especially in noisy label environments. Finally, we incorporate causal inference into the alignment process with \textbf{causal reward modeling}, enforcing counterfactual invariance to reduce biases like length,


 sycophancy, concept, and discrimination biases. This approach ensures more reliable and fair alignment of LLMs with human preferences.<br>


<br>


Overall, the optimization frameworks and bias mitigation strategies in this thesis offer practical improvements to alignment workflows, contributing to the development of trustworthy AI systems that adhere to ethical standards and reflect human preferences.<br>


<br>


Advisors: Yuxin Chen<br>


<br>


Committee: Yuxin Chen, Ari Holtzman and Haifeng Xu<br>


<br>


When unsubscribing, use your cnetid@cs.uchicago.edu address if your cnetid@uchicago.edu does not work.<br>


<br>


cs mailing list  -  cs@mailman.cs.uchicago.edu<br>


Edit Options and/or Unsubscribe: <a href="https://mailman.cs.uchicago.edu/mailman/listinfo/cs">


https://mailman.cs.uchicago.edu/mailman/listinfo/cs</a><br>


More information here: <a href="https://howto.cs.uchicago.edu/techstaff:mailinglist">


https://howto.cs.uchicago.edu/techstaff:mailinglist</a><o:p></o:p></span></p>


</div>


</div>


</div>


</div>


</body>


</html>