[CS] UPDATED: Ziyu Ye MS Presentation/Dec.6th
via cs
cs at mailman.cs.uchicago.edu
Thu Dec 5 14:15:31 CST 2024
This is an announcement of Ziyu Ye's MS Presentation
===============================================
Candidate: Ziyu Ye
Date: Friday, December 6, 2024
Time: 12:00 AM – 1:00 PM CST
Location: JCL 346
Remote Location: meet.google.com/pei-fgws-yii
Title: Towards Scalable and Self-Improving Artificial Intelligence: Learning to Reason and Align via Self-Play
Abstract: This thesis explores advancements in developing scalable methods for self-improving artificial intelligence through two primary areas: alignment and mathematical reasoning. First, we introduce Evolving Alignment via Asymmetric Self-Play (eva), a new framework that casts reinforcement learning from human feedback (RLHF) as an asymmetric game between a creator and a solver. Unlike conventional RLHF methods that rely on static prompt distributions, eva enables the generation of progressively informative prompts and solver improvements, resulting in scalable alignment and state-of-the-art performance across benchmarks, without any additional human-crafted prompts.
Next, we propose Reasoning in Reasoning (RiR), a hierarchical framework for neural theorem proving. RiR integrates decomposition and search-based reasoning through a planner-actor game, breaking down complex theorems into sub-goals to improve generalizability and search space efficiency. Empirical results on theorem proving datasets, such as LeanDojo and miniF2F, show that RiR achieves significant performance gains and operates nearly three times faster than existing baselines. We also provide information-theoretic insights into the principles behind RiR's effectiveness.
Together, these contributions push the boundaries of scalable, self-improving artificial intelligence capable of performing complex sequential decision making to achieve goals in the open world, bridging theoretical insights and practical advancements in training large language models.
Advisors: Yuxin Chen
Committee Members: Yuxin Chen, Haifeng Xu, Kaiyu Yang, Yuan Liu
More information about the cs
mailing list