[CS] UPDATED: Ziyu Ye MS Presentation/Dec.6th

Thu Dec 5 14:15:31 CST 2024

This is an announcement of Ziyu Ye's MS Presentation
===============================================
Candidate: Ziyu Ye

Date: Friday, December 6, 2024

Time: 12:00 AM – 1:00 PM CST

Location: JCL 346

Remote Location: meet.google.com/pei-fgws-yii

Title: Towards Scalable and Self-Improving Artificial Intelligence: Learning to Reason and Align via Self-Play

Abstract: This thesis explores advancements in developing scalable methods for self-improving artificial intelligence through two primary areas: alignment and mathematical reasoning. First, we introduce Evolving Alignment via Asymmetric Self-Play (eva), a new framework that casts reinforcement learning from human feedback (RLHF) as an asymmetric game between a creator and a solver. Unlike conventional RLHF methods that rely on static prompt distributions, eva enables the generation of progressively informative prompts and solver improvements, resulting in scalable alignment and state-of-the-art performance across benchmarks, without any additional human-crafted prompts.

Next, we propose Reasoning in Reasoning (RiR), a hierarchical framework for neural theorem proving. RiR integrates decomposition and search-based reasoning through a planner-actor game, breaking down complex theorems into sub-goals to improve generalizability and search space efficiency. Empirical results on theorem proving datasets, such as LeanDojo and miniF2F, show that RiR achieves significant performance gains and operates nearly three times faster than existing baselines. We also provide information-theoretic insights into the principles behind RiR's effectiveness.

Together, these contributions push the boundaries of scalable, self-improving artificial intelligence capable of performing complex sequential decision making to achieve goals in the open world, bridging theoretical insights and practical advancements in training large language models.

Advisors: Yuxin Chen

Committee Members: Yuxin Chen, Haifeng Xu, Kaiyu Yang, Yuan Liu