[CS] Ruoxi Jiang Dissertation Defense/Apr 17, 2025

Wed Apr 16 16:20:46 CDT 2025

This is an announcement of Ruoxi Jiang's Dissertation Defense.
===============================================
Candidate: Ruoxi Jiang

Date: Thursday, April 17, 2025

Time:  4 pm CST

Location: JCL 390

Title: Embed, preserve, generate: advancing surrogate models for scientific modeling

Abstract: Simulators are fundamental tools for studying complex dynamical systems—such as those in climate modeling, fluid dynamics, and molecular dynamics— and accelerating scientific discovery. Yet their computational cost and development complexity have spurred interest in data-driven machine learning surrogates as efficient alternatives. A central challenge lies in decoding the intricate interactions of high-dimensional dynamics, where sensitivity to initial conditions in chaotic systems exacerbate errors in long-horizon forecasting. This thesis advances surrogate modeling through novel algorithm designs that address accuracy, interpretability, and stability.

First, for inverse problems in scientific inference, we present Embed and Emulate, a new simulation-based inference (SBI) for parameter inference to fit physical models to real observation data. E&E jointly learns a low-dimensional latent embedding of observational data (serving as a summary statistic) and trains a fast emulator within this latent space. This eliminates the need for costly simulations or high-dimensional emulation during inference, enabling efficient parameter estimation and uncertainty quantification.

Next, we tackle chaotic dynamics by integrating representation learning with physical constraints. Our method learns latent structures that preserve the statistical properties of the system across diverse environments, ensuring the robustness of predictions using noisy observations under multi-scenario forecasting. Additionally, we further explore this strategy and propose a novel hierarchical generative model that iteratively generates semantic latent representations. Each model in this series is conditioned on the output of the preceding higher-level models, culminating in image generation, enabling coherent generation of high-fidelity images through structured latent space exploration.

Finally, we introduce multistep look-ahead denoising, a generative autoregressive strategy that dynamically rebalances attention between historical states and noisy recent predictions during rollouts. By progressively prioritizing context from earlier stable states while correcting for accumulating errors, this approach enhances long-term stability in chaotic systems. The implications extend to diverse domains where reliable long-term forecasting is crucial, from climate science to engineering design.

Advisor: Rebecca Willett

Committee: Rebecca Willett, Michael Maire, Yuxin Chen