[Colloquium] Truong Son Hy Dissertation Defense/May 31, 2022

Wed May 18 08:45:50 CDT 2022

This is an announcement of Truong Son Hy's Dissertation Defense.
===============================================
Candidate: Truong Son Hy

Date: Tuesday, May 31, 2022

Time:  1 pm CST

Remote Location: https://uchicago.zoom.us/j/99991588564?pwd=ay9nYzViQndnL2hIU0tlc0dacHFGZz09  Meeting ID: 999 9158 8564 Passcode: 140479

Title: Graph Representation Learning, Deep Generative Models on Graphs, Group Equivariant Molecular Neural Networks and Multiresolution Machine Learning

Abstract: Graph neural networks (GNNs) utilizing various ways of generalizing the concept of convolution to graphs have been widely applied to many learning tasks, including modeling physical systems, finding molecular representations to estimate quantum chemical computation, etc. Most existing GNNs address permutation invariance by conceiving of the network as a message passing scheme, where each node sums the feature vectors coming from its neighbors. We argue that this scheme imposes a limitation on the representation power of GNNs such that each node loses their identity after being aggregated by summing. Thus, we propose a new general architecture called Covariant Compositional Networks (CCNs) in which the node features are represented by higher order tensors and transform covariantly/equivariantly according to a specific representation of the symmetry group of its receptive field. Experiments show that CCNs can outperform competing methods on standard graph learning benchmarks and on estimating the molecular properties calculated by computationally expensive Density Functional Theory (DFT). This novel machine learning approach allows scientists to efficiently extract chemical knowledge and explore the increasingly growing chemical data.

Understanding graphs in a multiscale perspective is essential for capturing the large-scale structure of molecules, proteins, genomes, etc. For this reason, we introduce Multiresolution Equivariant Graph Variational Autoencoder (MGVAE), the first hierarchical generative model to learn and generate graphs in a multiresolution and equivariant manner. MGVAE is built upon Multiresolution Graph Network (MGN), an architecture which explicitly learns a multilevel hard clustering of the vertices, leading to a true multiresolution hierarchy. MGVAE then employs the hierarchical variational autoencoder model to stochastically generate a graph in multiple resolution levels given the hierarchy of latent distributions. Our proposed framework achieves competitive results with several generative tasks including general graph generation, molecule generation, unsupervised molecular representation learning, link prediction on citation graphs, and graph-based image generation. Future applications of MGVAE range from lead optimization enhancing the most promising compounds in drug discovery to finding stable crystal structures in material science.

In general, we want to learn on molecular data specified by a set of charge-position pairs for each atom. This problem is invariant to rotations and translations. We use covariant activations to "bake-in" these symmetries, while retaining local geometric information. We propose Covariant Molecular Neural Networks (Cormorant), a rotationally covariant neural network architecture for learning the behavior and properties of complex many-body physical systems. We apply these networks to molecular systems with two goals: learning atomic potential energy surfaces for use in Molecular Dynamics simulations, and learning ground state properties of molecules calculated by Density Functional Theory. Some of the key features of our network are that (a) each neuron explicitly corresponds to a subset of atoms; (b) the activation of each neuron is covariant to rotations, ensuring that overall the network is fully rotationally invariant. Furthermore, the non-linearity in our network is based upon tensor products and the Clebsch-Gordan decomposition, allowing the network to operate entirely in Fourier space. Cormorant significantly outperforms competing algorithms in learning molecular Potential Energy Surfaces from conformational geometries in the MD-17 dataset, and is competitive with other methods at learning geometric, energetic, electronic, and thermodynamic properties of molecules on the GDB-9 dataset.

Multiresolution Matrix Factorization (MMF) is unusual amongst fast matrix factorization algorithms in that it does not make a low rank assumption. This makes MMF especially well suited to modeling certain types of graphs with complex multiscale or hierarchical structure. While MMF promises to yield a useful wavelet basis, finding the factorization itself is hard, and existing greedy methods tend to be brittle. Therefore, we propose a "learnable" version of MMF that carefully optimizes the factorization with a combination of Reinforcement Learning and Stiefel manifold optimization through back-propagating errors. Based on the wavelet basis produced by MMF when factorizing the normalized graph Laplacian, a wavelet network learning graphs on the spectral domain is constructed with the graph convolution defined via the sparse wavelet transform. We have shown that the wavelet basis resulted from our learnable MMF far outperforms prior MMF algorithms, and the corresponding wavelet networks yield state of the art results on standard node classification on citation graphs and molecular graph classification. This is a promising direction to understand and visualize complex hierarchical structures such as social networks and biological data.

Advisors: Risi Kondor

Committee Members: Risi Kondor, Yuxin Chen, and Eric Jonas

http://people.cs.uchicago.edu/~hytruongson/PhD-Thesis.pdf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20220518/f63b2662/attachment-0001.html>