[Colloquium] Jacob Williams Candidacy Exam/May 6, 2024

meganwoodward at uchicago.edu meganwoodward at uchicago.edu
Mon Apr 22 10:01:58 CDT 2024


This is an announcement of Jacob Williams's Candidacy Exam.
===============================================
Candidate: Jacob Williams

Date: Monday, May 06, 2024

Time: 10 am CT

Remote Location: https://uchicago.zoom.us/j/93007490832?pwd=K1BZRHdKS0RqQ09qc1l2T2JQUHMxdz09  Meeting ID: 930 0749 0832 Passcode: 815521

Location: JCL 390

Title: Machine Learning and NMR Spectroscopy

Abstract: The recent introduction of AlphaFold (AF) has fundamentally changed our ability to predict the structure of proteins from their primary sequence of amino acids. The AlphaFold Protein Structure Database currently contains over 200 million predicted structures. As AI protein prediction continues to advance, we examine the potential of hybrid techniques that combine experiment and computation that may yield more accurate structures than AI alone with significantly reduced experimental burden. We present a heuristic comparing N-edited NOESY spectra and AlphaFold predicted structures that seeks to determine whether the predicted structure reasonably describes the true structure of the protein. Using verified structures from the Protein Data Bank (PDB) and the corresponding NMR data from the Biological Magnetic Resonance Data Bank (BMRB), we seek to find a simple way to compare AlphaFold predictions to measured NMR data which informs our understanding of the quality of AlphaFold predictions. Our heuristics are similar in nature to distance restraints, in that they use the peaks from the NOESY spectra to determine which residues in a protein are likely to be nearby in physical space. They are fed into a small machine learning model which learns by comparing the PDB and AlphaFold structures for the same protein, to construct a system which can validate predicted structures relative to the data. We will also consider ways to refine AlphaFold predictions, updating the structures themselves to create more likely configurations given the observed NMR spectra. This refinement will likely take the form of deep learning models such as energy-based models. The future of structure elucidation for proteins and other large molecules likely lies with hybrid methods that combine fast and accurate AI and ML models with a small set of experiments that will validate and refine structures.

Advisors: Rebecca Willett

Committee Members: Ian Foster, Risi Kondor, and Rebecca Willett



More information about the Colloquium mailing list