[Colloquium] 10/7 Research at TTIC: Jinbo Xu, TTIC

Mary Marre via Colloquium colloquium at mailman.cs.uchicago.edu
Fri Sep 30 16:26:49 CDT 2016


When:     Friday, October 7th at noon

Where:    TTIC, 6045 S Kenwood Avenue, 5th Floor, Room 526

Who:      Jinbo Xu, TTIC

Title:      Solving Protein Folding by Big Data and Deep Learning





Abstract:

Ab initio protein folding is one of the most challenging problems in
computational biology. Recently protein contact prediction and
contact-assisted folding that exploits residue co-variation has made some
progress, but this method is not effective on proteins without a large
number (>1000) of sequence homologs. This talk will present a deep learning
method that predicts contacts by integrating both residue co-variation and
conservation information through an ultra-deep neural network formed by two
deep residual networks. This deep network can learn very complex
sequence-contact relationship as well as long-range contact correlation
from the very large protein sequence and relatively small structure
databases and thus, yield much more accurate contact prediction and
accordingly contact-assisted folding for proteins without many sequence
homologs.

Tested on three datasets of 579 proteins, the top L long-range prediction
accuracy (L is sequence length of a protein) of our method is 0.47, much
better than two representative methods CCMpred and MetaPSICOV, which have
accuracy only 0.21 and 0.30, respectively. In terms of the top L/10
long-range accuracy, our method is 0.77, while CCMpred and MetaPSICOV is
0.47 and 0.59, respectively. Ab initio folding using our predicted contacts
as restraints can yield correct folds for 203 test proteins; while that
using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and
62 proteins, respectively. Our contact-assisted folding also outperforms
homology modeling. In particular, we can (ab initio) fold 208 of the 398
membrane proteins, while homology modeling can only do so for 10 of them.
One interesting finding is that even if we do not train our deep learning
models by any membrane proteins, they work equally well on membrane
proteins. Finally, in the three weeks of blind test with the live benchmark
CAMEO, our fully-automated contact prediction web server predicted correct
folds for three hard targets with a new fold: a mainly-beta protein of 182
residues with only 250 sequence homologs, an alpha+beta protein of 125
residues with only 180 sequence homologs, and an alpha protein of 140
residues with 330 sequence homologs.



Availability:

The contact prediction server implementing our method is available at
http://raptorx.uchicago.edu/ContactMap/. See
http://biorxiv.org/content/early/2016/09/16/073239 for the technical and
result details.

************************************************************
*************************************************


*Research at TTIC Seminar Series*

TTIC is hosting a weekly seminar series presenting the research currently
underway at the Institute. Every week a different TTIC faculty member will
present their research.  The lectures are intended both for students
seeking research topics and adviser, and for the general TTIC and
University of Chicago communities interested in hearing what their
colleagues are up to.

To receive announcements about the seminar series, please subscribe to the
mailing list: https://groups.google.com/a/ttic.edu/group/talks/subscribe

Speaker details can be found at: http://www.ttic.edu/tticseminar.php.

For additional questions, please contact Nathan Srebro at nati at ttic.edu
<mcallester at ttic.edu>




Mary C. Marre
Administrative Assistant
*Toyota Technological Institute*
*6045 S. Kenwood Avenue*
*Room 504*
*Chicago, IL  60637*
*p:(773) 834-1757*
*f: (773) 357-6970*
*mmarre at ttic.edu <mmarre at ttic.edu>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20160930/38f2faae/attachment.html>


More information about the Colloquium mailing list