[Colloquium] Seminar Announcement: A naive Bayesian classifier of bacterial phenotypes from genotypes

Ninfa Mayorga ninfa at uchicago.edu
Tue Apr 22 14:47:46 CDT 2014


Computation Institute Presentation - Data Lunch Seminar (DLS)

Speaker: Dr. Ric Colasanti, Computation Institute, University of Chicago
Host:  Tanu Malik 
Date:  April 25, 2014
Time: 12:00 PM - 1:00 PM
Location: University of Chicago, Searle 240A, 5735 S. Ellis Ave.

A naive Bayesian classifier of bacterial  phenotypes from genotypes

Abstract:
The recent advances in genome sequencing mean that we now have access to such vast quantities of data on the genetic codes of bacteria that it is increasingly difficult to analyse them, and deduce a hypothesis . This linked with the increased scope of 'Omics' technologies: genomics, transcriptomics, proteomics, metabolomics, epigenomics and metagenomics  means that biological hypothesis formation must become more automated. Metagenomics is an interesting example. Much of modern microbial ecology has moved from the Petri dish  and growth media as a means of identification, to the sequencer. In fact the term Ecosystomics has regretfully been coined .  

We present a small step in the quest for automated  hypothesis formation. We set out to predict the phenotypic behaviour of a bacteria from a list of functional roles. We have created a naive Bayesian classifier from data sets  in the U.S. Department of Energy Systems Biology Knowledgebase (KBase). The classifier can predict the Gram stain phenotype of bacteria from the functional roles of their enzymes coded by the bacterial genome. The classifier was trained on 86 known Gram positive  bacteria and 252 Gram negative bacteria. After training, it was tested on 112 unseen bacteria; the classifier exhibited an average accuracy of 0.97  and a balanced accuracy measurement of 0.96 .

Bio:
I was lucky enough to discover that I really enjoy computer programming, so on the back of that I got a Ph.D. from the University of Sheffield and my partner Diane and I have spent the last 10 years doing post doc placements in the US and Australia. I am currently working at the University of Chicago, which means I get to live in one of the best cities in the world, with miles of lakefront bike paths to ride on. Diane and I have kept a base in the seaside town of Barry Island in Wales, though, and we get back as often as we can for a proper cup of tea.

I am a researcher with expertise in computer simulations, in particular, cellular automata and agent based models. Having worked with a number of large data sets, I have become very excited about the possibilities of data mining. This inspired me to obtain an MSc in computing from Cardiff University. I have since successfully applied data mining techniques to both predicting the phenotype of bacteria from their genotype and to the diagnosis of respiratory behaviour from breathing patterns. 


Information: Lunch will be provided





-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20140422/16cb315c/attachment.htm 


More information about the Colloquium mailing list