[Colloquium] TTIC Colloquium: Sayan Mukherjee, Duke University

Liv Leader lleader at ttic.edu
Mon May 9 12:12:28 CDT 2011


When:     *Tuesday, May 11 @ 3 p.m.*

Where:    *TTIC Conference Room #530*, 6045 S. Kenwood Ave, 5th Floor

Who:       *Sayan Mukherjee*, Duke University

Title:        *Integration of Genomic Data for Functional Genomics*

I will develop two instances in functional genomics where different geonimc
data are combined.

The first example introduces a statistical tool that integrates genetic and
gene expression evidence into genome wide association analysis of gene sets.
As single variant or single gene analyses generally account for only a small
proportion of the phenotypic variation in complex traits gene set or pathway
association analyses are playing an increasingly important role in
uncovering genetic architectures of complex traits. The two dominant
paradigms for gene set analyses are association analyses based on SNP
genotypes and those based on gene expression profiles. However, gene-disease
association can manifest in many ways such as alterations of gene
expression, genotype and copy number, thus an integrative approach combining
multiple forms of evidence can more accurately and comprehensively capture
pathway associations. We have developed a single statistical framework, Gene
Set Association Analysis (GSAA), that simultaneously measures genome-wide
patterns of genetic variation and gene expression variation to identify sets
of genes enriched for differential expression and/or trait-associated
genetic markers. Simulation studies illustrate that joint analyses of
genomic data increase the power to detect real associations when compared to
gene set methods that use only one genomic data type. The analyses of two
human disease, glioblastoma and Crohn’s disease, detected abnormalities in
previously identified disease-associated pathways, such as pathways related
to the PI3K signaling, DNA damage response, and activation of NF-κB. In
addition, GSAA revealed novel pathway associations, for example differential
genetic and expression characteristics in genes from the ABC transporter
family in glioblastoma and from the HLA system in Crohn’s disease.

The second example is cERMIT, (conserved) evidence ranked motif
identification), a computationally efficient motif discovery tool based on
analyzing genome-wide quantitative regulatory evidence. Instead of
pre-selecting promising candidate sequences, it utilizes information across
all sequence regions to search for high-scoring motifs. We apply cERMIT on a
range of direct binding and overexpression datasets; it substantially
outperforms state-of-the-art approaches on curated ChIP-chip datasets, and
easily scales to current mammalian ChIP-seq experiments with data on
thousands of non-coding regions. I also will touch on how it can be adapted
to discover motifs corresponding to the binding sites of cellular
RNA-binding proteins (RBPs) and microRNA-containing
ribonucleoprotein complexes (miRNPs).

-- 
Liv Leader
Faculty Services

Toyota Technological Institute
6045 S Kenwood Ave, #504
Chicago, IL 60637
Phone- (773) 834-2567
Fax-     (773) 834-9881
Email-  lleader at ttic.edu <jam at ttic.edu>
Web-   www.ttic.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20110509/08e1e0df/attachment.htm 


More information about the Colloquium mailing list