[Colloquium] Turcu/MS Presentation/January 11, 2008

Margaret Jaffey margaret at cs.uchicago.edu
Wed Jan 9 10:14:20 CST 2008


The time and location of Gabri's MS Presentation on Friday has changed  
to 9:00 a.m. in Ry 277.  The title of her MS Paper has also changed  
slightly.

---------------------------------
Date:  Friday, January 11, 2008

Time:  9:00 a.m. (note the new time)

Place:  Ryerson 277 (note the new location)

M.S. Candidate:  Gabriela Turcu

M.S. Paper Title:  Efficient Data Warehouse Maintenance in  
Bioinformatics

Abstract:
Motivated by the challenge of integrating biological information, we  
explore various existing approaches. Based on lessons from the GADU/ 
GNARE system, we adopt a solution based on defining a common data  
model for the integrated sources and materialize the knowledge as a  
relational warehouse. We integrate information sources related to  
proteins together with results of computational tools applied to this  
data. We keep a full history of our warehouse by means of timestamps  
attached to every tuple in the warehouse. This allows us to recreate  
any past version of our warehouse and also aids in provenance tracking  
of our data. We maintain the warehouse incrementally for both  
relations derived by applying the standard relational operators and  
where possible, relations created through external computations. We  
propose a new approach to maintain incrementally sequence alignments  
produced by the BLAST bioinformatics tool. BLAST has become the  
standard tool used in biological sequence alignment. Even if it is an  
heuristic algorithm that attempts to reduce the search space of the  
traditional quadratic dynamic programming algorithms used for  
performing sequence alignments, it is still computationally expensive.  
The BLAST computation oftentimes turns out to be the bottleneck in the  
maintenance process of integrated bioinformatics analysis  
environments. Assuming an increasing alignments search space and based  
on information of the new sequences to be aligned, we adjust the  
statistical values produced by BLAST for the alignments already  
present before the update. To this we append the aligments that can be  
accounted for by the presence of new sequences in both the query and  
the target sequence sets.

Advisor:  Prof. Ian Foster

A draft copy of Gabri Turcu's MS Paper is available in Ry 156.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Margaret P. Jaffey                             margaret at cs.uchicago.edu
Department of Computer Science
Student Support Rep (Ry 156)        (773) 702-6011
The University of Chicago                  http://www.cs.uchicago.edu
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=





More information about the Colloquium mailing list