[Colloquium] Turcu/MS Presentation/January 11, 2008

Margaret Jaffey margaret at cs.uchicago.edu
Tue Dec 18 09:41:22 CST 2007


This is an announcement of Gabriela Turcu's MS Presentation.

---------------------------------
Date:  Friday, January 11, 2008

Time:  12:30 p.m.

Place:  Ryerson 251

M.S. Candidate:  Gabriela Turcu

M.S. Paper Title:  Efficient Biological Data Warehouse Maintenance

Abstract:
Motivated by the challenge of integrating biological information, we  
explore various existing approaches. Based on lessons from the GADU/ 
GNARE system, we adopt a solution based on defining a common data  
model for the integrated sources and materialize the knowledge as a  
relational warehouse. We integrate information sources related to  
proteins together with results of computational tools applied to this  
data. We keep a full history of our warehouse by means of timestamps  
attached to every tuple in the warehouse. This allows us to recreate  
any past version of our warehouse and also aids in provenance  
tracking of our data. We maintain the warehouse incrementally for  
both relations derived by applying the standard relational operators  
and where possible, relations created through external computations.  
We propose a new approach to maintain incrementally sequence  
alignments produced by the BLAST bioinformatics tool. BLAST has  
become the standard tool used in biological sequence alignment. Even  
if it is an heuristic algorithm that attempts to reduce the search  
space of the traditional quadratic dynamic programming algorithms  
used for performing sequence alignments, it is still computationally  
expensive. The BLAST computation oftentimes turns out to be the  
bottleneck in the maintenance process of integrated bioinformatics  
analysis environments. Assuming an increasing alignments search space  
and based on information of the new sequences to be aligned, we  
adjust the statistical values produced by BLAST for the alignments  
already present before the update. To this we append the aligments  
that can be accounted for by the presence of new sequences in both  
the query and the target sequence sets.

Advisor:  Prof. Ian Foster

A draft copy of Gabri Turcu's MS Paper is available in Ry 156.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Margaret P. Jaffey                             margaret at cs.uchicago.edu
Department of Computer Science
Student Support Rep (Ry 156)        (773) 702-6011
The University of Chicago                  http://www.cs.uchicago.edu
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=





More information about the Colloquium mailing list