[Colloquium] Turcu/MS Presentation/January 11, 2008
Margaret Jaffey
margaret at cs.uchicago.edu
Tue Dec 18 09:41:22 CST 2007
This is an announcement of Gabriela Turcu's MS Presentation.
---------------------------------
Date: Friday, January 11, 2008
Time: 12:30 p.m.
Place: Ryerson 251
M.S. Candidate: Gabriela Turcu
M.S. Paper Title: Efficient Biological Data Warehouse Maintenance
Abstract:
Motivated by the challenge of integrating biological information, we
explore various existing approaches. Based on lessons from the GADU/
GNARE system, we adopt a solution based on defining a common data
model for the integrated sources and materialize the knowledge as a
relational warehouse. We integrate information sources related to
proteins together with results of computational tools applied to this
data. We keep a full history of our warehouse by means of timestamps
attached to every tuple in the warehouse. This allows us to recreate
any past version of our warehouse and also aids in provenance
tracking of our data. We maintain the warehouse incrementally for
both relations derived by applying the standard relational operators
and where possible, relations created through external computations.
We propose a new approach to maintain incrementally sequence
alignments produced by the BLAST bioinformatics tool. BLAST has
become the standard tool used in biological sequence alignment. Even
if it is an heuristic algorithm that attempts to reduce the search
space of the traditional quadratic dynamic programming algorithms
used for performing sequence alignments, it is still computationally
expensive. The BLAST computation oftentimes turns out to be the
bottleneck in the maintenance process of integrated bioinformatics
analysis environments. Assuming an increasing alignments search space
and based on information of the new sequences to be aligned, we
adjust the statistical values produced by BLAST for the alignments
already present before the update. To this we append the aligments
that can be accounted for by the presence of new sequences in both
the query and the target sequence sets.
Advisor: Prof. Ian Foster
A draft copy of Gabri Turcu's MS Paper is available in Ry 156.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Margaret P. Jaffey margaret at cs.uchicago.edu
Department of Computer Science
Student Support Rep (Ry 156) (773) 702-6011
The University of Chicago http://www.cs.uchicago.edu
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
More information about the Colloquium
mailing list