[Colloquium] Turcu/MS Presentation/January 11, 2008
Margaret Jaffey
margaret at cs.uchicago.edu
Wed Jan 9 10:14:20 CST 2008
The time and location of Gabri's MS Presentation on Friday has changed
to 9:00 a.m. in Ry 277. The title of her MS Paper has also changed
slightly.
---------------------------------
Date: Friday, January 11, 2008
Time: 9:00 a.m. (note the new time)
Place: Ryerson 277 (note the new location)
M.S. Candidate: Gabriela Turcu
M.S. Paper Title: Efficient Data Warehouse Maintenance in
Bioinformatics
Abstract:
Motivated by the challenge of integrating biological information, we
explore various existing approaches. Based on lessons from the GADU/
GNARE system, we adopt a solution based on defining a common data
model for the integrated sources and materialize the knowledge as a
relational warehouse. We integrate information sources related to
proteins together with results of computational tools applied to this
data. We keep a full history of our warehouse by means of timestamps
attached to every tuple in the warehouse. This allows us to recreate
any past version of our warehouse and also aids in provenance tracking
of our data. We maintain the warehouse incrementally for both
relations derived by applying the standard relational operators and
where possible, relations created through external computations. We
propose a new approach to maintain incrementally sequence alignments
produced by the BLAST bioinformatics tool. BLAST has become the
standard tool used in biological sequence alignment. Even if it is an
heuristic algorithm that attempts to reduce the search space of the
traditional quadratic dynamic programming algorithms used for
performing sequence alignments, it is still computationally expensive.
The BLAST computation oftentimes turns out to be the bottleneck in the
maintenance process of integrated bioinformatics analysis
environments. Assuming an increasing alignments search space and based
on information of the new sequences to be aligned, we adjust the
statistical values produced by BLAST for the alignments already
present before the update. To this we append the aligments that can be
accounted for by the presence of new sequences in both the query and
the target sequence sets.
Advisor: Prof. Ian Foster
A draft copy of Gabri Turcu's MS Paper is available in Ry 156.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Margaret P. Jaffey margaret at cs.uchicago.edu
Department of Computer Science
Student Support Rep (Ry 156) (773) 702-6011
The University of Chicago http://www.cs.uchicago.edu
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
More information about the Colloquium
mailing list