[Colloquium] Reminder - Martin Putra MS Presentation/Nov 3, 2022
Megan Woodward
meganwoodward at uchicago.edu
Thu Nov 3 08:34:04 CDT 2022
This is an announcement of Martin Putra's MS Presentation
===============================================
Candidate: Martin Putra
Date: Thursday, November 03, 2022
Time: 11 am CST
Remote Location: https://uchicago.zoom.us/j/8080588315?pwd=bHJHMWxaVS8rbFI5WUczdVlhMkhGUT09
Location: JCL 298
M.S. Paper Title: HYBRID-CLOUD SCHEDULING FOR LONG-RUNNING BIOINFORMATICS WORKFLOWS UNDER TIME CONSTRAINTS
Abstract: The amount of genomics data is increasing at a rapid pace. This leads to a situation where high-throughput data processing is needed to efficiently mine insights from the increasingly abundant genomics data. However, bioinformatics workflows/jobs typically take large input files (e.g., hundreds of GB), resulting in jobs having long execution times.
This thesis work seeks to answer the following research question: "How to ensure X% deadline satisfaction when scheduling bioinformatics workflows in hybrid clouds?". To address the problem, this thesis presents a novel hybrid cloud scheduler that significantly reduces deadline miss rate at no or minimal extra cost. The new scheduler consists of two main components: 1) an accurate execution time predictor for bioinformatics workflows and 2) a novel scheduling policy for ensuring a high statistical guarantee of completing job executions on spot instances leveraging spot lifetime distributions and job duration prediction.
In this thesis, I developed a high-fidelity simulator that accurately resembles the behaviors of bioinformatics workflows on hybrid cloud environments composed of on-premise and cloud machines. The proposed scheduler was modeled on top of the high-fidelity simulator and evaluated using large-scale, real-world production traces from one of the leading genomics research centers. The evaluation results show that the new scheduler successfully reduced deadline miss rates and/or cost compared to several baseline scheduling policies. Finally, the thesis outlined future research directions for refining and implementing the scheduler on production-level genomics processing systems.
Advisors: Robert Grossman and Haryadi Gunawi
Committee Members: Robert Grossman, Haryadi Gunawi, and In Kee Kim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20221103/c85e1196/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Master_Thesis_Martin_Putra-v1.pdf
Type: application/pdf
Size: 2685012 bytes
Desc: Master_Thesis_Martin_Putra-v1.pdf
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20221103/c85e1196/attachment-0001.pdf>
More information about the Colloquium
mailing list