[Colloquium] Martin Putra MS Presentation/Nov 3, 2022

Megan Woodward meganwoodward at uchicago.edu
Wed Nov 2 10:00:37 CDT 2022


This is an announcement of Martin Putra's MS Presentation
===============================================
Candidate: Martin Putra

Date: Thursday, November 03, 2022

Time: 11 am CST

Remote Location:  https://uchicago.zoom.us/j/8080588315?pwd=bHJHMWxaVS8rbFI5WUczdVlhMkhGUT09<https://urldefense.com/v3/__https://uchicago.zoom.us/j/8080588315?pwd=bHJHMWxaVS8rbFI5WUczdVlhMkhGUT09__;!!BpyFHLRN4TMTrA!7j4ASVIjIT3nCHrfUj-gdqFMAyFMvBPfLumQPTKhEQBwM9gep9KZe8ps_aaI4nDWKesU-Fr2SJAU4mE02AXN1whJlI4Qv9z7jr0t$>

Location: JCL 298

M.S. Paper Title: HYBRID-CLOUD SCHEDULING FOR LONG-RUNNING BIOINFORMATICS WORKFLOWS UNDER TIME CONSTRAINTS

Abstract: The amount of genomics data is increasing at a rapid pace. This leads to a situation where high-throughput data processing is needed to efficiently mine insights from the increasingly abundant genomics data. However, bioinformatics workflows/jobs typically take large input files (e.g., hundreds of GB), resulting in jobs having long execution times.

This thesis work seeks to answer the following research question: "How to ensure X% deadline satisfaction when scheduling bioinformatics workflows in hybrid clouds?". To address the problem, this thesis presents a novel hybrid cloud scheduler that significantly reduces deadline miss rate at no or minimal extra cost. The new scheduler consists of two main components: 1) an accurate execution time predictor for bioinformatics workflows and 2) a novel scheduling policy for ensuring a high statistical guarantee of completing job executions on spot instances leveraging spot lifetime distributions and job duration prediction.

In this thesis, I developed a high-fidelity simulator that accurately resembles the behaviors of bioinformatics workflows on hybrid cloud environments composed of on-premise and cloud machines. The proposed scheduler was modeled on top of the high-fidelity simulator and evaluated using large-scale, real-world production traces from one of the leading genomics research centers. The evaluation results show that the new scheduler successfully reduced deadline miss rates and/or cost compared to several baseline scheduling policies. Finally, the thesis outlined future research directions for refining and implementing the scheduler on production-level genomics processing systems.

Advisors: Robert Grossman and Haryadi Gunawi

Committee Members: Robert Grossman, Haryadi Gunawi, and In Kee Kim

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20221102/563d0a4e/attachment.html>


More information about the Colloquium mailing list