[Colloquium] [defense] Farrell/Dissertation Defense/Jul 23, 2020

Wed Jul 8 07:57:25 CDT 2020

Zoom link:
https://uchicago.zoom.us/j/95335489752pwd=d0hGR0VDc2RnZzBhRGVUaGxxYVVFdz09
Meeting ID: 953 3548 9752 Password: 815788

       Department of Computer Science/The University of Chicago

                     *** Dissertation Defense ***

Candidate:  Anne Farrell

Date:  Thursday, July 23, 2020

Time:  3:00 PM

Place:  Via Zoom

Title: Trading Accuracy for Overall System Performance in
ML-for-Storage Solutions

Abstract:
The widespread adoption of SSDs has made ensuring stable performance
difficult due to their high tail latencies, which are amplified in
large systems. A promising approach to improving tail tolerance
involves using machine learning to predict when requests will be
slower than the acceptable threshold and retrying these slow requests
on a replica. This makes it possible to respond more quickly to tail
latencies without having to wait for the request to reach a timeout,
and it avoids wasting resources from duplicating requests that are
likely to finish quickly. Interestingly, while prediction accuracy is
the dominant metric in machine learning, it may not guarantee the best
results in terms of overall performance for ML-for-Storage solutions.
In this dissertation, we explore how to predict IO latencies in the
most useful manner for improving overall system performance, even if
this means reducing prediction accuracy.

We create a biased classifier that takes into account the differing
costs of predicting false positives and false negatives in the loss
function in order to trade prediction accuracy for overall
performance. We explore how varying the amount of bias affects the
predictions and overall performance and explain how to select the
right amount of bias for a given system. We show how the speedup that
can be achieved by the biased classifier is affected by factors such
as the cost of failing over to a replica and the number of replicas
available.

To improve flexibility, we similarly create a biased regressor with a
loss function that better reflects the different costs of
overestimates and underestimates, and we find that it can achieve
similar speedups to the biased classifier. Unlike with biased
classification, it is possible to use the biased regression
predictions with different failover thresholds without retraining the
neural network.

The asymmetric costs of false positives and false negatives in
ML-for-Storage solutions make it possible for biased predictions to
achieve superior performance to using perfectly accurate (unbiased)
predictions, indicating that a perfect predictor does not necessarily
imply a perfect system; a deep understanding of the system is still
essential to achieving the best results.

Anne's advisor is Prof. Henry Hoffmann

Login to the Computer Science Department website for details,
including a draft copy of the dissertation:

 https://newtraell.cs.uchicago.edu/phd/phd_announcements#amfarrell

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Margaret P. Jaffey            margaret at cs.uchicago.edu
Department of Computer Science
Student Support Rep (JCL 350)              (773) 702-6011
The University of Chicago      http://www.cs.uchicago.edu
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=