<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

</head>

<body>

<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">

<div class="PlainText">This is an announcement of Cesar Andres Stuardo's Dissertation Defense.<br>

===============================================<br>

Candidate: Cesar Andres Stuardo<br>

<br>

Date: Monday, November 07, 2022<br>

<br>

Time:  1 pm CST<br>

<br>

Remote Location: <a href="https://uchicago.zoom.us/j/92561990906?pwd=Y3BQWU1CdlNkQW1EM3hHS0krTUU4dz09">

https://uchicago.zoom.us/j/92561990906?pwd=Y3BQWU1CdlNkQW1EM3hHS0krTUU4dz09</a><br>

<br>

Location: JCL 298<br>

<br>

Title: Towards Scale-Checkable Systems<br>

<br>

Abstract: In this document, we present our approaches for understanding and discovering scalability faults, i.e. faults whose symptoms appear at larger scales but are not visible at smaller scales.

<br>

<br>

First, we present a study of over 350 scalability faults collected from the repositories of 10 popular open-source distributed systems. We analyze the symptoms they produce, the scenarios in which they manifest, their root causes, the effectiveness of existing

 testing tools in detecting them and the solutions and effort involved in tackling them.<br>

<br>

Then, we present ScaleCheck, an emulation-based approach for discovering scalability<br>

faults in large-scale distributed storage systems. ScaleCheck employs a set of <br>

black and white box techniques to allow developers to “deploy” a cluster in a single-machine

<br>

and accurately observe the behavior of their systems as if they were deployed in multiple machines. Moreover, ScaleCheck includes a collection-tracking mechanism that allows developers to discover potentially harmful code paths affected by the increase in the

 number of nodes in the cluster. We integrated this approach into 4 popular distributed storage systems and accurately reproduced the symptoms of 10 known scalability faults using a single machine.<br>

<br>

Finally, we present SView, a framework for identifying and analyzing potential scalability<br>

faults in large-scale distributed systems. SView combines instrumentation and statistical concepts to identify dimensional code fragments (DCFs), i.e. pieces of code whose number of executions (e.g., # loop iterations, # method executions) is positively correlated

 with the increase in the size of one or more system dimensions (e.g. # number of files, # clients, # requests), with static analysis modules that detect faulty code patterns involving the DCFs.  SView's lightweight approach does not require modifications in

 the system under test, it's portable without effort across different versions of the same system and focuses on the root cause of scalability faults rather than the symptoms they produce. We evaluate SView in 15 different versions of 4 popular distributed

 systems and use our analysis modules to detect known and unknown scalability faults.<br>

<br>

Advisors: Haryadi Gunawi<br>

<br>

Committee Members: Haryadi Gunawi, Shan Lu, and Cindy Rubio-Gonzáles<br>

<br>

<br>

</div>

</span></font></div>

<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">

<div class="PlainText"><br>

<br>

</div>

</span></font></div>

</body>

</html>