[Colloquium] CS Seminar today at 3:30 pm: Haryadi Gunawi, University of Chicago

Sandra Wallace via Colloquium colloquium at mailman.cs.uchicago.edu
Thu Oct 4 09:09:15 CDT 2018


UNIVERSITY OF CHICAGO
DEPARTMENT OF COMPUTER SCIENCE
PRESENTS



Haryadi Gunawi
University of Chicago
	

Thursday, October 4, 2018 at 3:30 pm
Crerar 390


Title:  Faults at Scale: What New Bugs Live in the Cloud and How to Exterminate Them

Abstract:
As more data and computation move from local to cloud environments, datacenter distributed systems have become a dominant backbone for many modern applications. However, the complexity of cloud-scale hardware and software ecosystems has outpaced existing testing, debugging, and verification tools.

I will describe several classes of new bugs that surface in large-scale datacenter distributed systems: (1) distributed concurrency bugs, caused by non-deterministic timings of distributed events such as message arrivals as well as multiple crashes and reboots; (2) tail-performance faults that surface in the presence of "limping" hardware or heavy contention that can cause cascades of performance failures; and (3) scalability faults, latent faults that are scale dependent, typically only surface in large-scale deployments (100+ nodes) but not necessarily in small/medium-scale deployments. These findings are based on our long, large-scale cloud bug and outage studies (3000+ bugs and 500+ outages).

I will present our various approaches in combating these bugs/faults such as highly scalable semantic-aware software model checkers for discovering distributed concurrency bugs and tail-tolerant operating-system supports for circumventing millisecond performance
faults.

 
Bio:
Haryadi S. Gunawi is a Neubauer Family Assistant Professor in the Department of Computer Science at the University of Chicago where he leads the UCARE research group (UChicago systems research on Availability, Reliability, and Efficiency). He received his Ph.D. in Computer Science from the University of Wisconsin, Madison in 2009. He was a postdoctoral fellow at the University of California, Berkeley from 2010 to 2012. His current research focuses on cloud computing reliability and new storage technology. He has won numerous awards including NSF CAREER award, NSF Computing Innovation Fellowship, Google Faculty Research Award, NetApp Faculty Fellowships, and Honorable Mention for the 2009 ACM Doctoral Dissertation Award.



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20181004/806c548f/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PastedGraphic-1.pdf
Type: application/pdf
Size: 44340 bytes
Desc: not available
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20181004/806c548f/attachment-0001.pdf>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20181004/806c548f/attachment-0003.html>


More information about the Colloquium mailing list