[Colloquium] Reminder: Li/Dissertation Defense/July 17, 2020 - Today at 12PM via Zoom

Fri Jul 17 09:30:09 CDT 2020

    This is a reminder announcement about Huaicheng Li's defense, today at 12PM via Zoom.
    
    Here is the Zoom link to participate:
    
    
    https://uchicago.zoom.us/j/92952579383?pwd=dkJoUVUxZW9FTzdycXBrTUhyTy9PUT09 
    
    Password: 701903
    
    One tap mobile +13126266799,,92952579383# US (Chicago)
    +16465588656,,92952579383# US (New York)
    
    Dial by your location +1 312 626 6799 US (Chicago) +1 646 558 8656 US
    (New York) +1 301 715 8592 US (Germantown) +1 346 248 7799 US
    (Houston) +1 669 900 9128 US (San Jose) +1 253 215 8782 US (Tacoma)
    888 788 0099 US Toll-free 877 853 5247 US Toll-free Meeting ID: 929
    5257 9383 Password: 701903
    
    
    
        Department of Computer Science/The University of Chicago
    
                      *** Dissertation Defense ***
    
    
    Candidate:  Huaicheng Li
    
    Date Friday, July 17, 2020
    
    Time:  12:00 PM
    
    Place:  remotely via Zoom
    
    Title: Evolving Cloud Storage Stack for Predictability and Efficiency
    
    Abstract:
    With the exponential growth of data which are expected to reach 175
    zettabytes by 2025, cloud storage is increasingly becoming the central
    hub for data management and processing. Among many benefits cloud
    platforms promise, predictable performance and cost-efficiency are two
    fundamental factors driving the success of modern cloud storage.
    However, under rapid changes of modern cloud storage infrastructure in
    terms of both software and hardware, new challenges emerge for
    achieving predictable performance with efficiency.
    
    In more detail, modern data intensive applications and new wave of
    computing paradigms (e.g., data analytics, ML, serverless) drive the
    storage stack to undergo a radical shift towards more feature-rich
    software designs on top of increasingly heterogeneous architectures.
    As a result, today's cloud storage stack is extremely heavy-weight and
    complex, burning 10-20% of data center CPU cycles and introducing
    severe performance non-determinism (i.e., long tail latencies).
    Unfortunately, the deployment of new acceleration hardware (e.g., NVMe
    SSDs and I/O co-processors) only {partially} addresses the problem.
    Due to the intrinsic complexities and idiosyncrasies in hardware
    (e.g., NAND Flash management) and lack of system-level support, it
    remains a challenge to design performant and cost-efficient cloud
    storage systems. In particular, achieving sub-millisecond level
    latency predictability in a cost-efficient manner is the new
    battlefield.
    
    Rooted in deep understanding and analysis of existing
    software/hardware stack, this dissertation focuses on building new
    abstractions, interfaces and end-to-end storage systems to achieve
    predictable performance and cost-efficiency using a software/hardware
    co-design approach. By revisiting the challenges across different
    layers in a holistic manner, the co-design approach opens up simple
    yet powerful system-level policy designs to opportunistically exploit
    hardware idiosyncrasies and heterogeneity. The systems we build can
    effectively decrease latency spikes by up to orders of magnitude and
    increase cost savings by 20x.
    
    To address the challenge of predictable performance in modern Flash
    storage systems, we present TeaFA, a tail-evading flash array design
    delivering deterministic performance. TeaFA uniquely combines a simple
    yet powerful host-SSD interface, time window mechanism, and data
    redundancy to proactively and deterministicaly reconstruct late
    requests, with only minor changes to the host software and device
    firmware. The evaluation results across 9 data center storage traces
    and several real storage workloads (e.g., FileBench, YCSB/RocksDB)
    show that TeaFA improves the baseline by orders of magnitude and is
    only 1.1x to 2.1x slower than an ideal case where there is no
    background operations induced tail latencies. When compared to other
    state-of-the-art works (e.g., Proactive approach, Preemptive GC, P/E
    Suspension, Flash-on-Rails and Harmonia) focusing on improving I/O
    performance, TeaFA is more deterministic and effective in cutting tail
    latencies while being less intrusive and easy to deploy.
    
    Although TeaFA effectively improves tail latencies, a significant
    portion of CPU cycles is needed to fulfill the reconstruction
    computations. Worse, at a large scale, the ``storage tax'' that cloud
    providers have to pay takes up to 10-20% of datacenter CPU cycles.
    Thus, it's challenging to achieve cost/resource-efficiency in modern
    cloud storage stack designs. One opportunity is to utilize modern I/O
    accelerators for cost-efficient storage offloading. Yet, the complex
    cloud storage stack is not completely offload-ready to today's IO
    accelerators. To tackle the cost-efficiency challenge, we present
    LeapIO, a next-generation cloud storage stack that leverages ARM-based
    co-processors to offload complex storage services. LeapIO addresses
    many deployment challenges, such as hardware fungibility, software
    portability, virtualizability, composability, and efficiency. It
    employs a set of OS/software techniques and new hardware properties to
    provide a uniform address space across the x86 and ARM cores to
    minimize data copies and directly expose virtual NVMe storage to
    unmodified guest VMs. At the core, LeapIO runtime enables agile
    storage service development at the user-space. Storage services on
    LeapIO are ``offload ready;'' they can portably run in ARM SoC or on
    host x86 in a trusted VM. The software overhead only exhibits 2-5%
    throughput reduction compared to bare-metal performance (still
    delivering the peak bandwidth of 0.65 million IOPS on a datacenter
    SSD). Our current SoC prototype also delivers an acceptable
    performance, 5% further reduction on the server side (and up to 30% on
    the client) but with more than 20x cost savings. Overall, LeapIO helps
    cloud providers cut the storage tax and improve utilization without
    sacrificing performance.
    
    Finally, we discuss the importance of scalable and extensible research
    platforms for fostering future full-stack software/hardware storage
    research. Existing software platforms (e.g., SSD/SoC simulators or
    emulators) are limited by the types of research they support, outdated
    and not scalable. Hardware platforms suffer from wear-out issues and
    are difficult to use. Thus, it's not an excellent choice for new idea
    exploration in the early phase neither. We argue that it is a critical
    time for storage research community to have a new software-based
    full-system SSD emulator. To this end, we build FEMU, a software
    (QEMU-based) NVMe flash emulator. FEMU is cheap (open-sourced),
    relatively accurate (0.5-38% variance as a drop-in replacement of
    OpenChannel SSD), scalable (can support 32 parallel channels/chips),
    and extensible (support internal-only and split-level SSD research).
    FEMU has been used by researchers from tens of institutions and in
    classes, demonstrating the urgent need for such a research platform
    and its success.
    
    
    
    Huaicheng's advisor is Prof. Haryadi Gunawi
    
    Login to the Computer Science Department website for details,
    including a draft copy of the dissertation:
    
    https://newtraell.cs.uchicago.edu/phd/phd_announcements#huaicheng
    
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    Margaret P. Jaffey            margaret at cs.uchicago.edu
    Department of Computer Science
    Student Support Rep (JCL 350)              (773) 702-6011
    The University of Chicago      http://www.cs.uchicago.edu
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    _______________________________________________
    Colloquium mailing list  -  Colloquium at mailman.cs.uchicago.edu
    https://mailman.cs.uchicago.edu/mailman/listinfo/colloquium
    _______________________________________________
    faculty mailing list  -  faculty at mailman.cs.uchicago.edu
    https://mailman.cs.uchicago.edu/mailman/listinfo/faculty
    _______________________________________________
    Associate-professors mailing list
    Associate-professors at mailman.cs.uchicago.edu
    https://mailman.cs.uchicago.edu/cgi-bin/mailman/listinfo/associate-professors
    _______________________________________________
    One-Click Unsubscribe: https://mailman.cs.uchicago.edu/mailman/options/cs/nitayack%40uchicago.edu?password=adubnaro&unsub=1&unsubconfirm=1
    
    
    When unsubscribing manually please use your cnetid at cs.uchicago.edu address to unsubscribe if your cnetid at uchicago.edu does not work.
    
    cs mailing list  -  cs at mailman.cs.uchicago.edu
    Edit Options and/or Unsubscribe: https://mailman.cs.uchicago.edu/mailman/listinfo/cs
    More information here: https://howto.cs.uchicago.edu/techstaff:mailinglist
    _______________________________________________
    One-Click Unsubscribe: https://mailman.cs.uchicago.edu/mailman/options/cs/rnoyola%40uchicago.edu?password=aloe1986&unsub=1&unsubconfirm=1
    
    
    When unsubscribing manually please use your cnetid at cs.uchicago.edu address to unsubscribe if your cnetid at uchicago.edu does not work.
    
    cs mailing list  -  cs at mailman.cs.uchicago.edu
    Edit Options and/or Unsubscribe: https://mailman.cs.uchicago.edu/mailman/listinfo/cs
    More information here: https://howto.cs.uchicago.edu/techstaff:mailinglist