[Colloquium] Chen Zou Dissertation Defense/Dec 7, 2022

Mon Nov 28 14:55:06 CST 2022

This is an announcement of Chen Zou's Dissertation Defense.
===============================================
Candidate: Chen Zou

Date: Wednesday, December 07, 2022

Time:  2 pm CST

Location: Crear 298

Title:  High-Performance Architectures for Data Center Computational Storage

Abstract: The dramatic growth in the importance of large-scale data analytics has driven the transformation of data center storage from hard disk drives to solid-state drives (SSD).
Central to increased capability is the rapid growth in SSD/flash bandwidth and
associated compute requirements, resurrecting the question of how to distribute
compute across CPUs and storage resources.

In this dissertation, we address three fundamental questions on architecting
general-purpose computational storage.
First, what are key opportunities for computational storage acceleration?
Second, what SSD architecture provides the most efficient support for computational storage?
Third, what computational storage processor architecture enables high performance that can match the continued rapid improvement in flash bandwidth?

Based on a broad survey of computational storage proposals and research,
we show that function properties of `vectorizable', `data size change'
and `offload direction' determine the system efficiency and performance benefits of computational SSD offloads and thus determines the priority of offloading.
Common properties including streaming access and variable-width values
exposed by first-priority functions call for architectural support.

Existing computational SSD architectures suffer from poor cost scaling.
This is exacerbated by the continued improvement in flash array bandwidth,
creating a SSD DRAM bottleneck.
We propose the ASSASIN SSD architecture, which provides a unified set of compute engines between SSD DRAM and the flash array to eliminate the bottleneck by enabling direct computing on flash data streams with streambuffers.
ASSASIN thus delivers 1.5x - 2.4x speedup for various computational SSD offloads
along with 2.0x power efficiency and 3.2x area efficiency.

Existing processor architectures suffer from low datapath efficiency when computing on variable-width values for requiring padding each value to 32/64 bits.
Variable-width values are central to coding and storage efficiency,
and thus critical for computational storage.
We propose VarVE, a vector instruction set architecture extension that provides
native variable-width value vector support to compute directly without padding.  
VarVE delivers 1.3x - 5.4x speedup over ARM's current best,
scalable vector extension (SVE), on popular file system and
database computational SSD kernels by delivering higher datapath efficiency.
VarVE builds on the vector-length agnostic (VLA) approach,
which is gaining widespread adoption.  As a result, 
VarVE has broad potential impact as a general SIMD extension for all processors,
increasing datapath efficiency and mitigating the SIMD instruction count explosion.

Advisors: Andrew Chien