[Colloquium] Fang/MS Presentation/Nov 3, 2015

Mon Oct 19 14:59:14 CDT 2015

This is an announcement of Yuanwei Kevin Fang's MS Presentation.

------------------------------------------------------------------------------
Date:  Tuesday, November 3, 2015

Time:  8:30 AM

Place:  Ryerson 276

M.S. Candidate:  Yuanwei Fang

M.S. Paper Title: EFFCLIP + UAP: UNIFIED, EFFICIENT REPRESENTATION AND
ARCHITECTURE FOR AUTOMATA PROCESSING

Abstract:
Finite automata models are a fundamental computing abstraction for
algorithms and solutions to a growing range of problems
(bioinformatics, text search, ranking, compression, and network
monitoring). Consequently there have been two vibrant threads of
invention - new finite automata models and novel implementations - but
so far there is no accepted software-hardware interface, a computer
architecture, to separate and thereby accelerate progress in both
software and hardware.

We propose the Unified Automata Processor (UAP), a new architecture
that provides general and efficient support for finite automata (FA).
We also propose a novel automata transition packing approach used by
UAP,“efficient coupled-linear packing” (EffCLiP), that optimizes both
finite-automata size and performance. EffCLiP + UAP enables high
performance flexible automata processing. They enable new automata to
be created and deployed as software exploiting integration of high
performance automata processing on FA accelerated CPU, instead of
requiring a new ASIC/FPGA. This technique elevates novel automata
innovation for future hardware platform. EffCLiP employs a novel
transition representation that enables a simple addressing operator
(integer addition) while providing flexibility to support transition
packing. With EffCLiP, the UAP supports a wide range of existing
finite automata models (DFAs, NFAs, A-DFAs, JFAs, counting-DFAs, and
counting-NFAs), and future novel FA models. Evaluation on realistic
workloads shows that UAP implements all models efficiently, achieving
line rates of 94.5% of ideal. A single UAP lane delivers line rates
50x greater than software approaches on CPUs and GPUs. Scaling UAP to
64 lanes achieves FA transition throughputs as high as 295 Gbps, more
than 100x higher than CPUs and GPUs, and even exceeding ASIC
approaches such as IBM’s RegX by 6x. With efficient support for
multiple input streams, UAP achieves throughputs that saturate even
high speed stacked DRAM memory systems.

Evaluation across a variety of widely-used finite automata workloads
and models show that EffCLiP achieves compression rates as high as
8.4x and packing densities of 77% for DFA, 2.24x and >99% for A-DFA,
3.9x and >99% for NFA while supporting high speed and simple
implementation. We implemented the UAP design with EffCLiP, whose
implementation is efficient in instructions executed, memory
references, and energy. UAP achieves a 1.2 Ghz clock rate (32nm TSMC
CMOS), and a 64-lane UAP requires only 2.2 mm2 and 563 mW. At these
levels of performance and energy-efficiency, UAP is a promising
candidate for integration into general-purpose computing
architectures.

Yuanwei's advisor is Prof. Andrew Chien

Login to the Computer Science Department website for details:
 https://www.cs.uchicago.edu/phd/ms_announcements#fywkevin

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Margaret P. Jaffey            margaret at cs.uchicago.edu
Department of Computer Science
Student Support Rep (Ry 156)               (773) 702-6011
The University of Chicago      http://www.cs.uchicago.edu
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=