[Colloquium] Re: Jin Jin Zhao MS Presentation/May 25, 2022

Jin Jin Zhao j2zhao at uchicago.edu
Wed May 25 10:49:40 CDT 2022


Here is the Zoom link for the presentation: https://uchicago.zoom.us/j/94379397186?pwd=aWxyVHgvVnBtK2ZkTzY1WTZPa2NxQT09

Thank you everyone who asked!

Jinjin

From: cs <cs-bounces+j2zhao=cs.uchicago.edu at mailman.cs.uchicago.edu> on behalf of Megan Woodward <meganwoodward at uchicago.edu>
Date: Tuesday, May 24, 2022 at 9:03 AM
To: cs at cs.uchicago.edu <cs at cs.uchicago.edu>, colloquium at cs.uchicago.edu <colloquium at cs.uchicago.edu>
Subject: [CS] Jin Jin Zhao MS Presentation/May 25, 2022
This is an announcement of Jin Jin Zhao's MS Presentation
===============================================
Candidate: Jin Jin Zhao

Date: Wednesday, May 25, 2022

Time:  3 pm CST

Location: JCL 298

M.S. Paper Title: AUTOMATED PROVENANCE CAPTURE IN ARRAY-PROGRAMMING FRAMEWORKS

Abstract: This paper presents DSLog, a system that efficiently capture and represent fine-grained data
provenance in array-programming frameworks for black box functions. It uses a technique
called annotated execution to capture “physical” provenance, automatically without user
specification. We describe a low-level implementation for arrays up to
100 million (and more) cells. This implementation also improves capture performance up to 34x over a high level
baseline. Additionally, we contribute a new compression algorithm, named ProvRC, that compresses such
relations. We show that the ProvRC results in a significant storage reduction over functions
with simple spatial regularity, beating alternative baselines by many orders of magnitude.
Finally, we present the concepts of dimensional and generalized views over these compressed
relational representation, which allows DSLog to recognize previously seen function (with
only input array dimension information, and no input array information respectively), and
re-use pre-existing materialized provenance views. We demonstrate that these views cover
92% and 73% respectively of 136 tested numpy functions, and preliminary results show that
using the views have a marked improvement over pure naive annotated execution.

Advisors: Sanjay Krishnan

Committee Members: Sanjay Krishnan, Raul Castro Fernandez, Blase Ur, and Nick Feamster


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20220525/bd64d4b1/attachment.html>


More information about the Colloquium mailing list