[Colloquium] Reminder - Adrian Lehmann MS Presentation/Mar 31, 2023

Megan Woodward meganwoodward at uchicago.edu
Fri Mar 31 08:50:50 CDT 2023


This is an announcement of Adrian Lehmann's MS Presentation
===============================================
Candidate: Adrian Lehmann

Date: Friday, March 31, 2023

Time:  2 pm CST

Location: JCL 390

M.S. Paper Title: Automatically parallelizing Diderot programs on CUDA targets

Abstract: Diderot is a domain-specific language to perform scientific visualizations.
Its programs are structured largely like bulk-synchronous parallelism. In this pattern, multiple strands (often also called treads) run one update step in isolation, followed by a single global reduction step (similar to MapReduce).
Currently, a compiler exists that transforms Diderot programs, along with the domain-specific operations, into C++.
The compiler supports targeting both sequential and parallel CPU execution models.
However, given the programming model's parallel nature, adding GPU support to Diderot's compiler is a natural step.

Our work fills this gap.
We add support for automatically parallelizing Diderot applications by modifying the compiler to be able to generate CUDA code.
We propose three strategies to schedule CUDA threads.
One that closely follows the BSP model, one that runs strands to completion (assuming no reduction steps are needed), and one that builds on a work queue.
We also propose a permutation mechanism for stochastic load distribution to mitigate strand divergence.
We also create variants that utilize CUDA unified memory, an API to move memory pages between system and GPU memory.

In benchmarks, we see speedups of 60-500x, where the queue-based approach outperforms other approaches.
Further, we see differences in the performance of our approaches between benchmarks.
We observe that permutation performance is highly dependent on the benchmark structure and the homogeneity of strand execution.
Furthermore, we conclude that in our test, CUDA unified memory leads to a significant performance penalty for benchmarks with fewer strands while greatly simplifying the produced code.

Advisors: John Reppy

Committee Members: John Reppy, Hank Hoffmann, and Ravi Chugh


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20230331/eacd9565/attachment-0001.html>


More information about the Colloquium mailing list