[CS] Andronicus Samsundar Rajasukumar Candidacy Exam/Oct 22, 2025

Tue Oct 7 09:19:59 CDT 2025

This is an announcement of Andronicus Samsundar Rajasukumar's Candidacy Exam.
===============================================
Candidate: Andronicus Samsundar Rajasukumar

Date: Wednesday, October 22, 2025

Time:  8:30am CST

Remote Location:  https://uchicago.zoom.us/j/98393108953?pwd=12XSAbaoaNx956b3O8BhrfGwE95BjW.1   Meeting ID: 983 9310 8953 , Passcode: 340952

Location: JCL 298

Title: UpDown: AnEffective, efficient and scalable Processing Near Memory architecture for irregular applications

Abstract: The significant growth in the scale of data processing has madelarge on-chip caches (∼0.5GB), in modern CPUs, ineffective inpreventing long latency off-chip DRAM accesses to ease data-movementbottlenecks. Consequently, there has been a push for rapid innovation to findalternatives to the traditional Von Neumann architecture. Processing NearMemory (PNMs) architectures push compute logic in proximity to DRAM. They havedirect access to data in DRAM sub-arrays (ranks, banks, channels) withouttraversing the CPU caches. This allows these cores to utilize the higherparallelism (and bandwidth) available at the bank/rank/channel interfaces. Someapplication specific PNMs have shown benefits on narrow domains where bulkmemory operations are accelerated on dense, regular data. 

 However, for the broader class of non-regular non-denseapplications, PNMs face a number of issues. First, they lack access to allmemory due to local address spaces. Second, there is lack of interaction withcoherence and consistency mechanisms leading to inefficient workarounds. Third,there is a lack of efficient communication mechanisms for synchronization andcoordination between all the PNM cores in a system. These issues make datamovement and synchronization ineffective and inefficient on PNMs resulting inpoor performance scaling, especially on irregular applications. Besides, localaddress spaces, explicit data movement and lack of interaction with coherencemake programming PNMs extremely challenging.

 We propose UpDown: a general-purpose Processing Near Memoryarchitecture that attempts to address these issues with 4 novel features -Scalable-Memory Level Parallelism, Software Controlled Locality, ScalableMessage-Driven Synchronization and Fine-Grained Computations. In this proposal,we describe the innovative architectural mechanisms that enable these features.We also describe how these mechanisms can be implemented in a simple in-ordercore to create an efficient and scalable building block for large scale PNMs.We outline our research plan to evaluate the impact of these mechanisms andfeatures using a set of irregular and regular applications and compare theperformance and cost (area, energy and power) with traditional CPUarchitectures and current PNMs. We will demonstrate that across theseapplications, UpDown can achieve efficient, scalable data movement andsynchronization. With flexible and general programmability we also expect toshow that UpDown can achieve overall performance benefits on these applicationscompared to traditional multi-core CPUs (> 100x) and current PNMs (> 4x).

Advisors: Andrew Chien

Committee Members: Andrew Chien, Valerie E. Taylor and Fred Chong