[Colloquium] Gabriel Mersy MS Presentation/Apr 29, 2024

Devin Davis devind at uchicago.edu
Fri Apr 26 08:59:52 CDT 2024


This is an announcement of Gabriel Mersy's MS Presentation
===============================================
Candidate: Gabriel Mersy

Date: Monday, April 29, 2024

Time: 10:30 am CT

Location: JCL 298

Title: Optimizing Collections of Bloom Filters within a Space Budget

Abstract: With a single Bloom filter, one can approximately answer set membership queries within a space budget. Practical systems often use collections of Bloom filters to facilitate applications such as data skipping, sideways information passing, and network filtering. While the optimal space-to-accuracy allocation is well-understood for a single filter, jointly optimizing how space is used across a collection of filters is yet to be studied. We pose this problem in the following way: (1) let's assume that each Bloom filter has some likelihood of being queried, and (2) given knowledge of this likelihood, how do we allocate space to minimize the expected false positive rate? In other words, “hot” filters are allocated more space, and “cold” filters are allocated less space. In this paper, we show how to solve this optimization problem. We first develop the concept of a “truncated” Bloom filter and theoretically analyze its false positive rate. We then formulate an optimization problem for a collection of truncated Bloom filters that minimizes the false positive rate across a utility distribution while meeting a strict space budget. Next, we show that the problem is convex and find a fast relaxation. Lastly, we apply our method to data skipping and full-text search, demonstrating its effectiveness across the range of possible space budgets when compared to the state of the art.

Advisors: Sanjay Krishnan

Committee Members: Aaron Elmore, Sanjay Krishnan, and Raul Castro Fernandez

_______________________________________________
Colloquium mailing list  -  Colloquium at mailman.cs.uchicago.edu<mailto:Colloquium at mailman.cs.uchicago.edu>
https://mailman.cs.uchicago.edu/mailman/listinfo/colloquium
_______________________________________________
staff mailing list  -  staff at mailman.cs.uchicago.edu<mailto:staff at mailman.cs.uchicago.edu>
https://mailman.cs.uchicago.edu/mailman/listinfo/staff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20240426/c29f96c1/attachment.html>


More information about the Colloquium mailing list