<div dir="ltr"><div><div><b><font size="4">Michael Cafarella</font></b></div><div><b>Associate Professor of Computer Science and Engineering</b></div><div><b>University of Michigan</b></div></div><div><b><br></b></div><div><b style="color:rgb(33,33,33);font-size:large">Data-Intensive Systems for the Social Sciences</b><b><br></b></div><div><b><br></b></div><div><div><b>Friday, May 3, 2019 @ 12:00pm</b></div><div><b><a href="https://goo.gl/maps/5h7saQhVerqW8gYA9">John Crerar Library</a>, Room 390</b></div></div><div><b>Lunch Provided</b></div><div><div><br></div><div><b>Abstract:</b></div><div style="font-size:13px"><span style="color:rgb(33,33,33)">The social sciences are crucial for deciding billions in spending, and yet are often starved for data and badly underserved by modern computational tools. Building data-intensive systems for social science workloads holds the promise of enabling exciting discoveries in both computational and domain-specific fields, while also making an outsized real-world impact.</span></div><div style="font-size:13px"><br></div><div style="font-size:13px"><span style="color:rgb(33,33,33)">This talk will describe two data systems for the social sciences. The first is RaccoonDB, a declarative nowcasting data management system, which enables users to predict real-world time-series phenomena from social media signals. RaccoonDB’s novel query optimization methods allow it to generate useful social science predictions 123 times faster than competing systems, using just 10% of the computational resources. When applied to unemployment phenomena, the system yields predictions with accuracy that is comparable to predictions from real-world economists.</span></div><div style="font-size:13px"><br></div><div style="font-size:13px"><span style="color:rgb(33,33,33)">The second system is an information extraction system designed to analyze online text and help law enforcement officers identify potential human trafficking victims. This system has been successfully applied to real-world cases. In addition, the resulting extracted dataset enables several novel social science findings about behavior in an illicit and often opaque market.</span></div><div><br></div><div><b>Bio:</b></div><div style="font-size:13px"><span style="color:rgb(33,33,33)">Michael Cafarella is an Associate Professor of Computer Science and Engineering at the University of Michigan. His research interests include databases, information extraction, data integration, and data mining. He has published extensively in venues such as SIGMOD, VLDB, and elsewhere. Mike received his PhD from the University of Washington in 2009 with advisors Oren Etzioni and Dan Suciu. His academic awards include the NSF CAREER award, the Sloan Research Fellowship, and the VLDB Test of Time Award. In addition to his academic work, Mike cofounded (with Doug Cutting) the Hadoop open-source project. In 2015 he cofounded (with Chris Re and Feng Niu) Lattice Data, Inc., which is now part of Apple.</span></div></div><div class="gmail-yj6qo gmail-ajU" style="outline:none;padding:10px 0px;width:22px;margin:2px 0px 0px"><br class="gmail-Apple-interchange-newline"></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><i style="font-size:12.8px">Rob Mitchum</i></div><div dir="ltr"><i>Associate Director of Communications for Data Science and Computing<br></i><div style="font-size:12.8px"><i style="font-size:12.8px">University of Chicago</i><br></div><div style="font-size:12.8px"><i style="font-size:12.8px"><a href="mailto:rmitchum@ci.uchicago.edu" target="_blank">rmitchum@uchicago.edu</a></i><br></div><div style="font-size:12.8px"><i style="font-size:12.8px">773-484-9890</i><br></div></div></div></div></div></div></div></div></div>