[Colloquium] Toward Cost-Efficient Use of Pre-trained Models

Tracey Walton tywalton615 at uchicago.edu
Wed Nov 1 14:30:41 CDT 2023


Department of Computer Sciences Seminar

Associate Professor Alan Ritter
College of Computing at Georgia Tech


Friday, November 3

11:00 am - 12:00 pm

In-Person: TTIC 501


Title:

Towards Cost-Efficient Use of Pre-trained Models


Abstract:

Large language models are leading to many exciting breakthroughs, but this comes at a significant cost in terms of both computational and data labeling expenses.  Training state-of-the-art models requires access to high-end GPUs for pre-training and inference, in addition to labeled data for fine-tuning.  In this talk I will examine the tradeoff between these costs, with the goal supporting better decisions.  Conventional wisdom holds that annotating data is expensive, so computational methods that use unlabeled data to improve performance can present an economical alternative.  I will examine this assumption in the context of pretraining-based adaptation, which requires significant computation for each new domain.  As a second example where the tradeoff between computation and annotation arises, I will show that training and then distilling large models can be an economical strategy for improving performance.  Finally, I will discuss applications on chemical synthesis protocols, and show a demo of a system that can help chemists to more efficiently find experimental conditions described in the literature.  I will also present a new approach to extracting data from tables in scientific articles where the only supervision provided to the model is a database schema, eliminating the need for labeled data or custom data extraction pipelines.


Bio:

Alan Ritter is an associate professor in the College of Computing at Georgia Tech. His research on natural language processing aims to solve technical challenges that help machines read the web and engage in safe and helpful dialogue with people. In a recent project, covered by WIRED (https://www.wired.com/story/machine-learning-tweets-critical-security-flaws/<https://urldefense.com/v3/__https://www.wired.com/story/machine-learning-tweets-critical-security-flaws/__;!!BpyFHLRN4TMTrA!9IgpC6OLZuYrQhAmCVeXulrwH8_o4BlAFU_gI4Xga0ahbYU_xgnZYSi2bczsTU5UxP-YacKGFRVfhWtm4XExYMGpKPFSSQ$>), Alan's group built a system that reads millions of online messages for mentions of new software vulnerabilities.  He completed his Ph.D. at the University of Washington and was a postdoctoral fellow in the Machine Learning Department at Carnegie Mellon.  Alan is the recipient of an NSF CAREER award and an Amazon Research Award.



Alan will have some time to meet with faculty and students virtually after the talk. Please sign up here if you are interested in meeting him:

Alan Ritter UChicago/TTIC Seminar Schedule<https://urldefense.com/v3/__https://docs.google.com/spreadsheets/d/1jdBq2e1caPlMsz8QABs0yoGjPNGY28xSRs6Aw0X3xlk/edit?usp=sharing__;!!BpyFHLRN4TMTrA!9IgpC6OLZuYrQhAmCVeXulrwH8_o4BlAFU_gI4Xga0ahbYU_xgnZYSi2bczsTU5UxP-YacKGFRVfhWtm4XExYMEWvRri_A$>


Please join us in TTIC 501 from 11-12 on Friday (you can also join on zoom<https://urldefense.com/v3/__https://uchicago.zoom.us/j/98297764499?pwd=ajNQSTZnMHRmMENkd1hjdjlNeW1xdz09__;!!BpyFHLRN4TMTrA!9IgpC6OLZuYrQhAmCVeXulrwH8_o4BlAFU_gI4Xga0ahbYU_xgnZYSi2bczsTU5UxP-YacKGFRVfhWtm4XExYMH2f-yiEQ$>).


[cid:53e327b9-883f-452e-80d7-08e12122f0ca]



Tracey Walton
Business Assistant
University of Chicago
5730 South Ellis JCL 212
Tywalton615 at uchicago.edu
(773)702-0723

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20231101/f33787a9/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 36483 bytes
Desc: image.png
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20231101/f33787a9/attachment-0001.png>


More information about the Colloquium mailing list