ColloquiaTalk by Anne Rogers, AT&T Labs - Thursday, November 29th

Margery Ishmael marge at cs.uchicago.edu
Wed Nov 14 11:39:36 CST 2001


Computer Science Department - Colloquium Announcement

Date: Thursday, November 29th
Time:	3:00 p.m.
Place: Ryerson Hall 251

Title: "Analyzing Transaction Streams with Hancock"

Speaker: Anne Rogers, AT&T Labs-Research

Abstract: A transaction data stream is a sequence of records that log
interactions between entities.  For example, a stream of stock market
transactions consists of buy/sell orders for particular companies from
individual investors.  Likewise, a stream of credit card transactions
contains records of purchases by consumers from merchants.  While
historically such data have been collected for billing or security
purposes, they are now being used by businesses to discover how
clients use the underlying services.  Storing transaction data in a
warehouse for later analysis is a standard approach to this problem,
but unfortunately, the sheer volume of data can make it difficult to
identify which entities are "interesting" for a given application.
"Where should data analysts choose to focus their attention?" is the
essential question.

For several years, statisticians at AT&T have computed evolving
profiles (called signatures) of the entities in transaction streams
using handwritten C code.  The signature for each entity captures the
salient features of the entity's transactions through time.  These
programs were carefully hand-optimized to ensure that the data could
be processed in a timely fashion.  They achieved the necessary
performance but at the expense of readability, which led to programs
that were difficult to verify and maintain.

Hancock is a domain-specific language created to analyze transactions
streams efficiently without sacrificing readability.  By design, the
language makes time and space efficient analysis programs easy to read
and write, independent of the quantity of data involved.  Because
Hancock manages the scaling issues, it allows data analysts to develop
new applications quickly.  In this talk, I will describe the obstacles
to computing with large streams and explain how Hancock addresses
these problems.  http://www.research.att.com/info/amr

Hancock is joint work with Corinna Cortes, Kathleen Fisher, Karin
Hogstedt, Daryl Pregibon, and Fred Smith.

Host: Stuart Kurtz

*The talk will be followed by refreshments in Ryerson 255*
Persons with disabilities who may need assistance, please call 773.834.8977

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Margery Ishmael
Secretary to the Chairman, Department of Computer Science
The University of Chicago
tel. 773.834.8977  fax. 773.702.8487




More information about the Colloquium mailing list