[Colloquium] [statseminars] DATE CHANGE! Aryeh Kontorovich (Weizmann Institute), TTI-C Talk

Wed May 21 12:00:33 CDT 2008

***DATE CHANGE:  This talk has been rescheduled to next Wednesday May 28
@11:00am.***

  _____  

When:             Thursday, May 22 @ 11:00am

                        Rescheduled to Wednesday, May 28 @11:00am

Where:            TTI-C Conference Room, 1427 E. 60th St., 2nd Floor

Who:                Aryeh Kontorovich, Weizmann Institute of Science

Topic:              A Universal Kernel for Learning Regular Languages

We develop a novel approach to the classical problem of learning regular
languages from labeled samples. Rather than attempting to construct small
consistent automata (long known to be a very difficult computational
problem), we embed the strings in a Hilbert space and compute a
maximum-margin hyperplane, which becomes our classifier for new strings. We
accomplish this via a universal kernel that renders all regular languages
linearly separable. Under this kernel, the image of every regular language
is linearly separable from its complement in some finite-dimensional space
with a strictly positive margin. Thus, we are able to efficiently (in sample
size) compute a maximum-margin separating hyperplane (via SVM, for example)
and use margin bounds to control the generalization error.

A brute-force computation of this universal kernel has super-exponential
complexity. We conjecture that this problem is intractable (a likely
candidate for #P-complete). However, we propose a simple randomized scheme
for efficiently obtaining an $\eps$-approximation to our universal kernel.
We show that the approximate kernel preserves the distances and margins with
low distortion, and therefore may be used as a surrogate for the original
one.

To our knowledge, the framework we propose is the only one capable of
inducing unrestricted regular languages from labeled samples (modulo
standard cryptographic limitations). Along the way, we touch upon several
fundamental questions in complexity, automata, and machine learning.

Join work with Boaz Nadler.

Contact:          Nati Srebro, TTI-C         nati at tti-c.org
834-7493

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20080521/3b0088f6/attachment-0003.html