[Colloquium] Monday: Yu Hu/Dissertation Defense/3-26-07
Margaret Jaffey
margaret at cs.uchicago.edu
Fri Mar 23 13:34:54 CDT 2007
This is a reminder about Yu Hu's dissertation defense that will be
held on Monday.
Department of Computer Science/The University of Chicago
*** Dissertation Defense ***
Candidate: Yu Hu
Date: Monday, March 26, 2007
Time and Location: 2:30 p.m. in Ryerson 276
Title: Topics in Unsupervised Language Learning
Abstract:
Language learning is one of the most complex and challenging
problems in Artificial Intelligence. Natural languages have several
distinct components, each of which presents a difficult learning
challenge.
These include phonetics and phonology, morphology, syntax, semantics,
pragmatics, and discourse. The focus of this dissertation is natural
language morphology, which is the study of the internal structure of
words, and in particular with methods for automatically learning
morphological structure. Much as sentences are composed of structured
sequences of words, words are composed of structured sequences of
morphemes. In some languages, the morphological structure is complex;
in others, it is relatively simple.
Based on how much explicit human analysis is integrated into the
learning procedure, language learning can be categorized into supervised
and unsupervised. Unsupervised learning is the focus in this thesis,
which
means the language parameters in our models are learned with little
or no
active human participation, but is instead induced by a system which is
language-independent. Whatever is different about two languages must be
inferred by the learning algorithm.
The principal problems addressed in this dissertation are the
following:
automatic identification of morphemes of a language, on the basis of
a simple
text; finding the best analysis of words into morphemes in a language-
independent
way; discovery of automatically related forms of a morpheme (known as
the
problem of “allomorphy”), and how the task of machine translation
from one
language to another can be improved by a knowledge of the
morphologies of
source and target language, and how machine translation can in turn
improve
the analysis of morphology
Much of the work in this dissertation uses Minimum Description
Length
analysis, which involves finding the least complex way to describe a
finite state
automaton that generates the observed data and assigns it a
(relatively) high
probability.
Candidate's Advisor: Prof. John Goldsmith
A draft copy of Mr. Hu's dissertation is available in Ry 161A.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Margaret P. Jaffey margaret at cs.uchicago.edu
Department of Computer Science
Student Support Rep (Ry 161A) (773) 702-6011
The University of Chicago http://www.cs.uchicago.edu
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
More information about the Colloquium
mailing list