Marie Roch Colloquium, Nov. 13

Fri Nov 10 10:49:31 CST 2000

Department of Computer Science Sound Seminar Series
Monday, November 13, 2000, 2:30 PM, Ryerson 276

        Robust hidden Markov model classification schemes for
              speaker recognition using integral decode

                              Marie Roch
                   Florida International University

Text-independent speaker identification is the task of automatically
determining a speaker's identity based upon an a short segment of
their speech regardless of its content.  Due to issues of scalability,
most state of the art speaker identification systems construct
individual models for each speaker and then perform identification by
evaluating the likelihood of test speech against each model.  The
class decision is based upon the maximum a-priori (MAP) decision rule.

The MAP decision rule is an optimal decision rule when accurate models
of the distribution are known and the measurements of the utterance
being classified are accurate.  Unfortunately for most types of
speaker recognition, neither is the case.  The measured feature
vectors are subject to corruption from transducer, channel, and
quantization effects as well as environmental noise.  Any of these can
contribute to error in both model estimation and testing.  In addition
to the effects of measurement error and environmental noise, there are
likely to be inaccuracies in the model due to lack of sufficient
training data (or even absence) for certain phones, transient
speaker conditions such as nasal congestion, and long term evolution
of the speaker's voice.

In this talk, a method of integration about local neighborhoods in the
feature space is proposed to counterbalance the observation errors,
resulting in a reduction of the classification error rate.  As the
integration occurs in a high dimensional space, techniques for
optimization are also discussed.