[Colloquium] Surendran/Dissertation Defense/11-7-07

Margaret Jaffey margaret at cs.uchicago.edu
Mon Oct 22 09:27:54 CDT 2007


		Department of Computer Science/The University of Chicago

				***  Dissertation Defense ***

Candidate:  Dinoj R. Surendran

Date:  Wednesday, November 7, 2007

Time and Location:  11:30 a.m. in Ryerson 277

Title:  Analysis and Recognition of Tones in Mandarin Chinese

Abstract:
In tonal languages, words are not simply defined by their phonemic
sequence, but also by the intonational pattern with they are spoken.
In Mandarin Chinese, each word is a sequence of syllables, and each
syllable is a sequence of phonemes plus an intonational component
called a tone. Syllables can have one of five tones : high, rising,
low, falling, and neutral. The first four tones have distinct ideal
shapes, while the neutral tone is more of a 'none of the above' tone
and is notoriously difficult to recognize.

We first tackle the question of how important it is to recognize tones
in Mandarin Chinese.  We propose an information-theoretic measure to
compare the relative importance of phonological contrasts in any
language, and use it to show that tones are at least as important as
vowels in conveying information in Mandarin.

With the importance of the problem settled, we move on to a large and
thorough investigation of possible acoustic features to recognize
tones. We carry out hundreds of experiments, each involves classifying
over a hundred thousand syllables. This is at least an order of
magnitude larger than similar previous experiments.

Traditionally, features for Mandarin tone recognition have been based
on the pitch, duration, and overall intensity of a syllable, and we do
indeed find a set of features based on these that achieve an overall
syllable classification rate of 58.9%. This figure increases to 60.4%
when we add the effect of local acoustic context, and is a useful
baseline.

We investigate a fourth source of features: voice quality. We first
determine, using a small experiment with twenty possible voice quality
measures, that features based on band energy consistently work better
for tone recognition than those based on more complicated methods like
harmonic-amplitude differences and glottal flow experiments. We then
investigate band energy features using several large-sized experiments
to find a set of features that improves classification accuracy to
63.7%. As we had hoped, most of the improvement is for neutral and low
tones; for example, the F score for Neutral Tone increases from 0.345
without band energy to 0.619 with it. This opens up a host of new
features for future speech researchers in industry and academia to
investigate and use.

We investigate making additional use of context: if we know the tones
of the surrounding syllables, we can increase classification accuracy
to 67.2%. (This provides a useful upper bound for our experiments, and
further underlines the significance of our improvements in accuracy.)
While we do not have such ideal contextual information, we can use
estimates of it to increase accuracy to 65.0%.

Finally, we investigate the hypothesis that syllables that are better
articulated are easier to recognize. We verify this to be true on a
small corpus of lab speech from Xu (1999), where syllables in focussed
words are recognized with over 99% accuracy,
and are able to use this to improve classification accuracy of all
syllables. However, in news broadcast speech, we find that while
stronger syllables are recognized better, the difference is not enough
to suggest an algorithm that makes use of the difference.

Candidate's Advisor:  Prof. Gina-Anne Levow

A draft copy of Mr. Surendran's dissertation will be available soon  
in Ry 156.


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Margaret P. Jaffey                             margaret at cs.uchicago.edu
Department of Computer Science
Student Support Rep (Ry 156)        (773) 702-6011
The University of Chicago                  http://www.cs.uchicago.edu
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=





More information about the Colloquium mailing list