[Colloquium] Xie/Dissertation Defense/Sept. 12, 2006
Margaret Jaffey
margaret at cs.uchicago.edu
Tue Aug 29 11:54:59 CDT 2006
Department of Computer Science / The University of Chicago
*** Dissertation Defense ***
Candidate: Zhimin Xie
Date: Tuesday, September 12, 2006
Time and Location: 2:00 p.m. in Ryerson 251
Title: Acoustic-Based Broad Class Phoneme Detection
Abstract:
We categorize American English phonemes into several groups: vowel,
semi-vowel, nasal, whisper, fricative/affricative, closure/stop, silence
and some special phonemes (/q/ and /dx/), among which five main groups
(vowel, semi-vowel, nasal, fricative, stop) are further examined.
Thereafter, we construct several detectors based on acoustic features
for
each phoneme group and compare them with HMM-based systems by testing
on continuous speech data, TIMIT.
To detect vowels, a compact vowel detector based only on two acoustic
features, periodicity and energy, is implemented. It performs with
86.8\%
accuracy and 22.4\% total error rate. Even under some adverse
environments, it still works stably. To detect fricatives, several
detectors based on SVMs using different acoustic features are
constructed
and a typical performance of them has 90.6\% and 24.8\% as accuracy and
total error rate, respectively. Whereas for stops, features of total
energy, energy above 3kHz and Wiener entropy are employed into SVMs and
the detector obtains accuracy of 93.2\% and total error rate of 19.6\%.
All of these results are comparable with or even better than HMM-based
systems.
However, detectors based on static acoustic features for nasals and
semi-vowels do not perform as well as expected. By examining details of
the errors, the associated detection problems are disclosed, which
inspires a new approach to detection.
Hence, we proposed a combination of HMMs and SVMs for detection of
phoneme groups and obtain satisfactory results. We believe this
method can
also be extended for more general speech recognition application.
Candidate's Advisor: Prof. Partha Niyogi
A draft copy of Mr. Xie's dissertation is available in Ry 161A.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Margaret P. Jaffey margaret at cs.uchicago.edu
Department of Computer Science
Student Support Rep (Ry 161A) (773) 702-6011
The University of Chicago http://www.cs.uchicago.edu
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
More information about the Colloquium
mailing list