[Colloquium] This afternoon: Xie/Dissertation Defense/Sept. 12, 2006

Margaret Jaffey margaret at cs.uchicago.edu
Tue Sep 12 13:48:32 CDT 2006


This is a reminder about Zhimin Xie's dissertation defense at 2:00  
this afternoon.  Please note: the title of his dissertation has been  
slightly revised.  It is now Broad Class Phoneme Detection (formerly  
Acoustic-Based Broad Class Phoneme Detection).

------
		Department of Computer Science / The University of Chicago

					*** Dissertation Defense ***


Candidate:  Zhimin Xie

Date:  Tuesday, September 12, 2006

Time and Location:  2:00 p.m. in Ryerson 251

Title:  Broad Class Phoneme Detection

Abstract:
We categorize American English phonemes into several groups: vowel,
semi-vowel, nasal, whisper, fricative/affricative, closure/stop, silence
and some special phonemes (/q/ and /dx/), among which five main groups
(vowel, semi-vowel, nasal, fricative, stop) are further examined.
Thereafter, we construct several detectors based on acoustic features  
for
each phoneme group and compare them with HMM-based systems by testing
on continuous speech data, TIMIT.

To detect vowels, a compact vowel detector based only on two acoustic
features, periodicity and energy, is implemented. It performs with  
86.8\%
accuracy and 22.4\% total error rate. Even under some adverse
environments, it still works stably. To detect fricatives, several
detectors based on SVMs using different acoustic features are  
constructed
and a typical performance of them has 90.6\% and 24.8\% as accuracy and
total error rate, respectively. Whereas for stops, features of total
energy, energy above 3kHz and Wiener entropy are employed into SVMs and
the detector obtains accuracy of 93.2\% and total error rate of 19.6\%.
All of these results are comparable with or even better than HMM-based
systems.

However, detectors based on static acoustic features for nasals and
semi-vowels do not perform as well as expected. By examining details of
the errors, the associated detection problems are disclosed, which
inspires a new approach to detection.

Hence, we proposed a combination of HMMs and SVMs for detection of
phoneme groups and obtain satisfactory results. We believe this  
method can
also be extended for more general speech recognition application.

Candidate's Advisor: Prof. Partha Niyogi

A draft copy of Mr. Xie's dissertation is available in Ry 161A.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Margaret P. Jaffey                             margaret at cs.uchicago.edu
Department of Computer Science
Student Support Rep (Ry 161A)        (773) 702-6011
The University of Chicago                  http://www.cs.uchicago.edu
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=




More information about the Colloquium mailing list