Plenary
Lecture
Robust Speech Representations in Noisy Environments
Professor Tetsuya Shimamura
Saitama University
Japan
E-mail:
shima@sie.ics.saitama-u.ac.jp
Abstract: Speech recognition is applied in many
systems. In an environment without occurrence of noise,
high recognition accuracy is achieved. However in noisy
environments, it commonly shows a poor performance. To
improve the performance of recognition system in noisy
environments, we need a noise-robust speech
representation. From this point of view, many methods
have been proposed up to now. However, unfortunately, we
are still struggling to find the robust speech
representation form. Recently, Shannon and Paliwal
proposed Autocorrelation Mel-frequency Cepstral
Coefficients (AMFCC) method that uses higher-lag
autocorrelation sequence as the input of Mel-frequency
filter bank analysis to find Mel-frequency Cepstral
Coefficients (MFCC) spectral representation. The
recognizers that use this kind of spectral
representations showed a better performance rather than
the typical MFCC analysis directly on the speech signal.
In this plenary speech, several robust speech
representations are discussed, which include the MFCC
and AMFCC spectral representations. And it is presented
that robust features of speech are correlation functions
and their modifications. Also, for the purpose of
improving the performance of speech recognition in noisy
environments, two methods using Autocorrelation and
Double Autocorrelation sequences as the input of a
Mel-frequency filter bank analysis to find MFCC spectral
feature are derived. A word recognition experiment
validates that both of the proposed methods achieve
better results than the conventional MFCC spectral
analysis on the input speech signal.
Brief Biography of the Speaker:
Tetsuya Shimamura received the B.E., M.E., and Ph. D.
degrees in electrical engineering from Keio University,
Yokohama, Japan, in 1986, 1988, and 1991, respectively.
In 1991, he joined Saitama University, Saitama City,
Japan, where he is currently a Professor. During this,
he joined Loughborough University, UK, and The Queen’s
University of Belfast, UK, in 1995 and 1996,
respectively, as a visiting Professor. He is an author
or co-author of 6 books, and member of the organizing
committee of several international conferences. His
interests are in digital signal processing and its
applications to speech, image and communication systems.
|