Plenary
Lecture
Facial Expression Recognition for Speaker Using Thermal
Image Processing and Speech Recognition System
Professor Yasunari Yoshitomi
Information Communication System Lab.
Graduate School of Life and Environmental Sciences
Kyoto Prefectural University
JAPAN
E-mail:
yoshitomi@kpu.ac.jp
Abstract: The goal of our research is to develop a
robot which can perceive human feelings or mental
states. The robot should be able to interact in a
friendly manner with a human. For example, it could
perhaps encourage a human who looks sad. Moreover, it
could advise a person to stop working and rest for a
while when the individual looks tired. Moreover, it
could take care of a person advanced in years.
The presented investigation concerns the first stage of
development wherein a robot acquires vision with the
ability to detect human feeling or inner mental states.
Although the mechanism for recognizing facial
expressions as one of the main, visible expressions of
feeling has been received considerable attention in the
course of computer vision research, its present stage
still falls far short of human capability, especially
from the viewpoint of robustness under widely ranging
lighting conditions. One of the reasons is that nuances
of shade, reflection, and local darkness influence the
accuracy of facial expression recognition through the
inevitable change of gray levels. In order to avoid the
problem and to develop a robust method for facial
expression recognition applicable under widely varied
lighting condition, we have used an image registered by
infrared rays (IR) which describes the thermal
distribution of the face. Although a human can not
detect IR, it is possible for a robot to process the
information around it using thermal images created by
IR. Therefore, as a new mode of robot-vision, thermal
image processing is a practical method viable under
natural conditions.
The timing of recognizing facial expressions is also
important for a robot because the processing for doing
it might be time-consuming. We have adopted an utterance
as the key of expressing human feelings or mental states
because humans tend to say something to express
feelings.
In this talk, I lecture on our method for facial
expression recognition for a speaker by exploiting a new
technique for deciding the timing positions of
extracting the frames from the thermal dynamic image at
an utterance, using a speech recognition system. For
facial expression recognition, we pick up three images (i)
just before speaking, in speaking (ii) the first and
(iii) last vowels at an utterance. The face direction is
also estimated for selecting front-view faces as targets
of facial expression recognition using thermal image
processing. A two-dimensional discrete cosine
transformation is performed for transforming gray-scale
values on each block in focused face-parts of image into
their frequency-components, which are used for
generating feature vectors. In this method, the facial
expressions are discriminable with the good recognition
accuracy, when he or she exhibits one of the intentional
facial expressions of "angry", "happy", "neutral",
"sad", and "surprise".
Brief Biography of the Speaker:
Yasunari Yoshitomi received his B.E., M.E. and Dr. Eng.
degrees in Applied Mathematics and Physics from Kyoto
University in 1980, 1982, and 1991, respectively. He had
worked in Nippon Steel Corporation from 1982 to 1995 and
had been engaged in image analysis application and
development of soft magnetic materials. From 1995 to
2001, he had been in Miyazaki University as an associate
professor at the Department of Computer Science and
Systems Engineering. From 2001 to 2008, he had been in
Kyoto Prefectural University as a professor at the
Department of Environmental Informatics. Since 2008, he
has been in Kyoto Prefectural University as a professor
at the Environmental Information System Subdivision,
Division of Environmental Sciences, Graduate School of
Life and Environmental Sciences. He is a member of IEEE,
IPSJ, IEICE, JSIAM, ORSJ, HIS, SSJ and IIEEJ. He
received a Best Paper Award from IEEE International
Workshop on Robot and Human Communication in 1998, and a
Best Paper Award from IEEE International Workshop on
Robot and Human Interactive Communication in 2000. He
has published more than 100 papers, two reviews, two
books, and more than 200 patents. He has been listed in
the 2010 Edition of Marquis Who's Who in the World. His
current research interests are communication between
human and computer, media information processing,
watermarking and biometric authentication on digital
content, stochastic programming problem and simulation
on emission trading of greenhouse effect gas.
|