Plenary Lecture

Plenary Lecture

Facial Expression Recognition for Speaker Using Thermal Image Processing and Speech Recognition System


Professor Yasunari Yoshitomi
Information Communication System Lab.
Graduate School of Life and Environmental Sciences
Kyoto Prefectural University
JAPAN
E-mail: yoshitomi@kpu.ac.jp


Abstract: The goal of our research is to develop a robot which can perceive human feelings or mental states. The robot should be able to interact in a friendly manner with a human. For example, it could perhaps encourage a human who looks sad. Moreover, it could advise a person to stop working and rest for a while when the individual looks tired. Moreover, it could take care of a person advanced in years.
The presented investigation concerns the first stage of development wherein a robot acquires vision with the ability to detect human feeling or inner mental states. Although the mechanism for recognizing facial expressions as one of the main, visible expressions of feeling has been received considerable attention in the course of computer vision research, its present stage still falls far short of human capability, especially from the viewpoint of robustness under widely ranging lighting conditions. One of the reasons is that nuances of shade, reflection, and local darkness influence the accuracy of facial expression recognition through the inevitable change of gray levels. In order to avoid the problem and to develop a robust method for facial expression recognition applicable under widely varied lighting condition, we have used an image registered by infrared rays (IR) which describes the thermal distribution of the face. Although a human can not detect IR, it is possible for a robot to process the information around it using thermal images created by IR. Therefore, as a new mode of robot-vision, thermal image processing is a practical method viable under natural conditions.
The timing of recognizing facial expressions is also important for a robot because the processing for doing it might be time-consuming. We have adopted an utterance as the key of expressing human feelings or mental states because humans tend to say something to express feelings.
In this talk, I lecture on our method for facial expression recognition for a speaker by exploiting a new technique for deciding the timing positions of extracting the frames from the thermal dynamic image at an utterance, using a speech recognition system. For facial expression recognition, we pick up three images (i) just before speaking, in speaking (ii) the first and (iii) last vowels at an utterance. The face direction is also estimated for selecting front-view faces as targets of facial expression recognition using thermal image processing. A two-dimensional discrete cosine transformation is performed for transforming gray-scale values on each block in focused face-parts of image into their frequency-components, which are used for generating feature vectors. In this method, the facial expressions are discriminable with the good recognition accuracy, when he or she exhibits one of the intentional facial expressions of "angry", "happy", "neutral", "sad", and "surprise".

Brief Biography of the Speaker:
Yasunari Yoshitomi received his B.E., M.E. and Dr. Eng. degrees in Applied Mathematics and Physics from Kyoto University in 1980, 1982, and 1991, respectively. He had worked in Nippon Steel Corporation from 1982 to 1995 and had been engaged in image analysis application and development of soft magnetic materials. From 1995 to 2001, he had been in Miyazaki University as an associate professor at the Department of Computer Science and Systems Engineering. From 2001 to 2008, he had been in Kyoto Prefectural University as a professor at the Department of Environmental Informatics. Since 2008, he has been in Kyoto Prefectural University as a professor at the Environmental Information System Subdivision, Division of Environmental Sciences, Graduate School of Life and Environmental Sciences. He is a member of IEEE, IPSJ, IEICE, JSIAM, ORSJ, HIS, SSJ and IIEEJ. He received a Best Paper Award from IEEE International Workshop on Robot and Human Communication in 1998, and a Best Paper Award from IEEE International Workshop on Robot and Human Interactive Communication in 2000. He has published more than 100 papers, two reviews, two books, and more than 200 patents. He has been listed in the 2010 Edition of Marquis Who's Who in the World. His current research interests are communication between human and computer, media information processing, watermarking and biometric authentication on digital content, stochastic programming problem and simulation on emission trading of greenhouse effect gas.

WSEAS Unifying the Science