| Plenary Lecture: Audio Interaction with Multimedia Information  Professor Mario Malcangi Universita degli Studi di Milano DICo – Dipartimento di Informatica e Comunicazione Via Comelico 39, 20135 Milano Italy E-mail: malcangi@dico.unimi.it Abstract: Interacting with multimedia information stored in systems or on the web, points up several difficulties inherent in the signal nature of such information. These difficulties are especially evident when palmtop devices are used for such pur pose. Developing and integrating a set of algorithms designed for extracting audio information is a primary step toward providing user-friendly access to multimedia information and the developing of powerful communication interfaces. Audio has several advantages over other communication media. These include: hands-free operation; unattended interaction; simple, cheap devices for capture and playback; etc. A set of algorithms and processes for extracting semantic and syntactic information from audio signals, including voice, has been defined. The extracted information is used to access information in multimedia databases, as well as to index them. More extensive higher-level information need to be extracted from the audio signal, such as audio-source identification (speaker identification) and genre in musical audio. A primary task involve transforming audio to symbols (e.g. music transformed into score, speech transformed into text) and transcribing symbols to audio (e.g. score transformed into musical audio, text transformed into speech). The purpose is to search for and access any kind of multimedia information by means of audio. To attain these results, digital audio processing, digital speech processing and soft-computing methods need to be integrated. Brief Biography of the Speaker: M. Malcangi graduated in Computer Engineering from the Politecnico di Milano in 1981. His research is in the areas of speech processing and digital audio processing. He teaches Digital Signal Processing and Digital Audio Processing at the Universita degli Studi di Milano. He has published several papers on topics in digital audio and speech processing. His current research efforts focus primarily on applying soft-computing methodologies (neural networks and fuzzy logic) to speech synthesis, speech recognition, and speaker identification, where deeply embedded systems are the platform that supports the application processing. | | |