|
Plenary
Lecture
Audio Interaction with
Multimedia Information
Professor Mario Malcangi
Universita degli Studi di Milano
DICo – Dipartimento di Informatica e Comunicazione
Via Comelico 39, 20135 Milano
Italy
E-mail: malcangi@dico.unimi.it
Abstract: Interacting with multimedia information stored in systems
or on the web, points up several difficulties inherent in the signal nature
of such information. These difficulties are especially evident when palmtop
devices are used for such pur pose. Developing and integrating a set of
algorithms designed for extracting audio information is a primary step
toward providing user-friendly access to multimedia information and the
developing of powerful communication interfaces. Audio has several
advantages over other communication media. These include: hands-free
operation; unattended interaction; simple, cheap devices for capture and
playback; etc.
A set of algorithms and processes for extracting semantic and syntactic
information from audio signals, including voice, has been defined. The
extracted information is used to access information in multimedia databases,
as well as to index them. More extensive higher-level information need to be
extracted from the audio signal, such as audio-source identification
(speaker identification) and genre in musical audio. A primary task involve
transforming audio to symbols (e.g. music transformed into score, speech
transformed into text) and transcribing symbols to audio (e.g. score
transformed into musical audio, text transformed into speech). The purpose
is to search for and access any kind of multimedia information by means of
audio.
To attain these results, digital audio processing, digital speech processing
and soft-computing methods need to be integrated.
Brief Biography of the Speaker:
M. Malcangi graduated in Computer Engineering from the Politecnico di Milano
in 1981. His research is in the areas of speech processing and digital audio
processing. He teaches Digital Signal Processing and Digital Audio
Processing at the Universita degli Studi di Milano. He has published several
papers on topics in digital audio and speech processing. His current
research efforts focus primarily on applying soft-computing methodologies
(neural networks and fuzzy logic) to speech synthesis, speech recognition,
and speaker identification, where deeply embedded systems are the platform
that supports the application processing.
|
|
|