BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Memento EPFL//
BEGIN:VEVENT
SUMMARY:Multimodal Feature Extraction and Fusion for Audio-Visual Speech R
 ecognition
DTSTART:20090116T173000
DTSTAMP:20260603T024246Z
UID:a4e586cbfb3784b54ddf2b8a1ef75e5a520a7a71ab0889e9a7063a00
CATEGORIES:Thesis defenses
DESCRIPTION:Mihai Gurban\nMultimodal signal processing leads to the extrac
 tion of higher-quality\nand more reliable information than that would be o
 btained from\nsingle-modality signals. We are focusing on two main challen
 ges in\nthis field\, feature extraction and multimodal fusion\, and we are
 \napplying our proposed solutions to audio-visual speech recognition.\nFir
 st\, we show how informative features can be extracted from the\nvisual mo
 dality\, using an information-theoretic framework which gives\nus a quanti
 tative measure of the relevance of individual features. We\nalso prove tha
 t reducing redundancy between these features is\nimportant for avoiding th
 e curse of dimensionality and improving\nrecognition results. Second\, we 
 present a method of multimodal fusion\nat the level of intermediate decisi
 ons using a weight for each of the\nmonomodal streams. The weights are ada
 ptive\, changing according to the\nestimated reliability of each stream.
LOCATION:ELA1
STATUS:CONFIRMED
END:VEVENT
END:VCALENDAR