Probabilistic Approaches to Robot Task Benchmarking

Event details
Date | 31.08.2016 |
Hour | 15:00 |
Speaker | Dr. Mathew Doss, IDIAP, Martigny |
Location | |
Category | Conferences - Seminars |
One of the crucial steps in development of speech technologies is to train statistical models that capture the relationship between the observed speech signal and the classes representing the high level information that we are interested in. Traditionally, development of such statistical models is divided into two steps, namely,
(a) extraction of "hand-crafted" features using signal processing techniques and prior knowledge about speech production and perception, and
(b) training a classifier with the extracted features as input. For example, in automatic speech recognition systems typically short-term spectrum based features are first extracted, and then subsequently modeled by Gaussian mixture models or artificial neural networks to estimate phone class likelihoods or posterior probabilities.
In this talk, I will present a novel approach, originally developed at Idiap, where the relevant features and the classifier are jointly learned from the raw speech signal using convolutional neural networks (CNNs). The talk will demonstrate the potential of the proposed approach through two different speech processing studies, namely, automatic speech recognition study and anti-spoofing study (in the context of automatic speaker verification). Specifically, I will show how, with minimal prior knowledge or assumptions, the proposed CNN-based approach learns to transform the speech signal and model the relevant information to yield better systems.
Bio: Dr. Mathew Magimai Doss received the Bachelor of Engineering (B.E.) in Instrumentation and Control Engineering from the University of Madras, India in 1996; the Master of Science (M.S.) by Research in Computer Science and Engineering from the Indian Institute of Technology, Madras, India in 1999; the PreDoctoral diploma and the Docteur dès Sciences (Ph.D.) from Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland in 2000 and 2005, respectively. He was a postdoctoral fellow at International Computer Science Institute (ICSI), Berkeley, USA from April 2006 till March 2007.
Since April 2007, he has been working as a Researcher in the Speech and Audio Processing group at Idiap Research Institute, Martigny, Switzerland. He is also a lecturer at EPFL. He is an associate editor of the IEEE Signal Processing Letters. His current research interests include signal processing, statistical pattern recognition, artificial neural networks and computational linguistics with applications to automatic speech recognition, automatic speaker recognition, objective speech assessment, spoken language processing and automatic sign language recognition and assessment.
(a) extraction of "hand-crafted" features using signal processing techniques and prior knowledge about speech production and perception, and
(b) training a classifier with the extracted features as input. For example, in automatic speech recognition systems typically short-term spectrum based features are first extracted, and then subsequently modeled by Gaussian mixture models or artificial neural networks to estimate phone class likelihoods or posterior probabilities.
In this talk, I will present a novel approach, originally developed at Idiap, where the relevant features and the classifier are jointly learned from the raw speech signal using convolutional neural networks (CNNs). The talk will demonstrate the potential of the proposed approach through two different speech processing studies, namely, automatic speech recognition study and anti-spoofing study (in the context of automatic speaker verification). Specifically, I will show how, with minimal prior knowledge or assumptions, the proposed CNN-based approach learns to transform the speech signal and model the relevant information to yield better systems.
Bio: Dr. Mathew Magimai Doss received the Bachelor of Engineering (B.E.) in Instrumentation and Control Engineering from the University of Madras, India in 1996; the Master of Science (M.S.) by Research in Computer Science and Engineering from the Indian Institute of Technology, Madras, India in 1999; the PreDoctoral diploma and the Docteur dès Sciences (Ph.D.) from Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland in 2000 and 2005, respectively. He was a postdoctoral fellow at International Computer Science Institute (ICSI), Berkeley, USA from April 2006 till March 2007.
Since April 2007, he has been working as a Researcher in the Speech and Audio Processing group at Idiap Research Institute, Martigny, Switzerland. He is also a lecturer at EPFL. He is an associate editor of the IEEE Signal Processing Letters. His current research interests include signal processing, statistical pattern recognition, artificial neural networks and computational linguistics with applications to automatic speech recognition, automatic speaker recognition, objective speech assessment, spoken language processing and automatic sign language recognition and assessment.
Practical information
- General public
- Free
Organizer
- NCCR Robotics
Contact
- Lirot Mayra <[email protected]>