Jump to ContentJump to Main Navigation
Show Summary Details
More options …

The Speech Processing Lexicon

Neurocognitive and Behavioural Approaches

Ed. by Lahiri, Aditi / Kotzor, Sandra

Series:Phonology and Phonetics [PP] 22

eBook (PDF)
Publication Date:
April 2017
Copyright year:
2017
ISBN
978-3-11-042265-8
See all formats and pricing

Automatic speech recognition: What phonology can offer

Arora, Vipul / Reetz, Henning

Abstract

This chapter presents phonological features as the underlying representation of speech for the purpose of automatic speech recognition (ASR), instead of phones (or phonemes), which are typically used for this purpose. Phonological features offer a number of advantages. Firstly, they can efficiently handle the pronunciation variability found in languages. Secondly, these features form natural classes to represent speech universally, hence they are capable of providing better ways to transfer various models, involved in ASR, across different languages and dialects. Moreover, the ubiquity of the perceptual properties of phonological features is supported by various neuro-linguistic experiments and language studies for different languages of the world. Thus, phonological features can provide a principled way of ASR, thereby reducing the amount of training data and computational resources required.

The main challenge is to develop mathematical models to reliably detect these features from the speech signal, and to incorporate them into ASR systems. Towards this end, we describe here some of our implementations. Firstly, we present a digit recognition system that includes detecting the features with the help of neural networks and a rule-based feature-to-phoneme mapping. Secondly, we describe a deep neural networks based method to extract the features from speech signals. This method improves the detection accuracy by using deep learning. Thirdly, we present a deep neural network based ASR system which detects features and maps them to phonemes using statistical models. This system performs at par with state-of-the-art ASR systems for the task of phoneme recognition.

Citation Information

The Speech Processing Lexicon

Neurocognitive and Behavioural Approaches

Edited by Lahiri, Aditi / Kotzor, Sandra

De Gruyter

2017

Pages: 211–235

ISBN (Online): 9783110422658

DOI (Chapter): https://doi.org/10.1515/9783110422658-011

DOI (Book): https://doi.org/10.1515/9783110422658

Comments (0)

Please log in or register to comment.
Log in