Jump to ContentJump to Main Navigation
Show Summary Details
More options …

The Speech Processing Lexicon

Neurocognitive and Behavioural Approaches

Ed. by Lahiri, Aditi / Kotzor, Sandra

Series:Phonology and Phonetics [PP] 22

eBook (PDF)
Publication Date:
April 2017
Copyright year:
See all formats and pricing

Automatic speech recognition: What phonology can offer

Arora, Vipul / Reetz, Henning


This chapter presents phonological features as the underlying representation of speech for the purpose of automatic speech recognition (ASR), instead of phones (or phonemes), which are typically used for this purpose. Phonological features offer a number of advantages. Firstly, they can efficiently handle the pronunciation variability found in languages. Secondly, these features form natural classes to represent speech universally, hence they are capable of providing better ways to transfer various models, involved in ASR, across different languages and dialects. Moreover, the ubiquity of the perceptual properties of phonological features is supported by various neuro-linguistic experiments and language studies for different languages of the world. Thus, phonological features can provide a principled way of ASR, thereby reducing the amount of training data and computational resources required.

The main challenge is to develop mathematical models to reliably detect these features from the speech signal, and to incorporate them into ASR systems. Towards this end, we describe here some of our implementations. Firstly, we present a digit recognition system that includes detecting the features with the help of neural networks and a rule-based feature-to-phoneme mapping. Secondly, we describe a deep neural networks based method to extract the features from speech signals. This method improves the detection accuracy by using deep learning. Thirdly, we present a deep neural network based ASR system which detects features and maps them to phonemes using statistical models. This system performs at par with state-of-the-art ASR systems for the task of phoneme recognition.

Citation Information

Vipul Arora, Henning Reetz (2017). Automatic speech recognition: What phonology can offer. In Aditi Lahiri, Sandra Kotzor (Eds.), The Speech Processing Lexicon: Neurocognitive and Behavioural Approaches (pp. 211–235). Berlin, Boston: De Gruyter. https://doi.org/10.1515/9783110422658-011

Book DOI: https://doi.org/10.1515/9783110422658

Online ISBN: 9783110422658

© 2017 Walter de Gruyter GmbH, Berlin/Munich/BostonGet Permission

Comments (0)

Please log in or register to comment.
Log in