Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Electrical, Control and Communication Engineering

The Journal of Riga Technical University

2 Issues per year

Open Access
See all formats and pricing
More options …

Multi-Stage Recognition of Speech Emotion Using Sequential Forward Feature Selection

Tatjana Liogienė / Gintautas Tamulevičius
Published Online: 2017-01-18 | DOI: https://doi.org/10.1515/ecce-2016-0005


The intensive research of speech emotion recognition introduced a huge collection of speech emotion features. Large feature sets complicate the speech emotion recognition task. Among various feature selection and transformation techniques for one-stage classification, multiple classifier systems were proposed. The main idea of multiple classifiers is to arrange the emotion classification process in stages. Besides parallel and serial cases, the hierarchical arrangement of multi-stage classification is most widely used for speech emotion recognition. In this paper, we present a sequential-forward-feature-selection-based multi-stage classification scheme. The Sequential Forward Selection (SFS) and Sequential Floating Forward Selection (SFFS) techniques were employed for every stage of the multi-stage classification scheme. Experimental testing of the proposed scheme was performed using the German and Lithuanian emotional speech datasets. Sequential-feature-selection-based multi-stage classification outperformed the single-stage scheme by 12–42 % for different emotion sets. The multi-stage scheme has shown higher robustness to the growth of emotion set. The decrease in recognition rate with the increase in emotion set for multi-stage scheme was lower by 10–20 % in comparison with the single-stage case. Differences in SFS and SFFS employment for feature selection were negligible.

Keywords: Classification algorithms; Emotion recognition; Human voice


  • [1] S. Ramakrishnan and I. M. M. El Emary, “Speech emotion recognition approaches in human computer interaction,” Telecommun. Systems, vol. 52, issue 3, pp. 1467–1478, Mar. 2013. https://doi.org/10.1007/s11235-011-9624-zCrossref

  • [2] S. G. Koolagudi and K. S. Rao, “Emotion recognition from speech: a review,” Int. J. of Speech Technology, vol. 15, issue 2, pp. 99–117, June 2012. https://doi.org/10.1007/s10772-011-9125-1Crossref

  • [3] Z. Xiao, E. Dellandrea, L. Chen and W. Dou, “Recognition of emotions in speech by a hierarchical approach,” in 2009 3rd Int. Conf. on Affective Computing and Intelligent Interaction and Workshops, Amsterdam, 2009, pp. 1–8. https://doi.org/10.1109/acii.2009.5349587Crossref

  • [4] P. Giannoulis and G. Potamianos, “A hierarchical approach with feature selection for emotion recognition from speech,” in Proc. of the Eighth Int. Conf. on Language Resources and Evaluation, 2012, pp. 1203–1206.Google Scholar

  • [5] B. Schuller, B. Vlasenko, F. Eyben, G. Rigoll and A. Wendemuth, “Acoustic Emotion Recognition: A Benchmark Comparison of Performances,” in 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, Merano, 2009, pp. 552–557. https://doi.org/10.1109/asru.2009.5372886Crossref

  • [6] A. Origlia, V. Galatà and B. Ludusan, “Automatic classification of emotions via global and local prosodic features on a multilingual emotional database,” in Proc. of Speech Prosody, 2010.Google Scholar

  • [7] M. Lugger, M.-E. Janoir and B. Yang, “Combining classifiers with diverse feature sets for robust speaker independent emotion recognition,” in 2009 17th European Signal Processing Conf., Glasgow, 2009, pp. 1225–1229.Google Scholar

  • [8] H. Peng, F. Long and C. Ding, “Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy,” IEEE Trans. on Pattern Analysis and Machine Intelligence, pp. 1226–1238, Aug. 2005. https://doi.org/10.1109/TPAMI.2005.159Crossref

  • [9] A. Mencattini, E. Martinelli, G. Costantini, M. Todisco, B. Basile, M. Bozzali and N. Di Corrado, “Speech emotion recognition using amplitude modulation parameters and a combined feature selection procedure,” Knowledge-Based Systems, vol. 63, pp. 68–81, June 2014. https://doi.org/10.1016/j.knosys.2014.03.019Crossref

  • [10] A. Milton and S. Tamil Selvi, “Class-specific multiple classifiers scheme to recognize emotions from speech signals,” Comput. Speech and Language, vol. 28, issue 3, pp. 727–742, May 2014. https://doi.org/10.1016/j.csl.2013.08.004Crossref

  • [11] L. Chen, X. Mao, Y. Xue and L. L. Cheng, “Speech emotion recognition: Features and classification models,” Digital Signal Processing, pp. 1154–1160, Dec. 2012. https://doi.org/10.1016/j.dsp.2012.05.007Crossref

  • [12] W.-J. Yoon and K.-S. Park, “Building robust emotion recognition system on heterogeneous speech databases,” in 2011 IEEE Int. Conf. on Consumer Electronics (ICCE), Las Vegas, NV, 2011, pp. 825–826. https://doi.org/10.1109/ICCE.2011.5722886Crossref

  • [13] J. Liu, C. Chen, J. Bu, M. You and J. Tao, “Speech Emotion Recognition using an Enhanced Co-Training Algorithm,” in 2007 IEEE Int. Conf. on Multimedia and Expo, Beijing, 2007, pp. 999–1002. https://doi.org/10.1109/ICME.2007.4284821Crossref

  • [14] M. Kotti and F. Paternò, “Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema,” Int. J. of Speech Technology, vol. 15, issue 2, pp. 131–150, June 2012. https://doi.org/10.1007/s10772-012-9127-7Crossref

  • [15] G. Tamulevicius and T. Liogiene, “Low-order multi-level features for speech emotion recognition,” Baltic J. of Modern Computing, vol. 3, no. 4, pp. 234–247, 2015.Google Scholar

  • [16] T. Liogiene and G. Tamulevicius, “Minimal cross-correlation criterion for speech emotion multi-level feature selection,” in Proc. of the Open Conf. of Electrical, Electronic and Information Sciences (eStream), Vilnius, 2015, pp. 1–4. https://doi.org/10.1109/estream.2015.7119492Crossref

  • [17] F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier and B. Weiss, “A database of German emotional speech,” in Proc. of Interspeech, Lissabon, 2005, pp. 1517–1520.Google Scholar

  • [18] J. Matuzas, T. Tišina, G. Drabavičius and L. Markevičiūtė, “Lithuanian Spoken Language Emotions Database,” Baltic Institute of Advanced Language, 2015. [Online]. Available: http://datasets.bpti.lt/lithuanian-spoken-language-emotions-database/

  • [19] F. Eyben, M. Wollmer and B. Schuller, “OpenEAR – Introducing the munich open-source emotion and affect recognition toolkit,” in 2009 3rd Int. Conf. on Affective Computing and Intelligent Interaction and Workshops, Amsterdam, 2009, pp. 1–6. https://doi.org/10.1109/acii.2009.5349350Crossref

About the article

Published Online: 2017-01-18

Published in Print: 2016-07-01

Citation Information: Electrical, Control and Communication Engineering, Volume 10, Issue 1, Pages 35–41, ISSN (Online) 2255-9159, DOI: https://doi.org/10.1515/ecce-2016-0005.

Export Citation

© 2016 Riga Technical University. This work is licensed under the Creative Commons Attribution 4.0 Public License. BY 4.0

Comments (0)

Please log in or register to comment.
Log in