Multi-Stage Recognition of Speech Emotion Using Sequential Forward Feature Selection

Tatjana Liogienė; Gintautas Tamulevičius

doi:10.1515/ecce-2016-0005

Multi-Stage Recognition of Speech Emotion Using Sequential Forward Feature Selection

Authors

Tatjana Liogienė Doctoral student, Vilnius University Institute of Mathematics and Informatics
Gintautas Tamulevičius Associate Professor, Vilnius Gediminas Technical University

DOI:

https://doi.org/10.1515/ecce-2016-0005

Keywords:

Classification algorithms, Emotion recognition, Human voice

Abstract

The intensive research of speech emotion recognition introduced a huge collection of speech emotion features. Large feature sets complicate the speech emotion recognition task. Among various feature selection and transformation techniques for one-stage classification, multiple classifier systems were proposed. The main idea of multiple classifiers is to arrange the emotion classification process in stages. Besides parallel and serial cases, the hierarchical arrangement of multi-stage classification is most widely used for speech emotion recognition. In this paper, we present a sequential-forward-feature-selection-based multi-stage classification scheme. The Sequential Forward Selection (SFS) and Sequential Floating Forward Selection (SFFS) techniques were employed for every stage of the multi-stage classification scheme. Experimental testing of the proposed scheme was performed using the German and Lithuanian emotional speech datasets. Sequential-feature-selection-based multi-stage classification outperformed the single-stage scheme by 12–42 % for different emotion sets. The multi-stage scheme has shown higher robustness to the growth of emotion set. The decrease in recognition rate with the increase in emotion set for multi-stage scheme was lower by 10–20 % in comparison with the single-stage case. Differences in SFS and SFFS employment for feature selection were negligible.

References

S. Ramakrishnan and I. M. M. El Emary, “Speech emotion recognition approaches in human computer interaction,” Telecommun. Systems, vol. 52, issue 3, pp. 1467–1478, Mar. 2013. https://doi.org/10.1007/s11235-011-9624-z

S. G. Koolagudi and K. S. Rao, “Emotion recognition from speech: a review,” Int. J. of Speech Technology, vol. 15, issue 2, pp. 99–117, June 2012. https://doi.org/10.1007/s10772-011-9125-1

Z. Xiao, E. Dellandrea, L. Chen and W. Dou, “Recognition of emotions in speech by a hierarchical approach,” in 2009 3rd Int. Conf. on Affective Computing and Intelligent Interaction and Workshops, Amsterdam, 2009, pp. 1–8. https://doi.org/10.1109/acii.2009.5349587

P. Giannoulis and G. Potamianos, “A hierarchical approach with feature selection for emotion recognition from speech,” in Proc. of the Eighth Int. Conf. on Language Resources and Evaluation, 2012, pp. 1203–1206.

B. Schuller, B. Vlasenko, F. Eyben, G. Rigoll and A. Wendemuth, “Acoustic Emotion Recognition: A Benchmark Comparison of Performances,” in 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, Merano, 2009, pp. 552–557. https://doi.org/10.1109/asru.2009.5372886

A. Origlia, V. Galatà and B. Ludusan, “Automatic classification of emotions via global and local prosodic features on a multilingual emotional database,” in Proc. of Speech Prosody, 2010.

M. Lugger, M.-E. Janoir and B. Yang, “Combining classifiers with diverse feature sets for robust speaker independent emotion recognition,” in 2009 17th European Signal Processing Conf., Glasgow, 2009, pp. 1225–1229.

H. Peng, F. Long and C. Ding, “Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy,” IEEE Trans. on Pattern Analysis and Machine Intelligence, pp. 1226–1238, Aug. 2005. https://doi.org/10.1109/TPAMI.2005.159

A. Mencattini, E. Martinelli, G. Costantini, M. Todisco, B. Basile, M. Bozzali and N. Di Corrado, “Speech emotion recognition using amplitude modulation parameters and a combined feature selection procedure,” Knowledge-Based Systems, vol. 63, pp. 68–81, June 2014. https://doi.org/10.1016/j.knosys.2014.03.019

A. Milton and S. Tamil Selvi, “Class-specific multiple classifiers scheme to recognize emotions from speech signals,” Comput. Speech and Language, vol. 28, issue 3, pp. 727–742, May 2014. https://doi.org/10.1016/j.csl.2013.08.004

L. Chen, X. Mao, Y. Xue and L. L. Cheng, “Speech emotion recognition: Features and classification models,” Digital Signal Processing, pp. 1154–1160, Dec. 2012. https://doi.org/10.1016/j.dsp.2012.05.007

W.-J. Yoon and K.-S. Park, “Building robust emotion recognition system on heterogeneous speech databases,” in 2011 IEEE Int. Conf. on Consumer Electronics (ICCE), Las Vegas, NV, 2011, pp. 825–826. https://doi.org/10.1109/ICCE.2011.5722886

J. Liu, C. Chen, J. Bu, M. You and J. Tao, “Speech Emotion Recognition using an Enhanced Co-Training Algorithm,” in 2007 IEEE Int. Conf. on Multimedia and Expo, Beijing, 2007, pp. 999–1002. https://doi.org/10.1109/ICME.2007.4284821

M. Kotti and F. Paternò, “Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema,” Int. J. of Speech Technology, vol. 15, issue 2, pp. 131–150, June 2012. https://doi.org/10.1007/s10772-012-9127-7

G. Tamulevicius and T. Liogiene, “Low-order multi-level features for speech emotion recognition,” Baltic J. of Modern Computing, vol. 3, no. 4, pp. 234–247, 2015.

T. Liogiene and G. Tamulevicius, “Minimal cross-correlation criterion for speech emotion multi-level feature selection,” in Proc. of the Open Conf. of Electrical, Electronic and Information Sciences (eStream), Vilnius, 2015, pp. 1–4. https://doi.org/10.1109/estream.2015.7119492

F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier and B. Weiss, “A database of German emotional speech,” in Proc. of Interspeech, Lissabon, 2005, pp. 1517–1520.

J. Matuzas, T. Tišina, G. Drabavičius and L. Markevičiūtė, “Lithuanian Spoken Language Emotions Database,” Baltic Institute of Advanced Language, 2015. [Online]. Available: http://datasets.bpti.lt/lithuanian-spoken-language-emotions-database/

F. Eyben, M. Wollmer and B. Schuller, “OpenEAR – Introducing the munich open-source emotion and affect recognition toolkit,” in 2009 3rd Int. Conf. on Affective Computing and Intelligent Interaction and Workshops, Amsterdam, 2009, pp. 1–6. https://doi.org/10.1109/acii.2009.5349350

Multi-Stage Recognition of Speech Emotion Using Sequential Forward Feature Selection

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite