Speech Emotion Recognition Based on an Improved Supervised Manifold Learning Algorithm

Zhang  Shi-Qing; Li Le-Min; Zhao  Zhi-Jin

doi:10.3724/SP.J.1146.2009.01430

Volume 32 Issue 11

Dec. 2010

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2010 > 32(11): 2724-2729

Yi Kechu, Jia Yumin. A RAPID METHOD FOR DESIGNING 2-D DIGITAL FILTERS[J]. Journal of Electronics & Information Technology, 1986, 8(4): 255-264.

Citation:

Zhang Shi-Qing, Li Le-Min, Zhao Zhi-Jin. Speech Emotion Recognition Based on an Improved Supervised Manifold Learning Algorithm[J]. Journal of Electronics & Information Technology, 2010, 32(11): 2724-2729. doi: 10.3724/SP.J.1146.2009.01430

Yi Kechu, Jia Yumin. A RAPID METHOD FOR DESIGNING 2-D DIGITAL FILTERS[J]. Journal of Electronics & Information Technology, 1986, 8(4): 255-264.

Citation:

PDF( 248 KB)

Speech Emotion Recognition Based on an Improved Supervised Manifold Learning Algorithm

doi: 10.3724/SP.J.1146.2009.01430

Zhang Shi-Qing^{①③ 李乐民
,},
Li Le-Min,
Zhao Zhi-Jin

Received Date: 2009-11-06
Rev Recd Date: 2010-04-13
Publish Date: 2010-11-19

Abstract

Abstract

To improve effectively the performance on speech emotion recognition, it is needed to perform nonlinear dimensionality reduction for speech feature data lying on a nonlinear manifold embedded in high-dimensional acoustic space. Supervised Locally Linear Embedding (SLLE) is a typical supervised manifold learning algorithm for nonlinear dimensionality reduction. Considering the existing drawbacks of SLLE, this paper proposes an improved version of SLLE, which enhances the discriminating power of low-dimensional embedded data and possesses the optimal generalization ability. The proposed algorithm is used to conduct nonlinear dimensionality reduction for 48-dimensional speech emotional feature data including prosody and voice quality features, and extract low-dimensional embedded discriminating features so as to recognize four emotions including anger, joy, sadness and neutral. Experimental results on the natural speech emotional database demonstrate that the proposed algorithm obtains the highest accuracy of 90.78% with only less 9 embedded features, making 15.65% improvement over SLLE. Therefore, the proposed algorithm can significantly improve speech emotion recognition results when applied for reducing dimensionality of speech emotional feature data.
- Speech emotion recognition,
- Nonlinear dimensionality reduction,
- Manifold learning,
- Supervised locally linear embedding

FullText(HTML)

References(1)

References

atural speech by combining prosody and voice quality features[C]. Advances in Neural Networks - ISNN 2008, Springer, 2008, Lecture Notes in Computer Science, 5264: 457-464.[18]Zhao Y, Zhao L, and Zou C, et al.. Speech emotion recognition using modified quadratic discrimination function[J].Journal of Electronics (China.2008, 25(6):840-8.

Picard R. Affective Computing[M]. MIT Press, Cambridge, MA, 1997: 1-24.[2]Jones C and Deeming A. Affective human-robotic interaction[C]. Affect and Emotion in Human-Computer Interaction, Springer, 2008, Lecture Notes in Computer Science, 4868: 175-185.[3]Morrison D, Wang R, and De Silva L C. Ensemble methods for spoken emotion recognition in call-centres[J].Speech Communication.2007, 49(2):98-112[4]Picard R. Robots with emotional intelligence[C]. 4th ACM/ IEEE international conference on Human robot interaction, California, 2009: 5-6.[5]Errity A and McKenna J. An investigation of manifold learning for speech analysis[C]. 9th International Conference on Spoken Language Processing (ICSLP'06), Pittsburgh, PA, USA, 2006: 2506-2509.[6]Goddard J, Schlotthauer G, and Torres M, et al.. Dimensionality reduction for visualization of normal and pathological speech data[J].Biomedical Signal Processing and Control.2009, 4(3):194-201[7]Yu D. The application of manifold based visual speech units for visual speech recognition[D]. [Ph.D.dissertation], Dublin City University, 2008.[8]Roweis S T and Saul L K. Nonlinear dimensionality reduction by locally linear embedding[J].Science.2000, 290(5500):2323-2326[9]Tenenbaum J B, Silva Vd, and Langford J C. A global geometric framework for nonlinear dimensionality reduction[J].Science.2000, 290(5500):2319-2323[10]Jolliffe I T. Principal Component Analysis[M]. New York: Springer, 2002: 150-165.[11]De Ridder D, Kouropteva O, and Okun O, et al.. Supervised locally linear embedding[C]. Artificial Neural Networks and Neural Information Processing-ICANN/ICONIP-2003, Springer, 2003, Lecture Notes in Computer Science, 2714, 333-341.[12]Liang D, Yang J, and Zheng Z, et al.. A facial expression recognition system based on supervised locally linear embedding[J].Pattern Recognition Letters.2005, 26(15):2374-2389[13]Pang Y, Teoh A, and Wong E, et al.. Supervised Locally Linear Embedding in face recognition[C]. International Symposium on Biometrics and Security Technologies, Islamabad, 2008: 1-6.[14]Platt J C. Fastmap, MetricMap, and Landmark MDS are all Nystrom algorithms[C]. 10th International Workshop on Artificial Intelligence and Statistics, Barbados, 2005: 261-268.[15]Aha D, Kibler D, and Albert M. Instance-based learning algorithms[J]. Machine Learning, 1991, 6(1): 37-66.[16]赵力, 将春辉, 邹采荣等. 语音信号中的情感特征分析和识别的研究[J].电子学报.2004, 32(4):606-609Zhao Li, Jiang Chun-hui, and Zou Cai-rong, et al.. A study on emotional feature analysis and recognition in speech[J]. Acta Electronica Sinica, 2004, 32(4): 606-609.[17]Zhang S. Emotion recognition.

Relative Articles

Supplements(0)

Cited By

Proportional views