Advanced Search
Volume 32 Issue 11
Dec.  2010
Turn off MathJax
Article Contents
Zhang Shi-Qing, Li Le-Min, Zhao Zhi-Jin. Speech Emotion Recognition Based on an Improved Supervised Manifold Learning Algorithm[J]. Journal of Electronics & Information Technology, 2010, 32(11): 2724-2729. doi: 10.3724/SP.J.1146.2009.01430
Citation: Zhang Shi-Qing, Li Le-Min, Zhao Zhi-Jin. Speech Emotion Recognition Based on an Improved Supervised Manifold Learning Algorithm[J]. Journal of Electronics & Information Technology, 2010, 32(11): 2724-2729. doi: 10.3724/SP.J.1146.2009.01430

Speech Emotion Recognition Based on an Improved Supervised Manifold Learning Algorithm

doi: 10.3724/SP.J.1146.2009.01430
  • Received Date: 2009-11-06
  • Rev Recd Date: 2010-04-13
  • Publish Date: 2010-11-19
  • To improve effectively the performance on speech emotion recognition, it is needed to perform nonlinear dimensionality reduction for speech feature data lying on a nonlinear manifold embedded in high-dimensional acoustic space. Supervised Locally Linear Embedding (SLLE) is a typical supervised manifold learning algorithm for nonlinear dimensionality reduction. Considering the existing drawbacks of SLLE, this paper proposes an improved version of SLLE, which enhances the discriminating power of low-dimensional embedded data and possesses the optimal generalization ability. The proposed algorithm is used to conduct nonlinear dimensionality reduction for 48-dimensional speech emotional feature data including prosody and voice quality features, and extract low-dimensional embedded discriminating features so as to recognize four emotions including anger, joy, sadness and neutral. Experimental results on the natural speech emotional database demonstrate that the proposed algorithm obtains the highest accuracy of 90.78% with only less 9 embedded features, making 15.65% improvement over SLLE. Therefore, the proposed algorithm can significantly improve speech emotion recognition results when applied for reducing dimensionality of speech emotional feature data.
  • atural speech by combining prosody and voice quality features[C]. Advances in Neural Networks - ISNN 2008, Springer, 2008, Lecture Notes in Computer Science, 5264: 457-464.[18]Zhao Y, Zhao L, and Zou C, et al.. Speech emotion recognition using modified quadratic discrimination function[J].Journal of Electronics (China.2008, 25(6):840-8.

    Picard R. Affective Computing[M]. MIT Press, Cambridge, MA, 1997: 1-24.[2]Jones C and Deeming A. Affective human-robotic interaction[C]. Affect and Emotion in Human-Computer Interaction, Springer, 2008, Lecture Notes in Computer Science, 4868: 175-185.[3]Morrison D, Wang R, and De Silva L C. Ensemble methods for spoken emotion recognition in call-centres[J].Speech Communication.2007, 49(2):98-112[4]Picard R. Robots with emotional intelligence[C]. 4th ACM/ IEEE international conference on Human robot interaction, California, 2009: 5-6.[5]Errity A and McKenna J. An investigation of manifold learning for speech analysis[C]. 9th International Conference on Spoken Language Processing (ICSLP'06), Pittsburgh, PA, USA, 2006: 2506-2509.[6]Goddard J, Schlotthauer G, and Torres M, et al.. Dimensionality reduction for visualization of normal and pathological speech data[J].Biomedical Signal Processing and Control.2009, 4(3):194-201[7]Yu D. The application of manifold based visual speech units for visual speech recognition[D]. [Ph.D.dissertation], Dublin City University, 2008.[8]Roweis S T and Saul L K. Nonlinear dimensionality reduction by locally linear embedding[J].Science.2000, 290(5500):2323-2326[9]Tenenbaum J B, Silva Vd, and Langford J C. A global geometric framework for nonlinear dimensionality reduction[J].Science.2000, 290(5500):2319-2323[10]Jolliffe I T. Principal Component Analysis[M]. New York: Springer, 2002: 150-165.[11]De Ridder D, Kouropteva O, and Okun O, et al.. Supervised locally linear embedding[C]. Artificial Neural Networks and Neural Information Processing-ICANN/ICONIP-2003, Springer, 2003, Lecture Notes in Computer Science, 2714, 333-341.[12]Liang D, Yang J, and Zheng Z, et al.. A facial expression recognition system based on supervised locally linear embedding[J].Pattern Recognition Letters.2005, 26(15):2374-2389[13]Pang Y, Teoh A, and Wong E, et al.. Supervised Locally Linear Embedding in face recognition[C]. International Symposium on Biometrics and Security Technologies, Islamabad, 2008: 1-6.[14]Platt J C. Fastmap, MetricMap, and Landmark MDS are all Nystrom algorithms[C]. 10th International Workshop on Artificial Intelligence and Statistics, Barbados, 2005: 261-268.[15]Aha D, Kibler D, and Albert M. Instance-based learning algorithms[J]. Machine Learning, 1991, 6(1): 37-66.[16]赵力, 将春辉, 邹采荣等. 语音信号中的情感特征分析和识别的研究[J].电子学报.2004, 32(4):606-609Zhao Li, Jiang Chun-hui, and Zou Cai-rong, et al.. A study on emotional feature analysis and recognition in speech[J]. Acta Electronica Sinica, 2004, 32(4): 606-609.[17]Zhang S. Emotion recognition.
  • Cited by

    Periodical cited type(19)

    1. 张家豪,章昭辉,严琦,王鹏伟. 基于语音节奏差异的情感识别方法. 计算机科学. 2024(04): 262-269 .
    2. 徐胜超. 流形学习降维算法中一种新动态邻域选择方法. 计算机技术与发展. 2022(01): 85-90 .
    3. 张石清,刘瑞欣,赵小明. 跨库语音情感识别研究进展. 计算机系统应用. 2022(11): 31-48 .
    4. 董寅冬,任福继,李春彬. 基于线性核主成分分析和XGBoost的脑电情感识别. 光电工程. 2021(02): 15-23 .
    5. 刘天宝,张凌涛,于文涛,魏东川,范轶军. 基于嵌入注意力机制层级LSTM的音视频情感识别. 激光与光电子学进展. 2021(02): 183-190 .
    6. 魏金太,高穹. 基于深度学习可变长度语音片段的情感识别. 承德石油高等专科学校学报. 2021(06): 51-56 .
    7. 田祥宏. 一种结合局部线性嵌入与支持向量机的语音识别方法. 电视技术. 2019(02): 61-65 .
    8. 杜弘彦,王士同,李滔. 基于非线性距离和夹角组合的最近特征空间嵌入方法. 计算机工程与科学. 2018(05): 888-897 .
    9. 谢湘,唐刚,肖泽苹,李通. 飞行驾驶员的应答方式识别. 北京理工大学学报. 2017(07): 744-747 .
    10. 杜弘彦,王士同. 非线性距离的最近邻特征空间嵌入改进方法. 计算机科学与探索. 2017(09): 1461-1473 .
    11. 李善,谭继文,俞昆. 基于SLLE算法和流形聚类分析的滚珠丝杠故障诊断. 组合机床与自动化加工技术. 2016(12): 96-99 .
    12. 徐照松,元昌安,覃晓,元建,李双. 基于关联规则的语音情感中韵律特征抽取算法研究. 计算机应用与软件. 2015(09): 42-45+77 .
    13. 王小虎,张石清,曹恒瑞. 基于多分类器集成的语音情感识别. 微电子学与计算机. 2015(07): 38-41+45 .
    14. 李强,皮智谋. 基于FastICA-SLLE的转子系统故障诊断研究. 组合机床与自动化加工技术. 2014(08): 105-107+118 .
    15. 张石清,李乐民,赵知劲. 人机交互中的语音情感识别研究进展. 电路与系统学报. 2013(02): 440-451+434 .
    16. 周夕良. 语音情感识别的发展与展望. 信息技术. 2013(11): 19-22+25 .
    17. 李杰,周萍. 语音情感识别中特征参数的研究进展. 传感器与微系统. 2012(02): 4-7 .
    18. 徐玉龙,王金明,吴文,陈志伟. 一种基于流形与特征融合的说话人识别方法. 军事通信技术. 2012(03): 7-11 .
    19. 李缨,于谦. 基于类集和类对的有监督流形学习的肺结节分类. 科技通报. 2012(08): 29-32 .

    Other cited types(22)

  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (4182) PDF downloads(851) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return