Advanced Search
Volume 29 Issue 2
Jan.  2011
Turn off MathJax
Article Contents
Yao Zhi-qiang, Zhou Xi, Dai Bei-qian . Improved Model-Based PCA Transformation for GMM in Speaker Identification[J]. Journal of Electronics & Information Technology, 2007, 29(2): 469-472. doi: 10.3724/SP.J.1146.2005.00749
Citation: Yao Zhi-qiang, Zhou Xi, Dai Bei-qian . Improved Model-Based PCA Transformation for GMM in Speaker Identification[J]. Journal of Electronics & Information Technology, 2007, 29(2): 469-472. doi: 10.3724/SP.J.1146.2005.00749

Improved Model-Based PCA Transformation for GMM in Speaker Identification

doi: 10.3724/SP.J.1146.2005.00749
  • Received Date: 2005-06-27
  • Rev Recd Date: 2006-01-03
  • Publish Date: 2007-02-19
  • There is a basic choice in the form of covariance matrix to be used with Gaussian mixture model in text-independent speaker identification. In general, diagonal covariance matrix is chose, which implies strong assumption that elements of the feature vector are independent, because full covariance matrix suffers from too many parameters and large computational requirement. Unfortunately, in most application the assumption is not reasonable. In order to make feature vectors more suit to be modeled with diagonal covariance, features are usually de-correlated in feature space or model space. In this paper, an improved model-based PCA transformation algorithm is presented to de-correlate the elements of feature vectors. In this algorithm, principal component analysis is directly made for covariance of Gaussians. Also, the number of parameter is deduced through tying the PCA transformation between Gaussians. Experiments on the MSRA mandarin task show that the algorithm could achieve above 35% identification error reduction over the best diagonal covariance models.
  • loading
  • [1] Reynolds D A. Speaker identification and verification using Gaussian mixture[J].Speech Communication.1995, 17:19-108 [2] Fukunaga K. Introduction to Statistical Pattern Recognition. New York: Academic, 1990, 9. [3] Haeb-Umbach R. Linear discriminant analysis for large vocabulary speech recognition, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, San Francisco, 1992, 1: 13-16. [4] Gopinath R A. Maximum likelihood modeling with Gaussian distributions for classification. in Proc. ICASSP, Seattle, 1998, 2: 661-664. [5] Kumar N. Investigation of silicon-auditory models and generalization of linear discriminant analysis for improved speech recognition, [Ph.D. dissertation], Johns Hopkins Univ., Baltimore, MD, 1997. [6] Gales M J F. Semi-tied covariance matrices for hidden Markov models, IEEE Trans[J].on Speech Audio Processing.1999, 7:272-281 [7] Gales M J F. Maximum likelihood multiple subspace projections for hidden Markov models, IEEE Trans[J].on Speech Audio Processing.2002, 10:37-47 [8] Chang Eric,Shi Yu, Zhou Jianlai, and Huang Chao. Speech lab in a box: A mandarin speech toolbox to jumpstart speech related research. EuroSpeech'01, Aalborg, Denmark, Oct. 2001: 2799-2802. [9] Zhou Xi and Yao Zhiqiang. Improved covariance modeling for gaussian mixture model. Inter-Speech 2005, Lisboa, Portugal, Sep. 2005, 3113-3116.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (3221) PDF downloads(1306) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return