Improved Model-Based PCA Transformation for GMM in Speaker Identification

Yao Zhi-qiang; Zhou Xi; Dai Bei-qian

doi:10.3724/SP.J.1146.2005.00749

Volume 29 Issue 2

Jan. 2011

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2007 > 29(2): 469-472

Yao Zhi-qiang, Zhou Xi, Dai Bei-qian . Improved Model-Based PCA Transformation for GMM in Speaker Identification[J]. Journal of Electronics & Information Technology, 2007, 29(2): 469-472. doi: 10.3724/SP.J.1146.2005.00749

Citation:

Yao Zhi-qiang, Zhou Xi, Dai Bei-qian . Improved Model-Based PCA Transformation for GMM in Speaker Identification[J]. Journal of Electronics & Information Technology, 2007, 29(2): 469-472. doi: 10.3724/SP.J.1146.2005.00749

Citation:

PDF( 227 KB)

Improved Model-Based PCA Transformation for GMM in Speaker Identification

doi: 10.3724/SP.J.1146.2005.00749

Received Date: 2005-06-27
Rev Recd Date: 2006-01-03
Publish Date: 2007-02-19

Abstract

Abstract

There is a basic choice in the form of covariance matrix to be used with Gaussian mixture model in text-independent speaker identification. In general, diagonal covariance matrix is chose, which implies strong assumption that elements of the feature vector are independent, because full covariance matrix suffers from too many parameters and large computational requirement. Unfortunately, in most application the assumption is not reasonable. In order to make feature vectors more suit to be modeled with diagonal covariance, features are usually de-correlated in feature space or model space. In this paper, an improved model-based PCA transformation algorithm is presented to de-correlate the elements of feature vectors. In this algorithm, principal component analysis is directly made for covariance of Gaussians. Also, the number of parameter is deduced through tying the PCA transformation between Gaussians. Experiments on the MSRA mandarin task show that the algorithm could achieve above 35% identification error reduction over the best diagonal covariance models.

FullText(HTML)

References(1)

References

[1] Reynolds D A. Speaker identification and verification using Gaussian mixture[J].Speech Communication.1995, 17:19-108 [2] Fukunaga K. Introduction to Statistical Pattern Recognition. New York: Academic, 1990, 9. [3] Haeb-Umbach R. Linear discriminant analysis for large vocabulary speech recognition, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, San Francisco, 1992, 1: 13-16. [4] Gopinath R A. Maximum likelihood modeling with Gaussian distributions for classification. in Proc. ICASSP, Seattle, 1998, 2: 661-664. [5] Kumar N. Investigation of silicon-auditory models and generalization of linear discriminant analysis for improved speech recognition, [Ph.D. dissertation], Johns Hopkins Univ., Baltimore, MD, 1997. [6] Gales M J F. Semi-tied covariance matrices for hidden Markov models, IEEE Trans[J].on Speech Audio Processing.1999, 7:272-281 [7] Gales M J F. Maximum likelihood multiple subspace projections for hidden Markov models, IEEE Trans[J].on Speech Audio Processing.2002, 10:37-47 [8] Chang Eric，Shi Yu, Zhou Jianlai, and Huang Chao. Speech lab in a box: A mandarin speech toolbox to jumpstart speech related research. EuroSpeech'01, Aalborg, Denmark, Oct. 2001: 2799-2802. [9] Zhou Xi and Yao Zhiqiang. Improved covariance modeling for gaussian mixture model. Inter-Speech 2005, Lisboa, Portugal, Sep. 2005, 3113-3116.

Relative Articles

Supplements(0)

Cited By

Proportional views