模糊C-均值(FCM)聚类法与矢量量化法相结合用于说话人识别
Speaker recognition using fuzzy C-mean clustering algorithm and vector-quantization(VQ) algorithm
-
摘要: 该文提出了一种将模糊C-均值聚类法与矢量量化法相结合进行说话人识别的方法。该算法将从语音信号中提取的 12阶 LPC(线性预测编码)倒谱系数作为待分类样本的 12个指标,先用矢量量化法求出每个说话人表征特征参数的码书,作为模糊聚类算法的聚类中心,最后将待识别的特征矢量以得到的码书为聚类中心,进行聚类识别。该算法所使用的特征参数较少,计算比较简单,但识别率较矢量量化法高。Abstract: In this paper, an efficient method for speaker recognition-the combination of VQ (Vector-Quantization) algorithm with fuzzy C-mean clustering algorithm is proposed. This algorithm extracts 12th order LPC cepstrum coefficients from speech signals and makes them the marker of those samples, which will be classified. At first, codebooks which can represent those feature parameters of each speaker are figured out, and used as the clustering centers of speaker recognition. Finally, all speakers feature parameters are identified from each other with fuzzy C-mean clustering algorithm in which the clustering centers are these codebooks which has been obtained using VQ algorithm. With relatively less feature parmeters and simpler computation, the proposed algorithm has a higher recognition rate compared with VQ algorithm.
-
朱民维,计算机语音技术,北京,北京航空航天大学出版社,1991,39-86.[2]胡光锐,语音处理与识别,上海,上海科学技术文献出版社,1994,200-297.[3]马卡尔着,娄乃英译,语音信号线性预测,北京,中国铁道出版社,1997,第一章.[4]Yu Dantong.[J].Zhang Aidong, ACD: An automatioc clustering and querying approach for large image database[C], In: ACM Multimedia99 Proc., Orlanda, Florida, USA.1999,:-[5]B.S. Everit, Cluster Analysis, 3rd. ED., New York, Halsted Press, part1~part3, 1993.[6]刘增良,模糊技术与神经网络技术选编,北京,北京航天航空大学出版社,1995,120-157.[7]S.B. Davis, P. Mermelstein, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, IEEE Trans. on ASSP, 1980, 28(4), 357-366.
计量
- 文章访问数: 2464
- HTML全文浏览量: 108
- PDF下载量: 515
- 被引次数: 0