基于小波神经网络的与文本无关说话人识别方法研究
Research on Text-Independent Speaker Recognition Methods Using Wavelet Neural Network
-
摘要: 基于神经网络的说话人识别方法可以在一定程度上模仿人脑的功能,是说话人识别中的一种主要技术,但它通常难以确定隐层单元的数目,收敛速度慢,易于收敛到极小点。该文研究了一种用于说话人识别的小波神经网络模型,给出了网络结构和学习算法。采用Mel频率倒谱系数作为与文本无关的说话人识别的特征参数,并利用该模型进行了5个人的说话人识别实验,得到99.5%的识别率。实验结果表明,小波网络和传统的BP网络相比,训练速度和识别率都有了较大提高,具有良好的应用前景和进一步研究的价值。Abstract: The approach for speaker recognition based on neural networks is able to emulate the function of human brain in some degree, so it is a main implementation technology in the speaker recognition. But it is difficult to determine the number of hidden layer neurons, slowly convergent and easy to fall into local minimum point. The model of wavelet neural networks is studied. The structure of the network and learning algorithm are given. The recognition correctness reaches to 99.5% for 5 speakers using Mel frequency cepstral coefficient as feature parameters. The experimental at results show that the learning rate and recognition correctness are improved much compared to the BP networks. It has a good application prospect and worth to research further more.
-
Zhang Qinhua, Benveniste Al. Wavelet networks[J].IEEE Trans. on Neural Networks.1992, 3(6):889-[2]Szu H, Telfer B, Kadambe S. Neural network adaptive wavelets for signal representation and classification. Optical Engineering, 1992, 31(9):907.1016.[3]彭玉华. 小波变换与工程应用. 北京: 科学出版社, 2002:7.8[4]Zhang J, Walter G. Wavelet neural networks for function learning[J].IEEE Trans. on Signal Processing.1995, 43(6):1485-[5]李卫斌, 刘芳.小波神经网络的构造. 模式识别与人工智能,2003, 16(4):403.406.[6]焦李成. 神经网络的应用与实现. 西安:西安电子科技大学出版社, 1996, 第一章.[7]Yoshihiro Yamamoto, Nikiforuk P N. A new supervised learning algorithm for multilayered and inter-connected neural networks[J].IEEE Trans. on Neural Network.2000,11(1):36-[8]李金平,王风涛,杨波. BP小波神经网络快速学习算法研究. 系统工程与电子技术,2001, 23(8):72.75.[9]赵学智,邹春华,陈统坚. 小波神经网络的参数初始化研究. 华南理工大学学报(自然科学版), 2003, 31 (2):77.80.[10]Lamel L F, Kessel R H, Seneff S. Speech database development :Design and analysis of the acoustic-phonetic corpus. Proc.Speech Recognition Workshop(DARPA), 1986: 100.109.[11]甄斌,吴玺宏,刘志敏. 语音识别和说话人识别中各倒谱分量的相对重要性. 北京大学学报, 2001, 37(3): 371.378.
计量
- 文章访问数: 2335
- HTML全文浏览量: 77
- PDF下载量: 963
- 被引次数: 0