高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

一种两步判决的说话人分割算法

杨继臣 贺前华 李艳雄 王伟凝

杨继臣, 贺前华, 李艳雄, 王伟凝. 一种两步判决的说话人分割算法[J]. 电子与信息学报, 2010, 32(8): 2006-2009. doi: 10.3724/SP.J.1146.2009.01072
引用本文: 杨继臣, 贺前华, 李艳雄, 王伟凝. 一种两步判决的说话人分割算法[J]. 电子与信息学报, 2010, 32(8): 2006-2009. doi: 10.3724/SP.J.1146.2009.01072
Yang Ji-Chen, He Qian-Hua, Li Yan-Xiong, Wang Wei-Ning. A Two-step Criterion Algorithm of Speaker Segmentation[J]. Journal of Electronics & Information Technology, 2010, 32(8): 2006-2009. doi: 10.3724/SP.J.1146.2009.01072
Citation: Yang Ji-Chen, He Qian-Hua, Li Yan-Xiong, Wang Wei-Ning. A Two-step Criterion Algorithm of Speaker Segmentation[J]. Journal of Electronics & Information Technology, 2010, 32(8): 2006-2009. doi: 10.3724/SP.J.1146.2009.01072

一种两步判决的说话人分割算法

doi: 10.3724/SP.J.1146.2009.01072
基金项目: 

国家自然科学基金(60972132,60602014)资助课题

A Two-step Criterion Algorithm of Speaker Segmentation

  • 摘要: 为了提高说话人分割(SS)准确率,该文综合考虑了静音信息和性别信息在SS中的作用,提出了一种两步判决的SS算法。在从音频流中分离出语音段的基础上,采用两步判决的方法进行SS。第1步采用基频信息为主、性别模型为辅的策略进行SS,将相邻说话人基频差异大的说话人改变检测出来;第2步采用基于性别的改进T2判决公式进行SS,实现相邻说话人基频差异小的同性别SS,为此,该文提出了一个基于块的潜在说话人改变点检测算法。实验结果表明,本文算法提高了分割准确率,F1度量值可达85.14%。对于短时长(2 s)语音段的SS,该算法和传统的贝叶斯信息判决算法相比,漏检率减少了16%。
  • Sinha R, Tranter S E, Gales M J F, and Woodland P C. Thecambridge university March 2005 speaker diarisation system.In proceeding of the European Conference SpeechCommunication and Technology. Lisbon, Portugal, 2005:2437-2440.[2]Kotti M, Benetos E, and Kotropoulos C. Computationallyefficient and robust BIC-Based speaker segmentation [J].IEEE Transactions on Speech and Audio Processing.2008,16(5):920-933[3]Chen S and Gopalakrishnan P S. Speaker, environment andchannel change detection and clustering via the Bayesianinformation criterion. Proc. DARPA Broadcast NewsTranscription and Understanding Workshop, Lansdowne, VA,Feb. 1998: 127-132.[4]El-Khoury E, Senac C, and Pinquier J. Improved speakerdiarization system for meetings. In ICASSP2009, Taipei,April, 2009: 4097-4100.[5]Christoph Boehm and Franz pernkopf. Effective metric-basedspeaker segmentation in the frequency domain. InICASSP2009, Taipei, April 2009: 4081-4084.[6]Kwon S and Narayanan S. Unsupervised speaker indexingusing generic models [J].IEEE Transactions on Speech andAudio Processing.2005, 13(5):1004-1013[7]郑铁然, 李海峰等. 基于预分割的说话人分割方法. 通信学报,2009, 30(2): 118-123.Zheng Tie-ran and Li Hai-feng, et al.. Method of speakerssegmentation based on pre-segmentation. Journal ofCommuncation, 2009, 30(2): 118-123.[8]Zhou B and Hansen H L. Efficient audio stream segmentationvia the combined T2-statistics and Bayesian informationcriterion [J].IEEE Transactions on Speech and AudioProcessing.2005, 13(4):467-474[9]Lu Lie, Zhang Hong-jiang, and Jiang Hao. Content analysisfor audio classification and segmentation [J].IEEETransactions on Speech and Audio Processing.2002, 10(7):504-516[10]Kotti M, Moschou V, and Kotropoulos C. Speakersegmentation and clustering [J].Journal of Signal Processing.2008, 88(5):1091-1124[11]Boersma P and Weenink D. Paraat: Doing phonetics bycomputer. Available: http:/ /www. praat.org/
  • 加载中
计量
  • 文章访问数:  3803
  • HTML全文浏览量:  102
  • PDF下载量:  696
  • 被引次数: 0
出版历程
  • 收稿日期:  2009-08-10
  • 修回日期:  2009-12-01
  • 刊出日期:  2010-08-19

目录

    /

    返回文章
    返回