广播新闻语料识别中的自动分段和分类算法

吕萍; 颜永红

广播新闻语料识别中的自动分段和分类算法

吕萍,
颜永红

计量
- 文章访问数: 2715
- HTML全文浏览量: 168
- PDF下载量: 1114
- 被引次数: 0
出版历程
- 收稿日期: 2005-04-06
- 修回日期: 2005-09-20
- 刊出日期: 2006-12-19

Audio Segmentation and Classification in a Broadcast News Task

摘要

摘要: 该介绍了中文广播新闻语料识别任务中的自动分段和自动分类算法。提出了3阶段自动分段系统。该方法通过粗分段、精细分段和平滑3个阶段，将音频流分割为易于识别的音频段。在精细分段阶段，文中提出两种算法：动态噪声跟踪分段算法和基于单音素解码的分段算法。仿效说话人鉴别中的方法，文中提出了基于混合高斯模型的分类算法。该算法较好地解决了音频段的多类判决问题。在新闻联播测试数据中的实验结果表明，该文提出的自动分段和分类算法性能与手工分段分类性能几乎相当。
- 语音识别; 自动分段; 自动分类
Abstract: This paper describes the work on the development of an audio segmentation and classification system applied to a broadcast news task for Chinese language. Three-phase automatic audio segmentation algorithm is provided. Audio stream is cut to audio segments (or sentences) by simply segmentation, fine segmentation and smoothing. Two different fine segmentation algorithms are given. They are dynamic noise tracking segmentation algorithm and segmentation based on mono-phone decoder algorithm respectively. Classifier based on mixture Gaussian model is used to classify audio segment into four groups: noise, music, male and female. The experiments on Xin Wen Lian Bo broadcast news show the performance of automatic segmentation and classification is almost equivalent to that of manual segmentation and classification.

HTML全文

参考文献(1)

David Graff. An overview of broadcast news corpora[J].Speech Communication.2002, 37(1):15-26[2]Pallett D S. A look a NISTs benchmark ASR tests: Past, present,and future. IEEE 2003 Automatic Speech Recognition and Understanding workshop, U S. Virgin Islands, 30 Nov.-3 Dec., 2003: 483 - 488.[3]Wayne C. Mutilingual topic detection and tracking: Successful research enabled by corpora and evaluation. Language Resources and Evaluation Conference (LREC), Athens, Greece, 31 May-2 June, 2000: 1487-1494.[4]David S. Automatic transcription of broadcast news data[J].Speech Communication.2002, 37(1):1-2[5]Robinson A J. Connectionist speech recognition of broadcast news[J].Speech Communication.2002, 37(1):27-45[6]Woodland P C. The development of the HTK broadcast news transcription system: An overview[J].Speech Communication.2002, 37(1):47-67[7]Hung Jeih-Weih. Automatic metric-based speech segmentation for broadcast news via principal component analysis. In International Conference on Spoken Language Processing (ICSLP) 2000, Beijing China, October 16-20, 2000, (4): 121-124.[8]Cheng Shi-sian. A sequential metric-based audio segmentation method via the Bayesian information criterion. EuroSpeech 2003, Geneva, Switzerland, Sep. 1-4, 2003: 945-948.[9]Lin L. Speech enhancement for nonstationary noise environment. Asia-Pacific Conference on Circuits and Systems, 2002, Singapore, Oct. 28-31, 2002, Vol(1): 177-180.[10]Yamamoto H. Parameter sharing and minimum classification error training of mixtures of factor analyzers for speaker identification. IEEE International Conference on Acoustics Speech and Signal Processing 2004, Montreal Canada, May 17-21, 2004, Vol(1): 17-21.[11]Legetter C J. Maximum likelihood linear regression for speaker adaptation of continuous density HMMs[J].Computer Speech and Language.1995, 9(2):171-186

施引文献

资源附件(0)

访问统计