Audio Segmentation and Classification in a Broadcast News Task

Lü Ping; Yan Yong-hong

Volume 28 Issue 12

Aug. 2010

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2006 > 28(12): 2292-2295

Lü Ping, Yan Yong-hong. Audio Segmentation and Classification in a Broadcast News Task[J]. Journal of Electronics & Information Technology, 2006, 28(12): 2292-2295.

Citation:

Lü Ping, Yan Yong-hong. Audio Segmentation and Classification in a Broadcast News Task[J]. Journal of Electronics & Information Technology, 2006, 28(12): 2292-2295.

Lü Ping, Yan Yong-hong. Audio Segmentation and Classification in a Broadcast News Task[J]. Journal of Electronics & Information Technology, 2006, 28(12): 2292-2295.

Citation:

Lü Ping, Yan Yong-hong. Audio Segmentation and Classification in a Broadcast News Task[J]. Journal of Electronics & Information Technology, 2006, 28(12): 2292-2295.

PDF( 217 KB)

Audio Segmentation and Classification in a Broadcast News Task

Received Date: 2005-04-06
Rev Recd Date: 2005-09-20
Publish Date: 2006-12-19

Abstract

Abstract

This paper describes the work on the development of an audio segmentation and classification system applied to a broadcast news task for Chinese language. Three-phase automatic audio segmentation algorithm is provided. Audio stream is cut to audio segments (or sentences) by simply segmentation, fine segmentation and smoothing. Two different fine segmentation algorithms are given. They are dynamic noise tracking segmentation algorithm and segmentation based on mono-phone decoder algorithm respectively. Classifier based on mixture Gaussian model is used to classify audio segment into four groups: noise, music, male and female. The experiments on Xin Wen Lian Bo broadcast news show the performance of automatic segmentation and classification is almost equivalent to that of manual segmentation and classification.

FullText(HTML)

References(1)

References

David Graff. An overview of broadcast news corpora[J].Speech Communication.2002, 37(1):15-26[2]Pallett D S. A look a NISTs benchmark ASR tests: Past, present,and future. IEEE 2003 Automatic Speech Recognition and Understanding workshop, U S. Virgin Islands, 30 Nov.-3 Dec., 2003: 483 - 488.[3]Wayne C. Mutilingual topic detection and tracking: Successful research enabled by corpora and evaluation. Language Resources and Evaluation Conference (LREC), Athens, Greece, 31 May-2 June, 2000: 1487-1494.[4]David S. Automatic transcription of broadcast news data[J].Speech Communication.2002, 37(1):1-2[5]Robinson A J. Connectionist speech recognition of broadcast news[J].Speech Communication.2002, 37(1):27-45[6]Woodland P C. The development of the HTK broadcast news transcription system: An overview[J].Speech Communication.2002, 37(1):47-67[7]Hung Jeih-Weih. Automatic metric-based speech segmentation for broadcast news via principal component analysis. In International Conference on Spoken Language Processing (ICSLP) 2000, Beijing China, October 16-20, 2000, (4): 121-124.[8]Cheng Shi-sian. A sequential metric-based audio segmentation method via the Bayesian information criterion. EuroSpeech 2003, Geneva, Switzerland, Sep. 1-4, 2003: 945-948.[9]Lin L. Speech enhancement for nonstationary noise environment. Asia-Pacific Conference on Circuits and Systems, 2002, Singapore, Oct. 28-31, 2002, Vol(1): 177-180.[10]Yamamoto H. Parameter sharing and minimum classification error training of mixtures of factor analyzers for speaker identification. IEEE International Conference on Acoustics Speech and Signal Processing 2004, Montreal Canada, May 17-21, 2004, Vol(1): 17-21.[11]Legetter C J. Maximum likelihood linear regression for speaker adaptation of continuous density HMMs[J].Computer Speech and Language.1995, 9(2):171-186

Relative Articles

Supplements(0)

Cited By

Proportional views