一种用于WI语音编码的相位预测式矢量量化方法
doi: 10.3724/SP.J.1146.2006.00608
A Predictive Phase Vector Quantization Method in WI Speech Coding
-
摘要: 在传统的低比特率语音编码中,考虑到人耳对相位信息不敏感而经常忽略相位信息,这将导致语音粗糙、刺耳甚至音调发生改变。为了获得高质量的声码器,语音的相位信息是不能不考虑的。该文在散布相位矢量量化方法的基础上进一步去除了相位冗余,在波形内插(Waveform Interpolation,WI)编码模型中对相邻帧慢渐变波形(Slowly Evolving Waveform,SEW)的相位谱差值进行预测式矢量量化。实验发现,该方法大大改善了重建语音效果,明显提高了语音的自然度和清晰度。主观A/B测试结果显示,该方法与固定相位法相比,经4~6 bit的相位量化可使合成语音质量得到显著的改善,相比散布相位矢量量化方法,女声的语音合成质量有所改进。
-
关键词:
- 语音编码;波形内插;矢量量化
Abstract: In traditional low bit-rate speech coding, considering that ears are not sensitive to phase information, the phase information is often neglected, and this will result in coarse and harsh speech quality, and it even may lead to inflection in pitch. In order to obtain a high-quality speech codec, the phase information of speech should be included in codec. In this paper, the phase redundancy is reduced further based on the dispersion phase vector quantization method. In the waveform interpolation (WI) speech coding model, the difference of SEWs phase spectra of conjoint frames is quantized using predictive vector quantization. The result of this scheme reveals that the speech quality is improved, and its naturalness and articulation are increased greatly. Subjective A/B listening test indicates that the reconstructed speechs quality of this method is better than that of fixed phase with 4-6 bit. Compared with the dispersion phase vector quantization method, the synthesis speech is slightly improved for female speakers. -
Kleijn W B and Haagen J. Waveform Interpolation. Speech Coding and Synthesis, Amsterdam: Elsevier Science B. V., Chapter 5, 1995: 175-207.[2]Kleijn W B and Haagen J. A speech coder based on decomp- osition of characteristic waveforms. IEEE ICASSP, Detroit, USA, 1995, vol.1: 508-511.[3]Kleijn W B and Haagen J. Transformation and decom- position of the speech signal for coding[J].IEEE Signal Process- ing Letters.1994, 1(9):136-139[4]Gottesman O. Dispersion phase vector quantization for enha- ncement of waveform interpolative coder. IEEE ICASSP, Phoenix, Arizona, USA ,1999, vol.1: 269-272.[5]Gottesman O and Gersho A. Enhanced waveform interpolative coding at low bit-rate[J].IEEE Trans. on Speech and Audio Sig -nal Processing.2001, 9(8):786-798[6]Gottesman O and Gersho A. Enhanced analysis-by-synthesis waveform interpolative coding at 4kbps. [Ph. D Dissertation], University of California. 1999: 1443-1446.[7]朱娜娜. 2kbps波形内插语音编码算法的研究. [硕士论文],北京工业大学. 2003: 10-70.[8]Quatieri T F and McAulay R J. Phase coherence in speech recons -truction for enhancement and coding applications. IEEE IC -ASSP, Glasgow, Scotland, 1989, vol.1: 207-210.[9]同鸣等. 语音信号中相位信息的听觉感知研究. 西安交通大学学报,2003, 37(12): 1288-1291.[10]陈悦,鲍长春. WI语音编码中相位信息的量化与重建. 信号处理, 2005, 21(4A): 164-167.[11]Chong-White N R and Burnett I S. Accurate, critically sampled characteristic waveform surface construction for waveform interpolation decomposition[J].IEE Electronics Letters.2000, 36(14):1245-1247[12]鲍长春. 低比特率数字语音编码基础[M]. 北京:北京工业大学出版社. 2001: 233-257.[13]Kim Doh-Suk and Kim Moo Young. On the perceptual weighting function for phase quantization of speech. IEEE Workshop on Speech Coding, Wisconsin, USA, 2000: 62-64.
计量
- 文章访问数: 2893
- HTML全文浏览量: 74
- PDF下载量: 756
- 被引次数: 0