将词类信息融入三元文法统计模型的汉语音字转换方法

梅勇; 王群生; 徐秉铮

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名

邮箱

手机号码

标题

留言内容

验证码

将词类信息融入三元文法统计模型的汉语音字转换方法

计量
- 文章访问数: 2249
- HTML全文浏览量: 120
- PDF下载量: 460
- 被引次数: 0
出版历程
- 收稿日期: 1997-01-27
- 修回日期: 1997-12-10
- 刊出日期: 1998-09-19

A KIND OF CHINESE TRANSITION METHOD FROM SPELLING TO CHARACTER TAKING INTO ACCOUNT POS INFORMATION IN A TRIGRAM-BASED STATISTICAL MODEL

摘要

摘要: 本文给出了一种将词类信息融入三元文法模型的汉语组合语言模型。理论分析和实验均表明:该模型不仅复杂度低于三元文法模型,而且对测试文本域的依赖性也优于前者。
- 三元文法模型; 三元词类模型; 组合模型; 复杂度; 稳健性
Abstract: A kind of Chinese combined language model,that takes into account POS(part of speech)information in a trigram-based statistical language model, is presented in this paper. The theoretical analysis and experiments all show that the model not only is lower than trigram model in PP(perplexity), but also is superior to trigram model in dependence on test text domain.

HTML全文

参考文献(1)

Cerf-Danon H, De Gennaro S, Ferretti M, Gonzalez J, Keppel E. Tangora-A large vocabulary speech recognition system for five language. EUROSPEECH91, Genova(Italy): Sep.24-26, 1991, vol.1, 183-192.[2]Katz S. Estimation of probabilistics from sparse data for the language model component of a speech recognizer. IEEE Trans.on Acoustics, Speech and Signal Processing, 1987, 34(3): 400-401.[3]Jelinek F, Mercer R L. Interpolated estimation of Markov source parameters from sparse data,[4]Pattern Recognition in Practice, E.L. Gelsema and L. N. Kanal, Eds., New York, North-Holland: 1980,381-397.[5]刘开瑛,郑家恒,赵军.语料库词类自动标注算法研究:机器翻译研究进展,北京:电子工业出版社,1992 378-386.[6]吴伯修,规绍升,祝宗泰,等.信息论与编码.北京:电子工业出版社,1986,5-13.

施引文献

资源附件(0)

访问统计

计量

文章访问数: 2249
HTML全文浏览量: 120
PDF下载量: 460
被引次数: 0

留言板

将词类信息融入三元文法统计模型的汉语音字转换方法

计量

出版历程

A KIND OF CHINESE TRANSITION METHOD FROM SPELLING TO CHARACTER TAKING INTO ACCOUNT POS INFORMATION IN A TRIGRAM-BASED STATISTICAL MODEL

计量

出版历程

目录