高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于Web的无指导译文消歧词模型与N-gram模型及对比研究

刘鹏远 赵铁军

刘鹏远, 赵铁军. 基于Web的无指导译文消歧词模型与N-gram模型及对比研究[J]. 电子与信息学报, 2009, 31(12): 2969-2974. doi: 10.3724/SP.J.1146.2008.01624
引用本文: 刘鹏远, 赵铁军. 基于Web的无指导译文消歧词模型与N-gram模型及对比研究[J]. 电子与信息学报, 2009, 31(12): 2969-2974. doi: 10.3724/SP.J.1146.2008.01624
Liu Peng-yuan, Zhao Tie-jun. Comparison of Web-Based Unsupervised Translation Disambiguation Word Model and N-gram Model[J]. Journal of Electronics & Information Technology, 2009, 31(12): 2969-2974. doi: 10.3724/SP.J.1146.2008.01624
Citation: Liu Peng-yuan, Zhao Tie-jun. Comparison of Web-Based Unsupervised Translation Disambiguation Word Model and N-gram Model[J]. Journal of Electronics & Information Technology, 2009, 31(12): 2969-2974. doi: 10.3724/SP.J.1146.2008.01624

基于Web的无指导译文消歧词模型与N-gram模型及对比研究

doi: 10.3724/SP.J.1146.2008.01624
基金项目: 

国家重点基础研究发展计划(2004CB318102)资助课题

Comparison of Web-Based Unsupervised Translation Disambiguation Word Model and N-gram Model

  • 摘要: 该文提出了基于Web的无指导译文消歧的词模型及N-gram模型方法,并在尽可能相同的条件下进行了比较。两种方法均利用搜索引擎统计不同搜索片段在Web上的Page Count作为主要消歧信息。词模型定义了汉语词汇与英语词汇之间的双语词汇Web相关度,根据汉语上下文词汇与英语译文之间的相关度进行消歧;N-gram模型首先假设不同语义下的多义词N-gram序列行为模式不同,从而可对多义词不同语义类下词汇在实例中的N-gram序列进行统计与分析以进行消歧。两个模型的性能均超过了在国际语义评测SemEval2007的task#5上可比较的最好无指导系统。对这两个模型进行试验对比可发现N-gram模型性能优于词模型,也表明组合两类模型的结果有进一步提升消歧性能的潜力。
  • Li Hang and Li Cong. Word translation disambiguation usingbilingual bootstrapping[J].Computational Linguistics.2004,30(1):1-22[2]Yarowsky D. Decision lists for lexical ambiguity resolution:Application to accent restoration in spanish and french.Proceedings of the 32nd Annual Meeting of the Associationfor Computational Linguistics, Las Cruces, New Mexico,1994: 88-95.[3]Niu Zheng-yu, Ji Dong-hong, Tan Chew lim, and PakhomovS. Word sense disambiguation using label propagation basedsemi-supervised larning. Proceedings of the 43th AnnualMeeting of the Association for Computational Linguistics(ACL), Morristown, NJ, USA July 2005: 395-402.[4]Gale W A, Church K W, and Yarowsky D. Using bilingualmaterials to develop word sense disambiguation methods.Proceedings of the International Conference on Theoreticaland Methodological Issues in Machine Translation, Montreal,Canada, 1992: 101-112.[5]Hwee Tou Ng, BinWang, and Yee Seng Chan. Exploitingparallel texts for word sense disambiguation: an empiricalstudy. Proceedings of the 41st Annual Meeting of theAssociation for Computational Linguistics, Sapporo, Japan,2003: 455-462.[6]Chodorow L M and Miller G A. Using corpus statistics andWordNet relations for sense identification. ComputationalLinguistics, 1998, 24(1): 147-165.[7]Mihalcea R. Bootstrapping large sense tagged corpora.Proceedings of the 3rd International Conference on LanguageResources and Evaluation (LREC), Las Palmas, Spain. 2002:1407-1411.[8]Agirre E and Martnez D. Unsupervised WSD based onautomatically retrieved examples: The importance of bias.Proceedings of the Conference on Empirical Methods in NLP.Barcelona, Spain, 2004: 25-32.[9]刘鹏远, 赵铁军, 杨沐昀, 李壮. 基于等价伪译词的无指导译文消歧模型研究[J].电子与信息学报.2008, 30(7):1690-1695浏览[10]Kilgarriff A and Grefenstette G. 2003. Introduction to thespecial issue on the web as corpus. ComputationalLinguistics,2003, 29(3): 333-348.[11]Martinez D, Agirre E and Wang Xing-long. Word relatives incontext for word sense disambiguation. Proceedings of the2006 Australasian Language Technology Workshop (ALTW2006), Sydney, Australia, 2006: 42-50.[12]Mihalcea R and Moldovan D I. Word sense disambiguationbased on Semantic Density. Proceedings of COLING-ACLWordshop on Usage of WordNet in Natural LanguageProcessing, Montreal, Canada, July 1998: 16-22.[13]Turney P D. Mining the Web for synonyms: PMI-IR versusLSA on TOEFL. Proceedings of the Twelfth EuropeanConference on Machine Learning, Berlin: Springer-Verlag,2001: 491-502.[14]Paolo Rosso, Manuel Montes-y-Gomez, Davide Buscaldi,Aaron Pancardo-Rodrguez, and Luis Villase.nor Pineda.Two Web-based approaches for noun sense disambiguation.Int. Conf. on Compute. Linguistics and Intelligent TextProcessing. CICLing-2005, Springer Verlag, LNCS (3406),Mexico D. F., Mexico, 2005: 261-273.[15]Yang Che-yu. Word sense disambiguation using semanticrelatedness measurement[J].Journal of Zhejiang UniversitySCIENCE A.2006, 7(10):1609-1625[16]Liu Peng-yuan, Zhao Tie-jun, and Yang Mu-yun. HIT-WSD:Using search engine for multilingual Chinese-English lexicalsample task. Proceedings of the 4th International Workshopon Semantic Evaluations (SemEval-2007), Prague, June 2007:169-172.[17]Mohammad S, Hirst G, and Resnik P. TOR, TORMD:Distribtional profiles of concepts for unsupervised word sensedisambiguation. Proceedings of the 4th InternationalWorkshop on Semantic Evaluations (SemEval-2007). Prague,June, zech Republic. Association for ComputationalLinguistics Conference. 2007: 326-333.[18]Gavin B, Wyatt J, Harris R, and Yao Xin. Diversity creationmethods: A survey and categorization. Information FusionJournal, 2004, (6): 5-20.[19]Pedersen T. A baseline methodology for word sensedisambiguation. Proceedings of the Third InternationalConference on Intelligent Text Processing andComputational Linguistics. Mexico City. February, 2002:17-23.
  • 加载中
计量
  • 文章访问数:  3046
  • HTML全文浏览量:  61
  • PDF下载量:  827
  • 被引次数: 0
出版历程
  • 收稿日期:  2008-12-05
  • 修回日期:  2009-05-07
  • 刊出日期:  2009-12-19

目录

    /

    返回文章
    返回