高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于统计机器翻译模型的查询扩展

李卫疆 赵铁军 王宪刚

李卫疆, 赵铁军, 王宪刚. 基于统计机器翻译模型的查询扩展[J]. 电子与信息学报, 2008, 30(3): 725-729. doi: 10.3724/SP.J.1146.2006.01382
引用本文: 李卫疆, 赵铁军, 王宪刚. 基于统计机器翻译模型的查询扩展[J]. 电子与信息学报, 2008, 30(3): 725-729. doi: 10.3724/SP.J.1146.2006.01382
Li Wei-jiang, Zhao Tie-jun, Wang Xian-gang . A SMT-based Approach for Query Expansion in Information Retrieval[J]. Journal of Electronics & Information Technology, 2008, 30(3): 725-729. doi: 10.3724/SP.J.1146.2006.01382
Citation: Li Wei-jiang, Zhao Tie-jun, Wang Xian-gang . A SMT-based Approach for Query Expansion in Information Retrieval[J]. Journal of Electronics & Information Technology, 2008, 30(3): 725-729. doi: 10.3724/SP.J.1146.2006.01382

基于统计机器翻译模型的查询扩展

doi: 10.3724/SP.J.1146.2006.01382
基金项目: 

国家自然科学基金重点项目(60435020)和微软亚洲研究院项目资助课题

A SMT-based Approach for Query Expansion in Information Retrieval

  • 摘要: 在搜索引擎等实际的信息检索应用中,用户提交的查询请求通常都只包含很少的几个关键词,这会引起相关文档与用户查询之间的词不匹配问题,对检索性能有较严重的负面影响。该文在分析了查询产生模型的基础上,提出了一种新的基于统计机器翻译的查询扩展方法。通过统计机器翻译模型提取文档集中与查询词相关联的词,用以进行查询扩展。在TREC数据集上的试验结果表明:基于统计翻译的查询扩展方法不仅比不扩展的语言模型方法始终有12%~17%的提高,而且比流行的查询扩展方法-伪反馈也具有可比的平均准确率。
  • Ponte J and Croft W. A language modeling approach to information retrieval. In Proceedings of the 21st ACM Conference on Research and Development in Information Retrieval(SIGIR98), Melbourne, Australia, 1998: 222-229.[2]Richardson R and Smeaton A. Using wordnet in a knowledge-based approach to information retrieval. Trinity College Dublin, Working paper ca-0395, 1995.[3]Lin D K and Zhao S J. Identifying synonyms among distributionally similar words. Proceedings of International Joint Conference of Artificial Intelligence (IJCAI2003), Mexico, 2003: 1492-1493.[4]丁国栋, 白硕. 一种基于局部共现的查询扩展方法. 中文信息学报, 2006, 20(3): 84-91.Ding Guo-dong and Bai Suo. Local co-occurrence based query expansion for information retrieval. Journal of Chinese Information Processing, 2006, 20(3): 84-91.[5]吕碧波. 基于相关文档池建模的查询扩展. 中文信息学报, 2005, 20(3): 78-83.[6]Lv Bi-bo. Query expansion based on modeling of relevant documents pool. Journal of Chinese Information Processing. 2005, 20(3): 78-83.[7]Xu J and Croft W. Query expansion using local and global document analysis. Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland, 1996: 4-11.[8]张敏. 基于语义关系查询扩展的文档重构方法[J].计算机学报.2004, 27(10):1395-1401Zhang Min. Document refinement based on semantic query expansion. Chinese Journal of Computers, 2004, 27(10): 1395-1401.[9]Kang De L. Dependency-based evaluation of MINIPAR. Proceedings of the Workshop on the Evaluation of Parsing Systems, Granada, Spain, 1998: 298-312.[10]Peat H and Willett P. The limitations of term co-occurrence data for query expansion in document retrieval systems[J].JASIS.1991, 42(5):378-3833.0.CO;2-8' target='_blank'>[11]Voorhees E. Query expansion using lexicalsemantic relations. ACM SIGIR, Dulin, Ireland, 1994: 61-69.[12]Qiu Y and Frei H. Concept based query expansion. ACM SIGIR, Pittsburgh, PA, USA, 1993: 160-169.[13]Bai J, Song D, Nie J Y, and Cao G. Query expansion using term relationships in language models for information retrieval. ACM CIKM, Bremen, Germany, 2005: 688-695.[14]Yarowsky D. Unsupervised word sense disambiguation rivaling supervised methods. ACL, Cambridge, Massachusetts, USA, 1995: 403-410.[15]Schjtze H and Pedersen J O. A cooccurrence-based thesaurus and two applications to information retrieval[J].Information Processing and Management.1997, 33(3):307-318[16]Berger A and Lafferty J. Information retrieval as statistical translation. In Proceedings of SIGIR99, Berkeley, CA,USA, 1999: 222-229.[17]曹华梁, 朱星. 适用于P2P的系统查询扩展优化方法[J].上海交通大学学报.2005, 39(10):1706-1710Cao Hua-liang and Zhu Xing. SDQE: A semantic query optimization in P2P system. Journal of Shanghai Jiaotong University, 2005, 39(10): 1706-1710.[18]Brown P, Della Pietra S, Della Pietra V, and Mercer R. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics. 1993, 19(2): 263-311.
  • 加载中
计量
  • 文章访问数:  3176
  • HTML全文浏览量:  96
  • PDF下载量:  921
  • 被引次数: 0
出版历程
  • 收稿日期:  2006-09-26
  • 修回日期:  2007-01-26
  • 刊出日期:  2008-03-19

目录

    /

    返回文章
    返回