高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于概念基元的词语相似度计算研究

池哲洁 张全

池哲洁, 张全. 基于概念基元的词语相似度计算研究[J]. 电子与信息学报, 2017, 39(1): 150-158. doi: 10.11999/JEIT160176
引用本文: 池哲洁, 张全. 基于概念基元的词语相似度计算研究[J]. 电子与信息学报, 2017, 39(1): 150-158. doi: 10.11999/JEIT160176
CHI Zhejie, ZHANG Quan. Word Similarity Measurement Based on Concept Primitive[J]. Journal of Electronics & Information Technology, 2017, 39(1): 150-158. doi: 10.11999/JEIT160176
Citation: CHI Zhejie, ZHANG Quan. Word Similarity Measurement Based on Concept Primitive[J]. Journal of Electronics & Information Technology, 2017, 39(1): 150-158. doi: 10.11999/JEIT160176

基于概念基元的词语相似度计算研究

doi: 10.11999/JEIT160176
基金项目: 

国家863计划十二五项目(2012AA011102),国家语委十二五科研项目(YB125-53)

Word Similarity Measurement Based on Concept Primitive

Funds: 

The Twelfth Five-Year Project of National 863 Program of China (2012AA011102), The State Language Commission Twelfth Five-Year Research Project (YB125-53)

  • 摘要: 词语相似度的计算在机器翻译、信息检索等多个领域有重要作用。该文以概念层次网络理论的概念基元符号系统为语义资源,在共性与差异性对比思想下,提出一个涵盖层次性、网络性、对比对偶特性、挂靠特性及五元组信息的多维度词语相似度计算方法;在节点深度和节点距离度量上,引入权重以增加不同层次间的区分程度。在人工打分的测试集上进行实验,结果表明该方法计算的相似度与人工判断的符合程度较好,兼容度、相关系数和序对符合度分别达到0.812, 0.786和0.775;同时,相关性检验的结果也显示该方法的计算值与人工打分显著相关。
  • LIN D. An information-theoretic definition of similarity semantic distance in WordNet[C]. Proceedings of the 15th International Conference on Machine Learning, San Francisco, CA, USA, 1998: 296-304.
    WU Z and PALMER M. Verbs semantics and lexical selection [C]. Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Stroudsburg, PA, USA, 1994: 133-138. doi: 10.3115/981732.981751.
    RESNIK P. Semantic similarity in a taxonomy: an information based measure and its application to problems of ambiguity in natural language[J]. Journal of Artificial Intelligence Research, 1999, 11(7): 95-130. doi: 10.1613/jair. 514.
    王桐, 王磊, 吴吉义, 等. WordNet中的综合概念语义相似度计算方法[J]. 北京邮电大学学报, 2013, 36(2): 98-101. doi: 10.13190/jbupt.201302.98.wangt.
    WANG Tong, WANG Lei, WU Jiyi, et al. Semantic similarity calculation method of Comprehensive concept in WordNet[J]. Journal of Beijing University of Posts and Telecommunications, 2013, 36(2): 98-101. doi: 10.13190/ jbupt.201302.98.wangt.
    WANG Junhua, ZUO Wanli, and PENG Tao. Hyponymy graph model for word semantic similarity measurement[J]. Chinese Journal of Electronics, 2015, 24(1): 96-101. doi: 10.1049/cje.2015.01.016.
    刘群, 李素建. 基于《知网》的词汇语义相似度计算[C]. 第三届汉语词汇语义学研讨会论文集, 台北, 中国, 2002: 59-76.
    LIU Qun and LI Sujian. Words semantic similarity computation based on HowNet[C]. Proceedings of the 3rd Chinese Lexical Semantics Workshop, Taipei, China, 2002: 59-76.
    李国佳. 基于知网的中文词语相似度计算[J]. 智能计算机与应用, 2015, 5(3): 49-52. doi: 10.3969/j.issn.2095-2163.2015. 03.015.
    LI Guojia. Chinese words similarity computation based on HowNet[J]. Intelligent Computer and Applications, 2015, 5(3): 49-52. doi: 10.3969/j.issn.2095-2163.2015.03.015.
    张沪寅, 刘道波, 温春艳. 基于《知网》的词语语义相似度改进算法研究[J]. 计算机工程, 2015, 41(2): 151-156. doi: 10.3969/j.issn.1000-3428.2015.02.029.
    ZHANG Huyin, LIU Daobo, and WEN Chunyan. Research on improved algorithm of word semantic similarity based on HowNet[J]. Computer Engineering, 2015, 41(2): 151-156. doi: 10.3969/j.issn.1000-3428.2015.02.029.
    孙晶, 张东站. 基于逆概念频率的词语相似度计算[J]. 厦门大学学报(自然科学版), 2015, 54(2): 257-262. doi: 10.6043/ j.issn.0438-0479.2015.02.018.
    SUN Jing and ZHANG Dongzhan. Word similarity computing based on inverse concept frequencies[J]. Journal of Xiamen University (Natural Science), 2015, 54(2): 257-262. doi: 10.6043/j.issn.0438-0479.2015.02.018.
    BROWN P, PIETRA S, PIETRA V, et al. Word sense disambiguation using statistical methods[C]. Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, CA, USA, 1991: 264-270. doi: 10.3115/981344.981378.
    关毅, 王晓龙. 基于统计的汉语词汇间语义相似度计算[C]. 第七届全国计算语言学联合学术会议论文集, 哈尔滨, 中国, 2003: 221-227.
    GUAN Yi and WANG Xiaolong. A statistical measure of semantic similarity between Chinese words[C]. Proceedings of the 7th Joint Symposium on Computational Linguistics, Harbin, China, 2003: 221-227.
    王石, 曹存根, 裴亚军, 等. 一种基于搭配的中文词汇语义相似度计算方法[J]. 中文信息学报, 2013, 27(1): 7-14. doi: 10.3969/j.issn.1003-0077.2013.01.002.
    WANG Shi, CAO Cungen, PEI Yajun, et al. A collocation based method for semantic similarity measure for Chinese words[J]. Journal of Chinese Information Processing, 2013, 27(1): 7-14. doi: 10.3969/j.issn.1003-0077.2013.01.002.
    李慧. 词语相似度算法研究综述[J]. 现代情报, 2015, 35(4):[13] 172-177. doi: 10.3969/j.issn.1008-0821.2015.04.035.
    LI Hui. A review on the research of word similarity algorithms[J]. Journal of Modern Information, 2015, 35(4): 172-177. doi: 10.3969/j.issn.1008-0821.2015.04.035.
    黄曾阳. HNC理论全书(第五册)[M]. 北京: 科学出版社, 2015: 1-102.
    HUANG Zengyang. The Complete Book of Hierarchical Network of Concepts Theory (Book 5)[M]. Beijing: Science Press, 2015: 1-102.
    苗传江. HNC(概念层次网络)理论导论[M]. 北京: 清华大学出版社, 2005: 1-49.
    MIAO Chuanjiang. Introduction to HNC Theory[M]. Beijing: Tsinghua University Press, 2005: 1-49.
    吴佐衍, 王宇. 基于HNC理论的词语相似度计算[J]. 中文信息学报, 2014, 28(2): 37-43. doi: 10.3969/j.issn.1003-0077. 2014.02.005.
    WU Zuoyan and WANG Yu. A new measure of semantic similarity based on hierarchical network of concepts[J]. Journal of Chinese Information Processing, 2014, 28(2): 37-43. doi: 10.3969/j.issn.1003-0077.2014.02.005.
    史燕. 基于HNC的汉语句子相似度算法的研究[D]. [硕士论文], 江苏大学, 2009: 14-19. doi: 10.7666/d.y1604350.
    SHI Yan. The research on Chinese sentence similarity algorithm based on HNC[D]. [Master dissertation], Jiangsu University, 2009: 14-19. doi: 10.7666/d.y1604350.
  • 加载中
计量
  • 文章访问数:  1429
  • HTML全文浏览量:  129
  • PDF下载量:  307
  • 被引次数: 0
出版历程
  • 收稿日期:  2016-02-25
  • 修回日期:  2016-09-14
  • 刊出日期:  2017-01-19

目录

    /

    返回文章
    返回