高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于整数线性规划重构抽象语义图结构的语义摘要算法

陈鸿昶 明拓思宇 刘树新 高超

陈鸿昶, 明拓思宇, 刘树新, 高超. 基于整数线性规划重构抽象语义图结构的语义摘要算法[J]. 电子与信息学报, 2019, 41(7): 1674-1681. doi: 10.11999/JEIT180720
引用本文: 陈鸿昶, 明拓思宇, 刘树新, 高超. 基于整数线性规划重构抽象语义图结构的语义摘要算法[J]. 电子与信息学报, 2019, 41(7): 1674-1681. doi: 10.11999/JEIT180720
Hongchang CHEN, Tuosiyu MING, Shuxin LIU, Chao GAO. Semantic Summarization of Reconstructed Abstract Meaning Representation Graph Structure Based on Integer Linear Pragramming[J]. Journal of Electronics & Information Technology, 2019, 41(7): 1674-1681. doi: 10.11999/JEIT180720
Citation: Hongchang CHEN, Tuosiyu MING, Shuxin LIU, Chao GAO. Semantic Summarization of Reconstructed Abstract Meaning Representation Graph Structure Based on Integer Linear Pragramming[J]. Journal of Electronics & Information Technology, 2019, 41(7): 1674-1681. doi: 10.11999/JEIT180720

基于整数线性规划重构抽象语义图结构的语义摘要算法

doi: 10.11999/JEIT180720
基金项目: 国家自然科学基金(61521003),国家自然科学基金青年科学基金(61601513)
详细信息
    作者简介:

    陈鸿昶:男,1964年生,教授,博士生导师,研究方向为通信与信息工程、网络大数据

    明拓思宇:男,1994年生,硕士生,研究方向为网络大数据、文本摘要

    刘树新:男,1987年生,助理研究员,研究方向为网络大数据、复杂网络

    高超:男,1982年生,助理研究员,研究方向为网络大数据、计算机视觉

    通讯作者:

    明拓思宇 1139446336@qq.com

  • 中图分类号: TP391.1

Semantic Summarization of Reconstructed Abstract Meaning Representation Graph Structure Based on Integer Linear Pragramming

Funds: The National Natural Science Foundation of China (61521003), The National Natural Science Foundation of China Youth Science Fund (61601513)
  • 摘要: 针对利用抽象语义(AMR)图来预测摘要子图存在的语义结构不完整问题,该文提出一种基于整数线性规划(ILP)重构AMR图结构的语义摘要算法。首先将数据预处理生成一个AMR总图;然后基于统计特征从AMR总图中抽取出摘要子图重要节点信息;最后利用ILP的方法来对摘要子图中节点关系进行重构,利用完整的摘要子图恢复生成语义摘要。实验结果表明,相比其他语义摘要方法,所提方法的ROUGE值和Smatch值都有显著提高,最多分别提高了9%和14%,该方法有利于提高语义摘要的质量。
  • 图  1  算法框架图

    图  2  英文句“I saw Joe’s dog, which was running in the garden”的AMR图表示

    图  3  AMR图合并生成AMR总图示意图

    图  4  实验结果AMR图与标准摘要AMR图的对比

    图  5  L值对摘要质量各指标的影响

    表  1  摘要子图节点和边预测正确率(%)

    PRF1
    节点71.482.576.5
    45.660.151.9
    下载: 导出CSV

    表  2  不同语义摘要算法的性能对比

    算法ROUGE-1ROUGE-2ROUGE-WSmatch
    外部语义资源20.45.614.317.8
    语义聚类21.26.015.219.1
    潜在语义分析22.86.814.920.5
    TextRank算法25.78.116.824.6
    PAS语义图26.59.618.628.9
    本文方法29.310.419.632.1
    下载: 导出CSV

    表  3  使用ILP和未使用ILP摘要质量对比

    ROUGE-1ROUGE-2ROUGE-WSmatch
    未使用ILP29.19.818.729.7
    使用ILP29.310.419.632.1
    结果提升0.20.60.92.4
    下载: 导出CSV

    表  4  与深度学习算法的性能对比

    方法ROUGE-1ROUGE-2ROUGE-WSmatch
    本文方法29.310.419.632.1
    深度学习33.413.624.826.7
    下载: 导出CSV
  • LYNN H M, CHOI C, and KIM P. An improved method of automatic text summarization for web contents using lexical chain with semantic-related terms[J]. Soft Computing, 2018, 22(12): 4013–4023. doi: 10.1007/s00500-017-2612-9
    SHETTY K and KALLIMANI J S. Automatic extractive text summarization using K-means clustering[C]. International Conference on Electrical, Electronics, Communication, Computer, and Optimization Techniques (ICEECCOT), Mysuru, India, 2017: 1–9.
    YU Shanshan, SU Jindian, LI Pengfei, et al. Towards high performance text mining: A TextRank-based method for automatic text summarization[J]. International Journal of Grid and High Performance Computing (IJGHPC) , 2016, 8(2): 58–75. doi: 10.4018/IJGHPC.2016040104
    NGUYEN-HOANG T A, NGUYEN K, and TRAN Q V. TSGVi: A graph-based summarization system for Vietnamese documents[J]. Journal of Ambient Intelligence and Humanized Computing, 2012, 3(4): 305–313. doi: 10.1007/s12652-012-0143-x
    KHAN A, SALIM N, FARMAN H, et al. Abstractive text summarization based on improved semantic graph approach[J]. International Journal of Parallel Programming, 2018: 1–25. doi: 10.1007/s10766-018-0560-3
    BANARESU L, BONIAL C, CAI S, et al. Abstract meaning representation for sembanking[C]. Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, Sofia, Bulgaria, 2013: 178–186.
    LIU Fei, FLANIGAN J, THOMSON S, et al. Toward abstractive summarization using semantic representations[C]. Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, USA, 2015: 1077–1086.
    SONG Linfeng, PENG Xiaochang, ZHANG Yue, et al. AMR-to-text generation with synchronous node replacement grammar[C]. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017: 7–13.
    KONSTAS I, IYER S, YATSKAR M, et al. Neural AMR: Sequence-to-sequence models for parsing and generation[C]. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017: 146–157.
    明拓思宇, 陈鸿昶, 黄瑞阳, 等. 一种基于加权AMR图的语义子图预测摘要算法[J]. 计算机工程, 2018, 44(10): 292–297. doi: 10.19678/j.issn.1000-3428.0050770

    MING Tuosiyu, CHEN Hongchang, HUANG Ruiyang, et al. A semantic subgraph predictive summary algorithm based on improved AMR graph[J]. Computer Engineering, 2018, 44(10): 292–297. doi: 10.19678/j.issn.1000-3428.0050770
    COLLINS M. Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms[C]. Proceedings of the ACL-02 conference on Empirical Methods in Natural Language Processing, Philadelphia, USA, 2002, 10: 1–8.
    HERMANN K M, KOČISKÝ T, GREFENSTETTE E, et al. Teaching machines to read and comprehend[C]. Proceeding NIPS’15 Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, 2015, 1: 1693–1701.
    LIN Chinyew. ROUGE: A package for automatic evaluation of summaries[C]. Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, Barcelona, Spain, 2004, 10: 74–81.
    CAI Shu and KNIGHT K. Smatch: An evaluation metric for semantic feature structures[C]. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, 2013, 2: 748–752.
    SEE A, LIU P J, and MANNING C D. Get to the point: Summarization with pointer-generator networks[C]. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017, 1: 1073–1083.
    TAN Jiwei, WAN Xiaojun, and XIAO Jianguo. Abstractive document summarization with a graph-based attentional neural model[C]. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017, 1: 1171–1181.
  • 加载中
图(5) / 表(4)
计量
  • 文章访问数:  2424
  • HTML全文浏览量:  916
  • PDF下载量:  99
  • 被引次数: 0
出版历程
  • 收稿日期:  2018-07-18
  • 修回日期:  2018-10-26
  • 网络出版日期:  2018-11-19
  • 刊出日期:  2019-07-01

目录

    /

    返回文章
    返回