An Interactive Graph Attention Networks Model for Aspect-level Sentiment Analysis
摘要: 方面级情感分析目前主要采用注意力机制与传统神经网络相结合的方法对方面与上下文词进行建模。这类方法忽略了句子中方面与上下文词之间的句法依存信息及位置信息,从而导致注意力权重分配不合理。为此,该文提出一种面向方面级情感分析的交互图注意力网络模型(IGATs)。该模型首先使用双向长短期记忆网络(BiLSTM)学习句子的语义特征表示,并结合位置信息生成新的句子特征表示,然后在新生成的特征表示上构建图注意力网络以捕获句法依存信息,再通过交互注意力机制建模方面与上下文词之间的语义关系,最后利用softmax进行分类输出。在3个公开数据集上的实验结果表明,与其他现有模型相比,IGATs的准确率与宏平均F1值均得到显著提升。Abstract: At present, aspect-level sentiment analysis uses mainly the method of combining attention mechanism and traditional neural network to model aspect and contextual words.These methods ignore the syntactic dependency information and position information between aspects and contextual words in sentences, which leads to unreasonable weight allocation of attention. Therefore, an Interactive Graph ATtention (IGATs) networks model for aspect-level sentiment analysis is proposed. Bidirectional Long Short-Term Memory (BiLSTM) network is firstly used to learn the semantic feature representation of sentences. And then the position information is combined to update the feature representation, a graph attention network is constructed on the newly generated feature representation to capture syntactic dependency information. Finally, interactive attention mechanism is used to model the semantic relations between the aspect and contextual words. Experimental results on three public datasets show that the accuracy and macro average F1 value of IGATs are significantly improved compared with other existing models.
表 1 数据集统计
数据集 积极 中性 消极 Twitter-train 1561 3127 1560 Twitter-test 173 346 173 Laptop-train 994 464 870 Laptop-test 341 169 128 Restaurant-train 2164 637 807 Restaurant-test 728 196 196 表 2 实验平台
实验环境 具体信息 操作系统 Windows 10 教育版 CPU Intel(R) Core(TM) i7-7700 CPU @ 3.60 GHz 内存 16.0 GB 显卡 GTX 1080 显存 8.0 GB 表 3 超参数设置
超参数 超参数值数量 词嵌入维度 300 隐藏状态向量维度 300 Batch size 16 训练迭代次数epoch 100 优化器Optimizer Adam 学习率Learning rate 0.001 Dropout rate 0.3 L2正则化系数 0.00001 表 4 各个模型的性能对比(%)
模型 Twitter Laptop Restaurant 准确率(Acc) 宏平均F1 准确率(Acc) 宏平均F1 准确率(Acc) 宏平均F1 SVM 63.40 63.30 70.49 N/A 80.16 N/A LSTM 69.56 67.70 69.28 63.09 78.13 67.47 MemNet 71.48 69.90 70.64 65.17 79.61 69.64 IAN 72.50 70.81 72.05 67.38 79.26 70.09 AOA 72.30 70.20 72.62 67.52 79.97 70.42 AOA-MultiACIA 72.40 69.40 75.27 70.24 82.59 72.13 ASGCN 72.15 70.40 75.55 71.05 80.77 72.02 GATs 73.12 71.25 74.61 70.51 80.63 70.41 IGATs 75.29 73.40 76.02 72.05 82.32 73.99 表 5 各个模型的可训练参数数量(M)
模型 可训练参数数量 SVM – LSTM 0.72 MemNet 0.36 IAN 2.17 AOA 2.10 ASGCN 2.17 GATs 1.81 IGATs 1.81 表 6 消融研究(%)
模型 Twitter Laptop Restaurant 准确率(Acc) 宏平均F1 准确率(Acc) 宏平均F1 准确率(Acc) 宏平均F1 BiLSTM+IAtt 74.13 72.86 75.08 70.82 81.25 72.14 BiLSTM+GAT+IAtt 74.86 72.98 74.92 71.08 82.05 73.45 BiLSTM+PE+IAtt 74.42 72.35 76.65 72.75 82.23 74.01 IGATs 75.29 73.40 76.02 72.05 82.32 73.99 -
