Multi-interaction Graph Convolutional Networks for Aspect-level Sentiment Analysis
摘要: 方面情感分析旨在识别句子中特定方面的情感极性,是一项细粒度情感分析任务。传统基于注意力机制方法,仅在单词之间进行单一的语义交互,没有建立方面词与文本词的语法信息交互,导致方面词错误地关注到与其语法无关的文本词信息。此外,单词的位置距离特征和语法距离特征,分别体现其在句子线性形式中和句子语法依存树中的位置关系,而基于图卷积网络处理语法信息的方法却忽略距离特征,使距方面词较远的无关信息对其情感分析造成干扰。针对上述问题,该文提出多交互图卷积网络(MIGCN),首先将文本词位置距离特征馈入到每层图卷积网络,同时利用依存树中文本词的语法距离特征对图卷积网络的邻接矩阵加权,最后,设计语义交互和语法交互分别处理单词之间语义和语法信息。实验结果表明,在公共数据集上,准确率和宏F1值均优于基准模型。Abstract: Aspect level sentiment analysis aims to identify the sentiment polarity of a specific aspect in a given context, and is a fine-grained sentiment analysis task. The traditional attention-based approach, which only performs the semantic interaction between words, does not establish the syntactic relation interaction between aspect words and text words, resulting in the aspect words incorrectly focusing on information about words that are irrelevant to their syntax. In addition, the positional distance feature and the syntactic distance feature of words, which reflect their relationships in the linear form of the sentence and in the syntactic dependency tree of the sentence, respectively, are ignored by the method of processing syntactic information using graph convolutional networks, allowing irrelevant information far from the aspect words to interfere with their sentiment analysis. To address this problem, a Multi-Interaction Graph Convolutional Network (MIGCN) is proposed. First, the context words positional distance features are fed into each layer of the graph convolutional network, while the adjacency matrix of the graph convolutional network is weighted by using the syntactic distance of context words in the dependency tree. Finally, semantic interaction and syntactic interaction are designed to process the semantic and syntactic information between words, respectively. The experimental results show the proposed model can outperform state-of-the-art baselines on the available datasets.
表 1 基于语法加权的邻接矩阵算法(算法1)
输入:$T$:句子依存树;$a$:句子中方面词;$N$:句子序列长度; 输出:$\boldsymbol{A}$: 语法加权邻接矩阵; (1) 初始化${\boldsymbol{A} } \in {\boldsymbol{R}^{N \times N} }$中所有元素为0; (2) 从依存树$T$的根部遍历每个节点$i$: (3) 设置矩阵的主对角线元素${{A}_{ii} } = 1$; (4) 遍历根为节点$i$的子树中的所有节点$j$; (5) 令${{A}_{ij} } = 1$和${{A}_{ji} } = 1$; (6) 计算节点$i$与句子中方面词$a$的语法距离$d$; (7) 令${{A}_{i{a_i} } } = { {\rm{Weight} } }(d)$和${{A}_{ {a_i}i} } = { {\rm{Weight} } }(d)$。 表 2 数据集的统计
数据集 积极 中性 消极 训练集 测试集 训练集 测试集 训练集 测试集 Lap14 994 341 464 169 870 128 Twitter 1561 173 3127 346 1560 173 Rest14 2164 728 637 196 807 196 Rest15 912 326 36 34 256 182 Rest16 1260 469 69 30 439 117 表 3 不同模型结果对比(%)
类别 模型 Twitter Lap14 Rest14 Rest15 Rest16 准确率 宏F1 准确率 宏F1 准确率 宏F1 准确率 宏F1 准确率 宏F1 基线 SVM 63.40 63.30 70.49 – 80.16 – – – – – LSTM[2] 69.56 67.70 69.28 63.09 78.13 67.47 77.37 55.17 86.80 63.88 交互
模型IAN[3] 72.50 70.81 72.05 67.38 79.26 70.09 78.54 52.65 84.74 55.21 MGAN[4] 72.54 70.81 75.27 70.81 81.25 71.94 – – – – AOA[17] 72.30 70.20 76.62 67.52 79.97 70.42 78.17 57.02 87.50 66.21 AEN-GloVe[10] 72.83 69.81 73.51 69.04 80.98 72.14 – – – – GCN 模型 ASGCN [6] 72.15 70.40 75.55 71.05 80.77 72.02 79.89 61.89 88.99 67.48 TD-GAT[7] 72.20 70.45 75.63 70.74 81.32 71.72 80.38 60.50 87.71 67.87 BiGCN[18] 74.16 73.35 74.59 71.84 81.97 73.48 81.16 64.79 88.96 70.84 kumaGCN[19] 72.45 70.77 76.12 72.42 81.43 73.64 80.69 65.99 89.39 73.19 本文模型 MIGCN 73.31 72.12 76.59 72.44 82.32 74.31 80.81 64.21 89.50 71.97 表 4 消融实验结果(%)
模型 Twitter Lap14 Rest14 Rest15 Rest16 准确率 宏F1 准确率 宏F1 准确率 宏F1 准确率 宏F1 准确率 宏F1 MIGCN 73.31 72.12 76.59 72.44 82.32 74.31 80.81 64.21 89.50 71.97 w/o se 71.34 69.54 74.97 70.89 80.21 71.85 78..60 59.01 88.20 69.20 w/o sy 72.45 70.64 75.34 71.03 81.55 73.29 79.40 62.22 89.02 66.72 w/o we 72.88 70.88 76.49 72.28 81.73 73.64 79.95 64.00 88.58 71.12 w/o ga 72.93 71.45 75.91 71.85 81.85 73.53 79.52 63.92 88.58 68.98 w/o sy + ga 72.98 71.40 75.08 70.84 81.19 72.74 78.17 58.24 88.26 68.21 w/o se + ga 72.16 70.33 74.71 70.54 79.82 71.09 79.46 61.55 88.80 67.97 -
