基于循环卷积多任务学习的多领域文本分类方法

谢金宝; 李嘉辉; 康守强; 王庆岩; 王玉静

doi:10.11999/JEIT200869

基于循环卷积多任务学习的多领域文本分类方法

doi: 10.11999/JEIT200869 cstr: 32379.14.JEIT200869

1.
广东科学技术职业学院机器人学院珠海 519090
2.
哈尔滨理工大学电气与电子工程学院哈尔滨 150000

基金项目: 基于工业互联网的协作式智能机器人产教融合创新应用平台(2020CJPT004)，黑龙江省自然科学基金(LH2019E058)，智能机器人湖北省重点实验室开放基金(HBIR202004)，黑龙江省普通高校基本科研业务费专项资金(LGYC2018JC027)

详细信息

作者简介:
谢金宝：男，1980年生，副教授，研究方向为自然语言处理、人工智能

李嘉辉：男，1995年生，硕士生，研究方向为自然语言处理、人工智能

康守强：男，1980年生，教授，研究方向为智能诊断、人工智能

王庆岩：男，1984年生，副教授，研究方向为智能诊断、人工智能、智能图像处理

王玉静：女，1983年生，副教授，研究方向为智能诊断、人工智能

通讯作者:
李嘉辉　maillijiahui@163.com

中图分类号: TP391.1
计量
- 文章访问数: 1564
- HTML全文浏览量: 966
- PDF下载量: 162
- 被引次数: 0
出版历程
- 收稿日期: 2020-10-09
- 修回日期: 2021-02-03
- 网络出版日期: 2021-03-01
- 刊出日期: 2021-08-10

A Multi-domain Text Classification Method Based on Recurrent Convolution Multi-task Learning

1.
School of Robotic, Guangdong Polytechnic of Science and Technology, Zhuhai 519090, China
2.
School of Electrical and Electronic Engineering, Harbin University of Science and Technology, Harbin 150000, China

Funds: The collaborative intelligent robot production and education integrates innovative application platform based on the industrial Internet (2020CJPT004), The Natural Science Foundation of Heilongjiang Province (LH2019E058), The open fund projects of Hubei Key Laboratory of Intelligent Robot (Wuhan Institute of Technology) (HBIR 202004), The Fundamental Research Fundation for Universities of Heilongjiang Province (LGYC2018JC027)

摘要

摘要: 文本分类任务中，不同领域的文本很多表达相似，具有相关性的特点，可以解决有标签训练数据不足的问题。采用多任务学习的方法联合学习能够将不同领域的文本利用起来，提升模型的训练准确率和速度。该文提出循环卷积多任务学习(MTL-RC)模型用于文本多分类，将多个任务的文本共同建模，分别利用多任务学习、循环神经网络(RNN)和卷积神经网络(CNN)模型的优势获取多领域文本间的相关性、文本长期依赖关系、提取文本的局部特征。基于多领域文本分类数据集进行丰富的实验，该文提出的循环卷积多任务学习模型(MTL-LC)不同领域的文本分类平均准确率达到90.1%，比单任务学习模型循环卷积单任务学习模型(STL-LC)提升了6.5%，与当前热门的多任务学习模型完全共享多任务学习模型(FS-MTL)、对抗多任务学习模型(ASP-MTL)、间接交流多任务学习框架(IC-MTL)相比分别提升了5.4%, 4%和2.8%。
- 多领域文本分类 /
- 多任务学习 /
- 循环神经网络 /
- 卷积神经网络
Abstract: In the text classification task, many texts in different domains are similarly expressed and have the characteristics of correlation, which can solve the problem of insufficient training data with labels. The text of different fields can be combined with the multi-task learning method, and the training accuracy and speed of the model can be improved. A Recurrent Convolution Multi-Task Learning (MTL-RC) model for text multi-classification is proposed, jointly modeling the text of multiple tasks, and taking advantage of multi-task learning, Recurrent Neural Network(RNN) and Convolutional Neural Network(CNN) models to obtain the correlation between multi-domain texts, long-term dependence of text. Local features of text are extracted. Rich experiments are carried out based on multi-domain text classification datasets, the Recurrent Convolution Multi-Task Learning(MTL-LC) proposed in this paper has an average accuracy of 90.1% for text classification in different fields, which is 6.5% higher than the single-task learning model STL-LC. Compared with mainstream multi-tasking learning models Full Shared Multi-Task Learning(FS-MTL), Adversarial Multi-Task Learninng(ASP-MTL), and Indirect Communciation for Multi-Task Learning(IC-MTL) have increased by 5.4%, 4%, and 2.8%, respectively.
- Multi-domain text classification /
- Multi-task learning /
- Recurrent Neural Netword(RNN) /
- Convolutional Neural Network(CNN)

HTML全文

图 1 MTL-RC多任务学习模型

下载: 全尺寸图片幻灯片

图 2 共享LSTM层

下载: 全尺寸图片幻灯片

图 3 共享LSTM和CNN层

下载: 全尺寸图片幻灯片

图 4 MTL-RC与STL-LC模型每个领域分类准确率的对比

下载: 全尺寸图片幻灯片

图 5 MTL-RC与MTL-LSTM模型每个领域分类准确率的对比

下载: 全尺寸图片幻灯片

图 6 不同领域数量下模型的准确率

下载: 全尺寸图片幻灯片

表 1 参数设置

超参数	取值	选择
隐藏层状态维数	50/100/128	100
卷积核大小	1/2/3/4/5	1/2/3
过滤器个数	50/64/100/128/256	100
dropout	0.3/0.4/0.5/0.6/0.7/0.8	0.7
训练次数	10/20/30/40/50	40
批次	8/16/32	16
学习率	0.1/0.01/0.001/0.0005	0.0005

下载: 导出CSV

表 2 与其它模型准确率对比(%)

任务	LSTM	CNN	MTL-DNN	MTL-CNN	FS-MTL	ASP-MTL	IC-MTL	MTL-RC
books	77.8	79.8	82.2	84.5	82.5	84.0	86.2	89.0
electronics	79.8	80.3	81.7	83.2	85.7	86.8	88.5	92.8
DVD	78.0	77.8	84.2	84.0	83.5	85.5	88.0	89.5
kitchen	81.8	78.5	80.7	83.2	86.0	86.2	88.2	91.5
apparel	82.8	82.0	85.0	83.7	84.5	87.0	87.5	89.8
camera	82.5	83.0	86.2	86.0	86.5	89.2	89.0	93.0
health	83.3	84.3	85.7	87.2	88.0	88.2	89.5	92.0
music	77.0	77.8	84.7	83.7	81.2	82.5	85.7	88.0
toys	82.8	80.5	87.7	89.2	84.5	88.0	89.2	91.3
video	85.5	81.8	85.0	81.5	83.7	84.5	86.0	91.0
baby	84.8	81.0	88.0	87.7	88.0	88.2	88.7	91.8
magazines	90.5	86.5	89.5	87.7	92.5	92.2	92.2	93.3
software	84.0	81.5	85.7	86.5	86.2	87.2	87.2	93.3
sports	80.8	81.8	83.2	84.0	85.5	85.7	76.7	91.5
IMDB	76.3	77.5	83.2	86.2	82.5	85.5	86.5	89.0
MR	70.8	67.0	75.5	74.5	74.7	76.7	78.0	75.3
平均数	81.2	80.1	84.3	84.5	84.7	86.1	87.3	90.1

下载: 导出CSV

表 3 MTL-LC与STL-LC模型准确率与时间比较

方法	STL-LC	MTL-RC
平均准确率(%)	83.6	90.1
平均1次训练的时间(s)	483.4	270.3

下载: 导出CSV

表 4 MTL-RC模型使用不同卷积核的准确率对比

卷积核	1	2	3	4	5	(3, 4, 5)	(2, 3, 4)	(1, 2, 3)
准确率(%)	88.6	89.5	89.2	89.2	89.4	89.3	89.5	90.1

下载: 导出CSV

参考文献(18)

[1]	谢金宝, 侯永进, 康守强, 等. 基于语义理解注意力神经网络的多元特征融合中文文本分类[J]. 电子与信息学报, 2018, 40(5): 1258–1265. doi: 10.11999/JEIT170815 XIE Jinbao, HOU Yongjin, KANG Shouqiang, et al. Multi-feature fusion based on semantic understanding attention neural network for Chinese text categorization[J]. Journal of Electronics &Information Technology, 2018, 40(5): 1258–1265. doi: 10.11999/JEIT170815
[2]	KIM Y. Convolutional neural networks for sentence classification[C]. 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014: 1746–1751. doi: 10.3115/v1/D14-1181.
[3]	BLITZER J, DREDZE M, and PEREIRA F. Biographies, Bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification[C]. The 45th Annual Meeting of the Association of Computational Linguistics (ACL), Prague, Czech Republic, 2007: 440–447.
[4]	CARUANA R. Multitask learning[J]. Machine Learning, 1997, 28(1): 41–75. doi: 10.1023/A:1007379606734
[5]	LIU Xiaodong, GAO Jianfeng, HE Xiaodong, et al. Representation learning using multi-task deep neural networks for semantic classification and information retrieval[C]. The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), Denver, USA, 2015: 912–921.
[6]	LIU Pengfei, QIU Xipeng, and HUANG Xuanjing. Recurrent neural network for text classification with multi-task learning[C]. The Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI), New York, USA, 2016: 2873–2879.
[7]	LIU Pengfei, QIU Xipeng, and HUANG Xuanjing. Adversarial multi-task learning for text classification[C]. The 55th Annual Meeting of the Association for Computational Linguistics (ACL), Vancouver, Canada, 2017: 1–10.
[8]	HOCHREITER S and SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735–1780. doi: 10.1162/neco.1997.9.8.1735
[9]	王鑫, 李可, 宁晨, 等. 基于深度卷积神经网络和多核学习的遥感图像分类方法[J]. 电子与信息学报, 2019, 41(5): 1098–1105. doi: 10.11999/JEIT180628 WANG Xin, LI Ke, NING Chen, et al. Remote sensing image classification method based on deep convolution neural network and multi-kernel learning[J]. Journal of Electronics &Information Technology, 2019, 41(5): 1098–1105. doi: 10.11999/JEIT180628
[10]	KALCHBRENNER N, GREFENSTETTE E, and BLUNSOM P. A convolutional neural network for modelling sentences[C]. The 52nd Annual Meeting of the Association for Computational Linguistics (ACL), Baltimore, USA, 2014: 655–665.
[11]	刘宗林, 张梅山, 甄冉冉, 等. 融入罪名关键词的法律判决预测多任务学习模型[J]. 清华大学学报: 自然科学版, 2019, 59(7): 497–504. doi: 10.16511/j.cnki.qhdxxb.2019.21.020 LIU Zonglin, ZHANG Meishan, ZHEN Ranran, et al. Multi-task learning model for legal judgment predictions with charge keywords[J]. Journal of Tsinghua University:Science and Technology, 2019, 59(7): 497–504. doi: 10.16511/j.cnki.qhdxxb.2019.21.020
[12]	COLLOBERT R and WESTON J. A unified architecture for natural language processing: Deep neural networks with multitask learning[C]. The 25th International Conference on Machine Learning (ICML), Helsinki, Finland, 2008: 160–167.
[13]	LI Shoushan, HUANG Churen, and ZONG Chengqing. Multi-domain sentiment classification with classifier combination[J]. Journal of Computer Science and Technology, 2011, 26(1): 25–33. doi: 10.1007/s11390-011-9412-y
[14]	LIU Pengfei, QIU Xipeng, and HUANG Xuanjing. Deep multi-task learning with shared memory for text classification[C]. 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP), Austin, USA, 2016: 118–127.
[15]	LIU Pengfei, FU Jie, DONG Yue, et al. Learning multi-task communication with message passing for sequence learning[C]. AAAI Conference on Artificial Intelligence (AAAI), Palo Alto, USA, 2019: 4360–4367.
[16]	YUAN Zhigang, WU Sixing, WU Fangzhao, et al. Domain attention model for multi-domain sentiment classification[J]. Knowledge-Based Systems, 2018, 155: 1–10. doi: 10.1016/j.knosys.2018.05.004
[17]	MAAS A L, DALY R E, PHAM P T, et al. Learning word vectors for sentiment analysis[C]. The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL), Portland, USA, 2011: 142–150.
[18]	PANG Bo and LEE L. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales[C]. The 43rd Annual Meeting on Association for Computational Linguistics (ACL), Ann Arbor, USA, 2005: 115–124.