Citation: | ZHANG Jun, LAI Zhipeng, LI Xue, NING Gengxin, YANG Cui. Cross-domain Chinese Word Segmentation Based on New Word Discovery[J]. Journal of Electronics & Information Technology, 2022, 44(9): 3241-3248. doi: 10.11999/JEIT210675 |
[1] |
陈平, 刘晓霞, 李亚军. 基于字典和统计的分词方法[J]. 计算机工程与应用, 2008, 44(10): 144–146. doi: 10.3778/j.issn.1002-8331.2008.10.042
CHEN Ping, LIU Xiaoxia, and LI Yajun. Chinese word segmentation based on dictionary and statistics[J]. Computer Engineering and Applications, 2008, 44(10): 144–146. doi: 10.3778/j.issn.1002-8331.2008.10.042
|
[2] |
WU Andi and JIANG Zixin. Word segmentation in sentence analysis[C]. 1998 International Conference on Chinese Information Processing, Beijing, China, 1998: 169–180.
|
[3] |
朱聪慧, 赵铁军, 郑德权. 基于无向图序列标注模型的中文分词词性标注一体化系统[J]. 电子与信息学报, 2010, 32(3): 700–704. doi: 10.3724/SP.J.1146.2009.00214
ZHU Conghui, ZHAO Tiejun, and ZHENG Dequan. Joint Chinese word segmentation and POS tagging system with undirected graphical models[J]. Journal of Electronics &Information Technology, 2010, 32(3): 700–704. doi: 10.3724/SP.J.1146.2009.00214
|
[4] |
YUAN Zheng, LIU Yuanhao, YIN Qiuyang, et al. Unsupervised multi-granular Chinese word segmentation and term discovery via graph partition[J]. Journal of Biomedical Informatics, 2020, 110: 103542. doi: 10.1016/j.jbi.2020.103542
|
[5] |
DU Jinlian, MI Wei, and DU Xiaolin. Chinese word segmentation in electronic medical record text via graph neural network-bidirectional LSTM-CRF model[C]. 2020 IEEE International Conference on Bioinformatics and Biomedicine, Seoul, Korea, 2020: 985–989.
|
[6] |
WANG Qi, ZHOU Yangming, RUAN Tong, et al. Incorporating dictionaries into deep neural networks for the Chinese clinical named entity recognition[J]. Journal of Biomedical Informatics, 2019, 92: 103133. doi: 10.1016/j.jbi.2019.103133
|
[7] |
XU Jingjing, MA Shuming, ZHANG Yi, et al. Transfer deep learning for low-resource Chinese word segmentation with a novel neural network[C]. The 6th National CCF Conference on Natural Language Processing and Chinese Computing, Dalian, China, 2017: 721–730.
|
[8] |
BELLEGARDA J R. Statistical language model adaptation: Review and perspectives[J]. Speech Communication, 2004, 42(1): 93–108. doi: 10.1016/j.specom.2003.08.002
|
[9] |
刘伟童, 刘培玉, 刘文锋, 等. 基于互信息和邻接熵的新词发现算法[J]. 计算机应用研究, 2019, 36(5): 1293–1296. doi: 10.19734/j.issn.1001-3695.2017.11.0745
LIU Weitong, LIU Peiyu, LIU Wenfeng, et al. New word discovery algorithm based on mutual information and branch entropy[J]. Application Research of Computers, 2019, 36(5): 1293–1296. doi: 10.19734/j.issn.1001-3695.2017.11.0745
|
[10] |
罗桂琼, 费洪晓, 戴弋. 基于反序词典的中文分词技术研究[J]. 计算机技术与发展, 2008, 18(1): 80–83.
LUO Guiqiong, FEI Hongxiao, and DAI Yi. Research of Chinese segmentation based on converse segmentation dictionary[J]. Computer Technology and Development, 2008, 18(1): 80–83.
|
[11] |
YAO Yushi and HUANG Zheng. Bi-directional LSTM recurrent neural network for Chinese word segmentation[C]. The 23rd International Conference on Neural Information Processing, Kyoto, Japan, 2016: 345–353.
|
[12] |
LIU Liyuan, SHANG Jingbo, REN Xiang, et al. Empower sequence labeling with task-aware neural language model[C]. The Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, United States, 2018.
|
[13] |
KAN Zhigang, QIAO Linbo, YANG Sen, et al. Event arguments extraction via dilate gated convolutional neural network with enhanced local features[J]. IEEE Access, 2020, 8: 123483–123491. doi: 10.1109/ACCESS.2020.3004378
|
[14] |
MIKOLOV T, CHEN Kai, CORRADO G, et al. Efficient estimation of word representations in vector space[C]. The 1st International Conference on Learning Representations, Scottsdale, Arizona, 2013.
|
[15] |
KIM Y. Convolutional neural networks for sentence classification[C]. The 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 2014: 1746–1751.
|
[16] |
Beijing Universty, City University of Hong Kong, CKIP, et al. The second international Chinese word segmentation bakeoff data[EB/OL]. http://sighan.cs.uchicago.edu/bakeoff2005/, 2005.
|