Advanced Search
Volume 29 Issue 9
Jan.  2011
Turn off MathJax
Article Contents
Cao Hai-long, Zhao Tie-jun, Li Sheng. Parsing Chinese Based on Lexicalized Model[J]. Journal of Electronics & Information Technology, 2007, 29(9): 2082-2085. doi: 10.3724/SP.J.1146.2006.00119
Citation: Cao Hai-long, Zhao Tie-jun, Li Sheng. Parsing Chinese Based on Lexicalized Model[J]. Journal of Electronics & Information Technology, 2007, 29(9): 2082-2085. doi: 10.3724/SP.J.1146.2006.00119

Parsing Chinese Based on Lexicalized Model

doi: 10.3724/SP.J.1146.2006.00119
  • Received Date: 2006-01-23
  • Rev Recd Date: 2006-07-20
  • Publish Date: 2007-09-19
  • In order to process large-scale real text, a method of building Chinese parser based on lexicalized model is proposed. First, a unified approach for segmentation and part of speech tagging is proposed based on hidden Markov model. The method not only conservers the merits of HMM which is simple and efficient but also improves the tagging accuracy. Then the head-driven model is used to recognize phrases. Head-driven model is a well-known English parsing model; we combine it with segmentation and POS tagging model and thus build a Chinese parser that can operate at the character level. The parser is evaluated on the standard test set. It achieves 77.57% precision and 74.96% recall and outperforms the only previous comparable work significantly.
  • loading
  • Xue Nianwen, Xia Fei, and Chiou Fudong, et al.. The Penn Chinese treebank: Phrase structure annotation of a large corpus. Natural Language Engineering, 2004(4): 1-30.[2]Collins Michael. Head-driven statistical models for natural language parsing. [Ph.D. thesis], University of Pennsylvania, 1999.[3]Fung Pascale, Ngai Grace, and Yang Yongsheng, et al.. A maximum-entropy chinese parser augmented by transformation-based[J].learning. ACM Trans. on Asian Language Processing.2004, 3(2):159-168[4]Bikel Daniel and Chiang David. Two statistical parsing models applied to Chinese treebank. Proceedings of the 2nd Chinese language processing workshop, Hong Kong, 2000: 1-6.[5]Chiang David and Bikel Daniel. Recovering latent information in treebanks. Proceedings of the 19th International Conference on Computational Linguistics, Taipei, 2002: 183-189.[6]Levy Roger and Manning Christopher. Is it harder to parse Chinese, or the Chinese treebank? Proceedings of Annual Meeting of the Association for Computational Linguistics, Sapporo, 2003: 439-446.[7]Hearne Mary and Way Andy. Data-oriented parsing and the Penn Chinese treebank. Proceedings of the First International Joint Conference Natural language processing, Hainan Island, 2004: 406-413.[8]Xiong Deyi, Li Shuanglong, and Liu Qun et al.. Parsing the Penn Chinese treebank with semantic knowledge. Proceedings of the Second International Joint Conference Natural language processing, Jeju Island, 2005: 70-81.[9]Xia Fei. Automatic grammar generation from two different perspectives. [Ph.D. thesis], University of Pennsylvania, 1999.[10]Luo Xiaoqiang. A maximum entropy Chinese character-based parser. Proceedings of the conference on Empirical methods in Natural Language Processing, Barcelona, 2003: 192-199.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (3589) PDF downloads(760) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return