Applying Title Category Semantic Recognition for Text Categorization

Wang Qiang; Guan Yi; Wang Xiao-long

doi:10.3724/SP.J.1146.2006.00507

Volume 29 Issue 12

Jan. 2011

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2007 > 29(12): 2885-2890

Wang Qiang, Guan Yi, Wang Xiao-long. Applying Title Category Semantic Recognition for Text Categorization[J]. Journal of Electronics & Information Technology, 2007, 29(12): 2885-2890. doi: 10.3724/SP.J.1146.2006.00507

Citation:

Wang Qiang, Guan Yi, Wang Xiao-long. Applying Title Category Semantic Recognition for Text Categorization[J]. Journal of Electronics & Information Technology, 2007, 29(12): 2885-2890. doi: 10.3724/SP.J.1146.2006.00507

Citation:

PDF( 325 KB)

Applying Title Category Semantic Recognition for Text Categorization

doi: 10.3724/SP.J.1146.2006.00507

Received Date: 2006-04-17
Rev Recd Date: 2006-09-26
Publish Date: 2007-12-19

Abstract

Abstract

This paper presents a new algorithm using title category semantic recognition for text categorization. The algorithm generates feature space based on its category, picks up category semantic words of the title to produce candidate category and finally classifies it under these candidate categories. The experimental results firmly prove that the new algorithm performs better with fewer candidates and higher precision. Further research introduces category space representation efficiency to verify the validity of the new algorithm and proves that it can achieve great improvement in text representation.

FullText(HTML)

References(1)

References

Yiming Yang and Jan O P. A comparative study on feature selection in text categorization. In Proceedings of the 14th International Conference on Machine Learning (ICML97), San Francisco, USA, 1997: 412-420.[2]Rong Jin, Joyce Y C, and Luo Si. Learn to weight terms in information retrieval using category information. In Proceedings of the 22nd International Conference on Machine Learning, Bonn, Germany, 2005: 353-360.[3]Young joong Ko,Park Jinwoo, and Seo Jungyun. Automatic text categorization using the importance of sentences. In Proceedings of the 19th International Conference on Computational Linguistics, Taipei, Taiwan, 2002: 474-480.[4]Li Wei, Yuan Chunfa, Wong Kam-Fai, and Li Wenjie. Text similarity calculating based on critical sentence vector model. In Proceedings of the 20th International Conference on Computer Processing of Oriental Languages (ICCPOL2003), Shenyang, China, 2003: 424-430.[5]Zhan Xuegang, Yao Tianshun. The classification method for Chinese document title based on Chinese semantic analysis. In Proc of the Int'l Conf Chinese Information Processing, Beijing, Tsinghua University Press, 1998, 321-324.[6]林鸿飞. 基于示例的文本标题分类机制. 计算机研究与发展, 2001, 38(9): 1132-1136. in Hong-Fei. The mechanism of text title classification based on examples. Journal of Computer Research Development, 2001, 38(9):1132-1136.[7]张加民. 标题预示性的元功能视角. 外语教学, 2004, 25(6): 36-39. hang Jia-ming. The meta-function research on title's prediction. Foreign Language Education, 2004, 25(6): 36-39.[8]麻志毅，姚天顺. 基于情境的文本主题求解. 计算机研究与发展, 1998, 35(4): 344-348. a Zhi-yi, Yao Tian-shun. Calculating texts' topics based on situations. Journal of Computer Research Development, 1998, 35(4): 344-348.[9]刘云. 论篇名语言的标记性. 云梦学刊, 2003, 4: 104-107. iu Yun. On the markedness of title language. Journal of Yun Meng, 2003, 4: 104-107.[10]John C P. Probabilistic outputs for support vector machines and comparisons to regularized likelihood, methods. Advances in Large Margin Classifiers, 1999: 61-73.Tom Ault and Yang Yiming. KNN at TREC-9. In Proceedings of the Ninth Text REtrieval Conference (TREC-9). Maryland, USA, 1999: 127-134.[11]Franca Debole and Fabrizio Sebastiani. A analysis of the relative hardness of Reuters-21578 subsets: research articles[J].Journal of the American Society for Information Science and Technology.2005, 56(6):584-596

Relative Articles

Supplements(0)

Cited By

Proportional views