Jin Jian-ming, Wang Hua, Ding Xiao-qing. Uyghur, Chinese and English Multilingual Document Recognition[J]. Journal of Electronics & Information Technology, 2006, 28(7): 1188-1191.
Citation:
Jin Jian-ming, Wang Hua, Ding Xiao-qing. Uyghur, Chinese and English Multilingual Document Recognition[J]. Journal of Electronics & Information Technology, 2006, 28(7): 1188-1191.
Jin Jian-ming, Wang Hua, Ding Xiao-qing. Uyghur, Chinese and English Multilingual Document Recognition[J]. Journal of Electronics & Information Technology, 2006, 28(7): 1188-1191.
Citation:
Jin Jian-ming, Wang Hua, Ding Xiao-qing. Uyghur, Chinese and English Multilingual Document Recognition[J]. Journal of Electronics & Information Technology, 2006, 28(7): 1188-1191.
The characteristics of Uyghur, Chinese and English scripts are totally different. A Uyghur, Chinese and English multilingual document recognition system is implemented the first time based on the multilingual OCR system design principle, which includes multi-layer character language estimation and suitable adjustment. At first, the language property of each text block is estimated according to the characteristics of Uyghur, Chinese and English scripts. After that, language-oriented character segmentation algorithms are performed on text blocks, and the character recognition confidence is used to judge whether the results of character segmentation and language property estimation of a text block are right. Experimental results show the recognition accuracy of Uyghur, Chinese and English multilingual documents achieves 96.4% and above.