低数据资源条件下基于结构信息共享的无切分维文文档识别字符建模

姜志威; 丁晓青; 彭良瑞; 刘长松

doi:10.11999/JEIT150019

低数据资源条件下基于结构信息共享的无切分维文文档识别字符建模

doi: 10.11999/JEIT150019 cstr: 32379.14.JEIT150019

基金项目:

国家自然科学基金(61032008)和国家973计划项目(2013CB329403)

计量
- 文章访问数: 1490
- HTML全文浏览量: 201
- PDF下载量: 448
- 被引次数: 0
出版历程
- 收稿日期: 2015-01-06
- 修回日期: 2016-03-25
- 刊出日期: 2015-09-19

Uyghur Character Models with Shared Structure Information for Segmentation-free Recognition under Low Data Resource Conditions

摘要

摘要: 无切分维吾尔文文档识别技术能够有效避免字符切分错误，但是对于低数据资源的新样本类型，原有模型往往难以获得较高的识别性能。为此，该文提出共享常用维文字体间相对稳定的字符结构信息，并用Bootstrap方法提高样本利用效率的解决方法。通过在实际书籍样本上的实验表明，仅利用规模约原始训练样本1/5的新类型样本，该方法在测试集上的平均字符识别准确率就可以达到95.05%；而与常用的最大后验概率估计方法相比，也能使识别错误率相对降低55.76%~63.84%。因此，该方法能够有效解决低数据资源条件下的维文字符建模问题，实现对新样本类型的高性能识别。
- 文字识别 /
- 隐马尔可夫模型 /
- 统计学习 /
- 维吾尔文
Abstract: Although segmentation-free Uyghur character document recognition can efficiently avoid character segmentation error, it does not work well on low-resource new-type samples. This paper suggests sharing stable character structure among different Uyghur fonts, and improves the efficiency of utilizing samples through Bootstrap. Experiments are made on new-type book samples, which contains only 1/5 training sample amount than the original. The average character recognition accuracy of the proposed method on test samples is 95.05%, and has 55.76%~63.84% recognition error rate relative decrease than the one of Maximum A Posteriori (MAP) method. Therefore, the proposed method can accomplish accurate Uyghur character model training under low data resource conditions.
- Character recognition /
- Hidden Markov Model (HMM) /
- Statistical learning /
- Uyghur character

HTML全文

参考文献(20)

钱彦旻. 低数据资源条件下的语音识别技术新方法研究[D]. [博士论文], 清华大学, 2012: 67-85.

Qian Yan-min. Study on new speech recognition technology under low data resource conditions[D]. [Ph.D. dissertation], Tsinghua University, 2012: 67-85.

钱彦旻, 刘加. 低数据资源条件下基于优化的数据选择策略的无监督语音识别声学建模[J]. 清华大学学报(自然科学版), 2013, 53(7): 1001-1004.

Qian Yan-min and Liu Jia. Optimized data selection strategy based unsupervised acoustic modeling for low data resource speech recognition[J]. Journal of Tsinghua University (Science and Technology), 2013, 53(7): 1001-1004.

Gunter S and Bunke H. Optimizing the number of states, training iterations and Gaussians in an HMM-based handwritten word recognizer[C]. 7th International Conference on Document Analysis and Recognition (ICDAR), Edinburgh, Scotland, UK, 2003: 472-476.

Geiger J, Schenk J, Wallhoff F, et al.. Optimizing the number of states for HMM-based on-line handwritten whiteboard recognition[C]. 12th International Conference on Frontiers in Handwriting Recognition (ICFHR), Kolkata, India, 2010: 107-112.

Qing H, Chan C, and Chin-Hui L. Bayesian learning of the SCHMM parameters for speech recognition[C]. IEEE 19th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Adelaide, USA, 1994, I: 221-224.

Leggetter C J and Woodland P C. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models[J]. Computer Speech Language, 1995, 9(2): 171-185.

刘杰. 序列模型中的迁移学习研究[D]. [博士论文], 南开大学计算机与控制工程学院, 2008: 66-89.

Liu Jie. Research on transfer learning on sequence model[D]. [Ph.D. dissertation], Nankai University, 2008: 66-89.

Ait-Mohand K, Paquet T, and Ragot N. Combining structure and parameter adaptation of HMMs for printed text recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(9): 1716-1732.

Ait-Mohand K, Paquet T, Ragot N, et al.. Structure adaptation of HMM applied to OCR[C]. 20th International Conference on Pattern Recognition (ICPR), Istanbul, Turkey, 2010: 2877-2880.

Jiang Zhi-wei, Ding Xiao-qing, Peng Liang-rui, et al.. Analyzing the information entropy of states to optimize the number of states in an HMM-based off-line handwritten Arabic word recognizer[C]. 21st International Conference on Pattern Recognition, Tsukuba, Japan, 2012: 697-700.

王欢良, 韩纪庆, 郑铁然. 高斯混合分布之间K-L散度的近似计算[J]. 自动化学报, 2008, 34(5): 529-534.

Wang Huan-liang, Han Ji-qing, and Zheng Tie-ran. Approximation of Kullback-Leibler divergence between two Gaussian mixture distributions[J]. Acta Automatica Sinica, 2008, 34(5): 529-534.

Bicego M, Murino V, and Figueiredo M A T. A sequential pruning strategy for the selection of the number of states in hidden Markov models[J]. Pattern Recognition Letters, 2003, 24(9): 1395-1407.

Seymore K, McCallum A, and Rosenfeld R. Learning hidden Markov model structure for information extraction[C]. AAAI-99 Workshop on Machine Learning for Information Extraction, Orlando, USA, 1999: 37-42.

Jiang Zhi-wei, Ding Xiao-qing, Peng Liang-rui, et al.. Modified bootstrap approach with state number optimization for hidden Markov model estimation in small-size printed Arabic text-line recognition[C]. 10th International Conference on Machine Learning and Data Mining in Pattern Recognition, St. Petersburg, Russia, 2014: 437-441.

Young S, Evermann G, Gales M, et al.. The HTK Book (for HTK Version 3.4)[M]. Cambridge, UK, Cambridge University Engineering Department, 2009: 97-147.

Al-Hajj M R, Likforman-Sulem L, and Mokbel C. Combining slanted-frame classifiers for improved HMM-based Arabic handwriting recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(7): 1165-1177.

施引文献

资源附件(0)

访问统计