一种表格框线检测和字线分离算法
A frame line detection and removal algorithm for form document recognition
-
摘要: 该文提出了一种基于有向单连通链的表格框线检测算法,能够合理地利用单连通链边沿的全局统计特性和单连通链之间的局部位置关系,精确地提取表格框线,具有抗倾斜,抗断裂,抗字线交叠等优点。在此基础上,提出了一种能够分离交叠字线的表格框线去除算法,并成功应用于实际的表格识别系统中。Abstract: A new frame line detection algorithm based on the structural image element-Directional Single-Connected Chain (DSCC) is proposed. Taking advantages of the global statistical property of the edges of the DSCCs, and their local mutual relations, the algorithm is able to accurately extract frame lines from scanned form images. It demonstrates the desired performance of insensitive to line slant, breaks as well as touches from character strokes inside the form cells. Based on this algorithm, a frame line removal approach is presented, by which the frame line can be removed without affecting the touched character strokes.
-
Yuan Y. Tang et al., Automatic document processing: A survey, Pattern Recognition, 1996,29(12), 1931-1952.[2]J. Illingworth, J. Kittler, A survey of the Hough transform, Computer Vision, Graphics and ImageProcessing, 1988, 44(1), 87-116.[3]Mark C. K. Yang, et al., Hough transform modified by line connectivity and line thickness, IEEETrans. on PAMI, 1997, 19(8), 905-910.[4]Bin Yu, Anil K. Jain, A generic system for form dropout, IEEE Trans. on PAMI, 1996, 18(11),1127-1134.[5]刘今晖,印刷表格自动输入数据库的研究与实现,硕士学位论文,清华大学,1992.[6]Liu Wenyin, Dov Dori, From raster to vectors: Extracting visual information from line drawings,Pattern Analysis and Applications, 1999, 2(2), 10-21.[7]Chun-Ta Ho, Ling-Hwei Chen, A high-speed algorithm for line detection, Pattern RecognitionLetters, 1996, 17(5), 467-473.[8]Jin-Yong Yoo, et al., Line removal and restoration of handwritten characters on the form documents, Proc. 4th International Conference on Document Analysis and Recognition, Ulm, Germany, 1997, 128-131. 期刊类型引用(1)
1. 刘德渊,张金全,张鑫,万武南,张仕斌,秦智. 基于无证书签密的跨链身份认证方案. 计算机应用. 2024(12): 3731-3740 . 百度学术
其他类型引用(0)
-
计量
- 文章访问数: 2872
- HTML全文浏览量: 143
- PDF下载量: 1806
- 被引次数: 1