RISS 검색 - 국내학술지논문 상세보기

다국어 초록 (Multilingual Abstract)

Optical character recognition (OCR) technology is a field of continuous research in which text images are stored or utilized as data. However, OCR technology alone has limitations in classifying and recognizing attribute values such as address, name, and phone number expressed in text in a semi-structured form, such as a parcel delivery invoice (PDI) written by hand. Therefore, in this study, we propose a handwritten parcel delivery invoice understanding (HPDIU) model for automated parcel delivery reception. The proposed HPDIU model consists of two steps: region detection of the parcel delivery invoice (RD-PDI) and information extraction from the PDI (IE-PDI). The RD-PDI, which is the first step, minimizes the resolution adjustment by detecting only the necessary area, including the sender and recipient information in the image. The second step, IE-PDI, consists of an end-to-end framework without OCR technology using a document understanding transformer and integrates the process of character detection, recognition, and understanding.
In other words, the proposed model can solve the limitations of the OCR technology because it can integrate the process of classifying and recognizing according to attributes. To prove the validity of the proposed model, we used 500 handwritten PDI datasets to evaluate the accuracy of the character units according to the attributes of address, name, and phone number. As a result of the evaluation, a total average of 91.67% in units of letters proved the superiority of the proposed HPDIU model.

참고문헌 (Reference)

1 Z. Liu, "Swin transformer: Hierarchical vision transformer using shifted windows" 9992-10002, 2021

2 T. T. H. Nguyen, "Survey of post-OCR processing approaches" 54 (54): 1-37, 2022

3 D. Wang, "Rout-ing and scheduling for hybrid truck-drone collaborative par-cel delivery with independent and truck-carried drones" 6 (6): 10483-10495, 2019

4 B. Shi, "Robust scene text recognition with automatic rectification" 4168-4176, 2016

5 B. P. Majumder, "Representation learning for information extrac-tion from form-like documents" 6495-6504, 2020

6 W. Hwang, "Post-OCR parsing: Building simple and robust parser via bio tagging" 2019

7 G. Kim, "OCR-free document under-standing transformer" 498-517, 2022

8 S. Rijhwani, "OCR post correction for endangered language texts" 5931-5942, 2020

9 A. Buades, "Non-local means de-noising" 1 : 208-212, 2011

10 A. Radford, "Improving language understanding by generative pre-training" 2018