A Connected Component-Based Deep Learning Model for Multi-type Struck-Out Component Classification
- Publisher:
- Springer
- Publication Type:
- Conference Proceeding
- Citation:
- Document Analysis and Recognition – ICDAR 2021 Workshops, 2021, 12917 LNCS, pp. 158-173
- Issue Date:
- 2021-01-01
Closed Access
Filename | Description | Size | |||
---|---|---|---|---|---|
Shivakumara2021_Chapter_AConnectedComponent-BasedDeepL.pdf | Published version | 3.15 MB |
Copyright Clearance Process
- Recently Added
- In Progress
- Closed Access
This item is closed access and not available.
Due to the presence of struck-out handwritten words in document images, the performance of different methods degrades for several important applications, such as handwriting recognition, writer, gender, fraudulent document identification, document age estimation, writer age estimation, normal/abnormal behavior of person analysis, and descriptive answer evaluation. This work proposes a new method which combines connected component analysis for text component detection and deep learning for classification of struck-out and non-struck-out words. For text component detection, the proposed method finds the stroke width to detect edges of texts in images, and then performs smoothing operations to remove noise. Furthermore, morphological operations are performed on smoothed images to label connected components as text by fixing bounding boxes. Inspired by the great success of deep learning models, we explore DenseNet for classifying struck-out and non-struck-out handwritten components by considering text components as input. Experimental results on our dataset demonstrate the proposed method outperforms the existing methods in terms of classification rate.
Please use this identifier to cite or link to this item: