A Connected Component-Based Deep Learning Model for Multi-type Struck-Out Component Classification

Shivakumara, P; Jain, T; Surana, N; Pal, U; Lu, T; Blumenstein, M; Chanda, S

A Connected Component-Based Deep Learning Model for Multi-type Struck-Out Component Classification

Shivakumara, P Jain, T Surana, N Pal, U Lu, T Blumenstein, M

Chanda, S

Permalink

Publisher:: Springer
Publication Type:: Conference Proceeding
Citation:: Document Analysis and Recognition – ICDAR 2021 Workshops, 2021, 12917 LNCS, pp. 158-173
Issue Date:: 2021-01-01

Closed Access

	Filename	Description	Size
	Shivakumara2021_Chapter_AConnectedComponent-BasedDeepL.pdf	Published version	3.15 MB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Shivakumara, P
dc.contributor.author	Jain, T
dc.contributor.author	Surana, N
dc.contributor.author	Pal, U
dc.contributor.author	Lu, T
dc.contributor.author	Blumenstein, M https://orcid.org/0000-0002-9908-3744
dc.contributor.author	Chanda, S
dc.date	2021-09-05
dc.date.accessioned	2022-06-27T08:05:20Z
dc.date.available	2022-06-27T08:05:20Z
dc.date.issued	2021-01-01
dc.identifier.citation	Document Analysis and Recognition – ICDAR 2021 Workshops, 2021, 12917 LNCS, pp. 158-173
dc.identifier.isbn	9783030861582
dc.identifier.issn	0302-9743
dc.identifier.issn	1611-3349
dc.identifier.uri	http://hdl.handle.net/10453/158404
dc.description.abstract	Due to the presence of struck-out handwritten words in document images, the performance of different methods degrades for several important applications, such as handwriting recognition, writer, gender, fraudulent document identification, document age estimation, writer age estimation, normal/abnormal behavior of person analysis, and descriptive answer evaluation. This work proposes a new method which combines connected component analysis for text component detection and deep learning for classification of struck-out and non-struck-out words. For text component detection, the proposed method finds the stroke width to detect edges of texts in images, and then performs smoothing operations to remove noise. Furthermore, morphological operations are performed on smoothed images to label connected components as text by fixing bounding boxes. Inspired by the great success of deep learning models, we explore DenseNet for classifying struck-out and non-struck-out handwritten components by considering text components as input. Experimental results on our dataset demonstrate the proposed method outperforms the existing methods in terms of classification rate.
dc.language	en
dc.publisher	Springer
dc.relation.ispartof	Document Analysis and Recognition – ICDAR 2021 Workshops
dc.relation.ispartof	International Conference on Document Analysis and Recognition Workshops
dc.relation.ispartofseries	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.relation.isbasedon	10.1007/978-3-030-86159-9_11
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	A Connected Component-Based Deep Learning Model for Multi-type Struck-Out Component Classification
dc.type	Conference Proceeding
utslib.citation.volume	12917 LNCS
utslib.location.activity	Lausanne, Switzerland
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
pubs.organisational-group	/University of Technology Sydney/Strength - QSI - Centre for Quantum Software and Information
utslib.copyright.status	closed_access	*
pubs.consider-herdc	false
dc.date.updated	2022-06-27T08:05:18Z
pubs.finish-date	2021-09-10
pubs.place-of-publication	Switzerland
pubs.publication-status	Published
pubs.start-date	2021-09-05
pubs.volume	12917 LNCS
dc.location	Switzerland

Abstract:

Due to the presence of struck-out handwritten words in document images, the performance of different methods degrades for several important applications, such as handwriting recognition, writer, gender, fraudulent document identification, document age estimation, writer age estimation, normal/abnormal behavior of person analysis, and descriptive answer evaluation. This work proposes a new method which combines connected component analysis for text component detection and deep learning for classification of struck-out and non-struck-out words. For text component detection, the proposed method finds the stroke width to detect edges of texts in images, and then performs smoothing operations to remove noise. Furthermore, morphological operations are performed on smoothed images to label connected components as text by fixing bounding boxes. Inspired by the great success of deep learning models, we explore DenseNet for classifying struck-out and non-struck-out handwritten components by considering text components as input. Experimental results on our dataset demonstrate the proposed method outperforms the existing methods in terms of classification rate.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/158404