Fractals based multi-oriented text detection system for recognition in mobile video images

Shivakumara, P; Wu, L; Lu, T; Tan, CL; Blumenstein, M; Anami, BS

Fractals based multi-oriented text detection system for recognition in mobile video images

Shivakumara, P Wu, L Lu, T Tan, CL Blumenstein, M

Anami, BS

Permalink

Publication Type:: Journal Article
Citation:: Pattern Recognition, 2017, 68 pp. 158 - 174
Issue Date:: 2017-08-01

Closed Access

	Filename	Description	Size
	Fractals based multi-oriented text detection system for recognition in mobile video images.pdf	Published Version	4.36 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Shivakumara, P	en_US
dc.contributor.author	Wu, L	en_US
dc.contributor.author	Lu, T	en_US
dc.contributor.author	Tan, CL	en_US
dc.contributor.author	Blumenstein, M https://orcid.org/0000-0002-9908-3744	en_US
dc.contributor.author	Anami, BS	en_US
dc.date.issued	2017-08-01	en_US
dc.identifier.citation	Pattern Recognition, 2017, 68 pp. 158 - 174	en_US
dc.identifier.issn	0031-3203	en_US
dc.identifier.uri	http://hdl.handle.net/10453/114756
dc.description.abstract	© 2017 Elsevier Ltd Text detection in mobile video is challenging due to poor quality, complex background, arbitrary orientation and text movement. In this work, we introduce fractals for text detection in video captured by mobile cameras. We first use fractal properties such as self-similarity in a novel way in the gradient domain for enhancing low resolution mobile video. We then propose to use k-means clustering for separating text components from non-text ones. To make the method font size independent, fractal expansion is further explored in the wavelet domain in a pyramid structure for text components in text cluster to identify text candidates. Next, potential text candidates are obtained by studying the optical flow property of text candidates. Direction guided boundary growing is finally proposed to extract multi-oriented texts. The method is tested on different datasets, which include low resolution video captured by mobile, benchmark ICDAR 2013 video, YouTube Video Text (YVT) data, ICDAR 2013, Microsoft, and MSRA arbitrary orientation natural scene datasets, to evaluate the performance of the proposed method in terms of recall, precision, F-measure and misdetection rate. To show the effectiveness of the proposed method, the results are compared with the state of the art methods.	en_US
dc.relation.ispartof	Pattern Recognition	en_US
dc.relation.isbasedon	10.1016/j.patcog.2017.03.018	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	Fractals based multi-oriented text detection system for recognition in mobile video images	en_US
dc.type	Journal Article
utslib.citation.volume	68	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
utslib.for	0806 Information Systems	en_US
utslib.for	0906 Electrical and Electronic Engineering	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
pubs.organisational-group	/University of Technology Sydney/Strength - QSI - Centre for Quantum Software and Information
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US
pubs.volume	68	en_US

Abstract:

© 2017 Elsevier Ltd Text detection in mobile video is challenging due to poor quality, complex background, arbitrary orientation and text movement. In this work, we introduce fractals for text detection in video captured by mobile cameras. We first use fractal properties such as self-similarity in a novel way in the gradient domain for enhancing low resolution mobile video. We then propose to use k-means clustering for separating text components from non-text ones. To make the method font size independent, fractal expansion is further explored in the wavelet domain in a pyramid structure for text components in text cluster to identify text candidates. Next, potential text candidates are obtained by studying the optical flow property of text candidates. Direction guided boundary growing is finally proposed to extract multi-oriented texts. The method is tested on different datasets, which include low resolution video captured by mobile, benchmark ICDAR 2013 video, YouTube Video Text (YVT) data, ICDAR 2013, Microsoft, and MSRA arbitrary orientation natural scene datasets, to evaluate the performance of the proposed method in terms of recall, precision, F-measure and misdetection rate. To show the effectiveness of the proposed method, the results are compared with the state of the art methods.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/114756