A new method for arbitrarily-oriented text detection in video

Sharma, N; Shivakumara, P; Pal, U; Blumenstein, M; Tan, CL

A new method for arbitrarily-oriented text detection in video

Sharma, N

Shivakumara, P Pal, U Blumenstein, M

Tan, CL

Permalink

Publication Type:: Conference Proceeding
Citation:: Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012, 2012, pp. 74 - 78
Issue Date:: 2012-05-24

Closed Access

	Filename	Description	Size
	2013007801OK.pdf	Published version	449.01 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Sharma, N https://orcid.org/0000-0003-0841-1245	en_US
dc.contributor.author	Shivakumara, P	en_US
dc.contributor.author	Pal, U	en_US
dc.contributor.author	Blumenstein, M https://orcid.org/0000-0002-9908-3744	en_US
dc.contributor.author	Tan, CL	en_US
dc.date.issued	2012-05-24	en_US
dc.identifier.citation	Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012, 2012, pp. 74 - 78	en_US
dc.identifier.isbn	9780769546612	en_US
dc.identifier.uri	http://hdl.handle.net/10453/121005
dc.description.abstract	Text detection in video frames plays a vital role in enhancing the performance of information extraction systems because the text in video frames helps in indexing and retrieving video efficiently and accurately. This paper presents a new method for arbitrarily-oriented text detection in video, based on dominant text pixel selection, text representatives and region growing. The method uses gradient pixel direction and magnitude corresponding to Sobel edge pixels of the input frame to obtain dominant text pixels. Edge components in the Sobel edge map corresponding to dominant text pixels are then extracted and we call them text representatives. We eliminate broken segments of each text representatives to get candidate text representatives. Then the perimeter of candidate text representatives grows along the text direction in the Sobel edge map to group the neighboring text components which we call word patches. The word patches are used for finding the direction of text lines and then the word patches are expanded in the same direction in the Sobel edge map to group the neighboring word patches and to restore missing text information. This results in extraction of arbitrarily-oriented text from the video frame. To evaluate the method, we considered arbitrarily-oriented data, non-horizontal data, horizontal data, Hua's data and ICDAR-2003 competition data (Camera images). The experimental results show that the proposed method outperforms the existing method in terms of recall and f-measure. © 2012 IEEE.	en_US
dc.relation.ispartof	Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012	en_US
dc.relation.isbasedon	10.1109/DAS.2012.6	en_US
dc.title	A new method for arbitrarily-oriented text detection in video	en_US
dc.type	Conference Proceeding
utslib.for	0803 Computer Software	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
pubs.organisational-group	/University of Technology Sydney/Strength - QSI - Centre for Quantum Software and Information
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US

Abstract:

Text detection in video frames plays a vital role in enhancing the performance of information extraction systems because the text in video frames helps in indexing and retrieving video efficiently and accurately. This paper presents a new method for arbitrarily-oriented text detection in video, based on dominant text pixel selection, text representatives and region growing. The method uses gradient pixel direction and magnitude corresponding to Sobel edge pixels of the input frame to obtain dominant text pixels. Edge components in the Sobel edge map corresponding to dominant text pixels are then extracted and we call them text representatives. We eliminate broken segments of each text representatives to get candidate text representatives. Then the perimeter of candidate text representatives grows along the text direction in the Sobel edge map to group the neighboring text components which we call word patches. The word patches are used for finding the direction of text lines and then the word patches are expanded in the same direction in the Sobel edge map to group the neighboring word patches and to restore missing text information. This results in extraction of arbitrarily-oriented text from the video frame. To evaluate the method, we considered arbitrarily-oriented data, non-horizontal data, horizontal data, Hua's data and ICDAR-2003 competition data (Camera images). The experimental results show that the proposed method outperforms the existing method in terms of recall and f-measure. © 2012 IEEE.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/121005