Mask assisted object coding with deep learning for object retrieval in surveillance videos

Teng, K; Wang, J; Xu, M; Lu, H

Mask assisted object coding with deep learning for object retrieval in surveillance videos

Teng, K Wang, J Xu, M

Lu, H

Permalink

Publication Type:: Conference Proceeding
Citation:: MM 2014 - Proceedings of the 2014 ACM Conference on Multimedia, 2014, pp. 1109 - 1112
Issue Date:: 2014-01-01

Closed Access

	Filename	Description	Size
	p1109-teng.pdf	Published version	3.11 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Teng, K	en_US
dc.contributor.author	Wang, J	en_US
dc.contributor.author	Xu, M https://orcid.org/0000-0001-9581-8849	en_US
dc.contributor.author	Lu, H	en_US
dc.date.issued	2014-01-01	en_US
dc.identifier.citation	MM 2014 - Proceedings of the 2014 ACM Conference on Multimedia, 2014, pp. 1109 - 1112	en_US
dc.identifier.isbn	9781450330633	en_US
dc.identifier.uri	http://hdl.handle.net/10453/121619
dc.description.abstract	Retrieving visual object from a large-scale video dataset is one of multimedia research focuses but a challenging task due to imprecise object extraction and partial occlusion. This paper presents a novel approach to efficiently encode and retrieve visual objects, which addresses some practical complications in surveillance videos. Specifically, we take advantage of the mask information to assist object representation, and develop an encoding method by utilizing highly nonlinear mapping with a deep neural network. Furthermore, we add some occluded noise into the learning process to enhance the robustness of dealing with background noise and partial occlusions. A real-life surveillance video data containing over 10 million objects are built to evaluate the proposed approach. Experimental results show our approach significantly outperforms state-of-the-art solutions for object retrieval in large-scale video dataset.	en_US
dc.relation.ispartof	MM 2014 - Proceedings of the 2014 ACM Conference on Multimedia	en_US
dc.relation.isbasedon	10.1145/2647868.2654981	en_US
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Mask assisted object coding with deep learning for object retrieval in surveillance videos	en_US
dc.type	Conference Proceeding
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
pubs.organisational-group	/University of Technology Sydney/Strength - GBDTC - Global Big Data Technologies
pubs.organisational-group	/University of Technology Sydney/Strength - INEXT - Innovation in IT Services and Applications
utslib.copyright.status	closed_access	*
pubs.publication-status	Published	en_US

Abstract:

Retrieving visual object from a large-scale video dataset is one of multimedia research focuses but a challenging task due to imprecise object extraction and partial occlusion. This paper presents a novel approach to efficiently encode and retrieve visual objects, which addresses some practical complications in surveillance videos. Specifically, we take advantage of the mask information to assist object representation, and develop an encoding method by utilizing highly nonlinear mapping with a deep neural network. Furthermore, we add some occluded noise into the learning process to enhance the robustness of dealing with background noise and partial occlusions. A real-life surveillance video data containing over 10 million objects are built to evaluate the proposed approach. Experimental results show our approach significantly outperforms state-of-the-art solutions for object retrieval in large-scale video dataset.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/121619