Anomaly3D: Video anomaly detection based on 3D-normality clusters

Asad, M; Yang, J; Tu, E; Chen, L; He, X

Anomaly3D: Video anomaly detection based on 3D-normality clusters

Asad, M Yang, J Tu, E Chen, L He, X

Permalink

Publisher:: Elsevier BV
Publication Type:: Journal Article
Citation:: Journal of Visual Communication and Image Representation, 2021, 75, pp. 103047
Issue Date:: 2021-02-01

Closed Access

	Filename	Description	Size
	1-s2.0-S1047320321000201-main.pdf	Published version	3.04 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Asad, M
dc.contributor.author	Yang, J
dc.contributor.author	Tu, E
dc.contributor.author	Chen, L
dc.contributor.author	He, X https://orcid.org/0000-0001-8962-540X
dc.date.accessioned	2022-03-12T22:14:43Z
dc.date.available	2022-03-12T22:14:43Z
dc.date.issued	2021-02-01
dc.identifier.citation	Journal of Visual Communication and Image Representation, 2021, 75, pp. 103047
dc.identifier.issn	1047-3203
dc.identifier.issn	1095-9076
dc.identifier.uri	http://hdl.handle.net/10453/155176
dc.description.abstract	Abnormal behavior detection in surveillance videos is necessary for public monitoring and safety. In human-based surveillance systems, it requires continuous human attention and observation, which is a difficult task. The autonomous detection of such events is of essential significance. However, due to the scarcity of labeled data and the low occurrence probability of these events, abnormal event detection is a challenging vision problem. In this paper, we introduce a novel two-stage architecture for detecting anomalous behavior in videos. In the first stage, we propose a 3D Convolutional Autoencoder (3D-CAE) architecture to extract spatio-temporal features from normal event training videos. In 3D-CAE, the encoder and decoder architectures are based on 3D convolutions, which can learn both appearance and the motion features effectively in an unsupervised manner. In the second stage, we group the 3D spatio-temporal features into different normality clusters, and then remove the sparse clusters to represent a stronger pattern of normality. From these clusters, one-class SVM classifier is used to distinguish between normal and abnormal events based on the normality scores. Experimental results on four different benchmark datasets show significant performance improvement compared to state-of-the-art approaches while providing results in real-time.
dc.language	en
dc.publisher	Elsevier BV
dc.relation.ispartof	Journal of Visual Communication and Image Representation
dc.relation.isbasedon	10.1016/j.jvcir.2021.103047
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0801 Artificial Intelligence and Image Processing, 1203 Design Practice and Management, 1905 Visual Arts and Crafts
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Anomaly3D: Video anomaly detection based on 3D-normality clusters
dc.type	Journal Article
utslib.citation.volume	75
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	1203 Design Practice and Management
utslib.for	1905 Visual Arts and Crafts
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - CRIN - Realtime Information Networks
pubs.organisational-group	/University of Technology Sydney/Strength - GBDTC - Global Big Data Technologies
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
utslib.copyright.status	closed_access	*
dc.date.updated	2022-03-12T22:14:41Z
pubs.publication-status	Published
pubs.volume	75

Abstract:

Abnormal behavior detection in surveillance videos is necessary for public monitoring and safety. In human-based surveillance systems, it requires continuous human attention and observation, which is a difficult task. The autonomous detection of such events is of essential significance. However, due to the scarcity of labeled data and the low occurrence probability of these events, abnormal event detection is a challenging vision problem. In this paper, we introduce a novel two-stage architecture for detecting anomalous behavior in videos. In the first stage, we propose a 3D Convolutional Autoencoder (3D-CAE) architecture to extract spatio-temporal features from normal event training videos. In 3D-CAE, the encoder and decoder architectures are based on 3D convolutions, which can learn both appearance and the motion features effectively in an unsupervised manner. In the second stage, we group the 3D spatio-temporal features into different normality clusters, and then remove the sparse clusters to represent a stronger pattern of normality. From these clusters, one-class SVM classifier is used to distinguish between normal and abnormal events based on the normality scores. Experimental results on four different benchmark datasets show significant performance improvement compared to state-of-the-art approaches while providing results in real-time.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/155176