Convolutional Neural Network for Accurate Crowd Counting and Destiny Estimation

Amirgholipour Kasmani, Saeed

Convolutional Neural Network for Accurate Crowd Counting and Destiny Estimation

Amirgholipour Kasmani, Saeed

Permalink

Publication Type:: Thesis
Issue Date:: 2021

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download contents and abstractAdobe PDF (317.81 kB)

Adobe PDF

Download thesisAdobe PDF (21.2 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Amirgholipour Kasmani, Saeed
dc.date.accessioned	2021-10-07T22:48:17Z
dc.date.available	2021-10-07T22:48:17Z
dc.date.issued	2021
dc.identifier.uri	http://hdl.handle.net/10453/150908
dc.description	University of Technology Sydney. Faculty of Engineering and Information Technology.	en_US.UTF-8
dc.description.abstract	Nowadays, crowd and object counting has become an important task for a variety of applications, such as traffic control, public safety, urban planning, and video surveillance. It has also become a crucial part of building a high-level monitoring system such as video surveillance and crowd analysis. In these cases, dynamic crowd monitoring and analysis is extremely important for control management and social safety. Like the other computer vision issues, crowd counting and density estimation come with various kinds of challenges such as high clutters, occlusions, non-uniform distributions of objects or people, and intra-scene and inter-scene variations in appearance. Researchers and industrial partners have attempted to design and develop many sophisticated models to address various issues that exist in crowd counting. Especially in recent years, the number of researches in the crowd counting era became overwhelming with the domination of deep-learning and Convolution Neural Networks (CNNs) based models in various computer vision tasks. In this thesis, we revisit the crowd counting and propose various novel solutions to this problem. At first, we propose an Adaptive Counting Convolutional Neural Network (A-CCNN) and consider the scale variation of objects in a frame adaptively to improve the accuracy of counting. Our method takes advantages of contextual information to provide more accurate and adaptive density maps and crowd counting in a scene. Then, we focus on CNN pruning to further enhance the crowd counting models for real-time application and increase the performance of CCNN model. Thus, a new pruning strategy is proposed by considering the contributions of various filters to the final result. The filters in the original CCNN model are grouped into positive, negative, and irrelevant types. We prune the irrelevant filters, of which feature maps contain little information, and the negative filters determined by a mask learned from the training dataset. Our solution improves the results of the counting model without fine-tuning or retraining the pruned model. Finally, we propose a novel Pyramid Density-Aware Attention-based network, abbreviated as PDANet, which leverages the attention, pyramid scale feature and two branch decoder modules for density-aware crowd counting. The PDANet utilises these modules to extract different scale features, focus on the relevant information, and suppress the misleading ones. Extensive evaluations conducted on the challenging benchmark datasets well demonstrate the superior performance of the proposed models in terms of the accuracy of counting as well as generated density maps over the well-known state-of-the-art approaches.	en_US.UTF-8
dc.format	Thesis (PhD)
dc.language.iso	en_US	en_US.UTF-8
dc.relation	https://opus.lib.uts.edu.au/bitstream/10453/150908/2/02whole.pdf
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.rights	au.edu.uts.lib/ppc
dc.rights	info:eu-repo/semantics/openAccess
dc.title	Convolutional Neural Network for Accurate Crowd Counting and Destiny Estimation	en_US.UTF-8
dc.type	Thesis
utslib.copyright.status	open_access	*

Abstract:

Nowadays, crowd and object counting has become an important task for a variety of applications, such as traffic control, public safety, urban planning, and video surveillance. It has also become a crucial part of building a high-level monitoring system such as video surveillance and crowd analysis. In these cases, dynamic crowd monitoring and analysis is extremely important for control management and social safety. Like the other computer vision issues, crowd counting and density estimation come with various kinds of challenges such as high clutters, occlusions, non-uniform distributions of objects or people, and intra-scene and inter-scene variations in appearance. Researchers and industrial partners have attempted to design and develop many sophisticated models to address various issues that exist in crowd counting. Especially in recent years, the number of researches in the crowd counting era became overwhelming with the domination of deep-learning and Convolution Neural Networks (CNNs) based models in various computer vision tasks. In this thesis, we revisit the crowd counting and propose various novel solutions to this problem. At first, we propose an Adaptive Counting Convolutional Neural Network (A-CCNN) and consider the scale variation of objects in a frame adaptively to improve the accuracy of counting. Our method takes advantages of contextual information to provide more accurate and adaptive density maps and crowd counting in a scene. Then, we focus on CNN pruning to further enhance the crowd counting models for real-time application and increase the performance of CCNN model. Thus, a new pruning strategy is proposed by considering the contributions of various filters to the final result. The filters in the original CCNN model are grouped into positive, negative, and irrelevant types. We prune the irrelevant filters, of which feature maps contain little information, and the negative filters determined by a mask learned from the training dataset. Our solution improves the results of the counting model without fine-tuning or retraining the pruned model. Finally, we propose a novel Pyramid Density-Aware Attention-based network, abbreviated as PDANet, which leverages the attention, pyramid scale feature and two branch decoder modules for density-aware crowd counting. The PDANet utilises these modules to extract different scale features, focus on the relevant information, and suppress the misleading ones. Extensive evaluations conducted on the challenging benchmark datasets well demonstrate the superior performance of the proposed models in terms of the accuracy of counting as well as generated density maps over the well-known state-of-the-art approaches.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/150908