Shakeout: A new regularized deep neural network training scheme

Kang, G; Li, J; Tao, D

Shakeout: A new regularized deep neural network training scheme

Kang, G Li, J

Tao, D

Permalink

Publication Type:: Conference Proceeding
Citation:: 30th AAAI Conference on Artificial Intelligence, AAAI 2016, 2016, pp. 1751 - 1757
Issue Date:: 2016-01-01

Closed Access

	Filename	Description	Size
	11840-55685-1-PB.pdf	Published version	957.11 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Kang, G	en_US
dc.contributor.author	Li, J https://orcid.org/0000-0002-1336-2241	en_US
dc.contributor.author	Tao, D https://orcid.org/0000-0001-7225-5449	en_US
dc.date.issued	2016-01-01	en_US
dc.identifier.citation	30th AAAI Conference on Artificial Intelligence, AAAI 2016, 2016, pp. 1751 - 1757	en_US
dc.identifier.isbn	9781577357605	en_US
dc.identifier.uri	http://hdl.handle.net/10453/122948
dc.description.abstract	© Copyright 2016, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Recent years have witnessed the success of deep neural networks in dealing with a plenty of practical problems. The invention of effective training techniques largely contributes to this success. The so-called "Dropout" training scheme is one of the most powerful tool to reduce over-fitting. From the statistic point of view, Dropout works by implicitly imposing an L2 regularizer on the weights. In this paper, we present a new training scheme: Shakeout. Instead of randomly discarding units as Dropout does at the training stage, our method randomly chooses to enhance or inverse the contributions of each unit to the next layer. We show that our scheme leads to a combination of L1 regularization and L2 regularization imposed on the weights, which has been proved effective by the Elastic Net models in practice.We have empirically evaluated the Shakeout scheme and demonstrated that sparse network weights are obtained via Shakeout training. Our classification experiments on real-life image datasets MNIST and CIFAR- 10 show that Shakeout deals with over-fitting effectively.	en_US
dc.relation	http://purl.org/au-research/grants/arc/DP140102164
dc.relation.ispartof	30th AAAI Conference on Artificial Intelligence, AAAI 2016	en_US
dc.title	Shakeout: A new regularized deep neural network training scheme	en_US
dc.type	Conference Proceeding
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
pubs.organisational-group	/University of Technology Sydney/Students
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US

Abstract:

© Copyright 2016, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Recent years have witnessed the success of deep neural networks in dealing with a plenty of practical problems. The invention of effective training techniques largely contributes to this success. The so-called "Dropout" training scheme is one of the most powerful tool to reduce over-fitting. From the statistic point of view, Dropout works by implicitly imposing an L2 regularizer on the weights. In this paper, we present a new training scheme: Shakeout. Instead of randomly discarding units as Dropout does at the training stage, our method randomly chooses to enhance or inverse the contributions of each unit to the next layer. We show that our scheme leads to a combination of L1 regularization and L2 regularization imposed on the weights, which has been proved effective by the Elastic Net models in practice.We have empirically evaluated the Shakeout scheme and demonstrated that sparse network weights are obtained via Shakeout training. Our classification experiments on real-life image datasets MNIST and CIFAR- 10 show that Shakeout deals with over-fitting effectively.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/122948