Centered Weight Normalization in Accelerating Training of Deep Neural Networks

Huang, L; Liu, X; Liu, Y; Lang, B; Tao, D

Centered Weight Normalization in Accelerating Training of Deep Neural Networks

Huang, L Liu, X Liu, Y Lang, B Tao, D

Permalink

Publication Type:: Conference Proceeding
Citation:: Proceedings of the IEEE International Conference on Computer Vision, 2017, 2017-October pp. 2822 - 2830
Issue Date:: 2017-12-22

Closed Access

	Filename	Description	Size
	08237567.pdf	Published version	471.94 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Huang, L	en_US
dc.contributor.author	Liu, X	en_US
dc.contributor.author	Liu, Y	en_US
dc.contributor.author	Lang, B	en_US
dc.contributor.author	Tao, D https://orcid.org/0000-0001-7225-5449	en_US
dc.date.issued	2017-12-22	en_US
dc.identifier.citation	Proceedings of the IEEE International Conference on Computer Vision, 2017, 2017-October pp. 2822 - 2830	en_US
dc.identifier.isbn	9781538610329	en_US
dc.identifier.issn	1550-5499	en_US
dc.identifier.uri	http://hdl.handle.net/10453/126719
dc.description.abstract	© 2017 IEEE. Training deep neural networks is difficult for the pathological curvature problem. Re-parameterization is an effective way to relieve the problem by learning the curvature approximately or constraining the solutions of weights with good properties for optimization. This paper proposes to reparameterize the input weight of each neuron in deep neural networks by normalizing it with zero-mean and unit-norm, followed by a learnable scalar parameter to adjust the norm of the weight. This technique effectively stabilizes the distribution implicitly. Besides, it improves the conditioning of the optimization problem and thus accelerates the training of deep neural networks. It can be wrapped as a linear module in practice and plugged in any architecture to replace the standard linear module. We highlight the benefits of our method on both multi-layer perceptrons and convolutional neural networks, and demonstrate its scalability and efficiency on SVHN, CIFAR-10, CIFAR-100 and ImageNet datasets.	en_US
dc.relation.ispartof	Proceedings of the IEEE International Conference on Computer Vision	en_US
dc.relation.isbasedon	10.1109/ICCV.2017.305	en_US
dc.title	Centered Weight Normalization in Accelerating Training of Deep Neural Networks	en_US
dc.type	Conference Proceeding
utslib.citation.volume	2017-October	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US
pubs.volume	2017-October	en_US

Abstract:

© 2017 IEEE. Training deep neural networks is difficult for the pathological curvature problem. Re-parameterization is an effective way to relieve the problem by learning the curvature approximately or constraining the solutions of weights with good properties for optimization. This paper proposes to reparameterize the input weight of each neuron in deep neural networks by normalizing it with zero-mean and unit-norm, followed by a learnable scalar parameter to adjust the norm of the weight. This technique effectively stabilizes the distribution implicitly. Besides, it improves the conditioning of the optimization problem and thus accelerates the training of deep neural networks. It can be wrapped as a linear module in practice and plugged in any architecture to replace the standard linear module. We highlight the benefits of our method on both multi-layer perceptrons and convolutional neural networks, and demonstrate its scalability and efficiency on SVHN, CIFAR-10, CIFAR-100 and ImageNet datasets.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/126719