Patch reordering: A novel way to achieve rotation and translation invariance in convolutional neural networks

Shen, X; Tian, X; Sun, S; Tao, D

Patch reordering: A novel way to achieve rotation and translation invariance in convolutional neural networks

Shen, X Tian, X Sun, S Tao, D

Permalink

Publication Type:: Conference Proceeding
Citation:: 31st AAAI Conference on Artificial Intelligence, AAAI 2017, 2017, pp. 2534 - 2540
Issue Date:: 2017-01-01

Closed Access

	Filename	Description	Size
	14674-66929-1-PB.pdf	Published version	1.12 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Shen, X	en_US
dc.contributor.author	Tian, X	en_US
dc.contributor.author	Sun, S	en_US
dc.contributor.author	Tao, D https://orcid.org/0000-0001-7225-5449	en_US
dc.date.issued	2017-01-01	en_US
dc.identifier.citation	31st AAAI Conference on Artificial Intelligence, AAAI 2017, 2017, pp. 2534 - 2540	en_US
dc.identifier.uri	http://hdl.handle.net/10453/125899
dc.description.abstract	Copyright © 2017, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance on many visual recognition tasks. However, the combination of convolution and pooling operations only shows invariance to small local location changes in meaningful objects in input. Sometimes, such networks are trained using data augmentation to encode this invariance into the parameters, which restricts the capacity of the model to learn the content of these objects. A more efficient use of the parameter budget is to encode rotation or translation invariance into the model architecture, which relieves the model from the need to learn them. To enable the model to focus on learning the content of objects other than their locations, we propose to conduct patch ranking of the feature maps before feeding them into the next layer. When patch ranking is combined with convolution and pooling operations, we obtain consistent representations despite the location of meaningful objects in input. We show that the patch ranking module improves the performance of the CNN on many benchmark tasks, including MNIST digit recognition, large-scale image recognition, and image retrieval.	en_US
dc.relation.ispartof	31st AAAI Conference on Artificial Intelligence, AAAI 2017	en_US
dc.title	Patch reordering: A novel way to achieve rotation and translation invariance in convolutional neural networks	en_US
dc.type	Conference Proceeding
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US

Abstract:

Copyright © 2017, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance on many visual recognition tasks. However, the combination of convolution and pooling operations only shows invariance to small local location changes in meaningful objects in input. Sometimes, such networks are trained using data augmentation to encode this invariance into the parameters, which restricts the capacity of the model to learn the content of these objects. A more efficient use of the parameter budget is to encode rotation or translation invariance into the model architecture, which relieves the model from the need to learn them. To enable the model to focus on learning the content of objects other than their locations, we propose to conduct patch ranking of the feature maps before feeding them into the next layer. When patch ranking is combined with convolution and pooling operations, we obtain consistent representations despite the location of meaningful objects in input. We show that the patch ranking module improves the performance of the CNN on many benchmark tasks, including MNIST digit recognition, large-scale image recognition, and image retrieval.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/125899