Shakeout: A new regularized deep neural network training scheme

Publication Type:
Conference Proceeding
30th AAAI Conference on Artificial Intelligence, AAAI 2016, 2016, pp. 1751 - 1757
Issue Date:
Filename Description Size
11840-55685-1-PB.pdfPublished version957.11 kB
Adobe PDF
Full metadata record
© Copyright 2016, Association for the Advancement of Artificial Intelligence ( All rights reserved. Recent years have witnessed the success of deep neural networks in dealing with a plenty of practical problems. The invention of effective training techniques largely contributes to this success. The so-called "Dropout" training scheme is one of the most powerful tool to reduce over-fitting. From the statistic point of view, Dropout works by implicitly imposing an L2 regularizer on the weights. In this paper, we present a new training scheme: Shakeout. Instead of randomly discarding units as Dropout does at the training stage, our method randomly chooses to enhance or inverse the contributions of each unit to the next layer. We show that our scheme leads to a combination of L1 regularization and L2 regularization imposed on the weights, which has been proved effective by the Elastic Net models in practice.We have empirically evaluated the Shakeout scheme and demonstrated that sparse network weights are obtained via Shakeout training. Our classification experiments on real-life image datasets MNIST and CIFAR- 10 show that Shakeout deals with over-fitting effectively.
Please use this identifier to cite or link to this item: