Adversarial deep learning models with multiple adversaries

Publication Type:
Journal Article
Citation:
IEEE Transactions on Knowledge and Data Engineering, 2019, 31 (6), pp. 1066 - 1079
Issue Date:
2019-06-01
Filename Description Size
B26C4B0D-471C-46BA-B8F9-80D7E7E09A am.pdfAccepted Manuscript Version5.31 MB
Adobe PDF
Full metadata record
© 1989-2012 IEEE. We develop an adversarial learning algorithm for supervised classification in general and Convolutional Neural Networks (CNN) in particular. The algorithm's objective is to produce small changes to the data distribution defined over positive and negative class labels so that the resulting data distribution is misclassified by the CNN. The theoretical goal is to determine a manipulating change on the input data that finds learner decision boundaries where many positive labels become negative labels. Then we propose a CNN which is secure against such unforeseen changes in data. The algorithm generates adversarial manipulations by formulating a multiplayer stochastic game targeting the classification performance of the CNN. The multiplayer stochastic game is expressed in terms of multiple two-player sequential games. Each game consists of interactions between two players-an intelligent adversary and the learner CNN-such that a player's payoff function increases with interactions. Following the convergence of a sequential noncooperative Stackelberg game, each two-player game is solved for the Nash equilibrium. The Nash equilibrium finds a pair of strategies (learner weights and evolutionary operations) from which there is no incentive for either learner or adversary to deviate. We then retrain the learner over all the adversarial manipulations generated by multiple players to propose a secure CNN which is robust to subsequent adversarial data manipulations. The adversarial data and corresponding CNN performance is evaluated on MNIST handwritten digits data. The results suggest that game theory and evolutionary algorithms are very effective in securing deep learning models against performance vulnerabilities simulated as attack scenarios from multiple adversaries.
Please use this identifier to cite or link to this item: