Detecting adversarial examples by additional evidence from noise domain

Publication Type:
Journal Article
IET Image Processing, 2022, 16, (2), pp. 378-392
Issue Date:
Full metadata record
Deep neural networks are widely adopted powerful tools for perceptual tasks. However, recent research indicated that they are easily fooled by adversarial examples, which are produced by adding imperceptible adversarial perturbations to clean examples. Here the steganalysis rich model (SRM) is utilized to generate noise feature maps, and they are combined with RGB images to discover the difference between adversarial examples and clean examples. In particular, a two-stream pseudo-siamese network that fuses the subtle difference in RGB images with the noise inconsistency in noise features is proposed. The proposed method has strong detection capability and transferability, and can be combined with any model without modifying its architecture or training procedure. The extensive empirical experiments show that, compared with the state-of-the-art detection methods, the proposed approach achieves excellent performance in distinguishing adversarial samples generated by popular attack methods on different real datasets. Moreover, this method has good generalization, it trained by a specific adversary can defend against other adversaries effectively.
Please use this identifier to cite or link to this item: