Robust Deep Learning Models against Semantic-Preserving Adversarial Attack
- Publisher:
- IEEE
- Publication Type:
- Conference Proceeding
- Citation:
- 2023 International Joint Conference on Neural Networks (IJCNN), 2023, 00, pp. 1-8
- Issue Date:
- 2023-01-01
Closed Access
Filename | Description | Size | |||
---|---|---|---|---|---|
2304.03955v1.pdf | Submitted version | 7.28 MB |
Copyright Clearance Process
- Recently Added
- In Progress
- Closed Access
This item is closed access and not available.
Deep learning models can be fooled by small l(p)-norm adversarial perturbations and natural perturbations in terms of attributes. Although the robustness against each perturbation has been explored, it remains a challenge to address the robustness against joint perturbations effectively. In this paper, we study the robustness of deep learning models against joint perturbations by proposing a novel attack mechanism named Semantic-Preserving Adversarial (SPA) attack, which can then be used to enhance adversarial training. Specifically, we introduce an attribute manipulator to generate natural and human-comprehensible perturbations and a noise generator to generate diverse adversarial noises. Based on such combined noises, we optimize both the attribute value and the diversity variable to generate jointlyperturbed samples. For robust training, we adversarially train the deep learning model against the generated joint perturbations. Empirical results on four benchmarks show that the SPA attack causes a larger performance decline with small l1 norm-ball constraints compared to existing approaches. Furthermore, our SPA-enhanced training outperforms existing defense methods against such joint perturbations.
Please use this identifier to cite or link to this item: