Boosting Model Inversion Attacks with Adversarial Examples

Zhou, S; Zhu, T; Ye, D; Yu, X; Zhou, W

Boosting Model Inversion Attacks with Adversarial Examples

Zhou, S Zhu, T

Ye, D

Yu, X

Zhou, W

Permalink

Publisher:: Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:: Journal Article
Citation:: IEEE Transactions on Dependable and Secure Computing, 2023, PP, (99), pp. 1-18
Issue Date:: 2023-01-01

Closed Access

	Filename	Description	Size
	Boosting Model Inversion Attacks with Adversarial Examples_OPUS.pdf	Accepted version	5.09 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Zhou, S
dc.contributor.author	Zhu, T https://orcid.org/0000-0003-3411-7947
dc.contributor.author	Ye, D https://orcid.org/0000-0002-7561-0992
dc.contributor.author	Yu, X https://orcid.org/0000-0002-0269-5649
dc.contributor.author	Zhou, W
dc.date.accessioned	2024-03-10T23:56:53Z
dc.date.available	2024-03-10T23:56:53Z
dc.date.issued	2023-01-01
dc.identifier.citation	IEEE Transactions on Dependable and Secure Computing, 2023, PP, (99), pp. 1-18
dc.identifier.issn	1545-5971
dc.identifier.issn	1941-0018
dc.identifier.uri	http://hdl.handle.net/10453/176426
dc.description.abstract	Model inversion attacks involve reconstructing the training data of a target model, which raises serious privacy concerns for machine learning models. However, these attacks, especially learning-based methods, are likely to suffer from low attack accuracy, i.e., low classification accuracy of these reconstructed data by machine learning classifiers. Recent studies showed an alternative strategy of model inversion attacks, GAN-based optimization, can improve the attack accuracy effectively. However, these series of GAN-based attacks reconstruct only class-representative training data for a class, whereas learning-based attacks can reconstruct diverse data for different training data in each class. Hence, in this paper, we propose a new training paradigm for a learning-based model inversion attack that can achieve higher attack accuracy in a black-box setting. First, we regularize the training process of the attack model with an added semantic loss function and, second, we inject adversarial examples into the training data to increase the diversity of the class-related parts (i.e., he essential features for classification tasks) in training data. This scheme guides the attack model to pay more attention to the class-related parts of the original data during the data reconstruction process. The experimental results show that our method greatly boosts the performance of existing learning-based model inversion attacks. Even when no extra queries to the target model are allowed, the approach can still improve the attack accuracy of reconstructed data. This new attack shows that the severity of the threat from learning-based model inversion adversaries is underestimated and more robust defenses are required.
dc.language	en
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartof	IEEE Transactions on Dependable and Secure Computing
dc.relation.isbasedon	10.1109/TDSC.2023.3285015
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0803 Computer Software, 0804 Data Format, 0805 Distributed Computing
dc.subject.classification	Strategic, Defence & Security Studies
dc.subject.classification	4604 Cybersecurity and privacy
dc.subject.classification	4606 Distributed computing and systems software
dc.title	Boosting Model Inversion Attacks with Adversarial Examples
dc.type	Journal Article
utslib.citation.volume	PP
utslib.for	0803 Computer Software
utslib.for	0804 Data Format
utslib.for	0805 Distributed Computing
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	University of Technology Sydney/Strength - CCSP - Centre for Cyber Security and Privacy
utslib.copyright.status	closed_access	*
dc.date.updated	2024-03-10T23:56:47Z
pubs.issue	99
pubs.publication-status	Published
pubs.volume	PP
utslib.citation.issue	99

Abstract:

Model inversion attacks involve reconstructing the training data of a target model, which raises serious privacy concerns for machine learning models. However, these attacks, especially learning-based methods, are likely to suffer from low attack accuracy, i.e., low classification accuracy of these reconstructed data by machine learning classifiers. Recent studies showed an alternative strategy of model inversion attacks, GAN-based optimization, can improve the attack accuracy effectively. However, these series of GAN-based attacks reconstruct only class-representative training data for a class, whereas learning-based attacks can reconstruct diverse data for different training data in each class. Hence, in this paper, we propose a new training paradigm for a learning-based model inversion attack that can achieve higher attack accuracy in a black-box setting. First, we regularize the training process of the attack model with an added semantic loss function and, second, we inject adversarial examples into the training data to increase the diversity of the class-related parts (i.e., he essential features for classification tasks) in training data. This scheme guides the attack model to pay more attention to the class-related parts of the original data during the data reconstruction process. The experimental results show that our method greatly boosts the performance of existing learning-based model inversion attacks. Even when no extra queries to the target model are allowed, the approach can still improve the attack accuracy of reconstructed data. This new attack shows that the severity of the threat from learning-based model inversion adversaries is underestimated and more robust defenses are required.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/176426