Towards Model Extraction Attacks in GAN-Based Image Translation via Domain Shift Mitigation

Mi, D; Zhang, Y; Zhang, LY; Hu, S; Zhong, Q; Yuan, H; Pan, S

Towards Model Extraction Attacks in GAN-Based Image Translation via Domain Shift Mitigation

Mi, D Zhang, Y

Zhang, LY Hu, S Zhong, Q Yuan, H Pan, S

Permalink

Publisher:: Association for the Advancement of Artificial Intelligence (AAAI)
Publication Type:: Conference Proceeding
Citation:: Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38, (18), pp. 19902-19910
Issue Date:: 2024-03-25

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Accepted versionAdobe PDF (2.63 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Mi, D
dc.contributor.author	Zhang, Y https://orcid.org/0000-0001-5611-3483
dc.contributor.author	Zhang, LY
dc.contributor.author	Hu, S
dc.contributor.author	Zhong, Q
dc.contributor.author	Yuan, H
dc.contributor.author	Pan, S
dc.date.accessioned	2024-04-15T02:26:55Z
dc.date.available	2024-04-15T02:26:55Z
dc.date.issued	2024-03-25
dc.identifier.citation	Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38, (18), pp. 19902-19910
dc.identifier.issn	2159-5399
dc.identifier.issn	2374-3468
dc.identifier.uri	http://hdl.handle.net/10453/177916
dc.description.abstract	Model extraction attacks (MEAs) enable an attacker to replicate the functionality of a victim deep neural network (DNN) model by only querying its API service remotely, posing a severe threat to the security and integrity of pay-per-query DNN-based services. Although the majority of current research on MEAs has primarily concentrated on neural classifiers, there is a growing prevalence of image-to-image translation (I2IT) tasks in our everyday activities. However, techniques developed for MEA of DNN classifiers cannot be directly transferred to the case of I2IT, rendering the vulnerability of I2IT models to MEA attacks often underestimated. This paper unveils the threat of MEA in I2IT tasks from a new perspective. Diverging from the traditional approach of bridging the distribution gap between attacker queries and victim training samples, we opt to mitigate the effect caused by the different distributions, known as the domain shift. This is achieved by introducing a new regularization term that penalizes high-frequency noise, and seeking a flatter minimum to avoid overfitting to the shifted distribution. Extensive experiments on different image translation tasks, including image super-resolution and style transfer, are performed on different backbone victim models, and the new design consistently outperforms the baseline by a large margin across all metrics. A few real-life I2IT APIs are also verified to be extremely vulnerable to our attack, emphasizing the need for enhanced defenses and potentially revised API publishing policies.
dc.language	en
dc.publisher	Association for the Advancement of Artificial Intelligence (AAAI)
dc.relation.ispartof	Proceedings of the AAAI Conference on Artificial Intelligence
dc.relation.isbasedon	10.1609/aaai.v38i18.29966
dc.rights	info:eu-repo/semantics/openAccess
dc.title	Towards Model Extraction Attacks in GAN-Based Image Translation via Domain Shift Mitigation
dc.type	Conference Proceeding
utslib.citation.volume	38
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	open_access	*
dc.date.updated	2024-04-15T02:26:53Z
pubs.issue	18
pubs.publication-status	Published
pubs.volume	38
utslib.citation.issue	18

Abstract:

Model extraction attacks (MEAs) enable an attacker to replicate the functionality of a victim deep neural network (DNN) model by only querying its API service remotely, posing a severe threat to the security and integrity of pay-per-query DNN-based services. Although the majority of current research on MEAs has primarily concentrated on neural classifiers, there is a growing prevalence of image-to-image translation (I2IT) tasks in our everyday activities. However, techniques developed for MEA of DNN classifiers cannot be directly transferred to the case of I2IT, rendering the vulnerability of I2IT models to MEA attacks often underestimated. This paper unveils the threat of MEA in I2IT tasks from a new perspective. Diverging from the traditional approach of bridging the distribution gap between attacker queries and victim training samples, we opt to mitigate the effect caused by the different distributions, known as the domain shift. This is achieved by introducing a new regularization term that penalizes high-frequency noise, and seeking a flatter minimum to avoid overfitting to the shifted distribution. Extensive experiments on different image translation tasks, including image super-resolution and style transfer, are performed on different backbone victim models, and the new design consistently outperforms the baseline by a large margin across all metrics. A few real-life I2IT APIs are also verified to be extremely vulnerable to our attack, emphasizing the need for enhanced defenses and potentially revised API publishing policies.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/177916