Unmasking vulnerabilities : adversarial attacks via word-level manipulation on NLP models

Ni, Mingze

Unmasking vulnerabilities : adversarial attacks via word-level manipulation on NLP models

Ni, Mingze

Permalink

Publication Type:: Thesis
Issue Date:: 2024

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download thesisAdobe PDF (4.8 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Ni, Mingze
dc.date.accessioned	2024-12-10T01:30:19Z
dc.date.available	2024-12-10T01:30:19Z
dc.date.issued	2024
dc.identifier.uri	http://hdl.handle.net/10453/182451
dc.description	University of Technology Sydney. Faculty of Engineering and Information Technology.	en_US.UTF-8
dc.description.abstract	Natural language processing (NLP) models have advanced significantly and are widely used in applications like sentiment analysis, translation, and chatbots. However, they are vulnerable to adversarial attacks, threatening their reliability and real-world adoption. This thesis examines the vulnerabilities of sequence-to-sequence and classification models and introduces techniques for creating effective, imperceptible adversarial examples. The Hybrid Attentive Attack (HAA) method crafts subtle adversarial examples in Neural Machine Translation by focusing on semantically relevant words. The Fraud's Bargain Attack (FBA) uses randomization to improve adversarial example selection for classifiers via the Word Manipulation Process (WMP) and the Metropolis-Hasting sampler. Two algorithms, Reversible Jump Attack (RJA) and Metropolis-Hasting Modification Reduction (MMR), enhance search space and balance changes with attack success. This thesis demonstrates the proposed methods' effectiveness through extensive experiments.	en_US.UTF-8
dc.format	Thesis (PhD)
dc.language.iso	en_US	en_US.UTF-8
dc.relation	https://opus.lib.uts.edu.au/bitstream/10453/182451/1/thesis.pdf
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.rights	© 2024 Mingze Ni
dc.rights	au.edu.uts.lib/cph
dc.title	Unmasking vulnerabilities : adversarial attacks via word-level manipulation on NLP models	en_US.UTF-8
dc.type	Thesis
utslib.copyright.status	open_access	*

Abstract:

Natural language processing (NLP) models have advanced significantly and are widely used in applications like sentiment analysis, translation, and chatbots. However, they are vulnerable to adversarial attacks, threatening their reliability and real-world adoption. This thesis examines the vulnerabilities of sequence-to-sequence and classification models and introduces techniques for creating effective, imperceptible adversarial examples. The Hybrid Attentive Attack (HAA) method crafts subtle adversarial examples in Neural Machine Translation by focusing on semantically relevant words. The Fraud's Bargain Attack (FBA) uses randomization to improve adversarial example selection for classifiers via the Word Manipulation Process (WMP) and the Metropolis-Hasting sampler. Two algorithms, Reversible Jump Attack (RJA) and Metropolis-Hasting Modification Reduction (MMR), enhance search space and balance changes with attack success. This thesis demonstrates the proposed methods' effectiveness through extensive experiments.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/182451