Bigram and Unigram Based Text Attack via Adaptive Monotonic Heuristic Search

Yang, X; Bailey, J; Liu, W; Liu, W

Bigram and Unigram Based Text Attack via Adaptive Monotonic Heuristic Search

Yang, X

Bailey, J Liu, W

Liu, W

Permalink

Publisher:: AAAI Press
Publication Type:: Conference Proceeding
Citation:: Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI 2021), 2021, 1, pp. 706-714
Issue Date:: 2021-05-18

Closed Access

	Filename	Description	Size
	16151-Article Text-19645-1-2-20210518.pdf	Published version	268.51 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Yang, X https://orcid.org/0000-0001-6487-3183
dc.contributor.author	Bailey, J
dc.contributor.author	Liu, W https://orcid.org/0000-0002-3003-1313
dc.contributor.author	Liu, W https://orcid.org/0000-0002-3003-1313
dc.date	2021-02-02
dc.date.accessioned	2022-06-22T23:58:09Z
dc.date.available	2022-06-22T23:58:09Z
dc.date.issued	2021-05-18
dc.identifier.citation	Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI 2021), 2021, 1, pp. 706-714
dc.identifier.isbn	9781713835974
dc.identifier.issn	2159-5399
dc.identifier.uri	http://hdl.handle.net/10453/158324
dc.description.abstract	Deep neural networks (DNNs) are known to be vulnerable to adversarial images, while their robustness in text classification are rarely studied. Several lines of text attack methods have been proposed in the literature, such as character-level, word-level, and sentence-level attacks. However, it is still a challenge to minimize the number of word distortions necessary to induce misclassification, while simultaneously ensuring the lexical correctness, syntactic correctness, and semantic similarity. In this paper, we propose the Bigram and Unigram based Monotonic Heuristic Search (BU-MHS) method to examine the vulnerability of deep models. Our method has three major merits. Firstly, we propose to attack text documents not only at the unigram word level but also at the bigram level to avoid producing meaningless outputs. Secondly, we propose a hybrid method to replace the input words with both their synonyms and sememe candidates, which greatly enriches potential substitutions compared to only using synonyms. Lastly, we design a search algorithm, i.e., Monotonic Heuristic Search (MHS), to determine the priority of word replacements, aiming to reduce the modification cost in an adversarial attack. We evaluate the effectiveness of BU-MHS on IMDB, AG's News, and Yahoo! Answers text datasets by attacking four state-of-the-art DNNs models. Experimental results show that our BU-MHS achieves the highest attack success rate by changing the smallest number of words compared with other existing models.
dc.language	en
dc.publisher	AAAI Press
dc.relation.ispartof	Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI 2021)
dc.relation.ispartof	AAAI Conference on Artificial Intelligence
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Bigram and Unigram Based Text Attack via Adaptive Monotonic Heuristic Search
dc.type	Conference Proceeding
utslib.citation.volume	1
utslib.location.activity	Virtual
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAI - Advanced Analytics Institute Research Centre
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	closed_access	*
pubs.consider-herdc	true
dc.date.updated	2022-06-22T23:58:08Z
pubs.finish-date	2021-02-09
pubs.place-of-publication	Palo Alto, California USA
pubs.publication-status	Published
pubs.start-date	2021-02-02
pubs.volume	1
dc.location	Palo Alto, California USA

Abstract:

Deep neural networks (DNNs) are known to be vulnerable to adversarial images, while their robustness in text classification are rarely studied. Several lines of text attack methods have been proposed in the literature, such as character-level, word-level, and sentence-level attacks. However, it is still a challenge to minimize the number of word distortions necessary to induce misclassification, while simultaneously ensuring the lexical correctness, syntactic correctness, and semantic similarity. In this paper, we propose the Bigram and Unigram based Monotonic Heuristic Search (BU-MHS) method to examine the vulnerability of deep models. Our method has three major merits. Firstly, we propose to attack text documents not only at the unigram word level but also at the bigram level to avoid producing meaningless outputs. Secondly, we propose a hybrid method to replace the input words with both their synonyms and sememe candidates, which greatly enriches potential substitutions compared to only using synonyms. Lastly, we design a search algorithm, i.e., Monotonic Heuristic Search (MHS), to determine the priority of word replacements, aiming to reduce the modification cost in an adversarial attack. We evaluate the effectiveness of BU-MHS on IMDB, AG's News, and Yahoo! Answers text datasets by attacking four state-of-the-art DNNs models. Experimental results show that our BU-MHS achieves the highest attack success rate by changing the smallest number of words compared with other existing models.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/158324