Deceiving question-answering models: A hybrid word-level adversarial approach.

Li, J; Ni, M; Gong, Y; Liu, W

Deceiving question-answering models: A hybrid word-level adversarial approach.

Li, J

Ni, M Gong, Y Liu, W

Permalink

Publisher:: PERGAMON-ELSEVIER SCIENCE LTD
Publication Type:: Journal Article
Citation:: Neural Netw, 2026, 194, pp. 108105
Issue Date:: 2026-02

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Published versionAdobe PDF (3.8 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Li, J https://orcid.org/0009-0003-8595-0944
dc.contributor.author	Ni, M
dc.contributor.author	Gong, Y
dc.contributor.author	Liu, W https://orcid.org/0000-0002-3003-1313
dc.date.accessioned	2026-01-28T01:21:00Z
dc.date.available	2025-09-08
dc.date.available	2026-01-28T01:21:00Z
dc.date.issued	2026-02
dc.identifier.citation	Neural Netw, 2026, 194, pp. 108105
dc.identifier.issn	0893-6080
dc.identifier.issn	1879-2782
dc.identifier.uri	http://hdl.handle.net/10453/192459
dc.description.abstract	Deep learning underpins most of the currently advanced natural language processing (NLP) tasks such as textual classification, neural machine translation (NMT), abstractive summarization and question-answering (QA). However, the robustness of the models, particularly QA models, against adversarial attacks is a critical concern that remains insufficiently explored. This paper introduces QA-Attack (Question Answering Attack), a novel word-level adversarial strategy that fools QA models. Our attention-based attack exploits the customized attention mechanism and deletion ranking strategy to identify and target specific words within contextual passages. It creates deceptive inputs by carefully choosing and substituting synonyms, preserving grammatical integrity while misleading the model to produce incorrect responses. Our approach demonstrates versatility across various question types, particularly when dealing with extensive long textual inputs. Extensive experiments on multiple benchmark datasets demonstrate that QA-Attack successfully deceives baseline QA models and surpasses existing adversarial techniques regarding success rate, semantics changes, BLEU score, fluency and grammar error rate.
dc.format	Print-Electronic
dc.language	eng
dc.publisher	PERGAMON-ELSEVIER SCIENCE LTD
dc.relation.ispartof	Neural Netw
dc.relation.isbasedon	10.1016/j.neunet.2025.108105
dc.rights	info:eu-repo/semantics/openAccess
dc.subject.classification	Artificial Intelligence & Image Processing
dc.subject.classification	4602 Artificial intelligence
dc.subject.classification	4611 Machine learning
dc.subject.classification	4905 Statistics
dc.subject.mesh	Natural Language Processing
dc.subject.mesh	Humans
dc.subject.mesh	Neural Networks, Computer
dc.subject.mesh	Deep Learning
dc.subject.mesh	Semantics
dc.subject.mesh	Humans
dc.subject.mesh	Semantics
dc.subject.mesh	Natural Language Processing
dc.subject.mesh	Deep Learning
dc.subject.mesh	Neural Networks, Computer
dc.title	Deceiving question-answering models: A hybrid word-level adversarial approach.
dc.type	Journal Article
utslib.citation.volume	194
utslib.location.activity	United States
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	University of Technology Sydney/UTS Groups
pubs.organisational-group	University of Technology Sydney/UTS Groups/Data Science Institute (DSI)
pubs.organisational-group	University of Technology Sydney/UTS Groups/Cyber Digital Centre (CDC)
utslib.copyright.status	open_access	*
dc.rights.license	This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0). To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/
dc.date.updated	2026-01-28T01:20:58Z
pubs.publication-status	Published
pubs.volume	194

Abstract:

Deep learning underpins most of the currently advanced natural language processing (NLP) tasks such as textual classification, neural machine translation (NMT), abstractive summarization and question-answering (QA). However, the robustness of the models, particularly QA models, against adversarial attacks is a critical concern that remains insufficiently explored. This paper introduces QA-Attack (Question Answering Attack), a novel word-level adversarial strategy that fools QA models. Our attention-based attack exploits the customized attention mechanism and deletion ranking strategy to identify and target specific words within contextual passages. It creates deceptive inputs by carefully choosing and substituting synonyms, preserving grammatical integrity while misleading the model to produce incorrect responses. Our approach demonstrates versatility across various question types, particularly when dealing with extensive long textual inputs. Extensive experiments on multiple benchmark datasets demonstrate that QA-Attack successfully deceives baseline QA models and surpasses existing adversarial techniques regarding success rate, semantics changes, BLEU score, fluency and grammar error rate.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/192459