Mitigating Multi-class Unintended Demographic Bias in Text Classification with Adversarial Learning

Pan, L; Yao, L; Zhang, W; Wang, X

Mitigating Multi-class Unintended Demographic Bias in Text Classification with Adversarial Learning

Pan, L Yao, L Zhang, W Wang, X

Permalink

Publisher:: Springer
Publication Type:: Conference Proceeding
Citation:: Web Information Systems Engineering – WISE 2022, 2022, 13724 LNCS, pp. 386-394
Issue Date:: 2022-01-01

Closed Access

	Filename	Description	Size
	978-3-031-20891-1_27.pdf	Published version	453.89 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Pan, L
dc.contributor.author	Yao, L
dc.contributor.author	Zhang, W
dc.contributor.author	Wang, X https://orcid.org/0000-0001-9582-3445
dc.date	2022-11-01
dc.date.accessioned	2023-05-23T10:55:53Z
dc.date.available	2023-05-23T10:55:53Z
dc.date.issued	2022-01-01
dc.identifier.citation	Web Information Systems Engineering – WISE 2022, 2022, 13724 LNCS, pp. 386-394
dc.identifier.isbn	9783031208904
dc.identifier.issn	0302-9743
dc.identifier.issn	1611-3349
dc.identifier.uri	http://hdl.handle.net/10453/170417
dc.description.abstract	Text classification enables higher efficiency on text data queries in information retrieval. However, unintended demographic bias can impair text toxicity classification. Thus, we propose a novel debiasing framework utilizing Adversarial Learning on word embeddings of multi-class sensitive demographic words to alleviate this bias. Slight adjustment over word embeddings with flipped sensitive indices is achieved, and the modified word embeddings are used in the downstream classification task to realize Demographic Parity. The experimental results validate the effectiveness of our proposed method in mitigating multi-class unintended demographic bias without impairing the original classification accuracy.
dc.language	en
dc.publisher	Springer
dc.relation	http://purl.org/au-research/grants/arc/DE180100251
dc.relation.ispartof	Web Information Systems Engineering – WISE 2022
dc.relation.ispartof	International Conference on Web Information Systems Engineering
dc.relation.ispartofseries	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.relation.isbasedon	10.1007/978-3-031-20891-1_27
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Mitigating Multi-class Unintended Demographic Bias in Text Classification with Adversarial Learning
dc.type	Conference Proceeding
utslib.citation.volume	13724 LNCS
utslib.location.activity	Biarritz, France
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	/University of Technology Sydney/Strength - CCSP - Centre for Cyber Security and Privacy
utslib.copyright.status	closed_access	*
pubs.consider-herdc	false
dc.date.updated	2023-05-23T10:55:51Z
pubs.finish-date	2022-11-03
pubs.place-of-publication	Switzerland
pubs.publication-status	Published
pubs.start-date	2022-11-01
pubs.volume	13724 LNCS
dc.location	Switzerland

Abstract:

Text classification enables higher efficiency on text data queries in information retrieval. However, unintended demographic bias can impair text toxicity classification. Thus, we propose a novel debiasing framework utilizing Adversarial Learning on word embeddings of multi-class sensitive demographic words to alleviate this bias. Slight adjustment over word embeddings with flipped sensitive indices is achieved, and the modified word embeddings are used in the downstream classification task to realize Demographic Parity. The experimental results validate the effectiveness of our proposed method in mitigating multi-class unintended demographic bias without impairing the original classification accuracy.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/170417