Mitigating Multi-class Unintended Demographic Bias in Text Classification with Adversarial Learning

Publisher:
Springer
Publication Type:
Conference Proceeding
Citation:
Web Information Systems Engineering – WISE 2022, 2022, 13724 LNCS, pp. 386-394
Issue Date:
2022-01-01
Filename Description Size
978-3-031-20891-1_27.pdfPublished version453.89 kB
Adobe PDF
Full metadata record
Text classification enables higher efficiency on text data queries in information retrieval. However, unintended demographic bias can impair text toxicity classification. Thus, we propose a novel debiasing framework utilizing Adversarial Learning on word embeddings of multi-class sensitive demographic words to alleviate this bias. Slight adjustment over word embeddings with flipped sensitive indices is achieved, and the modified word embeddings are used in the downstream classification task to realize Demographic Parity. The experimental results validate the effectiveness of our proposed method in mitigating multi-class unintended demographic bias without impairing the original classification accuracy.
Please use this identifier to cite or link to this item: