Generalizing Hate Speech Detection Using Multi-Task Learning: A Case Study of Political Public Figures

Yuan, L; Rizoiu, MA

Generalizing Hate Speech Detection Using Multi-Task Learning: A Case Study of Political Public Figures

Yuan, L Rizoiu, MA

Permalink

Publisher:: ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD
Publication Type:: Journal Article
Citation:: Computer Speech and Language, 2025, 89
Issue Date:: 2025-01-01

In Progress

	Filename	Description	Size
	1-s2.0-S0885230824000731-main.pdf	Published version	1.71 MB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is being processed and is not currently available.

Full metadata record

Field	Value	Language
dc.contributor.author	Yuan, L
dc.contributor.author	Rizoiu, MA
dc.date.accessioned	2024-11-19T00:07:43Z
dc.date.available	2024-11-19T00:07:43Z
dc.date.issued	2025-01-01
dc.identifier.citation	Computer Speech and Language, 2025, 89
dc.identifier.issn	0885-2308
dc.identifier.issn	1095-8363
dc.identifier.uri	http://hdl.handle.net/10453/181955
dc.description.abstract	Automatic identification of hateful and abusive content is vital in combating the spread of harmful online content and its damaging effects. Most existing works evaluate models by examining the generalization error on train–test splits on hate speech datasets. These datasets often differ in their definitions and labeling criteria, leading to poor generalization performance when predicting across new domains and datasets. This work proposes a new Multi-task Learning (MTL) pipeline that trains simultaneously across multiple hate speech datasets to construct a more encompassing classification model. Using a dataset-level leave-one-out evaluation (designating a dataset for testing and jointly training on all others), we trial the MTL detection on new, previously unseen datasets. Our results consistently outperform a large sample of existing work. We show strong results when examining the generalization error in train–test splits and substantial improvements when predicting on previously unseen datasets. Furthermore, we assemble a novel dataset, dubbed PUBFIGS, focusing on the problematic speech of American Public Political Figures. We crowdsource-label using Amazon MTurk more than 20,000 tweets and machine-label problematic speech in all the 305,235 tweets in PUBFIGS. We find that the abusive and hate tweeting mainly originates from right-leaning figures and relates to six topics, including Islam, women, ethnicity, and immigrants. We show that MTL builds embeddings that can simultaneously separate abusive from hate speech, and identify its topics.
dc.language	English
dc.publisher	ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD
dc.relation.ispartof	Computer Speech and Language
dc.relation.isbasedon	10.1016/j.csl.2024.101690
dc.rights	info:eu-repo/semantics/restrictedAccess
dc.subject	0801 Artificial Intelligence and Image Processing, 1702 Cognitive Sciences, 2004 Linguistics
dc.subject.classification	Speech-Language Pathology & Audiology
dc.subject.classification	40 Engineering
dc.subject.classification	46 Information and computing sciences
dc.title	Generalizing Hate Speech Detection Using Multi-Task Learning: A Case Study of Political Public Figures
dc.type	Journal Article
utslib.citation.volume	89
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	1702 Cognitive Sciences
utslib.for	2004 Linguistics
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	University of Technology Sydney/UTS Groups
pubs.organisational-group	University of Technology Sydney/UTS Groups/Data Science Institute (DSI)
pubs.organisational-group	University of Technology Sydney/Strength - DSI - Data Science Institute
pubs.organisational-group	University of Technology Sydney/UTS Groups/The Trustworthy Digital Society
utslib.copyright.status	in_progress	*
dc.rights.license	This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0). To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/
dc.date.updated	2024-11-19T00:07:42Z
pubs.publication-status	Accepted
pubs.volume	89

Abstract:

Automatic identification of hateful and abusive content is vital in combating the spread of harmful online content and its damaging effects. Most existing works evaluate models by examining the generalization error on train–test splits on hate speech datasets. These datasets often differ in their definitions and labeling criteria, leading to poor generalization performance when predicting across new domains and datasets. This work proposes a new Multi-task Learning (MTL) pipeline that trains simultaneously across multiple hate speech datasets to construct a more encompassing classification model. Using a dataset-level leave-one-out evaluation (designating a dataset for testing and jointly training on all others), we trial the MTL detection on new, previously unseen datasets. Our results consistently outperform a large sample of existing work. We show strong results when examining the generalization error in train–test splits and substantial improvements when predicting on previously unseen datasets. Furthermore, we assemble a novel dataset, dubbed PUBFIGS, focusing on the problematic speech of American Public Political Figures. We crowdsource-label using Amazon MTurk more than 20,000 tweets and machine-label problematic speech in all the 305,235 tweets in PUBFIGS. We find that the abusive and hate tweeting mainly originates from right-leaning figures and relates to six topics, including Islam, women, ethnicity, and immigrants. We show that MTL builds embeddings that can simultaneously separate abusive from hate speech, and identify its topics.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/181955