Improving Open-Domain Answer Sentence Selection by Distributed Clients with Privacy Preservation

Wang, W; Shen, T; Blumenstein, M; Long, G

Improving Open-Domain Answer Sentence Selection by Distributed Clients with Privacy Preservation

Wang, W Shen, T Blumenstein, M

Long, G

Permalink

Publisher:: Springer Nature
Publication Type:: Conference Proceeding
Citation:: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2023, 14180 LNAI, pp. 15-29
Issue Date:: 2023-01-01

Embargoed

	Filename	Description	Size
	Springer_Lecture_Notes_in_Computer_Science_OPUS.pdf	Accepted version	690.75 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Embargoed
Open Access

This item is currently unavailable due to the publisher's embargo.

The embargo period expires on 5 Nov 2024

Full metadata record

Field	Value	Language
dc.contributor.author	Wang, W
dc.contributor.author	Shen, T
dc.contributor.author	Blumenstein, M https://orcid.org/0000-0002-9908-3744
dc.contributor.author	Long, G https://orcid.org/0000-0003-3740-9515
dc.date.accessioned	2024-02-02T03:55:03Z
dc.date.available	2024-02-02T03:55:03Z
dc.date.issued	2023-01-01
dc.identifier.citation	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2023, 14180 LNAI, pp. 15-29
dc.identifier.isbn	9783031466762
dc.identifier.issn	0302-9743
dc.identifier.issn	1611-3349
dc.identifier.uri	http://hdl.handle.net/10453/175266
dc.description.abstract	Open-domain answer sentence selection (OD-AS2), as a practical branch of open-domain question answering (OD-QA), aims to respond to a query by a potential answer sentence from a large-scale collection. A dense retrieval model plays a significant role across different solution paradigms, while its success depends heavily on sufficient labeled positive QA pairs and diverse hard negative sampling in contrastive learning. However, it is hard to satisfy such dependencies in a privacy-preserving distributed scenario, where in each client, fewer in-domain pairs and a relatively small collection cannot support effective dense retriever training. To alleviate this, we propose a brand-new learning framework for Privacy-preserving Distributed OD-AS2, dubbed PDD-AS2. Built upon federated learning, it consists of a client-customized query encoding for better personalization and a cross-client negative sampling for learning effectiveness. To evaluate our learning framework, we first construct a new OD-AS2 dataset, called Fed-NewsQA, based on NewsQA to simulate distributed clients with different genre/domain data. Experiment results show that our learning framework can outperform its baselines and exhibit its personalization ability.
dc.language	en
dc.publisher	Springer Nature
dc.relation.ispartof	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.relation.ispartofseries	Lecture Notes in Computer Science
dc.relation.isbasedon	10.1007/978-3-031-46677-9_2
dc.rights	info:eu-repo/semantics/embargoedAccess
dc.subject.classification	Artificial Intelligence & Image Processing
dc.subject.classification	46 Information and computing sciences
dc.title	Improving Open-Domain Answer Sentence Selection by Distributed Clients with Privacy Preservation
dc.type	Conference Proceeding
utslib.citation.volume	14180 LNAI
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
pubs.organisational-group	University of Technology Sydney/Strength - QSI - Centre for Quantum Software and Information
utslib.copyright.status	embargoed	*
utslib.copyright.embargo	2024-11-05T00:00:00+1000Z
dc.date.updated	2024-02-02T03:55:02Z
pubs.publication-status	Published
pubs.volume	14180 LNAI

Abstract:

Open-domain answer sentence selection (OD-AS2), as a practical branch of open-domain question answering (OD-QA), aims to respond to a query by a potential answer sentence from a large-scale collection. A dense retrieval model plays a significant role across different solution paradigms, while its success depends heavily on sufficient labeled positive QA pairs and diverse hard negative sampling in contrastive learning. However, it is hard to satisfy such dependencies in a privacy-preserving distributed scenario, where in each client, fewer in-domain pairs and a relatively small collection cannot support effective dense retriever training. To alleviate this, we propose a brand-new learning framework for Privacy-preserving Distributed OD-AS2, dubbed PDD-AS2. Built upon federated learning, it consists of a client-customized query encoding for better personalization and a cross-client negative sampling for learning effectiveness. To evaluate our learning framework, we first construct a new OD-AS2 dataset, called Fed-NewsQA, based on NewsQA to simulate distributed clients with different genre/domain data. Experiment results show that our learning framework can outperform its baselines and exhibit its personalization ability.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/175266