Advancing Chinese Conversation-based Patient Guidance with a Benchmark and Knowledge-Evolvable Assistant.

Lu, W; Liu, K; Wang, J; Peng, X; Shen, T; Zhu, F; Zhang, W; Zhu, J; Xin, T; Vasilakos, AV

Advancing Chinese Conversation-based Patient Guidance with a Benchmark and Knowledge-Evolvable Assistant.

Lu, W Liu, K Wang, J Peng, X

Shen, T Zhu, F Zhang, W Zhu, J Xin, T Vasilakos, AV

Permalink

Publisher:: Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:: Journal Article
Citation:: IEEE J Biomed Health Inform, 2025, PP, (99), pp. 1-12
Issue Date:: 2025-12-03

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Accepted versionAdobe PDF (757.58 kB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Lu, W
dc.contributor.author	Liu, K
dc.contributor.author	Wang, J
dc.contributor.author	Peng, X https://orcid.org/0000-0002-8901-1472
dc.contributor.author	Shen, T
dc.contributor.author	Zhu, F
dc.contributor.author	Zhang, W
dc.contributor.author	Zhu, J
dc.contributor.author	Xin, T
dc.contributor.author	Vasilakos, AV
dc.date.accessioned	2025-12-07T23:20:06Z
dc.date.available	2025-12-07T23:20:06Z
dc.date.issued	2025-12-03
dc.identifier.citation	IEEE J Biomed Health Inform, 2025, PP, (99), pp. 1-12
dc.identifier.issn	2168-2194
dc.identifier.issn	2168-2208
dc.identifier.uri	http://hdl.handle.net/10453/190868
dc.description.abstract	Chinese Conversation-based Patient Guidance (CCPG) helps patients reach the correct hospital department through natural-language exchanges with medical staff. Despite the rapid success of large language models (LLMs) in other healthcare tasks, CCPG remains under-explored and lacks dedicated benchmarks. We address this gap with PG-Bench, the first comprehensive CCPG benchmark, spanning five subsets, 19,814 annotated dialogues, and 98 clinical departments. We evaluate 25 representative LLMs on PG-Bench and observe uniformly poor performance, even the latest models such as GPT-4 and DeepSeek-V3 fail to meet practical requirements. To close this gap, we introduce the Knowledge-Evolvable Assistant (KEA), a novel framework that augments any LLM with (i) an experience bank of validated, successful CCPG cases for analogy-based reasoning; (ii) a reflection bank that records previously misclassified cases together with their corrections and self-summarized error analyses; and (iii) an external medical knowledge base. KEA employs retrieval-augmented generation to evolve its guidance knowledge iteratively. Experiments show that KEA consistently and significantly boosts the CCPG performance of all tested LLMs on PG-Bench. However, current best results still fall short of clinical expectations, underscoring the difficulty of CCPG and the need for further research. PG-Bench and KEA together establish a rigorous foundation and strong baseline for future work on conversation-driven patient guidance in Chinese healthcare settings.
dc.language	eng
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartof	IEEE J Biomed Health Inform
dc.relation.isbasedon	10.1109/JBHI.2025.3639805
dc.rights	info:eu-repo/semantics/openAccess
dc.title	Advancing Chinese Conversation-based Patient Guidance with a Benchmark and Knowledge-Evolvable Assistant.
dc.type	Journal Article
utslib.citation.volume	PP
utslib.location.activity	United States
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	University of Technology Sydney/UTS Groups
pubs.organisational-group	University of Technology Sydney/UTS Groups/Australian Artificial Intelligence Institute (AAII)
utslib.copyright.status	open_access	*
dc.date.updated	2025-12-07T23:20:04Z
pubs.issue	99
pubs.publication-status	Published online
pubs.volume	PP
utslib.citation.issue	99

Abstract:

Chinese Conversation-based Patient Guidance (CCPG) helps patients reach the correct hospital department through natural-language exchanges with medical staff. Despite the rapid success of large language models (LLMs) in other healthcare tasks, CCPG remains under-explored and lacks dedicated benchmarks. We address this gap with PG-Bench, the first comprehensive CCPG benchmark, spanning five subsets, 19,814 annotated dialogues, and 98 clinical departments. We evaluate 25 representative LLMs on PG-Bench and observe uniformly poor performance, even the latest models such as GPT-4 and DeepSeek-V3 fail to meet practical requirements. To close this gap, we introduce the Knowledge-Evolvable Assistant (KEA), a novel framework that augments any LLM with (i) an experience bank of validated, successful CCPG cases for analogy-based reasoning; (ii) a reflection bank that records previously misclassified cases together with their corrections and self-summarized error analyses; and (iii) an external medical knowledge base. KEA employs retrieval-augmented generation to evolve its guidance knowledge iteratively. Experiments show that KEA consistently and significantly boosts the CCPG performance of all tested LLMs on PG-Bench. However, current best results still fall short of clinical expectations, underscoring the difficulty of CCPG and the need for further research. PG-Bench and KEA together establish a rigorous foundation and strong baseline for future work on conversation-driven patient guidance in Chinese healthcare settings.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/190868