Measuring distance-based semantic similarity using meronymy and hyponymy relations

Cai, Y; Pan, S; Wang, X; Chen, H; Cai, X; Zuo, M

Measuring distance-based semantic similarity using meronymy and hyponymy relations

Cai, Y Pan, S

Wang, X Chen, H

Cai, X Zuo, M

Permalink

Publisher:: Springer (part of Springer Nature)
Publication Type:: Journal Article
Citation:: Neural Computing and Applications, 2020, 32, (8), pp. 3521-3534
Issue Date:: 2020

Closed Access

	Filename	Description	Size
	Cai2020_Article_MeasuringDistance-basedSemanti.pdf	Published version	730.95 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Cai, Y
dc.contributor.author	Pan, S https://orcid.org/0000-0003-0794-527X
dc.contributor.author	Wang, X
dc.contributor.author	Chen, H https://orcid.org/0000-0002-0893-1817
dc.contributor.author	Cai, X
dc.contributor.author	Zuo, M
dc.date.accessioned	2020-09-19T03:18:29Z
dc.date.available	2020-09-19T03:18:29Z
dc.date.issued	2020
dc.identifier.citation	Neural Computing and Applications, 2020, 32, (8), pp. 3521-3534
dc.identifier.issn	0941-0643
dc.identifier.issn	1433-3058
dc.identifier.uri	http://hdl.handle.net/10453/142750
dc.description.abstract	© 2018, The Natural Computing Applications Forum. The assessment of semantic similarity between lexical terms plays a critical part in semantic-oriented applications for natural language processing and cognitive science. The optimization of calculation models is still a challenging issue for improving the performance of similarity measurement. In this paper, we investigate WordNet-based measures including distance-based, information-based, feature-based and hybrid. Among them, the distance-based measures are considered to have the lowest computational complexity due to simple distance calculation. However, most of existing works ignore the meronymy relation between concepts and the non-uniformity of path distances caused by various semantic relations, in which path distances are simply determined by conceptual hyponymy relation. To solve this problem, we propose a novel model to calculate the path distance between concepts, and also propose a similarity measure which nonlinearly transforms the distance to semantic similarity. In the proposed model, we assign different weights in accordance with various relations to edges that link different concepts. On basis of the distance model, we use five structure properties of WordNet for similarity measurement, which consist of multiple meanings, multiple inheritance, link type, depth and local density. Our similarity measure is compared against state-of-the-art WordNet-based measures on M&C dataset, R&G dataset and WS-353 dataset. According to experiment results, the proposed measure in this work outperforms others in terms of both Pearson and Spearman correlation coefficients, which indicates the effectiveness of our distance model. Besides, we construct six additional benchmarks to prove that the proposed measure maintains stable performance.
dc.language	English
dc.publisher	Springer (part of Springer Nature)
dc.relation.ispartof	Neural Computing and Applications
dc.relation.isbasedon	10.1007/s00521-018-3766-9
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0801 Artificial Intelligence and Image Processing, 0906 Electrical and Electronic Engineering, 1702 Cognitive Sciences
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Measuring distance-based semantic similarity using meronymy and hyponymy relations
dc.type	Journal Article
utslib.citation.volume	32
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	1702 Cognitive Sciences
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	0906 Electrical and Electronic Engineering
utslib.for	1702 Cognitive Sciences
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	/University of Technology Sydney/DVC (Research)
utslib.copyright.status	closed_access	*
pubs.consider-herdc	true
dc.date.updated	2020-09-19T03:18:24Z
pubs.issue	8
pubs.publication-status	Published
pubs.volume	32
utslib.citation.issue	8

Abstract:

© 2018, The Natural Computing Applications Forum. The assessment of semantic similarity between lexical terms plays a critical part in semantic-oriented applications for natural language processing and cognitive science. The optimization of calculation models is still a challenging issue for improving the performance of similarity measurement. In this paper, we investigate WordNet-based measures including distance-based, information-based, feature-based and hybrid. Among them, the distance-based measures are considered to have the lowest computational complexity due to simple distance calculation. However, most of existing works ignore the meronymy relation between concepts and the non-uniformity of path distances caused by various semantic relations, in which path distances are simply determined by conceptual hyponymy relation. To solve this problem, we propose a novel model to calculate the path distance between concepts, and also propose a similarity measure which nonlinearly transforms the distance to semantic similarity. In the proposed model, we assign different weights in accordance with various relations to edges that link different concepts. On basis of the distance model, we use five structure properties of WordNet for similarity measurement, which consist of multiple meanings, multiple inheritance, link type, depth and local density. Our similarity measure is compared against state-of-the-art WordNet-based measures on M&C dataset, R&G dataset and WS-353 dataset. According to experiment results, the proposed measure in this work outperforms others in terms of both Pearson and Spearman correlation coefficients, which indicates the effectiveness of our distance model. Besides, we construct six additional benchmarks to prove that the proposed measure maintains stable performance.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/142750