Sprinkled semantic diffusion kernel for word sense disambiguation

Publication Type:
Journal Article
Engineering Applications of Artificial Intelligence, 2017, 64 pp. 43 - 51
Issue Date:
Filename Description Size
1-s2.0-S0952197617301021-main.pdfPublished Version838.95 kB
Adobe PDF
Full metadata record
© 2017 Elsevier Ltd Word sense disambiguation (WSD), the task of identifying the intended meanings (senses) of words in context, has been a long-standing research objective for natural language processing (NLP). In this paper, we are concerned with kernel methods for automatic WSD. Under this framework, the main difficulty is to design an appropriate kernel function to represent the sense distinction knowledge. Semantic diffusion kernel, which models semantic similarity by means of a diffusion process on a graph defined by lexicon and co-occurrence information to smooth the typical “Bag of Words” (BOW) representation, has been successfully applied to WSD. However, the diffusion is an unsupervised process, which fails to exploit the class information in a supervised classification scenario. To address the limitation, we present a sprinkled semantic diffusion kernel to make use of the class knowledge of training documents in addition to the co-occurrence knowledge. The basic idea is to construct an augmented term-document matrix by encoding class information as additional terms and appending them to training documents. Diffusion is then performed on the augmented term-document matrix. In this way, the words belonging to the same class are indirectly drawn closer to each other, hence the class-specific word correlations are strengthened. We evaluate our method on several Senseval/Semeval benchmark examples with support vector machine (SVM), and show that the proposed kernel can significantly improve the disambiguation performance over semantic diffusion kernel in terms of different measures and yield a competitive result with the state-of-the-art kernel methods for WSD.
Please use this identifier to cite or link to this item: