Similarity-based approach for positive and unlabelled learning

Publication Type:
Conference Proceeding
IJCAI International Joint Conference on Artificial Intelligence, 2011, pp. 1577 - 1582
Issue Date:
Full metadata record
Files in This Item:
Filename Description Size
Thumbnail2010005228OK.pdf680.54 kB
Adobe PDF
Positive and unlabelled learning (PU learning) has been investigated to deal with the situation where only the positive examples and the unlabelled examples are available. Most of the previous works focus on identifying some negative examples from the unlabelled data, so that the supervised learning methods can be applied to build a classifier. However, for the remaining unlabelled data, which can not be explicitly identified as positive or negative (we call them ambiguous examples), they either exclude them from the training phase or simply enforce them to either class. Consequently, their performance may be constrained. This paper proposes a novel approach, called similarity-based PU learning (SPUL) method, by associating the ambiguous examples with two similarity weights, which indicate the similarity of an ambiguous example towards the positive class and the negative class, respectively. The local similarity-based and global similarity-based mechanisms are proposed to generate the similarity weights. The ambiguous examples and their similarity-weights are thereafter incorporated into an SVM-based learning phase to build a more accurate classifier. Extensive experiments on real-world datasets have shown that SPUL outperforms state-of-the-art PU learning methods.
Please use this identifier to cite or link to this item: