Dynamic classifier ensemble for positive unlabeled text stream classification

Publication Type:
Journal Article
Citation:
Knowledge and Information Systems, 2012, 33 (2), pp. 267 - 287
Issue Date:
2012-11-01
Full metadata record
Files in This Item:
Filename Description Size
ContentServer (56).pdfPublished Version527.82 kB
Adobe PDF
Most of studies on streaming data classification are based on the assumption that data can be fully labeled. However, in real-life applications, it is impractical and time-consuming to manually label the entire stream for training. It is very common that only a small part of positive data and a large amount of unlabeled data are available in data stream environments. In this case, applying the traditional streaming algorithms with straightforward adaptation to positive unlabeled stream may not work well or lead to poor performance. In this paper, we propose a Dynamic Classifier Ensemble method for Positive and Unlabeled text stream (DCEPU) classification scenarios. We address the problem of classifying positive and unlabeled text stream with various concept drift by constructing an appropriate validation set and designing a novel dynamic weighting scheme in the classification phase. Experimental results on benchmark dataset RCV1-v2 demonstrate that the proposed method DCEPU outperforms the existing LELC (Li et al. 2009b), DVS (with necessary adaption) (Tsymbal et al. in Inf Fusion 9(1):56-68, 2008), and Stacking style ensemble-based algorithm (Zhang et al. 2008b). © 2011 Springer-Verlag London Limited.
Please use this identifier to cite or link to this item: