Healing sample selection bias by source classifier selection

Publication Type:
Conference Proceeding
Proceedings - IEEE International Conference on Data Mining, ICDM, 2011, pp. 577 - 586
Issue Date:
Filename Description Size
Thumbnail2013004337OK.pdf432.2 kB
Adobe PDF
Full metadata record
Domain Adaptation (DA) methods are usually carried out by means of simply reducing the marginal distribution differences between the source and target domains, and subsequently using the resultant trained classifier, namely source classifier, for use in the target domain. However, in many cases, the true predictive distributions of the source and target domains can be vastly different especially when their class distributions are skewed, causing the issues of sample selection bias in DA. Hence, DA methods which leverage the source labeled data may suffer from poor generalization in the target domain, resulting in negative transfer. In addition, we observed that many DA methods use either a source classifier or a linear combination of source classifiers with a fixed weighting for predicting the target unlabeled data. Essentially, the labels of the target unlabeled data are spanned by the prediction of these source classifiers. Motivated by these observations, in this paper, we propose to construct many source classifiers of diverse biases and learn the weight for each source classifier by directly minimizing the structural risk defined on the target unlabeled data so as to heal the possible sample selection bias. Since the weights are learned by maximizing the margin of separation between opposite classes on the target unlabeled data, the proposed method is established here as Maximal Margin Target Label Learning (MMTLL), which is in a form of Multiple Kernel Learning problem with many label kernels. Extensive experimental studies of MMTLL against several state-of-the-art methods on the Sentiment and Newsgroups datasets with various imbalanced class settings showed that MMTLL exhibited robust accuracies on all the settings considered and was resilient to negative transfer, in contrast to other counterpart methods which suffered significantly in prediction accuracy. © 2011 IEEE.
Please use this identifier to cite or link to this item: