Learning with inadequate and incorrect supervision
- Publication Type:
- Conference Proceeding
- Proceedings - IEEE International Conference on Data Mining, ICDM, 2017, 2017-November pp. 889 - 894
- Issue Date:
© 2017 IEEE. Practically, we are often in the dilemma that the labeled data at hand are inadequate to train a reliable classifier, and more seriously, some of these labeled data may be mistakenly labeled due to the various human factors. Therefore, this paper proposes a novel semi-supervised learning paradigm that can handle both label insufficiency and label inaccuracy. To address label insufficiency, we use a graph to bridge the data points so that the label information can be propagated from the scarce labeled examples to unlabeled examples along the graph edges. To address label inaccuracy, Graph Trend Filtering (GTF) and Smooth Eigenbase Pursuit (SEP) are adopted to filter out the initial noisy labels. GTF penalizes the l-0 norm of label difference between connected examples in the graph and exhibits better local adaptivity than the traditional l-2 norm-based Laplacian smoother. SEP reconstructs the correct labels by emphasizing the leading eigenvectors of Laplacian matrix associated with small eigenvalues, as these eigenvectors reflect real label smoothness and carry rich class separation cues. We term our algorithm as 'Semi-supervised learning under Inadequate and Incorrect Supervision' (SIIS). Thorough experimental results on image classification, text categorization, and speech recognition demonstrate that our SIIS is effective in label error correction, leading to superior performance to the state-of-the-art methods in the presence of label noise and label scarcity.
Please use this identifier to cite or link to this item: