Exploiting Local Data Uncertainty to Boost Global Outlier Detection

Publication Type:
Conference Proceeding
ICDM 2010, The 10th IEEE International Conference on Data Mining, 2010, pp. 304 - 313
Issue Date:
Full metadata record
Files in This Item:
Filename Description Size
Thumbnail2010001635OK.pdf653.02 kB
Adobe PDF
This paper presents a novel hybrid approach to outlier detection by incorporating local data uncertainty into the construction of a global classifier. To deal with local data uncertainty, we introduce a confidence value to each data example in the training data, which measures the strength of the corresponding class label. Our proposed method works in two steps. Firstly, we generate a pseudo training dataset by computing a confidence value of each input example on its class label. We present two different mechanisms: kernel k-means clustering algorithm and kernel LOF-based algorithm, to compute the confidence values based on the local data behavior. Secondly, we construct a global classifier for outlier detection by generalizing the SVDD-based learning framework to incorporate both positive and negative examples as well as their associated confidence values. By integrating local and global outlier detection, our proposed method explicitly handles the uncertainty of the input data and enhances the ability of SVDD in reducing the sensitivity to noise. Extensive experiments on real life datasets demonstrate that our proposed method can achieve a better tradeoff between detection rate and false alarm rate as compared to four state-of-the-art outlier detection algorithms.
Please use this identifier to cite or link to this item: