Support Vector Machine for Outlier Detection in Breast Cancer Survivability Prediction
- Springer Berlin / Heidelberg
- Publication Type:
- Conference Proceeding
- Advanced Web and NetworkTechnologies, and Applications Lecture Notes in Computer Science, 2008, 4977 pp. 99 - 109
- Issue Date:
Finding and removing misclassified instances are important steps in data mining and machine learning that affect the performance of the data mining algorithm in general. In this paper, we propose a C-Support Vector Classification Filter (C-SVCF) to identify and remove the misclassified instances (outliers) in breast cancer survivability samples collected from Srinagarind hospital in Thai- land, to improve the accuracy of the prediction models. Only instances that are correctly classified by the filter are passed to the learning algorithm. Perform- ance of the proposed technique is measured with accuracy and area under the re- ceiver operating characteristic curve (AUC), as well as compared with several popular ensemble filter approaches including AdaBoost, Bagging and ensemble of SVM with AdaBoost and Bagging filters. Our empirical results indicate that C-SVCF is an effective method for identifying misclassified outliers. This ap- proach significantly benefits ongoing research of developing accurate and robust prediction models for breast cancer survivability.
Please use this identifier to cite or link to this item: