Cost-sensitive semi-supervised classification using CS-EM

Publication Type:
Conference Proceeding
Proceedings - 2008 IEEE 8th International Conference on Computer and Information Technology, CIT 2008, 2008, pp. 131 - 136
Issue Date:
Filename Description Size
Thumbnail2008001373OK.pdf629.15 kB
Adobe PDF
Full metadata record
In many real world data mining and classification tasks, we face with the problem of high cost In making training data sets. In addition, in many domains, different misclassification errors involve different costs. These two issues are often addressed by semi-supervised learning and costsensitive learning separately. Sometimes the two issues can happen at the same time in real world applications. However, existing semi-supervised learning algorithms never consider the mlsclassification costs. In this paper, we propose a simple and novel method, CS-EM for learning cost-sensitive classifier using both labeled and unlabeled training data. CS-EM modifies EM, a popular semi-supervised learning algorithm by incorporating misclassiflcation costs into the probability estimation process. Our experiments show that CS-EM outperforms other two competing methods on three bench mark text data sets across different cost ratios. © 2008 IEEE.
Please use this identifier to cite or link to this item: