A Cost-Sensitive Learning Strategy for Feature Extraction from Imbalanced Data

Springer International Publishing
Publication Type:
Conference Proceeding
Springer International Publishing, 2016, 9949 (0302-9743), pp. 78 - 86
Issue Date:
Full metadata record
Files in This Item:
Filename Description Size
Conference paper.pdfAccepted Manuscript version310.49 kB
Adobe PDF
In this paper, novel cost-sensitive principal component analysis (CSPCA) and cost-sensitive non-negative matrix factorization (CSNMF) methods are proposed for handling the problem of feature extraction from imbalanced data. The presence of highly imbalanced data misleads existing feature extraction techniques to produce biased features, which results in poor classification performance especially for the minor class problem. To solve this problem, we propose a cost-sensitive learning strategy for feature extraction techniques that uses the imbalance ratio of classes to discount the majority samples. This strategy is adapted to the popular feature extraction methods such as PCA and NMF. The main advantage of the proposed methods is that they are able to lessen the inherent bias of the extracted features to the majority class in existing PCA and NMF algorithms. Experiments on twelve public datasets with different levels of imbalance ratios show that the proposed methods outperformed the state-of-the-art methods on multiple classifiers.
Please use this identifier to cite or link to this item: