Efficient mining of contrast patterns on large scale imbalanced real-life data

Publisher:
Springer
Publication Type:
Journal Article
Citation:
Lecture Notes in Computer Science, 2013, 7818 (1), pp. 62 - 73
Issue Date:
2013-01
Full metadata record
Files in This Item:
Filename Description Size
Thumbnail2013003790OK.pdf477.38 kB
Adobe PDF
Contrast pattern mining has been studied intensively for its strong discriminative capability. However, the state-of-the-art methods rarely consider the class imbalanced problem, which has been proved to be a big challenge in mining large scale data. This paper introduces a novel pattern, i.e. converging pattern, which refers to the itemsets whose supports contrast sharply from the minority class to the majority one. A novel algorithm, ConvergMiner, which adopts T*-tree and branch bound pruning strategies to mine converging patterns efficiently, is proposed. Substantial experiments in online banking fraud detection show that the ConvergMiner greatly outperforms the existing cost-sensitive classification methods in terms of predicative accuracy. In particular, the efficiency improves with the increase of data imbalance.
Please use this identifier to cite or link to this item: