A hierarchical VQSVM for imbalanced data sets

Publication Type:
Conference Proceeding
Citation:
IEEE International Conference on Neural Networks - Conference Proceedings, 2007, pp. 518 - 523
Issue Date:
2007-12-01
Full metadata record
Files in This Item:
Filename Description Size
Thumbnail2009005202OK.pdf425.63 kB
Adobe PDF
First, a hierarchical modelling method, VQSVM, is introduced, and some remarks are discussed. Secondly the proposed VQSVM is applied to a nonstandard learning environment, imbalanced data sets. In cases of extremely imbalanced dataset with high dimensions, standard machine learning techniques tend to be overwhelmed by the large classes. The hierarchical VQSVM contains a set of local models i.e. codevectors produced by the Vector Quantization and a global model, i.e. Support Vector Machine, to rebalance datasets without significant information loss. Some issues, e.g. distortion and support vectors, have been discussed to address the trade-off between the information loss and undersampling rate. Experiments compare VQSVM with random resampling techniques on some imbalanced datasets with varied imbalance ratios, and results show that the performance of VQSVM is superior or equivalent to random resampling techniques, especially in case of extremely imbalanced large datasets. ©2007 IEEE.
Please use this identifier to cite or link to this item: