Combine vector quantization and support vector machine for imbalanced datasets

Yu, T; Debenham, J; Jan, T; Simoff, S

Combine vector quantization and support vector machine for imbalanced datasets

Yu, T Debenham, J Jan, T Simoff, S

Permalink

Publication Type:: Conference Proceeding
Citation:: IFIP International Federation for Information Processing, 2006, 217 pp. 81 - 88
Issue Date:: 2006-12-21

Closed Access

	Filename	Description	Size
	2006004194.pdf		1.05 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Yu, T	en_US
dc.contributor.author	Debenham, J	en_US
dc.contributor.author	Jan, T	en_US
dc.contributor.author	Simoff, S	en_US
dc.date.issued	2006-12-21	en_US
dc.identifier.citation	IFIP International Federation for Information Processing, 2006, 217 pp. 81 - 88	en_US
dc.identifier.isbn	0387346554	en_US
dc.identifier.isbn	9780387346557	en_US
dc.identifier.issn	1571-5736	en_US
dc.identifier.uri	http://hdl.handle.net/10453/1765
dc.description.abstract	In cases of extremely imbalanced dataset with high dimensions, standard machine learning techniques tend to be overwhelmed by the large classes. This paper rebalances skewed datasets by compressing the majority class. This approach combines Vector Quantization and Support Vector Machine and constructs a new approach, VQ-SVM, to rebalance datasets without significant information loss. Some issues, e.g. distortion and support vectors, have been discussed to address the trade-off between the information loss and undersampling. Experiments compare VQ-SVM and standard SVM on some imbalanced datasets with varied imbalance ratios, and results show that the performance of VQ-SVM is superior to SVM, especially in case of extremely imbalanced large datasets. © 2006 International Federation for Information Processing.	en_US
dc.relation.ispartof	IFIP International Federation for Information Processing	en_US
dc.relation.isbasedon	10.1007/978-0-387-34747-9_9	en_US
dc.subject.classification	Information Systems	en_US
dc.title	Combine vector quantization and support vector machine for imbalanced datasets	en_US
dc.type	Conference Proceeding
utslib.citation.volume	217	en_US
utslib.for	080109 Pattern Recognition and Data Mining	en_US
dc.location.activity	Santiago, CHILE	en_US
dc.location.activity	Brisbane, Australia
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Software
pubs.organisational-group	/University of Technology Sydney/Strength - AAI - Advanced Analytics Institute Research Centre
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US
pubs.volume	217	en_US

Abstract:

In cases of extremely imbalanced dataset with high dimensions, standard machine learning techniques tend to be overwhelmed by the large classes. This paper rebalances skewed datasets by compressing the majority class. This approach combines Vector Quantization and Support Vector Machine and constructs a new approach, VQ-SVM, to rebalance datasets without significant information loss. Some issues, e.g. distortion and support vectors, have been discussed to address the trade-off between the information loss and undersampling. Experiments compare VQ-SVM and standard SVM on some imbalanced datasets with varied imbalance ratios, and results show that the performance of VQ-SVM is superior to SVM, especially in case of extremely imbalanced large datasets. © 2006 International Federation for Information Processing.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/1765