Boosting imbalanced data learning with Wiener process oversampling

Li, Q; Li, G; Niu, W; Cao, Y; Chang, L; Tan, J; Guo, L

Boosting imbalanced data learning with Wiener process oversampling

Li, Q

Li, G Niu, W Cao, Y Chang, L Tan, J Guo, L

Permalink

Publisher:: Springer
Publication Type:: Journal Article
Citation:: Frontiers of Computer Science, 2017, 11, (5), pp. 836-851
Issue Date:: 2017-10-01

Closed Access

	Filename	Description	Size
	Li2017_Article_BoostingImbalancedDataLearning.pdf		974.65 kB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Li, Q https://orcid.org/0000-0002-8308-9551
dc.contributor.author	Li, G
dc.contributor.author	Niu, W
dc.contributor.author	Cao, Y
dc.contributor.author	Chang, L
dc.contributor.author	Tan, J
dc.contributor.author	Guo, L
dc.date.accessioned	2022-07-13T05:16:11Z
dc.date.available	2022-07-13T05:16:11Z
dc.date.issued	2017-10-01
dc.identifier.citation	Frontiers of Computer Science, 2017, 11, (5), pp. 836-851
dc.identifier.issn	2095-2236
dc.identifier.issn	2095-2236
dc.identifier.uri	http://hdl.handle.net/10453/158847
dc.description.abstract	Learning from imbalanced data is a challenging task in a wide range of applications, which attracts significant research efforts from machine learning and data mining community. As a natural approach to this issue, oversampling balances the training samples through replicating existing samples or synthesizing new samples. In general, synthesization outperforms replication by supplying additional information on the minority class. However, the additional information needs to follow the same normal distribution of the training set, which further constrains the new samples within the predefined range of training set. In this paper, we present the Wiener process oversampling (WPO) technique that brings the physics phenomena into sample synthesization. WPO constructs a robust decision region by expanding the attribute ranges in training set while keeping the same normal distribution. The satisfactory performance of WPO can be achieved with much lower computing complexity. In addition, by integrating WPO with ensemble learning, the WPOBoost algorithm outperformsmany prevalent imbalance learning solutions.
dc.language	English
dc.publisher	Springer
dc.relation.ispartof	Frontiers of Computer Science
dc.relation.isbasedon	10.1007/s11704-016-5250-y
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Boosting imbalanced data learning with Wiener process oversampling
dc.type	Journal Article
utslib.citation.volume	11
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAI - Advanced Analytics Institute Research Centre
utslib.copyright.status	closed_access	*
pubs.consider-herdc	false
dc.date.updated	2022-07-13T05:16:10Z
pubs.issue	5
pubs.publication-status	Published
pubs.volume	11
utslib.citation.issue	5

Abstract:

Learning from imbalanced data is a challenging task in a wide range of applications, which attracts significant research efforts from machine learning and data mining community. As a natural approach to this issue, oversampling balances the training samples through replicating existing samples or synthesizing new samples. In general, synthesization outperforms replication by supplying additional information on the minority class. However, the additional information needs to follow the same normal distribution of the training set, which further constrains the new samples within the predefined range of training set. In this paper, we present the Wiener process oversampling (WPO) technique that brings the physics phenomena into sample synthesization. WPO constructs a robust decision region by expanding the attribute ranges in training set while keeping the same normal distribution. The satisfactory performance of WPO can be achieved with much lower computing complexity. In addition, by integrating WPO with ensemble learning, the WPOBoost algorithm outperformsmany prevalent imbalance learning solutions.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/158847