Active learning from stream data using optimal weight classifier ensemble

Zhu, X; Zhang, P; Lin, X; Shi, Y

Active learning from stream data using optimal weight classifier ensemble

Zhu, X Zhang, P

Lin, X Shi, Y

Permalink

Publication Type:: Journal Article
Citation:: IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2010, 40 (6), pp. 1607 - 1621
Issue Date:: 2010-12-01

Closed Access

	Filename	Description	Size
	2010002553OK.pdf		861.86 kB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Zhu, X	en_US
dc.contributor.author	Zhang, P https://orcid.org/0000-0001-7973-2746	en_US
dc.contributor.author	Lin, X	en_US
dc.contributor.author	Shi, Y	en_US
dc.date.issued	2010-12-01	en_US
dc.identifier.citation	IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2010, 40 (6), pp. 1607 - 1621	en_US
dc.identifier.issn	1083-4419	en_US
dc.identifier.uri	http://hdl.handle.net/10453/14521
dc.description.abstract	In this paper, we propose a new research problem on active learning from data streams, where data volumes grow continuously, and labeling all data is considered expensive and impractical. The objective is to label a small portion of stream data from which a model is derived to predict future instances as accurately as possible. To tackle the technical challenges raised by the dynamic nature of the stream data, i.e., increasing data volumes and evolving decision concepts, we propose a classifier-ensemble-based active learning framework that selectively labels instances from data streams to build a classifier ensemble. We argue that a classifier ensemble's variance directly corresponds to its error rate, and reducing a classifier ensemble's variance is equivalent to improving its prediction accuracy. Because of this, one should label instances toward the minimization of the variance of the underlying classifier ensemble. Accordingly, we introduce a minimum-variance (MV) principle to guide the instance labeling process for data streams. In addition, we derive an optimal-weight calculation method to determine the weight values for the classifier ensemble. The MV principle and the optimal weighting module are combined to build an active learning framework for data streams. Experimental results on synthetic and real-world data demonstrate the performance of the proposed work in comparison with other approaches. © 2010 IEEE.	en_US
dc.relation.ispartof	IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics	en_US
dc.relation.isbasedon	10.1109/TSMCB.2010.2042445	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.subject.mesh	Algorithms	en_US
dc.subject.mesh	Decision Support Techniques	en_US
dc.subject.mesh	Models, Theoretical	en_US
dc.subject.mesh	Artificial Intelligence	en_US
dc.subject.mesh	Computer Simulation	en_US
dc.subject.mesh	Signal Processing, Computer-Assisted	en_US
dc.subject.mesh	Pattern Recognition, Automated	en_US
dc.title	Active learning from stream data using optimal weight classifier ensemble	en_US
dc.type	Journal Article
utslib.citation.volume	6	en_US
utslib.citation.volume	40	en_US
utslib.for	0102 Applied Mathematics	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
utslib.for	0906 Electrical and Electronic Engineering	en_US
dc.location.activity	ISI:000284364400016	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAI - Advanced Analytics Institute Research Centre
utslib.copyright.status	closed_access
pubs.issue	6	en_US
pubs.publication-status	Published	en_US
pubs.volume	40	en_US

Abstract:

In this paper, we propose a new research problem on active learning from data streams, where data volumes grow continuously, and labeling all data is considered expensive and impractical. The objective is to label a small portion of stream data from which a model is derived to predict future instances as accurately as possible. To tackle the technical challenges raised by the dynamic nature of the stream data, i.e., increasing data volumes and evolving decision concepts, we propose a classifier-ensemble-based active learning framework that selectively labels instances from data streams to build a classifier ensemble. We argue that a classifier ensemble's variance directly corresponds to its error rate, and reducing a classifier ensemble's variance is equivalent to improving its prediction accuracy. Because of this, one should label instances toward the minimization of the variance of the underlying classifier ensemble. Accordingly, we introduce a minimum-variance (MV) principle to guide the instance labeling process for data streams. In addition, we derive an optimal-weight calculation method to determine the weight values for the classifier ensemble. The MV principle and the optimal weighting module are combined to build an active learning framework for data streams. Experimental results on synthetic and real-world data demonstrate the performance of the proposed work in comparison with other approaches. © 2010 IEEE.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/14521