Robust ensemble learning for mining noisy data streams

Publication Type:
Journal Article
Decision Support Systems, 2011, 50 (2), pp. 469 - 479
Issue Date:
Filename Description Size
Thumbnail2011000602OK.pdf777.01 kB
Adobe PDF
Full metadata record
In this paper, we study the problem of learning from concept drifting data streams with noise, where samples in a data stream may be mislabeled or contain erroneous values. Our essential goal is to build a robust prediction model from noisy stream data to accurately predict future samples. For noisy data sources, most existing works rely on data preprocessing techniques to cleanse noisy samples before the training of decision models. In data stream environments, these data preprocessing techniques are, unfortunately, hard to apply, mainly because the concept drifting in a data stream may make it very difficult to differentiate noise from samples of changing concepts. Accordingly, we propose an aggregate ensemble (AE) learning framework. The aim of AE is to build a robust ensemble model that can tolerate data errors. Theoretical and empirical studies on both synthetic and real-world data streams demonstrate that the proposed AE learning framework is capable of building accurate classification models from noisy data streams. © 2010 Elsevier B.V. All rights reserved.
Please use this identifier to cite or link to this item: