Fuzzy Clustering-Based Adaptive Regression for Drifting Data Streams

Publisher:
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication Type:
Journal Article
Citation:
IEEE Transactions on Fuzzy Systems, 2020, 28, (3), pp. 544-557
Issue Date:
2020-03-01
Filename Description Size
08688575.pdfPublished version3.63 MB
Adobe PDF
Full metadata record
© 1993-2012 IEEE. Current models and algorithms have been increasingly required to learn in a nonstationary environment because the phenomenon of concept drift (or pattern shift) may occur, that is, the assumption that data are identically distributed may be invalid in data streams. Once the data pattern changes, a well-trained model built on the previous, now obsolete data cannot provide an accurate prediction for future data. To obtain reliable prediction, it is important to understand the existing patterns in the data stream and to know which pattern the current examples belong to during the modeling process. However, it is ambiguous to classify an example to a certain pattern in many real-world cases. In this paper, we propose a novel adaptive regression approach, called FUZZ-CARE, to dynamically recognize, train, and store patterns, and assign the membership degree of the upcoming examples belonging to these patterns. Membership degrees are presented by the membership matrix obtained from a kernel fuzzy c-means clustering, which is synchronously trained and adapted with regression parameters. Rather than designing a complicated procedure to continuously chase the newest pattern, which is a common approach in the literature, FUZZ-CARE abstracts useful past information to help predict newly arrived examples. It thus effectively avoids the risk of insufficient training due to the lack of new data and improves prediction accuracy. Experiments on six synthetic datasets and 21 real-world datasets validate the high accuracy and robustness of our approach.
Please use this identifier to cite or link to this item: