Handling concept drift in case-based reasoning systems
- Publication Type:
- Issue Date:
NO FULL TEXT AVAILABLE. Access is restricted indefinitely. ----- We live in a changing world. In data mining, the phenomenon of change in data distribution over time is called concept drift, but traditional machine learning methods face challenges in this dynamic environment due to their lack of learning capabilities. This research aims to develop an adaptive Case-Based Reasoning (CBR) learner which is able to track and quickly adapt to such unforeseeable changes. The literature review reveals that: 1) there is a connection between Case-Based Reasoner Maintenance (CBRM) and concept drift; 2) there is negligence in CBRM research about triggering; 3) there are several limitations to the current CBR methods for handling concept drift. This thesis proposes an integrated knowledge adaptation framework for CBR which enlarges the relationship between CBRM and concept drift and provides a blueprint for handling concept drift in CBR. Following the framework, this study conducts a theoretical study on competence model for CBR, which has become an important tool for CBRM. This thesis defines a competence closure model and proves three important aspects of the competence closure model, which the existing competence model does not possess. To track and understand concept drift, this research develops a competence-based change detection method. To the best of our knowledge, this method is the first in the literature to attempt to track concept drift through a competence model. This study first finds and reports that concept drift also reflects on competence measurement. The proposed competence-based change detection method requires no prior knowledge of data distribution and provides a statistical guarantee of the detected change. In addition, experimental evaluations reveal that the proposed competence-based change detection method is more robust to smaller sample sizes, and can provide a meaningful description of the detected changes. To facilitate future research in real-time adaptive methods, this research also proposes a competence model update procedure for case deletion. Finally, this research presents a two-stage case-base editing approach. In Stage 1, a Noise-Enhanced Fast Context Switching (NEFCS) algorithm is developed to prevent noise from inclusion during case retention and to hasten the context switching process in face of concept drift. In Stage 2, a Recursive Conservation Redundancy Removal (RCRR) algorithm is developed to restrict the growth the case-base. Experimental evaluations based on public real-world datasets show that these two case-base editing methods perform well on both static tasks and demonstrate a significant improvement on time-varying tasks compared to other CBR methods. To conclude, this thesis targets an urgent issue of modern machine learning research. The approach taken in the thesis of building an adaptive case-based learner through maintenance is novel. There has previously been no systematic study on handling concept drift with CBRM before. The findings of this thesis contribute to both scientific research and practical applications.
Please use this identifier to cite or link to this item: