Ensemble Pruning via Individual Contribution Ordering

Publication Type:
Conference Proceeding
Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2010, pp. 871 - 880
Issue Date:
Full metadata record
Files in This Item:
Filename Description Size
Thumbnail2010001754OK.pdf1.16 MB
Adobe PDF
An ensemble is a set of learned models that make decisions collectively. Although an ensemble is usually more accurate than a single learner, existing ensemble methods often tend to construct unnecessarily large ensembles, which increases the memory consumption and computational cost. Ensemble pruning tackles this problem by selecting a subset of ensemble members to form subensembles that are subject to less resource consumption and response time with accuracy that is similar to or better than the original ensemble. In this paper, we analyze the accuracy/diversity trade-off and prove that classifiers that are more accurate and make more predictions in the minority group are more important for subensemble construction. Based on the gained insights, a heuristic metric that considers both accuracy and diversity is proposed to explicitly evaluate each individual classifierâs contribution to the whole ensemble. By incorporating ensemble members in decreasing order of their contributions, subensembles are formed such that users can select the top p percent of ensemble members, depending on their resource availability and tolerable waiting time, for predictions. Experimental results on 26 UCI data sets show that subensembles formed by the proposed EPIC (Ensemble Pruning via Individual Contribution ordering) algorithm outperform the original ensemble and a state-ofthe-art ensemble pruning method, Orientation Ordering (OO) [16].
Please use this identifier to cite or link to this item: