Missing Value Imputation Based on Data Clustering

Publisher:
Springer
Publication Type:
Journal Article
Citation:
Lecture Notes in Computer Science, 2008, 4750 (2008), pp. 128 - 138
Issue Date:
2008-01
Full metadata record
Files in This Item:
Filename Description Size
Thumbnail2008001136OK.pdf1.2 MB
Adobe PDF
We propose an efficient nonparametric missing value imputation method based on clustering, called CMI (Clustering-based Missing value Imputation), for dealing with missing values in target attributes. In our approach, we impute the missing values of an instance A with plausible values that are generated from the data in the instances which do not contain missing values and are most similar to the instance A using a kernel-based method. Specifically, we first divide the dataset (including the instances with missing values) into clusters. Next, missing values of an instance A are patched up with the plausible values generated from Aâs cluster. Extensive experiments show the effectiveness of the proposed method in missing value imputation task.
Please use this identifier to cite or link to this item: