Missing Value Imputation Based on Data Clustering
- Publication Type:
- Journal Article
- Lecture Notes in Computer Science, 2008, 4750 (2008), pp. 128 - 138
- Issue Date:
We propose an efficient nonparametric missing value imputation method based on clustering, called CMI (Clustering-based Missing value Imputation), for dealing with missing values in target attributes. In our approach, we impute the missing values of an instance A with plausible values that are generated from the data in the instances which do not contain missing values and are most similar to the instance A using a kernel-based method. Specifically, we first divide the dataset (including the instances with missing values) into clusters. Next, missing values of an instance A are patched up with the plausible values generated from Aâs cluster. Extensive experiments show the effectiveness of the proposed method in missing value imputation task.
Please use this identifier to cite or link to this item: