A Novel Spark-Based Attribute Reduction and Neighborhood Classification for Rough Evidence.

Publisher:
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication Type:
Journal Article
Citation:
IEEE Trans Cybern, 2022, PP, (99)
Issue Date:
2022-10-10
Full metadata record
Neighborhood classification (NEC) algorithms have been widely used to solve classification problems. Most traditional NEC algorithms employ the majority voting mechanism as the basis for final decision making. However, this mechanism hardly considers the spatial difference and label uncertainty of the neighborhood samples, which may increase the possibility of the misclassification. In addition, the traditional NEC algorithms need to load the entire data into memory at once, which is computationally inefficient when the size of the dataset is large. To address these problems, we propose a novel Spark-based attribute reduction and NEC for rough evidence in this article. Specifically, we first construct a multigranular sample space using the parallel undersampling method. Then, we evaluate the significance of attribute by neighborhood rough evidence decision error rate and remove the redundant attribute on different samples subspaces. Based on this attribute reduction algorithm, we design a parallel attribute reduction algorithm which is able to compute equivalence classes in parallel and parallelize the process of searching for candidate attributes. Finally, we introduce the rough evidence into the classification decision of traditional NEC algorithms and parallelize the classification decision process. Furthermore, the proposed algorithms are conducted in the Spark parallel computing framework. Experimental results on both small and large-scale datasets show that the proposed algorithms outperform the benchmarking algorithms in the classification accuracy and the computational efficiency.
Please use this identifier to cite or link to this item: