Correlated multi-label classification with incomplete label space and class imbalance

Publication Type:
Journal Article
Citation:
ACM Transactions on Intelligent Systems and Technology, 2019, 10 (5)
Issue Date:
2019-09-01
Full metadata record
© 2019 Association for Computing Machinery. Multi-label classification is defined as the problem of identifying the multiple labels or categories of new observations based on labeled training data. Multi-labeled data has several challenges, including class imbalance, label correlation, incomplete multi-label matrices, and noisy and irrelevant features. In this article, we propose an integrated multi-label classification approach with incomplete label space and class imbalance (ML-CIB) for simultaneously training the multi-label classification model and addressing the aforementioned challenges. The model learns a new label matrix and captures new label correlations, because it is dificult to find a complete label vector for each instance in real-world data. We also propose a label regularization to handle the imbalanced multi-labeled issue in the new label, and l1 regularization norm is incorporated in the objective function to select the relevant sparse features. A multi-label feature selection (ML-CIB-FS) method is presented as a variant of the proposed ML-CIB to show the eficacy of the proposed method in selecting the relevant features. ML-CIB is formulated as a constrained objective function. We use the accelerated proximal gradient method to solve the proposed optimisation problem. Last, extensive experiments are conducted on 19 regular-scale and large-scale imbalanced multi-labeled datasets. The promising results show that our method significantly outperforms the state-of-the-art.
Please use this identifier to cite or link to this item: