Model-aware categorical data embedding: a data-driven approach

Publication Type:
Journal Article
Citation:
Soft Computing, 2018, 22 (11), pp. 3603 - 3619
Issue Date:
2018-06-01
Filename Description Size
Zhao2018_Article_Model-awareCategoricalDataEmbe.pdfPublished Version876.05 kB
Adobe PDF
Full metadata record
© 2018, Springer-Verlag GmbH Germany, part of Springer Nature. Learning from categorical data is a critical yet challenging task. Current research focuses on either leveraging the complex interaction between and within categorical values to generate a numerical representation, or designing a model that can tackle this types of data directly. However, both of these paradigms overlook the relation between the data characteristics and learning model hypothesis. In this paper, we propose a model-aware categorical data embedding framework that jointly reveals the intrinsic categorical data characteristics and optimizes the fitness of the representation for the follow-up learning model. An ELM-aware and a SVM-aware representation methods have been instantiated under this framework. Extensive experiments of classification with the embedded representation on 17 data sets demonstrate that the proposed framework can significantly improve the categorical data representation performance compared with state-of-the-art competitors.
Please use this identifier to cite or link to this item: