Mix2Vec: Unsupervised mixed data representation

Publisher:
IEEE
Publication Type:
Conference Proceeding
Citation:
Proceedings - 2020 IEEE 7th International Conference on Data Science and Advanced Analytics, DSAA 2020, 2020, 00, pp. 118-127
Issue Date:
2020-10-01
Filename Description Size
dsaa2020-final.pdfAccepted version970.49 kB
Adobe PDF
Full metadata record
Unsupervised representation learning on mixed data is highly challenging but rarely explored. It has to tackle significant challenges related to common issues in real-life mixed data, including sparsity, dynamics and heterogeneity of attributes and values. This work introduces an effective and efficient unsupervised deep representer called Mix2Vec to automatically learn a universal representation of dynamic mixed data with the above complex characteristics. Mix2Vec is empowered with three effective mechanisms: random shuffling prediction, prior distribution matching, and structural informativeness maximization, to tackle the aforementioned challenges. These mechanisms are implemented as an unsupervised deep neural representer Mix2Vec. Mix2Vec converts complex mixed data into vector space-based representations that are universal and comparable to all data objects and transparent and reusable for both unsupervised and supervised learning tasks. Extensive experiments on four large mixed datasets demonstrate that Mix2Vec performs significantly better than state-of-the-art deep representation methods. We also empirically verify the designed mechanisms in terms of representation quality, visualization and capability of enabling better performance of downstream tasks.
Please use this identifier to cite or link to this item: