Coupled Nominal Similarity in Unsupervised Learning

Publisher:
ACM
Publication Type:
Conference Proceeding
Citation:
Proceedings of the 20th ACM international conference on Information and knowledge management, 2011, pp. 973 - 978
Issue Date:
2011-01
Full metadata record
Files in This Item:
Filename Description Size
Thumbnail2010006758OK.pdf Published version1.12 MB
Adobe PDF
The similarity between nominal objects is not straightforward, especially in unsupervised learning. This paper proposes coupled similarity metrics for nominal objects, which consider not only intra-coupled similarity within an attribute (i.e., value frequency distribution) but also inter-coupled similarity between attributes (i.e. feature dependency aggregation). Four metrics are designed to calculate the inter-coupled similarity between two categorical values by considering their relationships with other attributes. The theoretical analysis reveals their equivalent accuracy and superior efficiency based on intersection against others, in particular for large-scale data. Substantial experiments on extensive UCI data sets verify the theoretical conclusions. In addition, experiments of clustering based on the derived dissimilarity metrics show a significant performance improvement.
Please use this identifier to cite or link to this item: