Coupled nominal similarity in unsupervised learning

Publication Type:
Conference Proceeding
International Conference on Information and Knowledge Management, Proceedings, 2011, pp. 973 - 978
Issue Date:
Filename Description Size
Thumbnail2010006758OK.pdf Published version1.12 MB
Adobe PDF
Full metadata record
The similarity between nominal objects is not straightforward, especially in unsupervised learning. This paper proposes coupled similarity metrics for nominal objects, which consider not only intra-coupled similarity within an attribute (i.e., value frequency distribution) but also inter-coupled similarity between attributes (i.e. feature dependency aggregation). Four metrics are designed to calculate the inter-coupled similarity between two categorical values by considering their relationships with other attributes. The theoretical analysis reveals their equivalent accuracy and superior efficiency based on intersection against others, in particular for large-scale data. Substantial experiments on extensive UCI data sets verify the theoretical conclusions. In addition, experiments of clustering based on the derived dissimilarity metrics show a significant performance improvement. © 2011 ACM.
Please use this identifier to cite or link to this item: