Diversified top-k clique search

Publication Type:
Conference Proceeding
Citation:
Proceedings - International Conference on Data Engineering, 2015, 2015-May pp. 387 - 398
Issue Date:
2015-01-01
Filename Description Size
[2015 ICDE] Diversified Top-K Clique Search.pdfPublished version463.24 kB
Adobe PDF
Full metadata record
© 2015 IEEE. Maximal clique enumeration is a fundamental problem in graph theory and has been extensively studied. However, maximal clique enumeration is time-consuming in large graphs and always returns enormous cliques with large overlaps. Motivated by this, in this paper, we study the diversified top-k clique search problem which is to find top-k maximal cliques that can cover most number of nodes in the graph. Diversified top-k clique search can be widely used in a lot of applications including community search, motif discovery, and anomaly detection in large graphs. A naive solution for diversified top-k clique search is to keep all maximal cliques in memory and then find k of them that cover most nodes in the graph by using the approximate greedy max k-cover algorithm. However, such a solution is impractical when the graph is large. In this paper, instead of keeping all maximal cliques in memory, we devise an algorithm to maintain k candidates in the process of maximal clique enumeration. Our algorithm has limited memory footprint and can achieve a guaranteed approximation ratio. We also introduce a novel light-weight PNP-Index, based on which we design an optimal maximal clique maintenance algorithm. We further explore three optimization strategies to avoid enumerating all maximal cliques and thus largely reduce the computational cost. We conduct extensive performance studies on six real graphs one of which contains 0.3 billion edges, and the results demonstrate the high efficiency and effectiveness of our approach.
Please use this identifier to cite or link to this item: