Efficient structural graph clustering: an index-based approach

Publication Type:
Journal Article
Citation:
VLDB Journal, 2019, 28 (3), pp. 377 - 399
Issue Date:
2019-06-01
Filename Description Size
p243-wen.pdfPublished Version2.72 MB
Adobe PDF
Full metadata record
© 2019, Springer-Verlag GmbH Germany, part of Springer Nature. Graph clustering is a fundamental problem widely applied in many applications. The structural graph clustering (SCAN) method obtains not only clusters but also hubs and outliers. However, the clustering results heavily depend on two parameters, ϵ and μ, while the optimal parameter setting depends on different graph properties and various user requirements. In addition, all existing SCAN solutions need to scan at least the whole graph, even if only a small number of vertices belong to clusters. In this paper, we propose an index-based method for SCAN. Based on our index, we cluster the graph for any ϵ and μ in O(∑ C∈C| EC|) time, where C is the result set of all clusters and | EC| is the number of edges in a specific cluster C. In other words, the time spent on computing structural clustering depends only on the result size, not on the size of the original graph. Our index’s space complexity is O(m), where m is the number of edges in the graph. To handle dynamic graph updates, we propose algorithms and several optimization techniques for maintaining our index. We also design an index for I/O efficient query processing. We conduct extensive experiments to evaluate the performance of all our proposed algorithms on 10 real-world networks, with the largest one containing more than 1 billion edges. The experimental results demonstrate that our approaches significantly outperform existing solutions.
Please use this identifier to cite or link to this item: