Efficient structural node similarity computation on billion-scale graphs
- Springer Science and Business Media LLC
- Publication Type:
- Journal Article
- VLDB Journal, 2021, 30, (3), pp. 471-493
- Issue Date:
Copyright Clearance Process
- Recently Added
- In Progress
- Open Access
This item is open access.
The embargo period expires on 1 Feb 2022
Structural node similarity is widely used in analyzing complex networks. As one of the structural node similarity metrics, role similarity has the good merit of indicating automorphism (isomorphism). Existing algorithms to compute role similarity (e.g., RoleSim and NED) suffer from severe performance bottlenecks and thus cannot handle large real-world graphs. In this paper, we propose a new framework, namely StructSim, to compute nodes’ role similarity. Under this framework, we first prove that StructSim is an admissible role similarity metric based on the maximum matching. While the maximum matching is still too costly to scale, we then devise the BinCount matching that not only is efficient to compute but also guarantees the admissibility of StructSim. BinCount-based StructSim admits a precomputed index to query a single pair of node in O(klog D) time, where k is a small user-defined parameter and D is the maximum node degree. To build the index, we further devise an FM-sketch-based technique that can handle graphs with billions of edges. Extensive empirical studies show that StructSim performs much better than the existing works regarding both effectiveness and efficiency when applied to compute structural node similarities on the real-world graphs.
Please use this identifier to cite or link to this item: