Distance labeling: on parallelism, compression, and ordering

Publisher:
SPRINGER
Publication Type:
Journal Article
Citation:
VLDB Journal, 2022, 31, (1), pp. 129-155
Issue Date:
2022-01-01
Filename Description Size
Li2022_Article_DistanceLabelingOnParallelismC.pdfPublished version2.14 MB
Adobe PDF
Full metadata record
Distance labeling approaches are widely adopted to speed up the online performance of shortest-distance queries. The construction of the distance labeling, however, can be exhaustive, especially on big graphs. For a major category of large graphs, small-world networks, the state-of-the-art approach is pruned landmark labeling (PLL). PLL prunes distance labels based on a node order and directly constructs the pruned labels by performing breadth-first searches in the node order. The pruning technique, as well as the index construction, has a strong sequential nature which hinders PLL from being parallelized. It becomes an urgent issue on massive small-world networks whose index can hardly be constructed by a single thread within a reasonable time. This paper first scales distance labeling on small-world networks by proposing a parallel shortest-distance labeling (PSL) scheme. PSL insightfully converts the PLL’s node-order dependency to a shortest-distance dependence, which leads to a propagation-based parallel labeling in D rounds where D denotes the diameter of the graph. To further scale up PSL, it is critical to reduce the index size. This paper proposes effective index compression techniques based on graph properties as well as label properties; it also explores best practices in using betweenness-based node order to reduce the index size. The efficient betweenness estimation of the graph nodes proposed may be of independent interest to graph practitioners. Extensive experimental results verify our efficiency on billion-scale graphs, near-linear speedup in a multi-core environment, and up to 94 % reduction in the index size.
Please use this identifier to cite or link to this item: