Efficient subgraph matching by postponing Cartesian products
- Publication Type:
- Conference Proceeding
- Proceedings of the ACM SIGMOD International Conference on Management of Data, 2016, 26-June-2016 pp. 1199 - 1214
- Issue Date:
© 2016 ACM. In this paper, we study the problem of subgraph matching that extracts all subgraph isomorphic embeddings of a query graph q in a large data graph G. The existing algorithms for subgraph matching follow Ullmann's backtracking approach; that is, iteratively map query vertices to data vertices by following a matching order of query vertices. It has been shown that the matching order of query vertices is a very important aspect to the efficiency of a subgraph matching algorithm. Recently, many advanced techniques, such as enforcing connectivity and merging similar vertices in query or data graphs, have been proposed to provide an effective matching order with the aim to reduce unpromising intermediate results especially the ones caused by redundant Cartesian products. In this paper, for the first time we address the issue of unpromising results by Cartesian products from "dissimilar" vertices. We propose a new framework by postponing the Cartesian products based on the structure of a query to minimize the redundant Cartesian products. Our second contribution is proposing a new path-based auxiliary data structure, with the size O(|E(G)| × |V(q)|), to generate a matching order and conduct subgraph matching, which significantly reduces the exponential size O(|V(G)||V(q)|-1) of the existing path-based auxiliary data structure, where V(G) and E(G) are the vertex and edge sets of a data graph G, respectively, and V(q) is the vertex set of a query q. Extensive empirical studies on real and synthetic graphs demonstrate that our techniques outperform the state-of-the-art algorithms by up to 3 orders of magnitude.
Please use this identifier to cite or link to this item: