Efficient subgraph matching by postponing Cartesian products

Bi, F; Chang, L; Lin, X; Qin, L; Zhang, W

Efficient subgraph matching by postponing Cartesian products

Bi, F Chang, L Lin, X Qin, L

Zhang, W

Permalink

Publication Type:: Conference Proceeding
Citation:: Proceedings of the ACM SIGMOD International Conference on Management of Data, 2016, 26-June-2016 pp. 1199 - 1214
Issue Date:: 2016-06-26

Closed Access

	Filename	Description	Size
	Subgraph paper.pdf	Published version	1.05 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Bi, F	en_US
dc.contributor.author	Chang, L	en_US
dc.contributor.author	Lin, X	en_US
dc.contributor.author	Qin, L https://orcid.org/0000-0001-6068-5062	en_US
dc.contributor.author	Zhang, W	en_US
dc.date.issued	2016-06-26	en_US
dc.identifier.citation	Proceedings of the ACM SIGMOD International Conference on Management of Data, 2016, 26-June-2016 pp. 1199 - 1214	en_US
dc.identifier.isbn	9781450335317	en_US
dc.identifier.issn	0730-8078	en_US
dc.identifier.uri	http://hdl.handle.net/10453/121804
dc.description.abstract	© 2016 ACM. In this paper, we study the problem of subgraph matching that extracts all subgraph isomorphic embeddings of a query graph q in a large data graph G. The existing algorithms for subgraph matching follow Ullmann's backtracking approach; that is, iteratively map query vertices to data vertices by following a matching order of query vertices. It has been shown that the matching order of query vertices is a very important aspect to the efficiency of a subgraph matching algorithm. Recently, many advanced techniques, such as enforcing connectivity and merging similar vertices in query or data graphs, have been proposed to provide an effective matching order with the aim to reduce unpromising intermediate results especially the ones caused by redundant Cartesian products. In this paper, for the first time we address the issue of unpromising results by Cartesian products from "dissimilar" vertices. We propose a new framework by postponing the Cartesian products based on the structure of a query to minimize the redundant Cartesian products. Our second contribution is proposing a new path-based auxiliary data structure, with the size O(\|E(G)\| × \|V(q)\|), to generate a matching order and conduct subgraph matching, which significantly reduces the exponential size O(\|V(G)\|\|V(q)\|-1) of the existing path-based auxiliary data structure, where V(G) and E(G) are the vertex and edge sets of a data graph G, respectively, and V(q) is the vertex set of a query q. Extensive empirical studies on real and synthetic graphs demonstrate that our techniques outperform the state-of-the-art algorithms by up to 3 orders of magnitude.	en_US
dc.relation.ispartof	Proceedings of the ACM SIGMOD International Conference on Management of Data	en_US
dc.relation.isbasedon	10.1145/2882903.2915236	en_US
dc.title	Efficient subgraph matching by postponing Cartesian products	en_US
dc.type	Conference Proceeding
utslib.citation.volume	26-June-2016	en_US
utslib.for	080101 Adaptive Agents and Intelligent Robotics	en_US
utslib.for	080109 Pattern Recognition and Data Mining	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US
pubs.volume	26-June-2016	en_US

Abstract:

© 2016 ACM. In this paper, we study the problem of subgraph matching that extracts all subgraph isomorphic embeddings of a query graph q in a large data graph G. The existing algorithms for subgraph matching follow Ullmann's backtracking approach; that is, iteratively map query vertices to data vertices by following a matching order of query vertices. It has been shown that the matching order of query vertices is a very important aspect to the efficiency of a subgraph matching algorithm. Recently, many advanced techniques, such as enforcing connectivity and merging similar vertices in query or data graphs, have been proposed to provide an effective matching order with the aim to reduce unpromising intermediate results especially the ones caused by redundant Cartesian products. In this paper, for the first time we address the issue of unpromising results by Cartesian products from "dissimilar" vertices. We propose a new framework by postponing the Cartesian products based on the structure of a query to minimize the redundant Cartesian products. Our second contribution is proposing a new path-based auxiliary data structure, with the size O(|E(G)| × |V(q)|), to generate a matching order and conduct subgraph matching, which significantly reduces the exponential size O(|V(G)||V(q)|-1) of the existing path-based auxiliary data structure, where V(G) and E(G) are the vertex and edge sets of a data graph G, respectively, and V(q) is the vertex set of a query q. Extensive empirical studies on real and synthetic graphs demonstrate that our techniques outperform the state-of-the-art algorithms by up to 3 orders of magnitude.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/121804