I/O efficient ECC graph decomposition via graph reduction

Yuan, L; Qin, L; Lin, X; Chang, L; Zhang, W

I/O efficient ECC graph decomposition via graph reduction

Yuan, L Qin, L

Lin, X

Chang, L Zhang, W

Permalink

Publication Type:: Journal Article
Citation:: VLDB Journal, 2017, 26 (2), pp. 275 - 300
Issue Date:: 2017-04-01

Closed Access

	Filename	Description	Size
	10.1007%2Fs00778-016-0451-4.pdf	Published Version	1.65 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Yuan, L	en_US
dc.contributor.author	Qin, L https://orcid.org/0000-0001-6068-5062	en_US
dc.contributor.author	Lin, X https://orcid.org/0000-0003-2396-7225	en_US
dc.contributor.author	Chang, L	en_US
dc.contributor.author	Zhang, W https://orcid.org/0000-0001-6572-2600	en_US
dc.date.issued	2017-04-01	en_US
dc.identifier.citation	VLDB Journal, 2017, 26 (2), pp. 275 - 300	en_US
dc.identifier.issn	1066-8888	en_US
dc.identifier.uri	http://hdl.handle.net/10453/124466
dc.description.abstract	© 2016, Springer-Verlag Berlin Heidelberg. The problem of computing k-edge connected components (k-ECCs) of a graph G for a specific k is a fundamental graph problem and has been investigated recently. In this paper, we study the problem of ECC decomposition, which computes the k-ECCs of a graph G for all possible k values. ECC decomposition can be widely applied in a variety of applications such as graph-topology analysis, community detection, Steiner Component Search, and graph visualization. A straightforward solution for ECC decomposition is to apply the existing k-ECC computation algorithm to compute the k-ECCs for all k values. However, this solution is not applicable to large graphs for two challenging reasons. First, all existing k-ECC computation algorithms are highly memory intensive due to the complex data structures used in the algorithms. Second, the number of possible k values can be very large, resulting in a high computational cost when each k value is independently considered. In this paper, we address the above challenges, and study I/O efficient ECC decomposition via graph reduction. We introduce two elegant graph reduction operators which aim to reduce the size of the graph loaded in memory while preserving the connectivity information of a certain set of edges to be computed for a specific k. We also propose three novel I/O efficient algorithms, Bottom-Up, Top-Down, and Hybrid, that explore the k values in different orders to reduce the redundant computations between different k values. We analyze the I/O and memory costs for all proposed algorithms. In addition, we extend our algorithm to build an efficient index for Steiner Component Search. We show that our index can be used to perform Steiner Component Search in optimal I/Os when only the node information of the graph is allowed to be loaded in memory. In our experiments, we evaluate our algorithms using seven real large datasets with various graph properties, one of which contains 1.95 billion edges. The experimental results show that our proposed algorithms are scalable and efficient.	en_US
dc.relation.ispartof	VLDB Journal	en_US
dc.relation.isbasedon	10.1007/s00778-016-0451-4	en_US
dc.subject.classification	Information Systems	en_US
dc.title	I/O efficient ECC graph decomposition via graph reduction	en_US
dc.type	Journal Article
utslib.citation.volume	2	en_US
utslib.citation.volume	26	en_US
utslib.for	0804 Data Format	en_US
utslib.for	0805 Distributed Computing	en_US
utslib.for	0806 Information Systems	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Science
pubs.organisational-group	/University of Technology Sydney/Faculty of Science/School of Life Sciences
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	closed_access
pubs.issue	2	en_US
pubs.publication-status	Published	en_US
pubs.volume	26	en_US

Abstract:

© 2016, Springer-Verlag Berlin Heidelberg. The problem of computing k-edge connected components (k-ECCs) of a graph G for a specific k is a fundamental graph problem and has been investigated recently. In this paper, we study the problem of ECC decomposition, which computes the k-ECCs of a graph G for all possible k values. ECC decomposition can be widely applied in a variety of applications such as graph-topology analysis, community detection, Steiner Component Search, and graph visualization. A straightforward solution for ECC decomposition is to apply the existing k-ECC computation algorithm to compute the k-ECCs for all k values. However, this solution is not applicable to large graphs for two challenging reasons. First, all existing k-ECC computation algorithms are highly memory intensive due to the complex data structures used in the algorithms. Second, the number of possible k values can be very large, resulting in a high computational cost when each k value is independently considered. In this paper, we address the above challenges, and study I/O efficient ECC decomposition via graph reduction. We introduce two elegant graph reduction operators which aim to reduce the size of the graph loaded in memory while preserving the connectivity information of a certain set of edges to be computed for a specific k. We also propose three novel I/O efficient algorithms, Bottom-Up, Top-Down, and Hybrid, that explore the k values in different orders to reduce the redundant computations between different k values. We analyze the I/O and memory costs for all proposed algorithms. In addition, we extend our algorithm to build an efficient index for Steiner Component Search. We show that our index can be used to perform Steiner Component Search in optimal I/Os when only the node information of the graph is allowed to be loaded in memory. In our experiments, we evaluate our algorithms using seven real large datasets with various graph properties, one of which contains 1.95 billion edges. The experimental results show that our proposed algorithms are scalable and efficient.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/124466