GPU-Accelerated Approaches for Graph Data Processing

Yu, Yuanhang

GPU-Accelerated Approaches for Graph Data Processing

Yu, Yuanhang

Permalink

Publication Type:: Thesis
Issue Date:: 2023

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download thesisAdobe PDF (1.78 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Yu, Yuanhang
dc.date.accessioned	2024-04-02T04:01:27Z
dc.date.available	2024-04-02T04:01:27Z
dc.date.issued	2023
dc.identifier.uri	http://hdl.handle.net/10453/177429
dc.description	University of Technology Sydney. Faculty of Engineering and Information Technology.	en_US.UTF-8
dc.description.abstract	This thesis investigates the increasing necessity of processing large-scale graph data across various domains, like bioinformatics and social networking, where traditional CPU performance lags due to the vastness of data and its inherent complexity, highlighting the need for high-parallelism hardware acceleration. We address this need through GPU-accelerated approaches for three prevalent graph data processing problems, considering the irregularity of data that complicates the use of GPU memory and threads efficiently. The first challenge we tackle is matrix factorization, essential for graph embedding and recommendation systems. Given its data-intensive nature, we harness heterogeneous multiprocessors for parallel processing. Our novel strategy involves non-uniform matrix block partitioning, ensuring optimal GPU resource use and balanced workload distribution, based on a custom cost model predicting the performance of different units. This model, combined with dynamic scheduling, significantly enhances performance and training quality, as proven by extensive experiments with real-world datasets. In the second part, we focus on the approximate k nearest neighbors search, a critical operation in community and anomaly detection. Although graph-based methods show promise, they are bottlenecked by the intensive computation required for high-dimensional data point distances, a perfect candidate for GPU acceleration. We identify inefficiencies in existing GPU-based solutions, specifically the costly data structure operations conducted by the single thread. To overcome this, we develop a GPU-optimized search approach, fully leveraging multi-threading capabilities throughout the search process. Furthermore, we introduce a GPU-accelerated algorithm for constructing high-quality graphs, exhibiting promising parallel efficiency. Benchmarked against high-dimensional datasets, our algorithms demonstrate superior performance in both nearest neighbors search and graph construction. The third problem is subgraph matching, fundamental in a wide range of research fields such as protein interaction analysis and social network analysis. As an NP-hard problem, subgraph matching is inherently challenging. This offers a chance for GPU acceleration. Existing work employs a lightweight filtering method to make a trade-off between high parallelism and efficient pruning performance. We notice that its pruning performance falls short of expectations when the labels on graphs are not diverse. Inspired by this, we propose a novel GPU-friendly filtering approach with strong pruning performance. Additionally, we propose a 1*-phase output scheme to reduce space consumption and optimize write transactions during enumeration. Together with a pipeline method, our approach outperforms existing work on real-world datasets through extensive experiments.	en_US.UTF-8
dc.format	Thesis (PhD)
dc.language.iso	en_US	en_US.UTF-8
dc.relation	https://opus.lib.uts.edu.au/bitstream/10453/177429/1/thesis.pdf
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.rights	© 2023 Yuanhang Yu
dc.rights	au.edu.uts.lib/cph
dc.title	GPU-Accelerated Approaches for Graph Data Processing	en_US.UTF-8
dc.type	Thesis
utslib.copyright.status	open_access	*

Abstract:

This thesis investigates the increasing necessity of processing large-scale graph data across various domains, like bioinformatics and social networking, where traditional CPU performance lags due to the vastness of data and its inherent complexity, highlighting the need for high-parallelism hardware acceleration. We address this need through GPU-accelerated approaches for three prevalent graph data processing problems, considering the irregularity of data that complicates the use of GPU memory and threads efficiently. The first challenge we tackle is matrix factorization, essential for graph embedding and recommendation systems. Given its data-intensive nature, we harness heterogeneous multiprocessors for parallel processing. Our novel strategy involves non-uniform matrix block partitioning, ensuring optimal GPU resource use and balanced workload distribution, based on a custom cost model predicting the performance of different units. This model, combined with dynamic scheduling, significantly enhances performance and training quality, as proven by extensive experiments with real-world datasets. In the second part, we focus on the approximate k nearest neighbors search, a critical operation in community and anomaly detection. Although graph-based methods show promise, they are bottlenecked by the intensive computation required for high-dimensional data point distances, a perfect candidate for GPU acceleration. We identify inefficiencies in existing GPU-based solutions, specifically the costly data structure operations conducted by the single thread. To overcome this, we develop a GPU-optimized search approach, fully leveraging multi-threading capabilities throughout the search process. Furthermore, we introduce a GPU-accelerated algorithm for constructing high-quality graphs, exhibiting promising parallel efficiency. Benchmarked against high-dimensional datasets, our algorithms demonstrate superior performance in both nearest neighbors search and graph construction. The third problem is subgraph matching, fundamental in a wide range of research fields such as protein interaction analysis and social network analysis. As an NP-hard problem, subgraph matching is inherently challenging. This offers a chance for GPU acceleration. Existing work employs a lightweight filtering method to make a trade-off between high parallelism and efficient pruning performance. We notice that its pruning performance falls short of expectations when the labels on graphs are not diverse. Inspired by this, we propose a novel GPU-friendly filtering approach with strong pruning performance. Additionally, we propose a 1*-phase output scheme to reduce space consumption and optimize write transactions during enumeration. Together with a pipeline method, our approach outperforms existing work on real-world datasets through extensive experiments.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/177429