Smart Contract Vulnerability Detection Based on Generative Adversarial Networks and Graph Matching Networks

Li, Hao

Smart Contract Vulnerability Detection Based on Generative Adversarial Networks and Graph Matching Networks

Li, Hao

Permalink

Publication Type:: Thesis
Issue Date:: 2024

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download thesisAdobe PDF (12.5 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Li, Hao
dc.date.accessioned	2024-06-07T03:07:06Z
dc.date.available	2024-06-07T03:07:06Z
dc.date.issued	2024
dc.identifier.uri	http://hdl.handle.net/10453/179449
dc.description	University of Technology Sydney. Faculty of Engineering and Information Technology.	en_US.UTF-8
dc.description.abstract	With blockchain technology’s decentralization and tamper-proof characteristics, smart contracts have developed rapidly and have been applied widely in some critical areas, e.g., the Internet of Things, digital management, healthcare, and finance. However, the security vulnerabilities of smart contracts have led to significant economic losses. Once deployed on the blockchain, smart contracts cannot be modified. Therefore, it is crucial to conduct pre-deployment vulnerability detection. We propose a smart contract vulnerability detection model that combines code embedding and Generative Adversarial Networks (GAN), which can effectively identify integer overflow vulnerabilities. The study advances beyond traditional textual or structural analysis by exploring the Abstract Syntax Tree of smart contract source code for effective vectorization, while maintaining essential contract features. We use the GAN model to generate synthetic contract vector data, which facilitates the construction of the detection model with small-sample data and reduces the challenges in source code acquisition. Our method integrates GAN discriminator feedback and employs both cosine similarity and correlation coefficients for vector similarity analysis. It enhances the accuracy of vulnerability detection and has proved effective in practical scenarios. Another novel detection method is proposed to identify reentrancy and integer overflow vulnerabilities, utilizing GAN and Graph Matching Networks (GMN). Expanding on prior research, we improve code representation by incorporating control and data flow from code functions and statements. This approach converts source code into a semantically and structurally rich contract graph, which preserves key contract features and outperforms traditional methods. We explore few-shot learning and use a graph-based GAN to overcome data starvation in training detection models. The innovative use of the GMN, an extension of Graph Neural Networks (GNN), enhances the efficiency of vulnerability detection. The novel GMN model uses a cross-graph attention mechanism to calculate the feature similarity between the target and the vulnerable contracts. Our research not only enhances the precision and efficiency of vulnerability detection models, but also introduces new concepts of vector and graph embedding for machine learning-oriented representation of smart contract source code. Furthermore, the GAN-based model construction approach proposed in this thesis is advantageous for building high-performance detection models with limited data samples. Lastly, our study demonstrates that GNNs, exemplified by the GMN, have technical superiority and potential for further development, especially in learning and analyzing feature graphs of smart contract source code.	en_US.UTF-8
dc.format	Thesis (ME)
dc.language.iso	en_US	en_US.UTF-8
dc.relation	https://opus.lib.uts.edu.au/bitstream/10453/179449/1/thesis.pdf
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.rights	© 2024 Hao Li
dc.rights	au.edu.uts.lib/cph
dc.title	Smart Contract Vulnerability Detection Based on Generative Adversarial Networks and Graph Matching Networks	en_US.UTF-8
dc.type	Thesis
utslib.copyright.status	open_access	*

Abstract:

With blockchain technology’s decentralization and tamper-proof characteristics, smart contracts have developed rapidly and have been applied widely in some critical areas, e.g., the Internet of Things, digital management, healthcare, and finance. However, the security vulnerabilities of smart contracts have led to significant economic losses. Once deployed on the blockchain, smart contracts cannot be modified. Therefore, it is crucial to conduct pre-deployment vulnerability detection. We propose a smart contract vulnerability detection model that combines code embedding and Generative Adversarial Networks (GAN), which can effectively identify integer overflow vulnerabilities. The study advances beyond traditional textual or structural analysis by exploring the Abstract Syntax Tree of smart contract source code for effective vectorization, while maintaining essential contract features. We use the GAN model to generate synthetic contract vector data, which facilitates the construction of the detection model with small-sample data and reduces the challenges in source code acquisition. Our method integrates GAN discriminator feedback and employs both cosine similarity and correlation coefficients for vector similarity analysis. It enhances the accuracy of vulnerability detection and has proved effective in practical scenarios. Another novel detection method is proposed to identify reentrancy and integer overflow vulnerabilities, utilizing GAN and Graph Matching Networks (GMN). Expanding on prior research, we improve code representation by incorporating control and data flow from code functions and statements. This approach converts source code into a semantically and structurally rich contract graph, which preserves key contract features and outperforms traditional methods. We explore few-shot learning and use a graph-based GAN to overcome data starvation in training detection models. The innovative use of the GMN, an extension of Graph Neural Networks (GNN), enhances the efficiency of vulnerability detection. The novel GMN model uses a cross-graph attention mechanism to calculate the feature similarity between the target and the vulnerable contracts. Our research not only enhances the precision and efficiency of vulnerability detection models, but also introduces new concepts of vector and graph embedding for machine learning-oriented representation of smart contract source code. Furthermore, the GAN-based model construction approach proposed in this thesis is advantageous for building high-performance detection models with limited data samples. Lastly, our study demonstrates that GNNs, exemplified by the GMN, have technical superiority and potential for further development, especially in learning and analyzing feature graphs of smart contract source code.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/179449