Smart Contract Vulnerability Detection Based on Generative Adversarial Networks and Graph Matching Networks

Publication Type:
Thesis
Issue Date:
2024
Full metadata record
With blockchain technology’s decentralization and tamper-proof characteristics, smart contracts have developed rapidly and have been applied widely in some critical areas, e.g., the Internet of Things, digital management, healthcare, and finance. However, the security vulnerabilities of smart contracts have led to significant economic losses. Once deployed on the blockchain, smart contracts cannot be modified. Therefore, it is crucial to conduct pre-deployment vulnerability detection. We propose a smart contract vulnerability detection model that combines code embedding and Generative Adversarial Networks (GAN), which can effectively identify integer overflow vulnerabilities. The study advances beyond traditional textual or structural analysis by exploring the Abstract Syntax Tree of smart contract source code for effective vectorization, while maintaining essential contract features. We use the GAN model to generate synthetic contract vector data, which facilitates the construction of the detection model with small-sample data and reduces the challenges in source code acquisition. Our method integrates GAN discriminator feedback and employs both cosine similarity and correlation coefficients for vector similarity analysis. It enhances the accuracy of vulnerability detection and has proved effective in practical scenarios. Another novel detection method is proposed to identify reentrancy and integer overflow vulnerabilities, utilizing GAN and Graph Matching Networks (GMN). Expanding on prior research, we improve code representation by incorporating control and data flow from code functions and statements. This approach converts source code into a semantically and structurally rich contract graph, which preserves key contract features and outperforms traditional methods. We explore few-shot learning and use a graph-based GAN to overcome data starvation in training detection models. The innovative use of the GMN, an extension of Graph Neural Networks (GNN), enhances the efficiency of vulnerability detection. The novel GMN model uses a cross-graph attention mechanism to calculate the feature similarity between the target and the vulnerable contracts. Our research not only enhances the precision and efficiency of vulnerability detection models, but also introduces new concepts of vector and graph embedding for machine learning-oriented representation of smart contract source code. Furthermore, the GAN-based model construction approach proposed in this thesis is advantageous for building high-performance detection models with limited data samples. Lastly, our study demonstrates that GNNs, exemplified by the GMN, have technical superiority and potential for further development, especially in learning and analyzing feature graphs of smart contract source code.
Please use this identifier to cite or link to this item: