DeepWukong: Statically Detecting Software Vulnerabilities Using Deep Graph Neural Network

Cheng, X; Wang, H; Hua, J; Xu, G; Sui, Y

DeepWukong: Statically Detecting Software Vulnerabilities Using Deep Graph Neural Network

Cheng, X Wang, H Hua, J Xu, G Sui, Y

Permalink

Publisher:: Association for Computing Machinery (ACM)
Publication Type:: Journal Article
Citation:: ACM Transactions on Software Engineering and Methodology, 2021, 30, (3), pp. 1-33
Issue Date:: 2021-05-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Accepted versionAdobe PDF (1.9 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Cheng, X
dc.contributor.author	Wang, H
dc.contributor.author	Hua, J
dc.contributor.author	Xu, G
dc.contributor.author	Sui, Y https://orcid.org/0000-0002-9510-6574
dc.date.accessioned	2022-04-12T21:28:05Z
dc.date.available	2022-04-12T21:28:05Z
dc.date.issued	2021-05-01
dc.identifier.citation	ACM Transactions on Software Engineering and Methodology, 2021, 30, (3), pp. 1-33
dc.identifier.issn	1049-331X
dc.identifier.issn	1557-7392
dc.identifier.uri	http://hdl.handle.net/10453/156137
dc.description.abstract	Static bug detection has shown its effectiveness in detecting well-defined memory errors, e.g., memory leaks, buffer overflows, and null dereference. However, modern software systems have a wide variety of vulnerabilities. These vulnerabilities are extremely complicated with sophisticated programming logic, and these bugs are often caused by different bad programming practices, challenging existing bug detection solutions. It is hard and labor-intensive to develop precise and efficient static analysis solutions for different types of vulnerabilities, particularly for those that may not have a clear specification as the traditional well-defined vulnerabilities. This article presents DeepWukong, a new deep-learning-based embedding approach to static detection of software vulnerabilities for C/C++ programs. Our approach makes a new attempt by leveraging advanced recent graph neural networks to embed code fragments in a compact and low-dimensional representation, producing a new code representation that preserves high-level programming logic (in the form of control-and data-flows) together with the natural language information of a program. Our evaluation studies the top 10 most common C/C++ vulnerabilities during the past 3 years. We have conducted our experiments using 105,428 real-world programs by comparing our approach with four well-known traditional static vulnerability detectors and three state-of-the-art deep-learning-based approaches. The experimental results demonstrate the effectiveness of our research and have shed light on the promising direction of combining program analysis with deep learning techniques to address the general static code analysis challenges.
dc.language	en
dc.publisher	Association for Computing Machinery (ACM)
dc.relation.ispartof	ACM Transactions on Software Engineering and Methodology
dc.relation.isbasedon	10.1145/3436877
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	0803 Computer Software, 0806 Information Systems
dc.subject.classification	Software Engineering
dc.title	DeepWukong: Statically Detecting Software Vulnerabilities Using Deep Graph Neural Network
dc.type	Journal Article
utslib.citation.volume	30
utslib.for	0803 Computer Software
utslib.for	0806 Information Systems
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	open_access	*
dc.date.updated	2022-04-12T21:28:02Z
pubs.issue	3
pubs.publication-status	Published
pubs.volume	30
utslib.citation.issue	3

Abstract:

Static bug detection has shown its effectiveness in detecting well-defined memory errors, e.g., memory leaks, buffer overflows, and null dereference. However, modern software systems have a wide variety of vulnerabilities. These vulnerabilities are extremely complicated with sophisticated programming logic, and these bugs are often caused by different bad programming practices, challenging existing bug detection solutions. It is hard and labor-intensive to develop precise and efficient static analysis solutions for different types of vulnerabilities, particularly for those that may not have a clear specification as the traditional well-defined vulnerabilities. This article presents DeepWukong, a new deep-learning-based embedding approach to static detection of software vulnerabilities for C/C++ programs. Our approach makes a new attempt by leveraging advanced recent graph neural networks to embed code fragments in a compact and low-dimensional representation, producing a new code representation that preserves high-level programming logic (in the form of control-and data-flows) together with the natural language information of a program. Our evaluation studies the top 10 most common C/C++ vulnerabilities during the past 3 years. We have conducted our experiments using 105,428 real-world programs by comparing our approach with four well-known traditional static vulnerability detectors and three state-of-the-art deep-learning-based approaches. The experimental results demonstrate the effectiveness of our research and have shed light on the promising direction of combining program analysis with deep learning techniques to address the general static code analysis challenges.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/156137