Machine-learning-guided typestate analysis for static use-After-free detection

Yan, H; Sui, Y; Chen, S; Xue, J

Machine-learning-guided typestate analysis for static use-After-free detection

Yan, H Sui, Y

Chen, S Xue, J

Permalink

Publication Type:: Conference Proceeding
Citation:: ACM International Conference Proceeding Series, 2017, Part F132521 pp. 42 - 54
Issue Date:: 2017-12-04

Closed Access

	Filename	Description	Size
	acsac17.pdf	Published version	1.33 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Yan, H	en_US
dc.contributor.author	Sui, Y https://orcid.org/0000-0002-9510-6574	en_US
dc.contributor.author	Chen, S	en_US
dc.contributor.author	Xue, J	en_US
dc.date.issued	2017-12-04	en_US
dc.identifier.citation	ACM International Conference Proceeding Series, 2017, Part F132521 pp. 42 - 54	en_US
dc.identifier.isbn	9781450353458	en_US
dc.identifier.uri	http://hdl.handle.net/10453/121642
dc.description.abstract	Typestate analysis relies on pointer analysis for detecting temporal memory safety errors, such as use-After-free (UAF). For large programs, scalable pointer analysis is usually imprecise in analyzing their hard "corner cases", such as infeasible paths, recursion cycles, loops, arrays, and linked lists. Due to a sound over-Approximation of the points-To information, a large number of spurious aliases will be reported conservatively, causing the corresponding typestate analysis to report a large number of false alarms. Thus, the usefulness of typestate analysis for heap-intensive clients, like UAF detection, becomes rather limited, in practice. We introduce Tac, a static UAF detector that bridges the gap between typestate and pointer analyses by machine learning. Tac learns the correlations between program features and UAF-related aliases by using a Support Vector Machine (SVM) and applies this knowledge to further disambiguate the UAF-related aliases reported imprecisely by the pointer analysis so that only the ones validated by its SVM classifier are further investigated by the typestate analysis. Despite its unsoundness, Tac represents a practical typestate analysis approach for UAF detection. We have implemented Tac in LLVM-3.8.0 and evaluated it using a set of eight open-source C/C++ programs. The results show that Tac is effective (in terms of finding 5 known CVE vulnerabilities, 1 known bug, and 8 new bugs with a low false alarm rate) and scalable (in terms of analyzing a large codebase with 2,098 KLOC in just over 4 hours).	en_US
dc.relation.ispartof	ACM International Conference Proceeding Series	en_US
dc.relation.isbasedon	10.1145/3134600.3134620	en_US
dc.title	Machine-learning-guided typestate analysis for static use-After-free detection	en_US
dc.type	Conference Proceeding
utslib.citation.volume	Part F132521	en_US
utslib.for	0803 Computer Software	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US
pubs.volume	Part F132521	en_US

Abstract:

Typestate analysis relies on pointer analysis for detecting temporal memory safety errors, such as use-After-free (UAF). For large programs, scalable pointer analysis is usually imprecise in analyzing their hard "corner cases", such as infeasible paths, recursion cycles, loops, arrays, and linked lists. Due to a sound over-Approximation of the points-To information, a large number of spurious aliases will be reported conservatively, causing the corresponding typestate analysis to report a large number of false alarms. Thus, the usefulness of typestate analysis for heap-intensive clients, like UAF detection, becomes rather limited, in practice. We introduce Tac, a static UAF detector that bridges the gap between typestate and pointer analyses by machine learning. Tac learns the correlations between program features and UAF-related aliases by using a Support Vector Machine (SVM) and applies this knowledge to further disambiguate the UAF-related aliases reported imprecisely by the pointer analysis so that only the ones validated by its SVM classifier are further investigated by the typestate analysis. Despite its unsoundness, Tac represents a practical typestate analysis approach for UAF detection. We have implemented Tac in LLVM-3.8.0 and evaluated it using a set of eight open-source C/C++ programs. The results show that Tac is effective (in terms of finding 5 known CVE vulnerabilities, 1 known bug, and 8 new bugs with a low false alarm rate) and scalable (in terms of analyzing a large codebase with 2,098 KLOC in just over 4 hours).

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/121642