Compacting points-to sets through object clustering

Barbar, M; Sui, Y

Compacting points-to sets through object clustering

Barbar, M Sui, Y

Permalink

Publisher:: Association for Computing Machinery (ACM)
Publication Type:: Journal Article
Citation:: Proceedings of the ACM on Programming Languages, 2021, 5, (OOPSLA), pp. 1-27
Issue Date:: 2021-10-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Published versionAdobe PDF (376.33 kB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Barbar, M
dc.contributor.author	Sui, Y https://orcid.org/0000-0002-9510-6574
dc.date.accessioned	2022-04-07T20:46:23Z
dc.date.available	2022-04-07T20:46:23Z
dc.date.issued	2021-10-01
dc.identifier.citation	Proceedings of the ACM on Programming Languages, 2021, 5, (OOPSLA), pp. 1-27
dc.identifier.issn	2475-1421
dc.identifier.issn	2475-1421
dc.identifier.uri	http://hdl.handle.net/10453/155983
dc.description.abstract	Inclusion-based set constraint solving is the most popular technique for whole-program points-to analysis whereby an analysis is typically formulated as repeatedly resolving constraints between points-to sets of program variables. The set union operation is central to this process. The number of points-to sets can grow as analyses become more precise and input programs become larger, resulting in more time spent performing unions and more space used storing these points-to sets. Most existing approaches focus on improving scalability of precise points-to analyses from an algorithmic perspective and there has been less research into improving the data structures behind the analyses. Bit-vectors as one of the more popular data structures have been used in several mainstream analysis frameworks to represent points-to sets. To store memory objects in bit-vectors, objects need to mapped to integral identifiers. We observe that this object-to-identifier mapping is critical for a compact points-to set representation and the set union operation. If objects in the same points-to sets (co-pointees) are not given numerically close identifiers, points-to resolution can cost significantly more space and time. Without data on the unpredictable points-to relations which would be discovered by the analysis, an ideal mapping is extremely challenging. In this paper, we present a new approach to inclusion-based analysis by compacting points-to sets through object clustering. Inspired by recent staged analysis where an auxiliary analysis produces results approximating a more precise main analysis, we formulate points-to set compaction as an optimisation problem solved by integer programming using constraints generated from the auxiliary analysis's results in order to produce an effective mapping. We then develop a more approximate mapping, yet much more efficiently, using hierarchical clustering to compact bit-vectors. We also develop an improved representation of bit-vectors (called core bit-vectors) to fully take advantage of the newly produced mapping. Our approach requires no algorithmic change to the points-to analysis. We evaluate our object clustering on flow sensitive points-to analysis using 8 open-source programs (>3.1 million lines of LLVM instructions) and our results show that our approach can successfully improve the analysis with an up to 1.83× speed up and an up to 4.05× reduction in memory usage.
dc.language	en
dc.publisher	Association for Computing Machinery (ACM)
dc.relation	http://purl.org/au-research/grants/arc/DP200101328
dc.relation	http://purl.org/au-research/grants/arc/DP210101348
dc.relation.ispartof	Proceedings of the ACM on Programming Languages
dc.relation.isbasedon	10.1145/3485547
dc.rights	info:eu-repo/semantics/openAccess
dc.title	Compacting points-to sets through object clustering
dc.type	Journal Article
utslib.citation.volume	5
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	open_access	*
dc.date.updated	2022-04-07T20:46:15Z
pubs.issue	OOPSLA
pubs.publication-status	Published
pubs.volume	5
utslib.citation.issue	OOPSLA

Abstract:

Inclusion-based set constraint solving is the most popular technique for whole-program points-to analysis whereby an analysis is typically formulated as repeatedly resolving constraints between points-to sets of program variables. The set union operation is central to this process. The number of points-to sets can grow as analyses become more precise and input programs become larger, resulting in more time spent performing unions and more space used storing these points-to sets. Most existing approaches focus on improving scalability of precise points-to analyses from an algorithmic perspective and there has been less research into improving the data structures behind the analyses. Bit-vectors as one of the more popular data structures have been used in several mainstream analysis frameworks to represent points-to sets. To store memory objects in bit-vectors, objects need to mapped to integral identifiers. We observe that this object-to-identifier mapping is critical for a compact points-to set representation and the set union operation. If objects in the same points-to sets (co-pointees) are not given numerically close identifiers, points-to resolution can cost significantly more space and time. Without data on the unpredictable points-to relations which would be discovered by the analysis, an ideal mapping is extremely challenging. In this paper, we present a new approach to inclusion-based analysis by compacting points-to sets through object clustering. Inspired by recent staged analysis where an auxiliary analysis produces results approximating a more precise main analysis, we formulate points-to set compaction as an optimisation problem solved by integer programming using constraints generated from the auxiliary analysis's results in order to produce an effective mapping. We then develop a more approximate mapping, yet much more efficiently, using hierarchical clustering to compact bit-vectors. We also develop an improved representation of bit-vectors (called core bit-vectors) to fully take advantage of the newly produced mapping. Our approach requires no algorithmic change to the points-to analysis. We evaluate our object clustering on flow sensitive points-to analysis using 8 open-source programs (>3.1 million lines of LLVM instructions) and our results show that our approach can successfully improve the analysis with an up to 1.83× speed up and an up to 4.05× reduction in memory usage.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/155983