Static malware clustering using enhanced deep embedding method

Ng, CK; Jiang, F; Zhang, LY; Zhou, W

Static malware clustering using enhanced deep embedding method

Ng, CK Jiang, F Zhang, LY Zhou, W

Permalink

Publication Type:: Conference Proceeding
Citation:: Concurrency Computation, 2019, 31 (19)
Issue Date:: 2019-10-10

Closed Access

	Filename	Description	Size
	Static malware clustering using enhanced deep.pdf	Published version	1.25 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Ng, CK	en_US
dc.contributor.author	Jiang, F	en_US
dc.contributor.author	Zhang, LY	en_US
dc.contributor.author	Zhou, W https://orcid.org/0000-0002-1680-2521	en_US
dc.date.issued	2019-10-10	en_US
dc.identifier.citation	Concurrency Computation, 2019, 31 (19)	en_US
dc.identifier.issn	1532-0626	en_US
dc.identifier.uri	http://hdl.handle.net/10453/137328
dc.description.abstract	© 2019 John Wiley & Sons, Ltd. Malware refers to any software, programs, or files that are intentionally utilised to compromise the system and cause unexpected losses to end-users such as economical losses or privacy breaches. The rapid growth of malware makes it impossible to keep up with its progress merely via human interventions or manual analysis. One of the challenges for the human-oriented approaches is they will cause backlog and inability to keep up with the development traces of the malware. Hence, an efficient method is needed urgently to analyse effectively and identify accurately the malware in their domain. Malware clustering has been extensively studied in the machine learning area with regards to distance functions, grouping algorithm and cluster validation. A large number of research studies have been done via behavioral analysis for clustering to achieve high performance of malware detections. However, there is a trade-off for better detection performance between behaviorial approaches and high computational forces. Up to date, little work focuses on the deep learning representations for malware clustering. Therefore, in this paper, we propose an enhanced deep embedded clustering method to facilitate an effective and efficient malware clustering process. The new method takes advantage of linear dimensionality reduction and a customised deep neural network to learn malware representations in an orthogonal space and performs cluster assignments. Our experimental results demonstrate that the proposed clustering model outperforms the traditional K-means method with regards to the enhanced features using various auto-encoder, pre-trained weight and principle component analysis (PCA).	en_US
dc.relation.ispartof	Concurrency Computation	en_US
dc.relation.isbasedon	10.1002/cpe.5234	en_US
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject.classification	Distributed Computing	en_US
dc.title	Static malware clustering using enhanced deep embedding method	en_US
dc.type	Conference Proceeding
utslib.citation.volume	19	en_US
utslib.citation.volume	31	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
utslib.for	0803 Computer Software	en_US
utslib.for	0805 Distributed Computing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	closed_access	*
pubs.issue	19	en_US
pubs.publication-status	Published	en_US
pubs.volume	31	en_US

Abstract:

© 2019 John Wiley & Sons, Ltd. Malware refers to any software, programs, or files that are intentionally utilised to compromise the system and cause unexpected losses to end-users such as economical losses or privacy breaches. The rapid growth of malware makes it impossible to keep up with its progress merely via human interventions or manual analysis. One of the challenges for the human-oriented approaches is they will cause backlog and inability to keep up with the development traces of the malware. Hence, an efficient method is needed urgently to analyse effectively and identify accurately the malware in their domain. Malware clustering has been extensively studied in the machine learning area with regards to distance functions, grouping algorithm and cluster validation. A large number of research studies have been done via behavioral analysis for clustering to achieve high performance of malware detections. However, there is a trade-off for better detection performance between behaviorial approaches and high computational forces. Up to date, little work focuses on the deep learning representations for malware clustering. Therefore, in this paper, we propose an enhanced deep embedded clustering method to facilitate an effective and efficient malware clustering process. The new method takes advantage of linear dimensionality reduction and a customised deep neural network to learn malware representations in an orthogonal space and performs cluster assignments. Our experimental results demonstrate that the proposed clustering model outperforms the traditional K-means method with regards to the enhanced features using various auto-encoder, pre-trained weight and principle component analysis (PCA).

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/137328