Static malware clustering using enhanced deep embedding method

Publication Type:
Conference Proceeding
Concurrency Computation, 2019, 31 (19)
Issue Date:
Filename Description Size
Static malware clustering using enhanced deep.pdfPublished version1.25 MB
Adobe PDF
Full metadata record
© 2019 John Wiley & Sons, Ltd. Malware refers to any software, programs, or files that are intentionally utilised to compromise the system and cause unexpected losses to end-users such as economical losses or privacy breaches. The rapid growth of malware makes it impossible to keep up with its progress merely via human interventions or manual analysis. One of the challenges for the human-oriented approaches is they will cause backlog and inability to keep up with the development traces of the malware. Hence, an efficient method is needed urgently to analyse effectively and identify accurately the malware in their domain. Malware clustering has been extensively studied in the machine learning area with regards to distance functions, grouping algorithm and cluster validation. A large number of research studies have been done via behavioral analysis for clustering to achieve high performance of malware detections. However, there is a trade-off for better detection performance between behaviorial approaches and high computational forces. Up to date, little work focuses on the deep learning representations for malware clustering. Therefore, in this paper, we propose an enhanced deep embedded clustering method to facilitate an effective and efficient malware clustering process. The new method takes advantage of linear dimensionality reduction and a customised deep neural network to learn malware representations in an orthogonal space and performs cluster assignments. Our experimental results demonstrate that the proposed clustering model outperforms the traditional K-means method with regards to the enhanced features using various auto-encoder, pre-trained weight and principle component analysis (PCA).
Please use this identifier to cite or link to this item: