A Game-Theoretic Method for Defending Against Advanced Persistent Threats in Cyber Systems

Zhang, L; Zhu, T; Hussain, FK; Ye, D; Zhou, W

A Game-Theoretic Method for Defending Against Advanced Persistent Threats in Cyber Systems

Zhang, L Zhu, T

Hussain, FK Ye, D

Zhou, W

Permalink

Publisher:: Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:: Journal Article
Citation:: IEEE Transactions on Information Forensics and Security, 2023, 18, (99), pp. 1349-1364
Issue Date:: 2023-01-01

Closed Access

	Filename	Description	Size
	A_Game-Theoretic_Method_for_Defending_Against_Advanced_Persistent_Threats_in_Cyber_Systems.pdf	Published version	1.99 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Zhang, L
dc.contributor.author	Zhu, T https://orcid.org/0000-0003-3411-7947
dc.contributor.author	Hussain, FK
dc.contributor.author	Ye, D https://orcid.org/0000-0002-7561-0992
dc.contributor.author	Zhou, W https://orcid.org/0000-0002-1680-2521
dc.date.accessioned	2023-03-09T22:37:18Z
dc.date.available	2023-03-09T22:37:18Z
dc.date.issued	2023-01-01
dc.identifier.citation	IEEE Transactions on Information Forensics and Security, 2023, 18, (99), pp. 1349-1364
dc.identifier.issn	1556-6013
dc.identifier.issn	1556-6021
dc.identifier.uri	http://hdl.handle.net/10453/166904
dc.description.abstract	Advanced persistent threats (APTs) are one of today's major threats to cyber security. Highly determined attackers along with novel and evasive exfiltration techniques mean APT attacks elude most intrusion detection and prevention systems. The result has been significant losses for governments, organizations, and commercial entities. Intriguingly, despite greater efforts to defend against APTs in recent times, frequent upgrades in defense strategies are not leading to increased security and protection. In this paper, we demonstrate this phenomenon in an appropriately designed APT rivalry game that captures the interactions between attackers and defenders. What is shown is that the defender's strategy adjustments actually leave useful information for the attackers, and thus intelligent and rational attackers can improve themselves by analyzing this information. Hence, a critical part of one's defense strategy must be finding a suitable time to adjust one's strategy to ensure attackers learn the least possible information. Another challenge for defenders is determining how to make the best use of one's resources to achieve a satisfactory defense level. In support of these efforts, we figured out the optimal timings of a player's strategy adjustment in terms of information leakage, which form a family of Nash equilibria. Moreover, two learning mechanisms are proposed to help defenders find an appropriate defense level and allocate their resources reasonably. One is based on adversarial bandits, and the other is based on deep reinforcement learning. Experimental simulations show the rationales behind the game and the optimality of the equilibria. The results also demonstrate that players indeed have the ability to improve themselves by learning from past experiences, which shows the necessity of specifying optimal strategy adjustment timings when defending against APTs.
dc.language	en
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation	http://purl.org/au-research/grants/arc/DP200100946
dc.relation.ispartof	IEEE Transactions on Information Forensics and Security
dc.relation.isbasedon	10.1109/TIFS.2022.3229595
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	08 Information and Computing Sciences, 09 Engineering
dc.subject.classification	Strategic, Defence & Security Studies
dc.title	A Game-Theoretic Method for Defending Against Advanced Persistent Threats in Cyber Systems
dc.type	Journal Article
utslib.citation.volume	18
utslib.for	08 Information and Computing Sciences
utslib.for	09 Engineering
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	closed_access	*
dc.date.updated	2023-03-09T22:37:17Z
pubs.issue	99
pubs.publication-status	Published
pubs.volume	18
utslib.citation.issue	99

Abstract:

Advanced persistent threats (APTs) are one of today's major threats to cyber security. Highly determined attackers along with novel and evasive exfiltration techniques mean APT attacks elude most intrusion detection and prevention systems. The result has been significant losses for governments, organizations, and commercial entities. Intriguingly, despite greater efforts to defend against APTs in recent times, frequent upgrades in defense strategies are not leading to increased security and protection. In this paper, we demonstrate this phenomenon in an appropriately designed APT rivalry game that captures the interactions between attackers and defenders. What is shown is that the defender's strategy adjustments actually leave useful information for the attackers, and thus intelligent and rational attackers can improve themselves by analyzing this information. Hence, a critical part of one's defense strategy must be finding a suitable time to adjust one's strategy to ensure attackers learn the least possible information. Another challenge for defenders is determining how to make the best use of one's resources to achieve a satisfactory defense level. In support of these efforts, we figured out the optimal timings of a player's strategy adjustment in terms of information leakage, which form a family of Nash equilibria. Moreover, two learning mechanisms are proposed to help defenders find an appropriate defense level and allocate their resources reasonably. One is based on adversarial bandits, and the other is based on deep reinforcement learning. Experimental simulations show the rationales behind the game and the optimality of the equilibria. The results also demonstrate that players indeed have the ability to improve themselves by learning from past experiences, which shows the necessity of specifying optimal strategy adjustment timings when defending against APTs.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/166904