AB - © 2018 IEEE. When selecting security Service Function Chaining (SFC) for network defense, operators usually take security performance, service quality, deployment cost, and network function diversity into consideration, formulating as a multi-objective optimization problem. However, as applications, users, and data volumes grow massively in networks, traditional mathematical approaches cannot be applied to online security SFC selections due to high execution time and uncertainty of network conditions. Thus, in this paper, we utilize reinforcement learning, specifically, the Q-learning algorithm to automatically choose proper security SFC for various requirements. Particularly, we design a reward function to make a tradeoff among different objectives and modify the standard -greedy based exploration to pick out multiple ranked actions for diversified network defense. We compare the Q-learning with mathematical optimization-based approaches, which are assumed to know network state changes in advance. The training and testing results show that the Q-learning based approach can capture changes of network conditions and make a tradeoff among different objectives. AU - Li, G AU - Zhou, H AU - Feng, B AU - Yu, S DA - 2019/02/19 DO - 10.1109/GLOCOMW.2018.8644122 JO - 2018 IEEE Globecom Workshops, GC Wkshps 2018 - Proceedings PY - 2019/02/19 TI - Automatic Selection of Security Service Function Chaining Using Reinforcement Learning Y1 - 2019/02/19 Y2 - 2024/03/29 ER -