Deep Reinforcement Learning and Privacy Preserving with Differential Privacy

Publication Type:
Thesis
Issue Date:
2023
Full metadata record
With the rapid advances in technology in the current era and the emergence of multiple technologies that have transformed society, Deep reinforcement learning (DRL) offers promising solutions and enhanced capabilities with demonstrated superior results. The learning method in DRL is carried out by executing the action, receiving corresponding rewards and then moving to the next state. In some complex problems, it is necessary to have more than one RL agent, which leads to the idea of multi-agent reinforcement learning, where more than one agent works together and shares the same environment to achieve a certain goal. Nowadays, deep reinforcement learning is used in various areas, such as recommendation systems, robotics, and health applications. While there are enormous benefits to using these technologies, there are also significant privacy concerns associated with them. The learning process in deep reinforcement learning involves performing the action, receiving the reward, and moving to the next state. Deep reinforcement learning is vulnerable to adversary attacks, and private information can be inferred by an adversary using recursive querying. The trained policy could be released to the client side, which could enable the adversary to infer private information from the trained policy, pose a real risk, and constitute a breach of privacy. This research focuses on deep reinforcement learning and multi-agent reinforcement learning and the related privacy issues. The contributions made by this research are as follows: • This research proposes a solution for online food delivery services to increase the number of food delivery orders and thereby increase the long-term income of couriers. The solution involves leveraging multi-agent reinforcement learning by employing two multi-agent reinforcement learning algorithms to guide couriers to areas with a high demand for food delivery orders. • This research proposes a solution to protect privacy in Double and Dueling Deep Q Networks by adopting the Differentially Private Stochastic Gradient Descent (DPSGD) method and injecting Gaussian noise into the gradient. • This research proposes the Protect User Location Method (PULM) to protect customer location information in online food delivery services. This method injects differential privacy Laplace noise based on two factors: the size of the city and the frequency of customer orders. • This research proposes a Protect Trajectory and Location in Food Delivery (PTLFD) method to maintain the privacy of the customer’s stored data in online food delivery services. This method leverages multi-agent reinforcement learning and differential privacy to protect customer location information.
Please use this identifier to cite or link to this item: