Machine Unlearning via Representation Forgetting With Parameter Self-Sharing

Publisher:
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication Type:
Journal Article
Citation:
IEEE Transactions on Information Forensics and Security, 2024, 19, pp. 1099-1111
Issue Date:
2024-01-01
Filename Description Size
1682950.pdfPublished version2.56 MB
Adobe PDF
Full metadata record
Machine unlearning enables data owners to remove the contribution of their specified samples from trained models. However, existing methods fail to strike an optimal balance between erasure effectiveness and model utility preservation. Previous studies focused on removing the impact of user-specified data from the model as much as possible to implement unlearning. These methods usually result in significant model utility degradation, commonly called catastrophic unlearning. To address the issue, we systematically consider machine unlearning and formulate it as a two-objective optimization problem that involves forgetting the erased data and retaining the previously learned knowledge, highlighting accuracy preservation during the unlearning process. We propose an unlearning method called representation-forgetting unlearning with parameter self-sharing (RFU-SS) to achieve the two-objective unlearning goal. Firstly, we design a representation-forgetting unlearning (RFU) method that aims to remove the contribution of specified samples from a trained representation by minimizing the mutual information between the representation and the erased data. The representation is learned using the information bottleneck (IB) method. RFU is tailored to the IB structure models for ease of introduction. Secondly, we customize a parameter self-sharing structural optimization method for RFU (i.e., RFU-SS) to simultaneously optimize the forgetting and retention objectives to find the optimal balance. Extensive experimental results demonstrate a significant effectiveness improvement of RFU-SS over the state-of-the-art methods. RFU-SS almost eliminates catastrophic unlearning, reducing model accuracy degradation from over 6% to less than 0.2% on the MNIST dataset with an even better removal effect. The source code is available at https://github.com/wwq5-code/RFU-SS.git.
Please use this identifier to cite or link to this item: