Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/158483
|
Title: | 機器遺忘過程之特徵分布與權重更新分析 Analysis of Feature Distribution and Weight Updates in the Machine Unlearning Process |
Authors: | 林暘景 Lin, Yang-Jing |
Contributors: | 廖文宏 Liao, Wen-Hung 林暘景 Lin, Yang-Jing |
Keywords: | 機器遺忘 重新標籤 模型操弄 權重重置 選擇性網路層凍結 Machine Unlearning Label Reassignment Model Manipulation Weight Reset Selective Layer-wise Freezing |
Date: | 2025 |
Issue Date: | 2025-08-04 13:59:17 (UTC+8) |
Abstract: | 深度學習仰賴大量的資料進行訓練,引發了民眾對於智慧財產權與隱私保護的疑慮。因應此問題,全球多個地區依序提出相關法規,強調被遺忘權(Right to be forgotten, RTBF),賦予民眾刪除個人資料的權利。然而,資料已在模型中留下痕跡,只刪除資料本身,無法達到遺忘的要求。為了解決模型內部留有敏感資訊的問題,機器遺忘(Machine Unlearning)概念被提出,其技術能移除特定資料對模型所造成的影響,從而實現對個人隱私的保護。 在本論文中,我們提出了兩種不同修改策略的機器遺忘演算法,權重重置與重新標籤。其中,權重重置方法透過修改模型權重移除特定資訊;而重新標籤方法則藉由改變訓練資料的標籤,使模型將遺忘資料視為其他類別,達到類似遺忘的效果。本論文使用CIFAR-100資料集和ResNet-50網路架構進行,利用六種指標進行分析,並且設計不同的遺忘類別,如遺忘1類、5類、10類、50Cluster類與50Uniform類,測試機器遺忘演算法在面對不同遺忘規模下的強健性。此外,本論文也模擬了漸進式之機器遺忘實驗,與傳統之機器遺忘實驗不同,漸進式之實驗主要驗證了,當使用者於不同時間點提出機器遺忘需求時,機器遺忘方法是否依然有效。實驗結果顯示,無論是在傳統或漸進式之實驗,本論文提出之兩種機器遺忘方法均展現良好之遺忘能力。 相較於多數既有的機器遺忘研究著重在指標層面的表現,本論文另一大重點在於,我們進一步研究機器遺忘過程中特徵分布與網路權重的變化,以揭示兩種不同遺忘策略的差別。在特徵層面上,我們發現未學習過任何遺忘類別的模型(Gold Standard),並非將這些遺忘類別隨機地分散到各個類別,而是傾向預測到特徵相似的類別。而權重重置方法與Gold Standard有相似的表現,隨著重置比例上升,模型逐漸失去遺忘類別的識別能力,將遺忘類別預測到特徵相近的類別,呈現出類似自然遺忘的行為。相對地,重新標籤方法則根據指定類別的方式不同,將遺忘類別與指定類別混淆,從而失去遺忘類別的資訊,達成遺忘目的。進一步的層級分析顯示,權重重置主要透過整體調整來實現遺忘,而重新標籤則更依賴於全連接層的變動,以造成混淆結果。 在權重變化的分析中,我們發現重新標籤較著重調整高階的特徵,而權重重置方法,其更廣泛的微調整體模型,隨著重置比例上升,權重變化也就越大,且與重新標籤相比,權重重置的調整較大。進一步分析顯示,隨著遺忘類別數量增加,重新標籤方法所造成的權重變化呈現上升趨勢,而權重重置則不會有明顯的改變。此現象揭示了兩者在設計理念上的差異,重新標籤方法會盡量保留模型的完整性,最小化對模型整體結構的擾動;權重重置則透過廣泛的重置權重來清除資訊,因此其對模型權重的影響更為深遠,較不受遺忘類別數量的影響。另外,我們進一步對權重變化的幅度分佈進行統計分析,結果顯示重新標籤方法調整的權重幅度都較小,不會對大部分的權重造成重大的影響,而權重重置方法牽扯到重置權重的部分,因此會對模型權重造成較大規模的改動。 Deep learning relies heavily on large volumes of data for training, which has raised public concerns about intellectual property and privacy protection. In response to these issues, several regions around the world have introduced relevant regulations emphasizing the "Right to be Forgotten" (RTBF), granting individuals the right to delete their personal data. However, since traces of data remain embedded in trained models, simply deleting the raw data does not fulfill the requirement for true forgetting. To address the issue of sensitive information being retained within models, machine unlearning techniques have been proposed, which aim to remove the influence of specific data from trained models, thereby enhancing personal privacy protection. In this thesis, we propose two machine unlearning algorithms with different modification strategies: weight resetting and label reassignment. The weight resetting method removes specific information by modifying model weights, while the label reassignment method achieves a similar forgetting effect by altering the labels of training data, causing the model to interpret the forgotten data as belonging to other classes. We evaluate these methods using the CIFAR-100 dataset and the ResNet-50 architecture, analyzing their effectiveness with six different metrics. To assess their robustness under varying scales of forgetting, we design different unlearning scenarios including forgetting 1 class, 5 classes, 10 classes, 50Cluster, and 50Uniform classes. Furthermore, we simulate incremental unlearning experiments, which differ from traditional ones by verifying whether the proposed methods remain effective when unlearning requests are issued at different time points. Experimental results demonstrate that both proposed unlearning algorithms perform effectively in both traditional and incremental settings. Unlike most existing machine unlearning studies that focus primarily on metric-based accuracy, this work also investigates feature distribution and network weight changes during the unlearning process, to reveal differences between the two strategies. At the feature level, we observe that the Gold Standard model—which has never learned the unlearning classes—does not randomly distribute these classes across others but tends to misclassify them into feature-similar categories. The weight resetting method exhibits a similar behavior: as the reset ratio increases, the model gradually loses its ability to identify the forgotten classes and predicts them as similar categories, resembling natural forgetting. In contrast, the label reassignment method, depending on the chosen target class, confuses the forgotten classes with the specified categories, thereby achieving the goal of forgetting by obscuring the original class information. Further layer-wise analysis reveals that weight resetting accomplishes unlearning through global adjustments across the model, whereas label reassignment relies more heavily on changes in the fully connected layers to induce class confusion. Weight change analysis shows that label reassignment mainly adjusts high-level features, while weight resetting makes broader model-wide modifications. As the reset ratio increases, weight resetting causes significantly larger changes than label reassignment. Moreover, the number of forgotten classes greatly affects label reassignment, but has little impact on weight resetting. This reflects their design difference: label reassignment preserves model structure, while weight resetting aggressively alters weights to remove information. Finally, statistical results indicate that label reassignment causes smaller, localized changes, whereas weight resetting results in larger-scale modifications. |
Reference: | [1] Ray, P. P. (2023). ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems. [2] Team, G., Anil, R., Borgeaud, S., Alayrac, J. B., Yu, J., Soricut, R., & Blanco, L. (2023). Gemini: A family of highly capable multimodal models. arXiv Preprint. arXiv:2312.11805 [3] Regulation, G. D. P. (2018). General Data Protection Regulation Policy. https://gdpr-info.eu/ [4] U., D. (2020). California consumer privacy act (CCPA) website policy https://www.consumerprivacyact.com/california/ [5] Office of the Privacy Commissioner of Canada. (2018, October). Announcement: Privacy commissioner seeks federal court determination on key issue for Canadians’ online reputation. [6] Ullah, E., Mai, T., Rao, A., Rossi, R. A., & Arora, R. (2021, July). Machine unlearning via algorithmic stability. Paper presented at the Conference on Learning Theory (COLT). [7] Bourtoule, L., Chandrasekaran, V., Choquette-Choo, C. A., Jia, H., Travers, A., Zhang, B., … & Papernot, N. (2021, May). Machine unlearning. In 2021 IEEE Symposium on Security and Privacy (SP) (pp. 141–159). IEEE. [8] Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579–2605. [9] Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision (pp. 618–626). [10] Cao, Y., & Yang, J. (2015, May). Towards making systems forget with machine unlearning. In 2015 IEEE Symposium on Security and Privacy (pp. 463–480). IEEE. [11] Shokri, R., Stronati, M., Song, C., & Shmatikov, V. (2017, May). Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP) (pp. 3–18). IEEE. [12] He, Z., Zhang, T., & Lee, R. B. (2019, December). Model inversion attacks against collaborative inference. In Proceedings of the 35th Annual Computer Security Applications Conference (ACSAC ’19) (pp. 148–162). New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3359789.3359809 [13] NeurIPS 2023 Machine Unlearning Challenge. (2023).Retrieved from https://unlearning-challenge.github.io/ [14] Thudi, A., Jia, H., Shumailov, I., & Papernot, N. (2022). On the necessity of auditable algorithmic definitions for machine unlearning. In 31st USENIX Security Symposium (USENIX Security ’22) (pp. 4007–4022). USENIX Association. [15] Yan, H., Li, X., Guo, Z., Li, H., Li, F., & Lin, X. (2022, July). ARCANE: An efficient architecture for exact machine unlearning. In IJCAI (Vol. 6, p. 19). [16] Graves, L., Nagisetty, V., & Ganesh, V. (2021, May). Amnesiac machine learning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 13, pp. 11516–11524). [17] Golatkar, A., Achille, A., & Soatto, S. (2020). Eternal sunshine of the spotless net: Selective forgetting in deep networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 9304–9312). [18] Deng, L. (2012). The MNIST database of handwritten digit images for machine learning research BestoftheWebBest of the WebBestoftheWeb. IEEE Signal Processing Magazine, 29(6), 141–142. [19] Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images TechnicalReportTechnical ReportTechnicalReport. University of Toronto. [20] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778). [21] Chundawat, V. S., Tarun, A. K., Mandal, M., & Kankanhalli, M. (2023, June). Can bad teaching induce forgetting? Unlearning in deep networks using an incompetent teacher. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 37, No. 6, pp. 7210–7217). [22] Li, N., Zhou, C., Gao, Y., Chen, H., Fu, A., Zhang, Z., & Shui, Y. (2024). Machine unlearning: Taxonomy, metrics, applications, challenges, and prospects. arXiv Preprint, arXiv:2403.08254. [23] Golatkar, A., Achille, A., Ravichandran, A., Polito, M., & Soatto, S. (2021). Mixed-privacy forgetting in deep networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 792–801). [24] Lin, J. (1991). Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory, 37(1), 145–151. [25] Kadavath, S., Conerly, T., Askell, A., Henighan, T., Drain, D., Perez, E., … & Kaplan, J. (2022). Language models (mostly) know what they know. arXiv Preprint, arXiv:2207.05221. |
Description: | 碩士 國立政治大學 資訊科學系 112753201 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0112753201 |
Data Type: | thesis |
Appears in Collections: | [資訊科學系] 學位論文
|
Files in This Item:
File |
Description |
Size | Format | |
320101.pdf | | 2670Kb | Adobe PDF | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|