政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/146886

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | Items with full text/Total items : 109956/140891 (78%)
Visitors : 46378947 Online Users : 777

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

Scope

please add "double quotation mark" for query phrases to get precise results

please goto advance search for comprehansive author search

Adv. Search

Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer

Goto mobile version

政大機構典藏 > 商學院 > 資訊管理學系 > 學位論文 > Item 140.119/146886

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/146886

Title:	以DeepSHAP生成決策邏輯的對抗樣本偵測研究 DeepSHAP Summary for Adversarial Example Detection
Authors:	林苡晴 Lin, Yi-Ching
Contributors:	郁方 Yu, Fang 林苡晴 Lin, Yi-Ching
Keywords:	對抗樣本可解釋人工智慧 DeepSHAP 決策邏輯 Adversarial example Explainable AI DeepSHAP Decision logic
Date:	2023
Issue Date:	2023-09-01 14:52:57 (UTC+8)
Abstract:	深度學習的應用已廣泛地使用在各種場景之中，可解釋人工智慧有助於提供模型預測的解釋，增強模型的可靠性及可信度。本研究提出三種基於DeepSHAP Summary所擴展的對抗樣本偵測方法。研究發現正常樣本與對抗樣本之間在解釋上存在差異，並且具有不同的決策邏輯可用於區別樣本。研究首先使用可解釋人工智慧的工具——DeepSHAP計算各個神經元在分類模型中逐層的貢獻，以篩選出關鍵神經元，並生成代表決策邏輯的關鍵神經元分佈圖，藉此提出基於決策邏輯而非SHAP值的新方法來偵測對抗樣本。將所有決策邏輯中關鍵神經元整合的決策圖則提供神經元對分類結果的影響力共識。研究亦透過逐層解釋的SHAP值來偵測對抗樣本，並推薦基於決策圖來選擇最佳層的策略，以提供更為合理的單一層來偵測對抗樣本。另外，研究提出以活化狀態方法進行偵測，透過提取決策圖中模型的活化值作為資料以降低計算成本。本研究針對三種資料集的實驗結果顯示：1) 提供更多層的SHAP值資訊可以獲得更好的偵測結果，2) 使用專注於關鍵神經元的決策邏輯方法其以更少的資源需求，達到與使用所有層的SHAP值相當的準確率，3) 使用最佳層的SHAP值與活化狀態方法可以提供更加輕量化且具有足夠的偵測能力的偵測方法。所有提出的方法均顯示對未經訓練的對抗樣本具有有效的偵測轉移能力。 Deep learning has broad applications. Explainable AI (XAI) enhances interpretability and reliability. Leveraging XAI, we propose three adversarial example detection approaches based on DeepSHAP Summary. Specifically, we use DeepSHAP to calculate the neuron contributions, identifying critical neurons by SHAP values and generating critical neuron bitmap as decision logic. We reveal distinct interpretations and diverse decision logic between normal and adversarial examples. Our approach uses the decision logic instead of the SHAP signature for detection. We then employ the layer-wise SHAP explanation and recommend a strategy for best layer selection through decision graph that summarizes critical neurons, enhancing single-layer detection. The activation status approach reduces computation using decision graph-based activation values. The results across three datasets demonstrate accuracy improvement with more SHAP layer information. Focusing on critical neurons yields competitive accuracy with fewer resources. The best layer SHAP signature and activation status approaches offer lightweight yet effective detection. This efficacy extends to untrained attack detection.
Reference:	[1] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015. [2] T. Young, D. Hazarika, S. Poria, and E. Cambria, “Recent trends in deep learning based natural language processing,” ieee Computational intelligenCe magazine, vol. 13, no. 3, pp. 55–75, 2018. [3] OpenAI, “Chatgpt language model,” https://openai.com, 2023. [4] A. Aldahdooh, W. Hamidouche, S. A. Fezza, and O. Déforges, “Adversarial example detection for dnn models: A review and experimental comparison,” Artificial Intelligence Review, vol. 55, no. 6, pp. 4403–4462, 2022. [5] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014. [6] S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard, “Deepfool: a simple and accurate method to fool deep neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2574–2582. [7] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” arXiv preprint arXiv:1706.06083, 2017. [8] S. Gu and L. Rigazio, “Towards deep neural network architectures robust to adversarial examples,” arXiv preprint arXiv:1412.5068, 2014. [9] N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami, “Distillation as a defense to adversarial perturbations against deep neural networks,” in 2016 IEEE symposium on security and privacy (SP). IEEE, 2016, pp. 582–597. [10] A. Rozsa, E. M. Rudd, and T. E. Boult, “Adversarial diversity and hard positive generation,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2016, pp. 25–32. [11] S. Zheng, Y. Song, T. Leung, and I. Goodfellow, “Improving the robustness of deep neural networks via stability training,” in Proceedings of the ieee conference on computer vision and pattern recognition, 2016, pp. 4480–4488. [12] T. Pang, C. Du, Y. Dong, and J. Zhu, “Towards robust detection of adversarial examples,” Advances in neural information processing systems, vol. 31, 2018. [13] N. Carlini and D. Wagner, “Adversarial examples are not easily detected: Bypassing ten detection methods,” in Proceedings of the 10th ACM workshop on artificial intelligence and security, 2017, pp. 3–14. [14] X. Yuan, P. He, Q. Zhu, and X. Li, “Adversarial examples: Attacks and defenses for deep learning,” IEEE transactions on neural networks and learning systems, vol. 30, no. 9, pp. 2805–2824, 2019. [15] J. H. Metzen, T. Genewein, V. Fischer, and B. Bischoff, “On detecting adversarial perturbations,” arXiv preprint arXiv:1702.04267, 2017. [16] J. Lu, T. Issaranon, and D. Forsyth, “Safetynet: Detecting and rejecting adversarial examples robustly,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 446–454. [17] S. Ma, Y. Liu, G. Tao, W.-C. Lee, and X. Zhang, “Nic: Detecting adversarial samples with neural network invariant checking,” in 26th Annual Network And Distributed System Security Symposium (NDSS 2019). Internet Soc, 2019. [18] Z. Gong, W. Wang, and W.-S. Ku, “Adversarial and clean data are not twins,” arXiv preprint arXiv:1704.04960, 2017. [19] A. Kherchouche, S. A. Fezza, W. Hamidouche, and O. Déforges, “Detection of adversarial examples in deep neural networks with natural scene statistics,” in 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 2020, pp. 1–7. [20] K. Grosse, P. Manoharan, N. Papernot, M. Backes, and P. McDaniel, “On the (statistical) detection of adversarial examples,” arXiv preprint arXiv:1702.06280, 2017. [21] X. Li and F. Li, “Adversarial examples detection in deep networks with convolutional filter statistics,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5764–5772. [22] R. Feinman, R. R. Curtin, S. Shintre, and A. B. Gardner, “Detecting adversarial samples from artifacts,” arXiv preprint arXiv:1703.00410, 2017. [23] X. Ma, B. Li, Y. Wang, S. M. Erfani, S. Wijewickrema, G. Schoenebeck, D. Song, M. E. Houle, and J. Bailey, “Characterizing adversarial subspaces using local intrinsic dimensionality,” arXiv preprint arXiv:1801.02613, 2018. [24] S. Freitas, S.-T. Chen, Z. J. Wang, and D. H. Chau, “Unmask: Adversarial detection and defense through robust feature alignment,” in 2020 IEEE International Conference on Big Data (Big Data). IEEE, 2020, pp. 1081–1088. [25] D. Meng and H. Chen, “Magnet: a two-pronged defense against adversarial examples,” in Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, 2017, pp. 135–147. [26] B. Liang, H. Li, M. Su, X. Li, W. Shi, and X. Wang, “Detecting adversarial image examples in deep neural networks with adaptive noise reduction,” IEEE Transactions on Dependable and Secure Computing, vol. 18, no. 1, pp. 72–85, 2018. [27] W. Xu, D. Evans, and Y. Qi, “Feature squeezing: Detecting adversarial examples in deep neural networks,” arXiv preprint arXiv:1704.01155, 2017. [28] S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and W. Samek, “On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation,” PloS one, vol. 10, no. 7, p. e0130140, 2015. [29] M. T. Ribeiro, S. Singh, and C. Guestrin, “” why should i trust you?” explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135–1144. [30] A. Shrikumar, P. Greenside, and A. Kundaje, “Learning important features through propagating activation differences,” in International conference on machine learning. PMLR, 2017, pp. 3145–3153. [31] S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” Advances in neural information processing systems, vol. 30, 2017. [32] A. Adadi and M. Berrada, “Peeking inside the black-box: a survey on explainable artificial intelligence (xai),” IEEE access, vol. 6, pp. 52 138–52 160, 2018. [33] A. Singh, S. Sengupta, and V. Lakshminarayanan, “Explainable deep learning models in medical image analysis,” Journal of Imaging, vol. 6, no. 6, p. 52, 2020. [34] L.-P. Cen, J. Ji, J.-W. Lin, S.-T. Ju, H.-J. Lin, T.-P. Li, Y. Wang, J.-F. Yang, Y.-F. Liu, S. Tan et al., “Automatic detection of 39 fundus diseases and conditions in retinal photographs using deep neural networks,” Nature communications, vol. 12, no. 1, pp. 1–13, 2021. [35] J. Reiter, “Developing an interpretable schizophrenia deep learning classifier on fmri and smri using a patient-centered deepshap,” in in 32nd Conference on Neural Information Processing Systems (NeurIPS 2018)(Montreal: NeurIPS), 2020, pp. 1–11. [36] S. Mangalathu, S.-H. Hwang, and J.-S. Jeon, “Failure mode and effects analysis of rc members based on machine-learning-based shapley additive explanations (shap) approach,” Engineering Structures, vol. 219, p. 110927, 2020. [37] K. Zhang, J. Zhang, P.-D. Xu, T. Gao, and D. W. Gao, “Explainable ai in deep reinforcement learning models for power system emergency control,” IEEE Transactions on Computational Social Systems, vol. 9, no. 2, pp. 419–427, 2021. [38] H. Wu, A. Huang, and J. W. Sutherland, “Layer-wise relevance propagation for interpreting lstm-rnn decisions in predictive maintenance,” The International Journal of Advanced Manufacturing Technology, vol. 118, no. 3, pp. 963–978, 2022. [39] A. T. Keleko, B. Kamsu-Foguem, R. H. Ngouna, and A. Tongne, “Health condition monitoring of a complex hydraulic system using deep neural network and deepshap explainable xai,” Advances in Engineering Software, vol. 175, p. 103339, 2023. [40] A. Warnecke, D. Arp, C. Wressnegger, and K. Rieck, “Evaluating explanation methods for deep learning in security,” in 2020 IEEE european symposium on security and privacy (EuroS&P). IEEE, 2020, pp. 158–174. [41] R. Alenezi and S. A. Ludwig, “Explainability of cybersecurity threats data using shap,” in 2021 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, 2021, pp. 01–10. [42] A. B. Parsa, A. Movahedi, H. Taghipour, S. Derrible, and A. K. Mohammadian, “Toward safer highways, application of xgboost and shap for real-time accident detection and feature analysis,” Accident Analysis & Prevention, vol. 136, p. 105405, 2020. [43] L. He, N. Aouf, and B. Song, “Explainable deep reinforcement learning for uav autonomous path planning,” Aerospace science and technology, vol. 118, p. 107052, 2021. [44] G. Fidel, R. Bitton, and A. Shabtai, “When explainability meets adversarial learning: Detecting adversarial examples using shap signatures,” in 2020 international joint conference on neural networks (IJCNN). IEEE, 2020, pp. 1–8. [45] E. Mosca, L. Huber, M. A. Kühn, and G. Groh, “Detecting word-level adversarial text attacks via shapley additive explanations,” in Proceedings of the 7th Workshop on Representation Learning for NLP, 2022, pp. 156–166. [46] E. Tcydenova, T. W. Kim, C. Lee, and J. H. Park, “Detection of adversarial attacks in ai-based intrusion detection systems using explainable ai,” HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, vol. 11, 2021. [47] X. Xie, T. Li, J. Wang, L. Ma, Q. Guo, F. Juefei-Xu, and Y. Liu, “Npc: N euron p ath c overage via characterizing decision logic of deep neural networks,” ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 31, no. 3, pp. 1–27, 2022. [48] A. Kurakin, I. J. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” in Artificial intelligence safety and security. Chapman and Hall/CRC, 2018, pp. 99–112. [49] F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and P. Mc-Daniel, “Ensemble adversarial training: Attacks and defenses,” arXiv preprint arXiv:1705.07204, 2017. [50] F. Croce and M. Hein, “Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks,” in International conference on machine learning. PMLR, 2020, pp. 2206–2216. [51] A. Aldahdooh, https://github.com/aldahdooh/detectors_review, 2022. [52] R. Ding, C. Gongye, S. Wang, A. A. Ding, and Y. Fei, “Emshepherd: Detecting adversarial samples via side-channel leakage,” in Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security, ser. ASIA CCS ’23. New York, NY, USA: Association for Computing Machinery, 2023, p. 300–313. [Online]. Available: https://doi.org/10.1145/3579856.3582827 [53] L. S. Shapley et al., “A value for n-person games,” 1953. [54] A. Nascita, A. Montieri, G. Aceto, D. Ciuonzo, V. Persico, and A. Pescapé, “Xai meets mobile traffic classification: Understanding and improving multimodal deep learning architectures,” IEEE Transactions on Network and Service Management, vol. 18, no. 4, pp. 4225–4246, 2021. [55] S. Meister, M. Wermes, J. Stüve, and R. M. Groves, “Investigations on explainable artificial intelligence methods for the deep learning classification of fibre layup defect in the automated composite manufacturing,” Composites Part B: Engineering, vol. 224, p. 109160, 2021. [56] B. H. Van der Velden, H. J. Kuijf, K. G. Gilhuijs, and M. A. Viergever, “Explainable artificial intelligence (xai) in deep learning-based medical image analysis,” Medical Image Analysis, p. 102470, 2022. [57] K. Davagdorj, J.-W. Bae, V.-H. Pham, N. Theera-Umpon, and K. H. Ryu, “Explainable artificial intelligence based framework for non-communicable diseases prediction,” IEEE Access, vol. 9, pp. 123 672–123 688, 2021. [58] M. V. García and J. L. Aznarte, “Shapley additive explanations for no2 forecasting,” Ecological Informatics, vol. 56, p. 101039, 2020. [59] V. Kumar and D. Boulanger, “Explainable automated essay scoring: Deep learning really has pedagogical value,” in Frontiers in education, vol. 5. Frontiers Media SA, 2020, p. 572367. [60] D. Ma, J. Bortnik, X. Chu, S. G. Claudepierre, Q. Ma, and A. Kellerman, “Opening the black box of the radiation belt machine learning model,” Space Weather, vol. 21, no. 4, p. e2022SW003339, 2023. [61] F. Yin, R. Fu, X. Feng, T. Xing, and M. Ji, “An interpretable neural network tv program recommendation based on shap,” International Journal of Machine Learning and Cybernetics, pp. 1–14, 2023. [62] A. O. Anim-Ayeko, C. Schillaci, and A. Lipani, “Automatic blight disease detection in potato (solanum tuberosum l.) and tomato (solanum lycopersicum, l. 1753) plants using deep learning,” Smart Agricultural Technology, p. 100178, 2023. [63] A. Temenos, N. Temenos, M. Kaselimi, A. Doulamis, and N. Doulamis, “Interpretable deep learning framework for land use and land cover classification in remote sensing using shap,” IEEE Geoscience and Remote Sensing Letters, vol. 20, pp. 1–5, 2023. [64] M. Veerappa, M. Anneken, N. Burkart, and M. F. Huber, “Validation of xai explanations for multivariate time series classification in the maritime domain,” Journal of Computational Science, vol. 58, p. 101539, 2022. [65] H. Chen, S. M. Lundberg, and S.-I. Lee, “Explaining a series of models by propagating shapley values,” Nature communications, vol. 13, no. 1, p. 4512, 2022. [66] L. Ma, F. Juefei-Xu, F. Zhang, J. Sun, M. Xue, B. Li, C. Chen, T. Su, L. Li, Y. Liu et al., “Deepgauge: Multi-granularity testing criteria for deep learning systems,” in Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, 2018, pp. 120–131. [67] Z. Ji, P. Ma, Y. Yuan, and S. Wang, “Cc: Causality-aware coverage criterion for deep neural networks,” in 2023 IEEE/ACM 45th International Conference on Software Engineering Proceedings (ICSE). IEEE, 2023. [68] S. Lundberg, “shap,” https://github.com/slundberg/shap, 2022. [69] H. Kim, “Torchattacks: A pytorch repository for adversarial attacks,” arXiv preprint arXiv:2010.01950, 2020.
Description:	碩士國立政治大學資訊管理學系 110356019
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0110356019
Data Type:	thesis
Appears in Collections:	[資訊管理學系] 學位論文

Files in This Item:

File	Size	Format
601901.pdf	5562Kb	Adobe PDF2	0	View/Open

All items in 政大典藏 are protected by copyright, with all rights reserved.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - Feedback