政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/158714

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | 全文筆數/總筆數 : 118786/149850 (79%)
造訪人次 : 81440448 線上人數 : 4168

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

搜尋範圍

查詢小技巧：

您可在西文檢索詞彙前後加上"雙引號"，以獲取較精準的檢索結果

若欲以作者姓名搜尋，建議至進階搜尋限定作者欄位，可獲得較完整資料

進階搜尋

主頁 ‧ 登入 ‧ 上傳 ‧ 說明 ‧ 關於政大典藏 ‧ 管理

到手機版

政大機構典藏 > 商學院 > 統計學系 > 學位論文 > Item 140.119/158714

請使用永久網址來引用或連結此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/158714

題名:	強化深偽偵測：以統計方法辨識影像的圖像特徵 Enhancing Deepfake Detection: Statistical Analysis of Frame Features with Extension to Video
作者:	高崇哲 GAO, CHONG-ZHE
貢獻者:	余清祥 YU, QING-XIANG 高崇哲 GAO, CHONG-ZHE
關鍵詞:	深偽影像維度縮減資料洩漏資料結構化幀間同質性 Deepfake videos Dimensionality reduction Data leakage Data structuring Inter-frame homogeneity
日期:	2025
上傳時間:	2025-08-04 15:11:34 (UTC+8)
摘要:	人工智慧與深度學習的快速發展帶來諸多便利與創新，然而這些技術可能遭不法分子濫用成為新型犯罪工具，其中深偽影像（Deepfake）的出現顛覆了眼見為憑的傳統認知，對視覺資訊的真實性構成嚴重威脅。目前，多數深偽影像偵測方法依賴深度學習技術，雖然偵測效果不錯，卻因龐大的參數量與複雜的計算過程，使決策過程難以解釋。本研究從統計角度切入，提出一種透過特徵維度縮減，具高度可解釋性的輕量化偵測方法。除了可解釋性與計算量較少的優勢外，本研究另有兩項貢獻：一、有效避免資料洩漏問題；二、將偵測單位從圖像層級拓展至影像層級，以符合實務需求。先前方法多以圖像為單位進行分析，這可能導致同一部影像同時出現在訓練集與測試集中，產生資料洩漏（Data Leakage），使測試結果與實際應用存在落差。為改善此問題，本研究改以影像為單位切割資料，避免資料洩漏，提升模型的泛化能力與結果可信度，使偵測效果更穩定且符合實務需求。我們參考先前研究方法，將原本以單一尺度區塊切割計算的梯度強度，調整為使用大尺度區塊與一階差分計算的低離群值，以賦予特徵同時具備全局與局部紋理的描述能力。同時，針對HSV（Hue、Saturation、Value）色彩空間中 H 通道的角度特性，採用 sin H 與 cos H 分解方式進行轉換，以提升偵測表現與解釋性。除了紋理變化，本研究亦發現紋理種類分布對深偽影像具有辨識力，因而進一步納入兩類紋理統計特徵：其一為以共生矩陣（Co-occurrence Matrix）計算的角二階矩（Angular Second Moment，ASM），其二為從梯度方向直方圖（Histogram of Oriented Gradient，HOG）中提取的統計量。本研究以 Celeb-DF-v2 深偽影像資料集為實驗對象，並採用 500 次重複模擬的交叉驗證進行評估。結果顯示，所提方法在僅使用 31 個特徵的情況下，仍可達到 69.55% 的偵測準確率，較原方法提升 4.91%，展現本方法兼具良好效能與可解釋性的潛力。 The rapid advancement of artificial intelligence and deep learning has brought significant benefits and innovations. However, these technologies are also increasingly misused, particularly in the creation of deepfake media, which severely undermines the credibility of visual information. While most existing detection methods rely on deep learning models that achieve high accuracy, they often suffer from limited interpretability and substantial computational complexity. This study presents a lightweight and interpretable statistical approach for deepfake detection, achieving competitive performance with fewer than 1% of the features typically used in deep learning models. Building upon the work of Chen (2023), we enhance both global and local texture representation by applying large-scale block-based gradient extraction in combination with first-order differencing to suppress outliers. To further address angular discontinuities in the HSV color space, hue components are transformed using sine and cosine decomposition (sin H and cos H). In addition to capturing texture variations, we investigate the distribution of texture types by incorporating two types of statistical features: (1) Angular Second Moment (ASM) from gray-level co-occurrence matrices, and (2) summary statistics extracted from Histograms of Oriented Gradients (HOG). These features are then used as inputs for statistical and machine learning classifiers. Experiments conducted on the Celeb-DF-v2 dataset, using 500 iterations of cross-validation, demonstrate that our method achieves a detection accuracy of 69.55% with only 31 features—a 4.91% improvement over the baseline. Furthermore, by sampling and aggregating predictions at the video level rather than the frame level, we mitigate data leakage risks and enhance real-world applicability. Final decisions are made using majority voting and median aggregation strategies to better reflect practical deployment scenarios.
參考文獻:	[1] 陳慧霜（2023）。「影像分析與深偽影片的偵測」。國立政治大學統計學系學位論文。 [2] Ahmed, N., Natarajan, T., & Rao, K. R. (2006). “Discrete Cosine Transform”, IEEE Transactions on Computers, 100(1), 90–93. [3] Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). “Optuna: A Next-Generation Hyperparameter Optimization Framework”, Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2623–2631. [4] Amari, S. (2006). “A Theory of Adaptive Pattern Classifiers”, IEEE Transactions on Electronic Computers, 3, 299–307. [5] Bertasius, G., Wang, H., & Torresani, L. (2021). “Is Space-Time Attention All You Need for Video Understanding?”, Proceedings of the 38th International Conference on Machine Learning (ICML), Vol. 2, No. 3, p. 4. [6] Blanz, V., & Vetter, T. (2023). “A Morphable Model for the Synthesis of 3D Faces”, Seminal Graphics Papers: Pushing the Boundaries, Volume 2, 157–164. [7] Breiman, L. (2001). “Random Forests”, Machine Learning, 45(1), 5–32. [8] Chen, T., & Guestrin, C. (2016). “XGBoost: A Scalable Tree Boosting System”, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. [9] Cortes, C., & Vapnik, V. (1995). “Support-Vector Networks”, Machine Learning, 20, 273–297. [10] Cover, T., & Hart, P. (1967). “Nearest Neighbor Pattern Classification”, IEEE Transactions on Information Theory, 13(1), 21–27. [11] Cox, D. R. (1958). “The Regression Analysis of Binary Sequences”, Journal of the Royal Statistical Society Series B: Statistical Methodology, 20(2), 215–232. [12] Dalal, N., & Triggs, B. (2005). “Histograms of Oriented Gradients for Human Detection”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 1, 886–893. [13] Dolhansky, B., Bitton, J., Pflaum, B., Lu, J., Howes, R., Wang, M., & Ferrer, C. C. (2020). “The Deepfake Detection Challenge (DFDC) Dataset”, arXiv preprint arXiv:2006.07397. [14] Gabor, D. (1946). “Theory of Communication. Part 1: The Analysis of Information”, Journal of the Institution of Electrical Engineers – Part III: Radio and Communication Engineering, 93(26), 429–441. [15] Haralick, R. M., Shanmugam, K., & Dinstein, I. H. (1973). “Textural Features for Image Classification”, IEEE Transactions on Systems, Man, and Cybernetics, 6, 610–621. [16] Horn, B. K. P., & Schunck, B. G. (1981). “Determining Optical Flow”, Artificial Intelligence, 17(1–3), 185–203. [17] Kim, H., Garrido, P., Tewari, A., Xu, W., Thies, J., Niessner, M., ... & Theobalt, C. (2018). “Deep Video Portraits”, ACM Transactions on Graphics (TOG), 37(4), 1–14. [18] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). “Gradient-Based Learning Applied to Document Recognition”, Proceedings of the IEEE, 86(11), 2278–2324. [19] Li, L., Bao, J., Zhang, T., Yang, H., Chen, D., Wen, F., & Guo, B. (2020). “Face X-Ray for More General Face Forgery Detection”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5001–5010. [20] Li, Y., Yang, X., Sun, P., Qi, H., & Lyu, S. (2020). “Celeb-DF: A Large-Scale Challenging Dataset for Deepfake Forensics”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) [21] Liu, Y., Zhang, K., Li, Y., Yan, Z., Gao, C., Chen, R., ... & Sun, L. (2024). “Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models”, arXiv preprint arXiv:2402.17177. [22] Matern, F., Riess, C., & Stamminger, M. (2019). “Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations”, 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), 83–92. [23] Nirkin, Y., Keller, Y., & Hassner, T. (2019). “FSGAN: Subject Agnostic Face Swapping and Reenactment”, Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 7184–7193. [24] Pérez, P., Gangnet, M., & Blake, A. (2023). “Poisson Image Editing”, Seminal Graphics Papers: Pushing the Boundaries, Volume 2, 577–582. [25] Polyak, A., Zohar, A., Brown, A., Tjandra, A., Sinha, A., Lee, A., ... & Du, Y. (2024). “Movie gen: A Cast of Media Foundation Models, 2025”, arXiv preprint arXiv:2410.13720. [26] Rossler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., & Nießner, M. (2019). “FaceForensics++: Learning to Detect Manipulated Facial Images”, Proceedings of the IEEE/CVF International Conference on Computer Vision, 1–11. [27] Siarohin, A., Lathuilière, S., Tulyakov, S., Ricci, E., & Sebe, N. (2019). “First Order Motion Model for Image Animation”, Advances in Neural Information Processing Systems, 32. [28] Tolosana, R., Vera-Rodriguez, R., Fierrez, J., Morales, A., & Ortega-Garcia, J. (2020). “Deepfakes and Beyond: A Survey of Face Manipulation and Fake Detection”, Information Fusion, 64, 131–148. [29] Wiles, O., Koepke, A., & Zisserman, A. (2018). “X2Face: A Network for Controlling Face Generation Using Images, Audio, and Pose Codes”, Proceedings of the European Conference on Computer Vision (ECCV), 670–686. [30] Yang, X., Li, Y., & Lyu, S. (2019). “Exposing Deep Fakes Using Inconsistent Head Poses”, ICASSP 2019 – IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 8261–8265. [31] Zhao, H., Zhou, W., Chen, D., Wei, T., Zhang, W., & Yu, N. (2021). “Multi-Attentional Deepfake Detection”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2185–2194. [32] Zhang, K., Zhang, Z., Li, Z., & Qiao, Y. (2016). “Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks”, IEEE Signal Processing Letters, 23(10), 1499–1503. [33] Zhao, T., Xu, X., Xu, M., Ding, H., Xiong, Y., & Xia, W. (2021). “Learning Self-Consistency for Deepfake Detection”, Proceedings of the IEEE/CVF International Conference on Computer Vision, 15023–15033.
描述:	碩士國立政治大學統計學系 112354020
資料來源:	http://thesis.lib.nccu.edu.tw/record/#G0112354020
資料類型:	thesis
顯示於類別:	[統計學系] 學位論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
402001.pdf		3901Kb	Adobe PDF	0	檢視/開啟

在政大典藏中所有的資料項目都受到原著作權保護.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - 回饋