政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/146898

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | Items with full text/Total items : 109952/140887 (78%)
Visitors : 46372726 Online Users : 1281

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

Scope

please add "double quotation mark" for query phrases to get precise results

please goto advance search for comprehansive author search

Adv. Search

Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer

Goto mobile version

政大機構典藏 > 商學院 > 資訊管理學系 > 學位論文 > Item 140.119/146898

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/146898

Title:	利用 HTTP 封包採取非監督式深度學習演算法進行網路攻擊樣態分析 An Unsupervised Learning Approach for Cyber Attack Analysis with HTTP Payload Embedding
Authors:	陳唯哲 Chen, Wei-Zhe
Contributors:	蕭舜文 Hsiao, Shun-Wen 陳唯哲 Chen, Wei-Zhe
Keywords:	語言模型封包嵌入 NCCU BERT Packet embedding MITRE ATT&CK
Date:	2023
Issue Date:	2023-09-01 14:55:52 (UTC+8)
Abstract:	網絡攻擊數量層出不窮，手段不斷創新。即使網絡安全專家進行分析，仍然相當耗時。因此，有必要開發一個利用人工智能進行大數據分析的自動化平台。與其觀察攻擊方式後進行事後驗證和防護，不如在攻擊發生前進行預測和分析。如果我們能夠知道具有攻擊模式的事件（例如：偵查目標環境，或者竊取數據庫數據）正在發生，我們就可以主動防禦網絡攻擊。我們觀察到，攻擊者會在不同的攻擊階段通過組合不同的技術來實施階段性策略（戰術），從而完成攻擊生命週期。通過執行完整的攻擊過程來達到最終的攻擊目標。因此，找出不同階段的攻擊模式，就可以知道當前攻擊的進展情況，即可以在攻擊初期進行防禦。在我們的方法中，我們構建了一個人工智能主動防禦系統，使用蜜罐來捕獲當前的攻擊，並分析其在特定事件期間（例如總統選舉日）的意圖和生命週期階段。自動生成攻擊模式的方法可以主動保護網絡服務免受網絡攻擊事件的影響，降低特定事件受網絡安全攻擊事件影響的風險。我們開發神經算法將蜜罐數據包數據和蜜罐記錄文件轉換到高維空間，利用神經網絡對蜜罐收集的行為進行聚類和分析，自動預測其攻擊生命週期並自動生成其攻擊模式報告。對於收集到的蜜罐行為，本研究可以產生其生命週期各個階段的攻擊行為，網絡安全專家可以了解行為的發展情況並進行分析。這項研究成果可以減少網絡安全專家分析大量惡意攻擊日誌和數據包所花費的時間和成本，並生成高質量的網絡分析報告。 The number of cyber attacks emerges in an endless stream and the methods are constantly being innovated. Even if cybersecurity experts conduct analysis, it is still quite time-consuming. Therefore, it is necessary to develop an automated platform for big data analysis using artificial intelligence. Instead of doing post-event verification and protection after observing the attack method, it is better to predict and analyze the attack before it occurs. If we can know that an event with an attack pattern (for example: scouting the target environment, or stealing DB data) is happening, we can actively defend against network attacks. We have observed that attackers will implement staged strategies (tactics) by combining different techniques in different attack stages to complete the attack life cycle. The final attack goal is achieved by executing a complete attack process. Therefore, if you find out the attack pattern at different stages, you can know the current progress of the attack, that is, you can defend in the early stage of the attack. In our approach, we build an artificial intelligence proactive defense system, use the Honeypot to trap the current attack, and analyze its intention and life cycle stage during a specific event period (e.g., the presidential election day). The method of automatically generating attack patterns can actively protect network services from cyber attack events and reduce the risk of specific events being affected by cybersecurity attack events. We develop neural algorithms to convert Honeypot packet data and Honeypot record files to high-dimensional space, use neural network to cluster and analyze the behaviors collected by Honeypot, automatically predict its attack life cycle and automatically generate its attack pattern report. For the collected Honeypot behaviors, this study can produce the attack behaviors in each stage of its life cycle, and cybersecurity experts can understand the development of the behaviors and conduct and analyze them. The results of this research can reduce the time and cost spent by cybersecurity experts in analyzing a large number of malicious attack logs and packets, and produce high-quality network analysis reports.
Reference:	Abdullah, T. and Ahmet, A. (2022). Deep learning in sentiment analysis: Recent archi- tectures. ACM Computing Surveys, 55(8):1–37. Bahaa, M., Aboulmagd, A., Adel, K., Fawzy, H., and Abdelbaki, N. (2020). nndpi: A novel deep packet inspection technique using word embedding, convolutional and re- current neural networks. In 2020 2nd Novel Intelligent and Leading Emerging Sciences Conference (NILES), pages 165–170. IEEE. Bojanowski, P., Grave, E., Joulin, A., and Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of the association for computational linguistics, 5:135–146. Chandola, V., Banerjee, A., and Kumar, V. (2009). Anomaly detection: A survey. ACM computing surveys (CSUR), 41(3):1–58. Chowdhary, K. and Chowdhary, K. (2020). Natural language processing. Fundamentals of artificial intelligence, pages 603–649. Combs, G. et al. (1998–2023). Wireshark: A network protocol analyzer. Accessed: 2023- 05-06. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Epp, N., Funk, R., Cappo, C., and Lorenzo-Paraguay, S. (2017). Anomaly-based web ap- plication firewall using http-specific features and one-class svm. In Workshop Regional de Segurança da Informação e de Sistemas Computacionais. Gehring, J., Auli, M., Grangier, D., Yarats, D., and Dauphin, Y. N. (2017). Convolutional sequence to sequence learning. In International conference on machine learning, pages 1243–1252. PMLR. GeoDB (2023). Geodb: The geospatial database. Accessed: 2023-05-06. Goodman, E. L., Zimmerman, C., and Hudson, C. (2020). Packet2vec: Utilizing word2vec for feature extraction in packet data. arXiv preprint arXiv:2004.14477. Han, L., Sheng, Y., and Zeng, X. (2019). A packet-length-adjustable attention model based on bytes embedding using flow-wgan for smart cybersecurity. IEEE Access, 7:82913– 82926. Hassan, M., Haque, M. E., Tozal, M. E., Raghavan, V., and Agrawal, R. (2021). Intrusion detection using payload embeddings. IEEE Access, 10:4015–4030. Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8):1735–1780. Hutchins, E. M., Cloppert, M. J., Amin, R. M., et al. (2011). Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains. Leading Issues in Information Warfare & Security Research, 1(1):80. Hwang, R.-H., Peng, M.-C., Nguyen, V.-L., and Chang, Y.-L. (2019). An lstm-based deep learning approach for classifying malicious traffic at the packet level. Applied Sciences, 9(16):3414. Jain, A. K., Murty, M. N., and Flynn, P. J. (1999). Data clustering: a review. ACM computing surveys (CSUR), 31(3):264–323. Jiao, X., Yin, Y., Shang, L., Jiang, X., Chen, X., Li, L., Wang, F., and Liu, Q. (2019). Tinybert: Distilling bert for natural language understanding. arXiv preprint arXiv:1909.10351. Jin, X., Cui, B., Yang, J., and Cheng, Z. (2018). Payload-based web attack detection using deep neural network. In Advances on Broad-Band Wireless Computing, Communication and Applications: Proceedings of the 12th International Conference on Broad-Band Wireless Computing, Communication and Applications (BWCCA-2017), pages 482– 488. Springer. Kriegel, H.-P., Kröger, P., Sander, J., and Zimek, A. (2011). Density-based clustering. Wiley interdisciplinary reviews: data mining and knowledge discovery, 1(3):231–240. Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., and Soricut, R. (2019). Al- bert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942. Laughter, A., Omari, S., Szczurek, P., and Perry, J. (2021). Detection of malicious http requests using header and url features. In Proceedings of the Future Technologies Con- ference (FTC) 2020, Volume 2, pages 449–468. Springer. Li, C., Wang, J., and Ye, X. (2018). Using a recurrent neural network and restricted boltzmann machines for malicious traffic detection. NeuroQuantology, 16(5). Liang, S. and Srikant, R. (2016). Why deep neural networks for function approximation? arXiv preprint arXiv:1610.04161. Likas, A., Vlassis, N., and Verbeek, J. J. (2003). The global k-means clustering algorithm. Pattern recognition, 36(2):451–461. Lin, L.-H. and Hsiao, S.-W. (2022). Attack tactic identification by transfer learning of language model. arXiv preprint arXiv:2209.00263. Liu, H., Lang, B., Liu, M., and Yan, H. (2019a). Cnn and rnn based payload classification methods for attack detection. Knowledge-Based Systems, 163:332–341. Liu, X., He, P., Chen, W., and Gao, J. (2019b). Multi-task deep neural networks for natural language understanding. arXiv preprint arXiv:1901.11504. Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. (2019c). Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692. Loshchilov, I. and Hutter, F. (2017). Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101. Lotfollahi, M., Jafari Siavoshani, M., Shirali Hossein Zade, R., and Saberian, M. (2020). Deep packet: A novel approach for encrypted traffic classification using deep learning. Soft Computing, 24(3):1999–2012. Lu, L., Jin, P., Pang, G., Zhang, Z., and Karniadakis, G. E. (2021). Learning nonlinear op- erators via deeponet based on the universal approximation theorem of operators. Nature machine intelligence, 3(3):218–229. Mairh, A., Barik, D., Verma, K., and Jena, D. (2011). Honeypot in network security: a survey. In Proceedings of the 2011 international conference on communication, com- puting & security, pages 600–605. Mandic, D. and Chambers, J. (2001). Recurrent neural networks for prediction: learning algorithms, architectures and stability. Wiley. Micro, T. (2022). Trend micro 2022 first half network security report. Misra, D. (2019). Mish: A self regularized non-monotonic activation function. arXiv preprint arXiv:1908.08681. MITRE Corporation (2022). MITRE ATT&CK. https://attack.mitre.org/. Accessed: May 3, 2023. Montes, N., Betarte, G., Martínez, R., and Pardo, A. (2021). Web application attacks detec- tion using deep learning. In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications: 25th Iberoamerican Congress, CIARP 2021, Porto, Portugal, May 10–13, 2021, Revised Selected Papers 25, pages 227–236. Springer. Moore, D., Shannon, C., Voelker, G. M., and Savage, S. (2004). Network telescopes: Technical report. Murtagh, F. and Contreras, P. (2012). Algorithms for hierarchical clustering: an overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2(1):86–97. Nawrocki, M., Wählisch, M., Schmidt, T. C., Keil, C., and Schönfelder, J. (2016). A survey on honeypot software and data analysis. arXiv preprint arXiv:1608.06249. Paxson, V. (1999). Bro: a system for detecting network intruders in real-time. Computer networks, 31(23-24):2435–2463. Ritter and Schulten (1988). Kohonen’s self-organizing maps: exploring their computa- tional capabilities. In IEEE 1988 International Conference on Neural Networks, pages 109–116. Sanders, C. (2017). Practical Packet Analysis, 3E: Using Wireshark to Solve Real-World Network Problems. No Starch Press. Sanh, V., Debut, L., Chaumond, J., and Wolf, T. (2019). Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108. Schubert, E., Sander, J., Ester, M., Kriegel, H. P., and Xu, X. (2017). Dbscan revisited, revisited: why and how you should (still) use dbscan. ACM Transactions on Database Systems (TODS), 42(3):1–21. Smith, S. L., Kindermans, P.-J., Ying, C., and Le, Q. V. (2017). Don’t decay the learning rate, increase the batch size. arXiv preprint arXiv:1711.00489. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929–1958. Storey, V. C. (1993). Understanding semantic relationships. The VLDB Journal, 2:455– 488. Strom, B. E., Applebaum, A., Miller, D. P., Nickels, K. C., Pennington, A. G., and Thomas, C. B. (2018). Mitre att&ck: Design and philosophy. In Technical report. The MITRE Corporation. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818–2826. Tekerek, A. (2021). A novel architecture for web-based attack detection using convolu- tional neural network. Computers & Security, 100:102096. Van Engelen, J. E. and Hoos, H. H. (2020). A survey on semi-supervised learning. Ma- chine learning, 109(2):373–440. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30. Wang, W., Zhu, M., Zeng, X., Ye, X., and Sheng, Y. (2017). Malware traffic classification using convolutional neural network for representation learning. In 2017 International conference on information networking (ICOIN), pages 712–717. IEEE. Xu, R. and Wunsch, D. (2005). Survey of clustering algorithms. IEEE Transactions on neural networks, 16(3):645–678. Yu, Y., Yan, H., Guan, H., and Zhou, H. (2018). Deephttp: semantics-structure model with attention for anomalous http traffic detection and pattern mining. arXiv preprint arXiv:1810.12751. Zhang, M., Xu, B., Bai, S., Lu, S., and Lin, Z. (2017). A deep learning method to detect web attacks using a specially designed cnn. In Neural Information Processing: 24th International Conference, ICONIP 2017, Guangzhou, China, November 14–18, 2017, Proceedings, Part V 24, pages 828–836. Springer.
Description:	碩士國立政治大學資訊管理學系 110356047
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0110356047
Data Type:	thesis
Appears in Collections:	[資訊管理學系] 學位論文

Files in This Item:

File	Description	Size	Format
604701.pdf		4599Kb	Adobe PDF2	54	View/Open

All items in 政大典藏 are protected by copyright, with all rights reserved.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - Feedback