政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/137731

政大典藏 > College of Informatics > Executive Master Program of Computer Science of NCCU > Theses > Item 140.119/137731

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/137731

Title:	基於深度學習框架之衛星圖像人造物切割 Segmentation of Man-Made Structures in Satellite Images Using Deep Learning Approaches
Authors:	陳忠揚 Chen, Chung-Yang
Contributors:	廖文宏 Liao, Wen-Hung 陳忠揚 Chen, Chung-Yang
Keywords:	深度學習衛星圖資語意分割影像強化無監督域適應 Deep Learning Satellite Images Semantic Segmentation Image Enhancement Unsupervised Domain Adaptation
Date:	2021
Issue Date:	2021-11-01 12:19:34 (UTC+8)
Abstract:	遙測(remote sensing)是近年來影像處理熱門領域之一，該技術被廣泛應用於水土監測、環境監測、以及軍事類活動監控等多項應用，囿於衛星資料取得成本相對較高，致使提供學術研究的公開資料與相關研究之應用起步較晚，眾多研究中可以發現，針對衛星影像的語意切割(semantic segmentation)整體表現上仍然不佳，本研究將衛星影像分為同質性與異質性兩種資料，前者的訓練與測試資料，皆來自相同衛星及成像條件的影像，後者則是訓練和測試資料集隸屬於不同區域及季節之影像，分別探討如何透過影像增強與深度學習框架的方式，提升衛星影像的物件切割表現，以及透過「無監督域適應(unsupervised domain adaptation, UDA)」的技術，使模型面對更加複雜的衛星圖資，於跨域任務的影像分割仍保有一定的適應力。同質性衛星影像的應用，本研究透過訓練資料的前處理，使用深度學習中遷移學習之概念，載入預訓練模型，搭配模型再訓練、Mixed Pooling Module (MPM)模組應用以及相關參數調校後，找到最佳搭配組合，提升衛星影像之切割效能；前處理包括影像增強、高頻強化、邊緣銳化等方式，目標鎖定人造物體的建築與道路，提升整體影像切割校能的mIoU指標。最終，透過資料前處理、特徵強化模組、骨幹網路選擇之搭配，獲得83.5%的mIoU效能表現，與原始效能相比大約精進3%。異質性衛星影像的應用，本研究依序驗證Source Only、現有UDA技術以及域轉換與強化網路(Domain Transfer and Enhancement Network, DTEN)架構，透過調整其中的關鍵參數設定，試圖讓模型更有效執行跨域影像分割任務，最終超越UDA最佳效能mIoU指標3.6%，達到45.3%之表現。 Analysis of remote sensing images is one important application of computer vision. It has been widely used in land and water surveillance, environmental monitoring, and military intelligence. Due to the relative high cost of obtaining satellite images and the lack of open data, academic research in satellite imagery analysis is gaining attention only recently. Many well-developed techniques in computer vision still have to prove their validity when satellite images are concerned. In this work, we attempt to tackle semantic segmentation of man-made structures in satellite images in two aspects: homogeneous and heterogenous datasets. The former contains images from the same satellite and imaging conditions in both training and test set, while in the latter case, training and test data are captured in different locations or seasons. To gain better performance, we have experimented with different strategies, including image enhancement, backbone substitution and architecture modification. We have also employed unsupervised domain adaptation (UDA) techniques to cope with heterogenous data, hoping that the model can still maintain its capability in cross-domain segmentation tasks. For homogeneous satellite images, our research uses transfer learning, image pre-processing, backbone replacement, mixed pooling module (MPM) and parameter tuning to find the combination that yields the best mIoU for building and road extraction. After extensive experiments, the highest mIoU is 83.5%, an improvement of 3% over existing techniques. For heterogeneous satellite images, our research tested and compared source only model, existing UDA methods, and domain transfer and enhancement network (DTEN). Experimental results indicate that DTEN has the best performance with an mIoU 45.3%, an improvement of 3.6% over state-of-the-art UDA techniques.
Reference:	[1] 維基百科:電腦視覺定義。 https://en.wikipedia.org/wiki/Computer_vision [2] Aditya Kulkarni, Tharun Mohandoss, Daniel Northrup, Ernest Mwebaze, Hamed Alemohammad (arXiv 2020). Semantic Segmentation of Medium-Resolution Satellite Imagery using Conditional Generative Adversarial Network [3] Yilei Shi, Qingyu Li, Xiao Xiang Zhus (IEEE 2018). Building Footprint Generation Using Improved Generative Adversarial Networks [4] Mingyuan Fan, Shenqi Lai , Junshi Huang , Xiaoming Wei , Zhenhua Chai, Junfeng Luo, Xiaolin Wei. (arXiv 2021). Rethinking BiSeNet For Real-time Semantic Segmentation [5] Landsat-8資料集。 https://zenodo.org/record/1154821#.YK0EFqgzZPY [6] Cityscapes資料集。 https://www.cityscapes-dataset.com/ [7] CamVid資料集。 http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/ [8] Coco資料集。 https://cocodataset.org/#home [9] Ade20k資料集。 https://paperswithcode.com/dataset/ade20k [10] 繞極軌道衛星。 https://web.fg.tp.edu.tw/~earth/learn/weather/collect4.htm [11] 同步軌道衛星。 https://zh.wikipedia.org/wiki/氣象衛星 [12] 福為五號。 https://www.nspo.narl.org.tw/inprogress.php?c=20021501 [13] Landsat系列衛星計畫。 https://en.wikipedia.org/wiki/Landsat_program [14] Planet Labs公司。 https://www.richitech.com.tw/全球中解析衛星影像簡介/ [15] WorldView系列衛星。 https://earth.esa.int/eogateway/missions/worldview/description [16] 法國Pléiades衛星。 https://earth.esa.int/web/eoportal/satellite-missions/p/pleiades [17] 輻射解析度定義。 https://www.csrsr.ncu.edu.tw/rsrs/ProperNoun.php [18] QuickBird系列介紹。 https://www.richitech.com.tw/%E7%A9%BA%E9%96%93%E7%94%A2%E5%93%81%E9%8A%B7%E5%94%AE/%E6%95%B8%E4%BD%8D%E7%A9%BA%E9%96%93/quickbird/ [19] WorldView4介紹。 https://spacenews.com/digitalglobe-loses-worldview-4-satellite-to-gyro-failure/ [20] 維基百科WorldView3解析度說明。 https://en.wikipedia.org/wiki/WorldView-3 [21] 維基百科Landsat-8解析度說明。 https://en.wikipedia.org/wiki/Landsat_8 [22] Tsung-Yi Lin Michael Maire Serge Belongie Lubomir Bourdev Ross Girshick James Hays Pietro Perona Deva Ramanan C. Lawrence Zitnick Piotr Dollar. (arXiv 2014). Microsoft COCO: Common Objects in Context. [23] Lecture 11: Detection and Segmentation. (Stanford University) http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture11.pdf (p.24) [24] Vijay Badrinarayanan, Alex Kendall, Roberto Cipolla. (arXiv 2015). SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. [25] CHAPTER 6 Deep-Learning. http://neuralnetworksanddeeplearning.com/chap6.html [26] ILSVRC歷年Top-5錯誤率。 https://www.kaggle.com/getting-started/149448 [27] Jonathan Long, Evan Shelhamer, Trevor Darrell. (CVPR 2015). Fully Convolutional Networks for Semantic Segmentation. [28] Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. (arXiv 2018). Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. [29] Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang, Hanqing Lu. (arXiv 2018). Dual Attention Network for Scene Segmentation. [30] Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John Winn & Andrew Zisserman. (International Journal of Computer Vision 2010). The PASCAL Visual Object Classes (VOC) Challenge. [31] Roozbeh Mottaghi, Xianjie Chen, Xiaobai Liu, Nam-Gyu Cho, Seong-Whan Lee, Sanja Fidler, Raquel Urtasun, Alan Yuille. (CVPR 2014). The Role of Context for Object Detection and Semantic Segmentation in the Wild. [32] Holger Caesar, Jasper Uijlings, Vittorio Ferrari. (arXiv 2016). COCO-Stuff: Thing and Stuff Classes in Context. [33] Minghao Yin, Zhuliang Yao, Yue Cao, Xiu Li, Zheng Zhang, Stephen Lin, Han Hu. (arXiv 2020). Disentangled Non-Local Neural Networks. [34] Xiaolong Wang, Ross Girshick, Abhinav Gupta, Kaiming He. (arXiv 2017). Non-local Neural Networks. [35] IOU(Intersection-over-Union)定義。 https://www.researchgate.net/figure/Intersection-over-Union-IOU-calculation-diagram_fig2_335876570 [36] Multimedia Laboratory. (The Chinese University of Hong Kong) https://mmlab.ie.cuhk.edu.hk/ [37] Open-MMLab https://github.com/open-mmlab/mmsegmentation [38] Tan, C., Sun, F., Kong, T., Zhang, W., Yang, C., & Liu, C. (arXiv 2018). A survey on deep transfer learning. [39] Vu, Tuan-Hung & Jain, Himalaya & Bucher, Maxime & Cord, Matthieu & Perez, Patrick. (CVPR 2019). ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation. [40] Wang, Zhonghao & Yu, Mo & Wei, Yunchao & Feris, Rogerio & Xiong, Jinjun & Hwu, Wen-mei & Shi, Humphrey. (arXiv:2020). Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic. [41] Jason Yosinski, Jeff Clune, Yoshua Bengio, Hod Lipson. (arXiv 2014). How transferable are features in deep neural networks. [42] Vijay Badrinarayanan, Alex Kendall , and Roberto Cipolla.(IEEE 2017). SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. [43] 王柏仁(2021，國立政治大學)，基於非監督域適應之衛星圖資切割優化。 [44] 對比伸展示意圖。 https://www.researchgate.net/figure/Explanatory-illustration-of-contrast-stretching-transformation_fig9_44650125 [45] Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (arXiv 2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. [46] Yilei Shi, Qingyu Li, Xiao Xiang Zhu. (IEEE 2017). Building Footprint Generation Using Improved Generative Adversarial Networks. [47] Qibin Hou, Li Zhang, Ming-Ming Cheng, Jiashi Feng. (arXiv 2020). Strip Pooling: Rethinking Spatial Pooling for Scene Parsing. [48] Artem Rozantsev, Mathieu Salzmann, Pascal Fua. (IEEE Transactions on Pattern Analysis and Machine Intelligence 2019). Beyond Sharing Weights for Deep Domain Adaptation.
Description:	碩士國立政治大學資訊科學系碩士在職專班 108971022
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0108971022
Data Type:	thesis
DOI:	10.6814/NCCU202101671
Appears in Collections:	[Executive Master Program of Computer Science of NCCU] Theses

Files in This Item:

File	Description	Size	Format
102201.pdf		5961Kb	Adobe PDF2	196	View/Open

社群 sharing

Loading...