English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 113392/144379 (79%)
Visitors : 51144768      Online Users : 240
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 >  Item 140.119/137679
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/137679


    Title: 基於深度學習之衛星圖像建物偵測
    Detection of Buildings in Satellite Images Using Deep Learning Techniques
    Authors: 陳芝宇
    Chen, Chih-Yu
    Contributors: 李蔡彥
    廖文宏

    Li, Tsai-Yen
    Liao, Wen-Hung

    陳芝宇
    Chen, Chih-Yu
    Keywords: 衛星圖像
    邊緣偵測
    YOLOv5
    物件辨識
    圖像分割
    超解析度
    Satellite Images
    YOLOv5
    Object Detection
    Image Segmentation
    Super-resolution
    Date: 2021
    Issue Date: 2021-11-01 12:01:23 (UTC+8)
    Abstract: 衛星照片的應用日趨廣泛,從衛星照片中辨識出不同物體的位置,是一項具挑戰性的任務。近年來伴隨人工智慧與深度學習的快速發展,自動物件辨識與偵測已取得不錯的成果,然針對衛星照片的物件辨識,仍有進一步研究改進的空間,特別是低解析度衛星圖資。
    本研究以Google Maps及Xview兩種不同解析度的衛星圖像資料集為基礎,希能透過深度學習的方法,快速地判別出建築物的位置,同時探討不同資料集所適用的方法是否有差異。由於Google Maps衛星圖缺乏物體的標記,為加速資料準備流程,本論文提出了一套圖像分割演算法,將Map街景圖透過顏色區分前後景、中值濾波器過濾雜訊、找物體再計算面積,最後將建築物與背景成功分離。
    有關物件偵測方法,嘗試過多種深度學習框架後,我們選擇以YOLOv5x6模型為基底,設計高解析度、強化和未強化、擴增通道等不同之影像強化前處理模型,調校模型中Anchor偵測框數量以及門檻值,最後與原圖模型進行比較,以了解不同模型對準確度、召回率與mAP等辨識品質指標的影響。實驗結果顯示, Google Maps資料集的mAP最佳值0.687,而Xview資料集mAP最佳值0.783。我們以實驗方式證明影像強化的前處理方法對提高衛星影像的辨識率有幫助,且不同類型資料集的最佳方法亦有所不同,可作為衛星影像辨識後續應用的參考。
    Satellite images have been utilized in a wide range of applications. However, identifying the location of various types of objects from satellite images remains a challenging task. Thanks to the recent rapid development of artificial intelligence and deep learning, the research on automatic object detection has made great strides. This thesis attempts to apply the latest technology in improving object recognition from satellite images, especially for low-resolution data.
    Two satellite image datasets with different resolutions, namely, Google Maps, and Xview, are employed to investigate whether there were discrepancies in current techniques. Since the images in Google Maps lack ground truth labels, this thesis proposed an image segmentation algorithm to distinguish foreground (buildings) and background in the map street view by combining color features, noise filtering, object localization and area computation.
    Regarding object detection methods, after testing various deep learning frameworks, we chose the YOLOv5x6 as the baseline model. We designed different pre-processing methods including super-resolution, edge enhancement, and augmented channels to improve the accuracy. Additionally, the calibration of the number of Anchor detection frames and threshold values in the models were investigated. Comparative analysis was conducted to understand the effects of various factors on performance metrics such as accuracy, recall rate and mAP. Experimental results showed that the highest mAP is 0.687 for the Google Maps dataset and 0.783 for the Xview dataset, demonstrating that image pre-processing is beneficial for improving the recognition rate. Moreover, the best method differed for various types of datasets. We expect these results to serve as an informative reference for subsequent analysis of satellite imagery.
    Reference: 參考資料

    [1] “Xview Dataset,” DIU, [線上]. Available: http://xviewdataset.org/.
    [2] H.Zhao, X.Kong, J.He, Y.Qiao, C.Dong, “Efficient Image Super-Resolution Using Pixel Attention,” arXiv preprint arXiv:2010.01073, Oct 2020.
    [3] X.Wang, K.Yu, S.Wu, J.Gu, Y.Liu, C.Dong, C.Loy, Y.Qiao, X.Tang, “ESRGAN: Enhanced Super-Resolution,” arXiv preprint arXiv:1809.00219, Sep 2018.
    [4] H. Zhao, X. Kong, J. He, Y. Qiao, C. Dong, “Efficient Image Super-Resolution Using Pixel Attention,” in Prof. of European Conference on Computer Vision (ECCV) Workshops, 2020.
    [5] Y. James, “[資料分析&機器學習] 第5.1講: 卷積神經網絡介紹(Convolutional Neural Network),” 2017. [線上]. Available: https://medium.com/jameslearningnote/%E8%B3%87%E6%96%99%E5%88%86%E6%9E%90-%E6%A9%9F%E5%99%A8%E5%AD%B8%E7%BF%92-%E7%AC%AC5-1%E8%AC%9B-%E5%8D%B7%E7%A9%8D%E7%A5%9E%E7%B6%93%E7%B6%B2%E7%B5%A1%E4%BB%8B%E7%B4%B9-convolutional-neural-network-4f8249d65d4f.
    [6] R. Girshick, “Fast R-CNN,” in Proc. of 2015 IEEE International Conference on Computer Vision (ICCV), Dec 2015.
    [7] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, A. C. Berg, “SSD: Single Shot MultiBox Detector,” arXiv preprint arXiv:1512.02325, Dec 2016.
    [8] A. Bochkovskiy, C.-Y. Wang, H.-Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” arXiv preprint arXiv:2004.10934., April 2020.
    [9] J. Canny, “A Computational Approach To Edge Detection,” IEEE Trans. Pattern Analysis and Machine Intelligence., 1986.
    [10] N. Kanopoulos, N. Vasanthavada, R.L. Baker, “Design of an image edge detection filter using the Sobel operator,” IEEE Journal of Solid-State Circuits, April 1988.
    [11] T. Xishan, “A Novel Image Edge Detection Algorithm based on Prewitt Operator and Wavelet Transform,” International Journal of Advancements in Computing Technology, 2012.
    [12] E. Roopa, H. Ramachandra, P. Shubha, “Buildings Detection from Very High Resolution Satellite images Using Segmentation and Morphological Operations,” in Proc. of International Conference on Design Innovations for 3Cs Compute Communicate Control, 2018.
    [13] K. Zhang, J. Liang, L. Gool, R. Timofte, “Designing a Practical Degradation Model for Deep Blind Image Super-Resolution,” Computer Vision and Pattern Recognition, May 2021.
    [14] “Image Quality Metrics - MATLAB & Simulink - MathWorks,” [線上]. Available: https://ww2.mathworks.cn/help/images/image-quality-metrics.html.
    [15] G. Jocher, “YOLOv5 github,” 2020. [線上]. Available: https://github.com/ultralytics/YOLOv5.
    [16] 白勇, “CSDN,” [線上]. Available: https://edu.csdn.net/course/detail/31428.
    [17] Y. Zhang, Y. Yin, R. Zimmermann, G. Wang, J. Varadarajan, S. Ng, An Enhanced GAN Model for Automatic Satellite-to-Map Image Conversion,” IEEE Access, Sep 2020.
    [18] D. Lam, R. Kuzma, K. McGee, S. Dooley, M. Laielli, “xView: Objects in Context in Overhead Imagery,” arXiv preprint arXiv:1802.07856, Feb 2018.
    [19] “Wiki,” [線上]. Available: https://en.wikipedia.org/wiki/Precision_and_recall.
    [20] O. Ronneberger, P. Fischer, T. Brox, “U-Net: Convolutional Networks for Biomedical,” arXiv preprint arXiv:1505.04597, 18 May 2015.
    [21] J. Zhu, T. Park, P. Isola, A. A. Efros, “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks,” in Proc. of ICCV, 2017.
    [22] I. J. Goodfellow, J. Abadie, “Generative Adversarial Nets,” Advances in Neural Information Processing Systems for GAN, 3(11), p. 9, 10 6 2014.
    [23] A. Ramya, V. Pola, A.Vaishnavi, S. Karra, “Comparison of YOLOv3, YOLOv4 and YOLOv5 Performance for Detection of Blood Cells,” International Research Journal of Engineering and Technology (IRJET), April 2021.
    [24] S. Wang, O. Wang, R. Zhang, A. Owens, A. Efros, “CNN-generated images are surprisingly easy to spot...for now,” in Proc. of CVPR 2020.
    Description: 碩士
    國立政治大學
    資訊科學系
    108971021
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0108971021
    Data Type: thesis
    DOI: 10.6814/NCCU202101630
    Appears in Collections:[資訊科學系] 學位論文

    Files in This Item:

    File Description SizeFormat
    102101.pdf6689KbAdobe PDF21View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback