Please use this identifier to cite or link to this item:

Title: 基於深度學習之衛星圖像建物偵測
Detection of Buildings in Satellite Images Using Deep Learning Techniques
Authors: 陳芝宇
Chen, Chih-Yu
Contributors: 李蔡彥

Li, Tsai-Yen
Liao, Wen-Hung

Chen, Chih-Yu
Keywords: 衛星圖像
Satellite Images
Object Detection
Image Segmentation
Date: 2021
Issue Date: 2021-11-01 12:01:23 (UTC+8)
Abstract: 衛星照片的應用日趨廣泛,從衛星照片中辨識出不同物體的位置,是一項具挑戰性的任務。近年來伴隨人工智慧與深度學習的快速發展,自動物件辨識與偵測已取得不錯的成果,然針對衛星照片的物件辨識,仍有進一步研究改進的空間,特別是低解析度衛星圖資。
本研究以Google Maps及Xview兩種不同解析度的衛星圖像資料集為基礎,希能透過深度學習的方法,快速地判別出建築物的位置,同時探討不同資料集所適用的方法是否有差異。由於Google Maps衛星圖缺乏物體的標記,為加速資料準備流程,本論文提出了一套圖像分割演算法,將Map街景圖透過顏色區分前後景、中值濾波器過濾雜訊、找物體再計算面積,最後將建築物與背景成功分離。
有關物件偵測方法,嘗試過多種深度學習框架後,我們選擇以YOLOv5x6模型為基底,設計高解析度、強化和未強化、擴增通道等不同之影像強化前處理模型,調校模型中Anchor偵測框數量以及門檻值,最後與原圖模型進行比較,以了解不同模型對準確度、召回率與mAP等辨識品質指標的影響。實驗結果顯示, Google Maps資料集的mAP最佳值0.687,而Xview資料集mAP最佳值0.783。我們以實驗方式證明影像強化的前處理方法對提高衛星影像的辨識率有幫助,且不同類型資料集的最佳方法亦有所不同,可作為衛星影像辨識後續應用的參考。
Satellite images have been utilized in a wide range of applications. However, identifying the location of various types of objects from satellite images remains a challenging task. Thanks to the recent rapid development of artificial intelligence and deep learning, the research on automatic object detection has made great strides. This thesis attempts to apply the latest technology in improving object recognition from satellite images, especially for low-resolution data.
Two satellite image datasets with different resolutions, namely, Google Maps, and Xview, are employed to investigate whether there were discrepancies in current techniques. Since the images in Google Maps lack ground truth labels, this thesis proposed an image segmentation algorithm to distinguish foreground (buildings) and background in the map street view by combining color features, noise filtering, object localization and area computation.
Regarding object detection methods, after testing various deep learning frameworks, we chose the YOLOv5x6 as the baseline model. We designed different pre-processing methods including super-resolution, edge enhancement, and augmented channels to improve the accuracy. Additionally, the calibration of the number of Anchor detection frames and threshold values in the models were investigated. Comparative analysis was conducted to understand the effects of various factors on performance metrics such as accuracy, recall rate and mAP. Experimental results showed that the highest mAP is 0.687 for the Google Maps dataset and 0.783 for the Xview dataset, demonstrating that image pre-processing is beneficial for improving the recognition rate. Moreover, the best method differed for various types of datasets. We expect these results to serve as an informative reference for subsequent analysis of satellite imagery.
Reference: 參考資料

[1] “Xview Dataset,” DIU, [線上]. Available:
[2] H.Zhao, X.Kong, J.He, Y.Qiao, C.Dong, “Efficient Image Super-Resolution Using Pixel Attention,” arXiv preprint arXiv:2010.01073, Oct 2020.
[3] X.Wang, K.Yu, S.Wu, J.Gu, Y.Liu, C.Dong, C.Loy, Y.Qiao, X.Tang, “ESRGAN: Enhanced Super-Resolution,” arXiv preprint arXiv:1809.00219, Sep 2018.
[4] H. Zhao, X. Kong, J. He, Y. Qiao, C. Dong, “Efficient Image Super-Resolution Using Pixel Attention,” in Prof. of European Conference on Computer Vision (ECCV) Workshops, 2020.
[5] Y. James, “[資料分析&機器學習] 第5.1講: 卷積神經網絡介紹(Convolutional Neural Network),” 2017. [線上]. Available:
[6] R. Girshick, “Fast R-CNN,” in Proc. of 2015 IEEE International Conference on Computer Vision (ICCV), Dec 2015.
[7] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, A. C. Berg, “SSD: Single Shot MultiBox Detector,” arXiv preprint arXiv:1512.02325, Dec 2016.
[8] A. Bochkovskiy, C.-Y. Wang, H.-Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” arXiv preprint arXiv:2004.10934., April 2020.
[9] J. Canny, “A Computational Approach To Edge Detection,” IEEE Trans. Pattern Analysis and Machine Intelligence., 1986.
[10] N. Kanopoulos, N. Vasanthavada, R.L. Baker, “Design of an image edge detection filter using the Sobel operator,” IEEE Journal of Solid-State Circuits, April 1988.
[11] T. Xishan, “A Novel Image Edge Detection Algorithm based on Prewitt Operator and Wavelet Transform,” International Journal of Advancements in Computing Technology, 2012.
[12] E. Roopa, H. Ramachandra, P. Shubha, “Buildings Detection from Very High Resolution Satellite images Using Segmentation and Morphological Operations,” in Proc. of International Conference on Design Innovations for 3Cs Compute Communicate Control, 2018.
[13] K. Zhang, J. Liang, L. Gool, R. Timofte, “Designing a Practical Degradation Model for Deep Blind Image Super-Resolution,” Computer Vision and Pattern Recognition, May 2021.
[14] “Image Quality Metrics - MATLAB & Simulink - MathWorks,” [線上]. Available:
[15] G. Jocher, “YOLOv5 github,” 2020. [線上]. Available:
[16] 白勇, “CSDN,” [線上]. Available:
[17] Y. Zhang, Y. Yin, R. Zimmermann, G. Wang, J. Varadarajan, S. Ng, An Enhanced GAN Model for Automatic Satellite-to-Map Image Conversion,” IEEE Access, Sep 2020.
[18] D. Lam, R. Kuzma, K. McGee, S. Dooley, M. Laielli, “xView: Objects in Context in Overhead Imagery,” arXiv preprint arXiv:1802.07856, Feb 2018.
[19] “Wiki,” [線上]. Available:
[20] O. Ronneberger, P. Fischer, T. Brox, “U-Net: Convolutional Networks for Biomedical,” arXiv preprint arXiv:1505.04597, 18 May 2015.
[21] J. Zhu, T. Park, P. Isola, A. A. Efros, “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks,” in Proc. of ICCV, 2017.
[22] I. J. Goodfellow, J. Abadie, “Generative Adversarial Nets,” Advances in Neural Information Processing Systems for GAN, 3(11), p. 9, 10 6 2014.
[23] A. Ramya, V. Pola, A.Vaishnavi, S. Karra, “Comparison of YOLOv3, YOLOv4 and YOLOv5 Performance for Detection of Blood Cells,” International Research Journal of Engineering and Technology (IRJET), April 2021.
[24] S. Wang, O. Wang, R. Zhang, A. Owens, A. Efros, “CNN-generated images are surprisingly easy to spot...for now,” in Proc. of CVPR 2020.
Description: 碩士
Source URI:
Data Type: thesis
Appears in Collections:[資訊科學系] 學位論文

Files in This Item:

File Description SizeFormat
102101.pdf6689KbAdobe PDF0View/Open

All items in 學術集成 are protected by copyright, with all rights reserved.

社群 sharing