Please use this identifier to cite or link to this item: https://ah.lib.nccu.edu.tw/handle/140.119/136969
題名: 基於非監督域適應之衛星圖資切割優化
Segmentation of Remote Sensing Images Using Unsupervised Domain Adaptation
作者: 王柏仁
Wang, Bor-Ren
貢獻者: 廖文宏
Liao, Wen-Hung
王柏仁
Wang, Bor-Ren
關鍵詞: 深度學習
衛星影像
影像切割
對抗式生成網路
風格轉換
非監督域適應
Deep learning
Remote sensing images
Image segmentation
Generative adversarial network(GAN)
Style transfer
Unsupervised domain adaptation
日期: 2021
上傳時間: 2-Sep-2021
摘要: 在國防領域中,衛星圖資因其高海拔尺度、獨特的視角以及廣大覆蓋率的特性扮演了重要的角色,同時也被應用在各種情境中,包括農作物預測、土地調查等繪圖應用。而在本論文中,我們使用深度學習框架於「影像切割與色彩調配」任務,其中前者目的在於準確地對衛星影像中的每個像素進行不同分類的預測;後者旨在開發相關技術以穩健地在具不均勻分布的不同來源影像中進行影像切割。\n\n深度學習本質上是資料驅動的,十分仰賴訓練資料的數量以及品質,且訓練好的模型往往無法泛化到其他資料上,而這也是應用在衛星圖資分析常常會被觀察到的現象。針對此一議題,我們採用了非監督域適應(UDA) 技術,嘗試在不同來源資料中轉移域知識,進而讓模型具更強能力來處理異質性資料。對此,我們首先研究及實驗目前現有的Source-Only模型以及 UDA 演算法,並獲得綜合性的結果。接著我們進一步提出創新的模型架構,稱為域轉換與強化網路(Domain Transfer and Enhancement Network, DTEN),當中包含Skip Connection、Mixed Pooling Module以及域轉換模組,並實驗中達到46.4% MIOU,超越當今SOTA約3% 的成績。
Satellite imagery plays an important role in national defense due to its high attitude, unique view point and large coverage. It has also been utilized in areas such as crop prediction, land surveying as mapping. In this thesis, we focus on the task of image segmentation and color matching using deep learning framework. The former is concerned with accurate classification of pixels in satellite images into different categories, and the latter is to develop techniques to perform robust semantic segmentation when satellite images are obtained from different sources, causing non-uniformity in color distribution.\n\nDeep neural network is inherently data-driven, relying heavily on the quantity and quality of the training data. Models trained on one dataset might not generalize well to the other, which is often observed in the analysis of satellite images. To address the above issue, we employ unsupervised domain adaptation (UDA) techniques to transfer domain knowledge between different sources of satellite images, hoping to strengthen the model’s ability to cope with heterogeneous data. We start by experimenting with existing UDA algorithms and performing comparative analysis. We then propose a novel architecture named domain transfer and enhancement network(DTEN) which incorporates skip connection, mixed pooling module and domain transfer module, and design some ablation studies. The resulting model supersedes the state-of-the-art methods in satellite image segmentation, achieving a highest MIOU of 46.4%, an improvement of nearly 3% over existing approaches.
參考文獻: [1]. Zhang, Liangpei, Lefei Zhang, and Bo Du. "Deep learning for remote sensing data: A technical tutorial on the state of the art." IEEE Geoscience and Remote Sensing Magazine 4.2 (2016): 22-40.\n[2]. Z. Chen, J. Zhuang, X. Liang and L. Lin, "Blending-Target Domain Adaptation by Adversarial Meta-Adaptation Networks," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 2243-2252, doi: 10.1109/CVPR.2019.00235.\n[3]. Goodfellow, Ian & Pouget-Abadie, Jean & Mirza, Mehdi & Xu, Bing & Warde-Farley, David & Ozair, Sherjil & Courville, Aaron & Bengio, Y.. (2014). Generative Adversarial Nets. ArXiv.\n[4]. Zhu, Jun-Yan & Park, Taesung & Isola, Phillip & Efros, Alexei. (2017). Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. 2242-2251. 10.1109/ICCV.2017.244.\n[5]. Li, Hanchao & Xiong, Pengfei & An, Jie & Wang, Lingxue. (2018). Pyramid Attention Network for Semantic Segmentation.\n[6]. Zhang, Hang & Dana, Kristin & Shi, Jianping & Zhang, Zhongyue & Wang, Xiaogang & Tyagi, Ambrish & Agrawal, Amit. (2018). Context Encoding for Semantic Segmentation. 7151-7160. 10.1109/CVPR.2018.00747.\n[7]. Fu, Jun & Liu, Jing & Tian, Haijie & Li, Yong & Bao, Yongjun & Fang, Zhiwei & Lu, Hanqing. (2019). Dual Attention Network for Scene Segmentation. 3141-3149. 10.1109/CVPR.2019.00326.\n[8]. Fu, Jun & Liu, Jing & Jiang, Jie & Li, Yong & Bao, Yongjun & Lu, Hanqing. (2020). Scene Segmentation With Dual Relation-Aware Attention Network. IEEE Transactions on Neural Networks and Learning Systems. PP. 1-14. 10.1109/TNNLS.2020.3006524.\n[9]. Q. Hou, L. Zhang, M. -M. Cheng and J. Feng, "Strip Pooling: Rethinking Spatial Pooling for Scene Parsing," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 4002-4011, doi:10.1109/CVPR42600.2020.00406.\n[10]. Chen, Liang-Chieh & Papandreou, George & Kokkinos, Iasonas & Murphy, Kevin & Yuille, Alan. (2016). DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence. PP. 10.1109/TPAMI.2017.2699184.\n[11]. Chen, Liang-Chieh & Zhu, Yukun & Papandreou, George & Schroff, Florian & Adam, Hartwig. (2018). Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation.\n[12]. https://towardsdatascience.com/deep-domain-adaptation-in-computer-vision-8da398d3167f\n[13]. Rozantsev, Artem & Salzmann, Mathieu & Fua, Pascal. (2016). Beyond Sharing Weights for Deep Domain Adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence. PP. 10.1109/TPAMI.2018.2814042.\n[14]. Sun, Baochen & Saenko, Kate. (2016). Deep CORAL: Correlation Alignment for Deep Domain Adaptation. 10.1007/978-3-319-49409-8_35.\n[15]. Kang, Guoliang & Jiang, Lu & Yang, Yi & Hauptmann, Alexander. (2019). Contrastive Adaptation Network for Unsupervised Domain Adaptation. 4888-4897. 10.1109/CVPR.2019.00503.\n[16]. Ganin, Yaroslav & Lempitsky, Victor. (2014). Unsupervised Domain Adaptation by Backpropagation.\n[17]. Ghifary, Muhammad & Kleijn, W. & Zhang, Mengjie & Balduzzi, David & Li, Wen. (2016). Deep Reconstruction-Classification Networks for Unsupervised Domain Adaptation. 9908. 597-613. 10.1007/978-3-319-46493-0_36.\n[18]. Hoffman, Judy & Tzeng, Eric & Park, Taesung & Zhu, Jun-Yan & Isola, Phillip & Saenko, Kate & Efros, Alexei & Darrell, Trevor. (2017). CyCADA: Cycle-Consistent Adversarial Domain Adaptation.\n[19]. Tsai, Yi-Hsuan & Hung, Wei-Chih & Schulter, Samuel & Sohn, Kihyuk & Yang, Ming-Hsuan & Chandraker, Manmohan. (2018). Learning to Adapt Structured Output Space for Semantic Segmentation. 7472-7481. 10.1109/CVPR.2018.00780.\n[20]. Yang, Yanchao & Soatto, Stefano. (2020). FDA: Fourier Domain Adaptation for Semantic Segmentation. 4084-4094. 10.1109/CVPR42600.2020.00414.\n[21]. Vu, Tuan-Hung & Jain, Himalaya & Bucher, Maxime & Cord, Matthieu & Perez, Patrick. (2019). ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation. 2512-2521. 10.1109/CVPR.2019.00262.\n[22]. Vu, Tuan-Hung & Jain, Himalaya & Bucher, Maxime & Cord, Matthieu & Perez, Patrick. (2019). DADA: Depth-Aware Domain Adaptation in Semantic Segmentation. 7363-7372. 10.1109/ICCV.2019.00746.\n[23]. Y. Li, L. Yuan and N. Vasconcelos, "Bidirectional Learning for Domain Adaptation of Semantic Segmentation," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 6929-6938, doi: 10.1109/CVPR.2019.00710.\n[24]. P. Isola, J. Zhu, T. Zhou and A. A. Efros, "Image-to-Image Translation with Conditional Adversarial Networks," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 5967-5976, doi: 10.1109/CVPR.2017.632.\n[25]. Y. Chen, Y. Lin, M. Yang and J. Huang, "CrDoCo: Pixel-Level Domain Transfer With Cross-Domain Consistency," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 1791-1800, doi: 10.1109/CVPR.2019.00189.\n[26]. Wang, Zhonghao & Yu, Mo & Wei, Yunchao & Feris, Rogerio & Xiong, Jinjun & Hwu, Wen-mei & Shi, Humphrey. (2020). Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation. 12632-12641. 10.1109/CVPR42600.2020.01265.\n[27]. Jin, Xin & Lan, Cuiling & Zeng, Wenjun & Chen, Zhibo. (2021). Style Normalization and Restitution for DomainGeneralization and Adaptation.\n[28]. Wang, Jing & Chen, Jiahong & Lin, Jianzhe & Sigal, Leonid & Silva, Clarence. (2021). Discriminative Feature Alignment: Improving Transferability of Unsupervised Domain Adaptation by Gaussian-guided Latent Alignment.. Pattern Recognition. 116. 107943. 10.1016/j.patcog.2021.107943.\n[29]. Zhang, Pan & Zhang, Bo & Zhang, Ting & Chen, Dong & Wang, Yong & Wen, Fang. (2021). Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation.\n[30]. Snell, Jake & Swersky, Kevin & Zemel, Richard. (2017). Prototypical Networks for Few-shot Learning.\n[31]. Kang, Yuhao & Gao, Song & Roth, Robert. (2019). Transferring Multiscale Map Styles Using Generative Adversarial Networks. International Journal of Cartography. 10.1080/23729333.2019.1615729.\n[32]. Tasar, Onur & Tarabalka, Yuliya & Giros, Alain & Alliez, Pierre & Clerc, Sebastien. (2020). StandardGAN: Multi-source Domain Adaptation for Semantic Segmentation of Very High Resolution Satellite Images by Data Standardization. 747-756. 10.1109/CVPRW50498.2020.00104.\n[33]. Huang, Xun & Belongie, Serge. (2017). Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization. 1510-1519. 10.1109/ICCV.2017.167.\n[34]. Park, Taesung & Liu, Ming-Yu & Wang, Ting-Chun & Zhu, Jun-Yan. (2019). Semantic Image Synthesis With Spatially-Adaptive Normalization. 2332-2341. 10.1109/CVPR.2019.00244.\n[35]. Park, Dae & Lee, Kwang. (2019). Arbitrary Style Transfer With Style-Attentional Networks. 5873-5881. 10.1109/CVPR.2019.00603.\n[36]. Chang, Woong-Gi & You, Tackgeun & Seo, Seonguk & Kwak, Suha & Han, Bohyung. (2019). Domain-Specific Batch Normalization for Unsupervised Domain Adaptation. 7346-7354. 10.1109/CVPR.2019.00753.\n[37]. Jonathan Howe, May Casterline and Abel Brown.(2018). Solving SpaceNet Road Detection Challenge With Deep Learning, https://developer.nvidia.com/blog/solving-spacenet-road-detection-challenge-deep- learning/\n[38]. Ronneberger, Olaf, Fischer, Philipp and Brox, Thomas. "U-Net: Convolutional Networks for Biomedical Image Segmentation." Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (2015): .\n[39]. Pleiss, Geoff, Chen, Danlu, Huang, Gao, Li, Tongcheng, van der Maaten, Laurens and Weinberger, Kilian Q. Memory-Efficient Implementation of DenseNets. (2017). , cite arxiv:1707.06990Comment: Technical report\n[40]. https://en.wikipedia.org/wiki/Flood_fill\n[41]. https://en.wikipedia.org/wiki/WorldView-3\n[42]. https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection/data\n[43]. https://en.wikipedia.org/wiki/QuickBird\n[44]. https://sites.google.com/site/michelevolpiresearch/data/zurich-dataset\n[45]. https://en.wikipedia.org/wiki/Pansharpened_image\n[46]. https://en.wikipedia.org/wiki/Landsat_8\n[47]. https://zenodo.org/record/1154821#.X5ocGkIzblx\n[48]. https://en.wikipedia.org/wiki/Sentinel-2\n[49]. https://mediatum.ub.tum.de/1474000\n[50]. Chollet, Francois. (2017). Xception: Deep Learning with Depthwise Separable Convolutions. 1800-1807. 10.1109/CVPR.2017.195.\n[51]. K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image Recognition," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778, doi: 10.1109/CVPR.2016.90.\n[52]. Zhang, Richard & Isola, Phillip & Efros, Alexei & Shechtman, Eli & Wang, Oliver. (2018). The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. 586-595. 10.1109/CVPR.2018.00068.\n[53]. Zisserman, Andrew. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 1409.1556.\n[54]. Krizhevsky, Alex & Sutskever, Ilya & Hinton, Geoffrey. (2012). ImageNet Classification with Deep Convolutional Neural Networks. Neural Information Processing Systems. 25. 10.1145/3065386.\n[55]. http://www.image-net.org\n[56]. https://en.wikipedia.org/wiki/Histogram_equalizationc\n[57]. https://en.wikipedia.org/wiki/Adaptive_histogram_equalization\n[58]. Chen, Liang-Chieh & Papandreou, George & Schroff, Florian & Adam, Hartwig. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation.\n[59]. https://en.wikipedia.org/wiki/Entropy_(information_theory)\n[60]. Xie, Saining & Girshick, Ross & Dollár, Piotr & Tu, Z. & He, Kaiming. (2016). Aggregated Residual Transformations for Deep Neural Networks.\n[61]. https://github.com/facebookresearch\n[62]. https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md\n[63]. https://cocodataset.org/#home\n[64]. https://paperswithcode.com/sota/synthetic-to-real-translation-on-gtav-to\n[65]. https://paperswithcode.com/sota/image-to-image-translation-on-synthia-to
描述: 碩士
國立政治大學
資訊科學系
108753204
資料來源: http://thesis.lib.nccu.edu.tw/record/#G0108753204
資料類型: thesis
Appears in Collections:學位論文

Files in This Item:
File Description SizeFormat
320401.pdf14.95 MBAdobe PDF2View/Open
Show full item record

Google ScholarTM

Check

Altmetric

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.