學術產出-學位論文
文章檢視/開啟
書目匯出
-
題名 基於語意分割之電纜線偵測
Power Line Detection Based on Semantic Segmentation作者 游晉瑞
YOU, CHIN-JUI貢獻者 廖文宏
Liao, Wen-Hung
游晉瑞
YOU, CHIN-JUI關鍵詞 電腦視覺
語意分割
深度學習
Computer vision
Semantic segmentation
Deep learning日期 2021 上傳時間 1-七月-2021 19:54:29 (UTC+8) 摘要 電腦視覺的領域中,語意分割是一項非常重要的技術,目前被廣泛應用於無人載具、自動駕駛、場景理解與避障等任務。隨著深度學習技術的進步,語意分割已取得一定的成果,然而針對電纜線檢測,仍有許多待改善的空間。現今電塔及電纜線的開源資料有限,本研究主要對目前兩大開源的電纜線資料集進行電纜線檢查的語意分割實驗,分別為Ground Truth of Powerline Dataset及PLD-UVA資料集,並將兩資料集之Ground Truth重新標記,使模型能夠精準的獲得Ground Truth資訊。近年來學者已注意到將不同感測器的資料融合,如熱感測器或深度感測器的資料,可以協助提升光學影像模型的準確度,其中RTFNet利用兩個編碼器將熱影像特徵融合到光學影像中,但此架構並未考慮到也可以將光學影像的特徵融合到熱影像中,達到互相幫助的效果。藉此,本研究以RTFNet為基礎提出Dual Segmentation Model,希望透過邊緣強化來加強電纜線的光學影像模型,讓語意層面的資訊得以互補,進而提升分割模型的準確度,使其超越現今電纜線檢測表現相當優異的LS-Net之結果。本論文提出之dual segmentation 模型,其Precision以0.7919與LS-Net的0.8004並駕齊驅,Recall以0.7710超越LS-Net之0.5368,總結以上兩者所表現的F-score得出0.7753,超越LS-Net的0.5940將近0.2之多,體現出本研究所提出模型之可靠性。
In the field of computer vision, semantic segmentation is a technique widely employed in tasks such as autonomous driving, scene understanding and obstacle avoidance. With the advancement of deep learning technology, the performance of semantic segmentation has progressed rapidly. Yet there is still much room for improvement in the detection of power lines. Currently, open-source datasets of electric towers and power lines are quite limited. This research mainly conducts the semantic segmentation experiment of power lines inspection using two open-source datasets, namely, the Ground Truth of Power lines dataset and the PLD-UVA dataset. We re-labeled the above two datasets to better locate the region occupied by power lines.Researchers have reported the benefits of fusing data from different sensors, such as thermal sensors or depth sensors, to enhance the accuracy of optical image models. Among them, RTFNet utilizes two encoders to fuse thermal image features. However, this architecture fails to consider the characteristics of the optical image to promote mutual assistance. This research proposes the dual segmentation model (DSM) based on RTFNet. We hope to strengthen the optical image model of the power lines through edge enhancement, so that the semantic-level information can be complemented, and the accuracy of the segmentation model can be improved. Experimental results indicate that dual segmentation model outperforms the LS-Net model. Specifically, the precision of our model (0.7919) is comparable with that of LS-Net (0.8004). The recall (0.7710) surpasses that of LS-Net (0.5368). F-score of the DSM model is 0.7753, exceeding that of LS-Net (0.5940) of by nearly 0.2, validating the superiority of the proposed approach.參考文獻 [1] Yann LeCun, Corinna Cortes, Christopher J.C. Burges. THE MNIST DATABASE of handwritten digits. http://yann.lecun.com/exdb/mnist/, last visited on Dec 2018[2] ImageNet. http://www.image-net.org/[3] Yuanqing Lin, Fengjun Lv, Shenghuo Zhu, Ming Yang, Timothee Cour, Kai Yu, Liangliang Cao, Thomas Huang. Large-scale image classification: Fast feature extraction and SVM training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1689-1696, 2011.[4] Deng, Jia, et al. "Imagenet large scale visual recognition competition 2012 (ILSVRC2012)." See net. org/challenges/LSVRC (2012).[5] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich. Going Deeper with Convolutions. arXiv:1409.4842v1, 2014.[6] Park, E., et al. "ILSVRC-2017." URL http://www. image-net. org/challenges/LSVRC/2017 (2017).[7] Hu, Jie, Li Shen, and Gang Sun. "Squeeze-and-excitation networks." arXiv. preprint arXiv:1709.01507 7 (2017).[8] LeCun, Yann, et al. “Gradient-based learning applied to document. recognition.” (1989)[9] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).[10] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.[11] Huang, Gao, et al. "Densely connected convolutional networks." CVPR. Vol. 1. No. 2. 2017.[12] Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber"Highway. Network" arXiv:1505.00387 [cs.LG][13] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net: Convolutional. Networks for Biomedical Image Segmentation. arXiv:1505.04597v1 [cs.CV] 18 May 2015[14] Van Nhan Nguyen∗†, Robert Jenssen∗ , and Davide Roverso† ∗The UiT. Machine. Learning Group, UiT The Arctic University of Norway, 9019 Tromsø, Norway †Analytics Department, eSmart Systems, 1783 Halden, Norway"LS-Net: Fast Single-Shot Line-Segment Detector"arXiv:1912.09532v2 [cs.CV] 24 Jan 2020[15] P.V.C. Hough, Machine Analysis of Bubble Chamber Pictures, Proc. Int. Conf. High Energy Accelerators and Instrumentation, 1959.[16] Duda, R. O. and P. E. Hart, “Use of the Hough Transformation to Detect Lines and Curves in Pictures,”Comm. ACM, Vol. 15, pp. 11–15 (January, 1972).[17] Chen, Y.; Li, Y.; Zhang, H. Automatic power line extraction from High. resolution remote sensing imagery based on an improved radon transform. Pattern Recognit. 2016, 49, 174–186[18] Yunping Chen , Yang Li , Huixiong Zhang , Ling Tong , Yongxing Cao , Zhihang Xue"Automatic power line extraction from high resolution remote sensing imagery based on an improved Radon transform"https://doi.org/10.1016/j.patcog.2015.07.004[19] Arbeláez, P.; Maire, M.; Fowlkes, C.; Malik, J. From contours to regions: An. empirical evaluation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 2294–2301.[20] Arbeláez, P.; Maire, M.; Fowlkes, C.; Malik, J. Contour detection and. hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 898–916.[21] Shen, W.; Wang, X.; Wang, Y.; Xiang, B. DeepContour: A deep convolutional. feature learned by positivesharing loss for contour detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3982–3991.[22] Bertasius, G.; Shi, J.; Torresani, L. DeepEdge: A multi-scale bifurcated deep. network for top-down contour detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 4380–4389[23] Yang, J.; Price, B.; Cohen, S. Object Contour Detection with a Fully. Convolutional Encoder-Decoder Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 193–202[24] Maninis, K.K.; Pont-Tuset, J.; Arbeláez, P. Convolutional oriented boundaries: From image segmentation to high-level tasks. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 819–833[25] Madaan, R.; Maturana, D.; Scherer, S. Wire detection using synthetic data and dilated convolutional networks for unmanned aerial vehicles. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 3487–3494.[26] Xingchen Zhang, Gang Xiao, Shanghai Jiao Tong University, Shanghai, China Ke Gong, East China Normal University, Shanghai, China Ping Ye, Junhao Zhao, Shanghai Jiao Tong University, Shanghai, China “Power Line Detection for Aircraft Safety Based on Image Processing Techniques: Advances and Recommendations”[27] Long, J., E. Shelhamer, and T. Darrell. Fully convolutional networks for. semantic segmentation. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.[28] J. Canny, "A Computational Approach to Edge Detection," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-8, no. 6, pp. 679-698, Nov. 1986, doi: 10.1109/TPAMI.1986.4767851.[29] Rabab Abdelfattah, Xiaofeng Wang, and Song Wang Department of Electrical Engineering, University of South Carolina, USA Department of Computer Science and Engineering, University of South Carolina, USA ," TTPLA: An Aerial-Image Dataset for Detection and Segmentation of Transmission Towers and Power Lines "[30] Yetgin, Ömer Emre; Gerek, Ömer Nezih (2019), “Ground Truth of Powerline. Dataset (Infrared-IR and Visible Light-VL)”, Mendeley Data, v9http://dx.doi.org/10.17632/twxp8xccsw.9[31] PLD-UVA Dataset :https://github.com/SnorkerHeng/PLD-UAV[32] Sun, Y., Zuo, W., & Liu, M. (2019). RTFNet: RGB-Thermal Fusion Network for Semantic Segmentation of Urban Scenes. IEEE Robotics and Automation Letters, 1–1. doi:10.1109/lra.2019.2904733[33] Hu, J., L. Shen, and G. Sun. Squeeze-and-excitation networks. in Proceedings of. the IEEE conference on computer vision and pattern recognition. 2018[34] Huang, X. and S. Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. in Proceedings of the IEEE International Conference on Computer Vision. 2017.[35] Hou, Q., et al. Strip Pooling: Rethinking Spatial Pooling for Scene Parsing. 2020. arXiv:2003.13328 描述 碩士
國立政治大學
資訊科學系
107753043資料來源 http://thesis.lib.nccu.edu.tw/record/#G0107753043 資料類型 thesis dc.contributor.advisor 廖文宏 zh_TW dc.contributor.advisor Liao, Wen-Hung en_US dc.contributor.author (Authors) 游晉瑞 zh_TW dc.contributor.author (Authors) YOU, CHIN-JUI en_US dc.creator (作者) 游晉瑞 zh_TW dc.creator (作者) YOU, CHIN-JUI en_US dc.date (日期) 2021 en_US dc.date.accessioned 1-七月-2021 19:54:29 (UTC+8) - dc.date.available 1-七月-2021 19:54:29 (UTC+8) - dc.date.issued (上傳時間) 1-七月-2021 19:54:29 (UTC+8) - dc.identifier (Other Identifiers) G0107753043 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/135981 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 資訊科學系 zh_TW dc.description (描述) 107753043 zh_TW dc.description.abstract (摘要) 電腦視覺的領域中,語意分割是一項非常重要的技術,目前被廣泛應用於無人載具、自動駕駛、場景理解與避障等任務。隨著深度學習技術的進步,語意分割已取得一定的成果,然而針對電纜線檢測,仍有許多待改善的空間。現今電塔及電纜線的開源資料有限,本研究主要對目前兩大開源的電纜線資料集進行電纜線檢查的語意分割實驗,分別為Ground Truth of Powerline Dataset及PLD-UVA資料集,並將兩資料集之Ground Truth重新標記,使模型能夠精準的獲得Ground Truth資訊。近年來學者已注意到將不同感測器的資料融合,如熱感測器或深度感測器的資料,可以協助提升光學影像模型的準確度,其中RTFNet利用兩個編碼器將熱影像特徵融合到光學影像中,但此架構並未考慮到也可以將光學影像的特徵融合到熱影像中,達到互相幫助的效果。藉此,本研究以RTFNet為基礎提出Dual Segmentation Model,希望透過邊緣強化來加強電纜線的光學影像模型,讓語意層面的資訊得以互補,進而提升分割模型的準確度,使其超越現今電纜線檢測表現相當優異的LS-Net之結果。本論文提出之dual segmentation 模型,其Precision以0.7919與LS-Net的0.8004並駕齊驅,Recall以0.7710超越LS-Net之0.5368,總結以上兩者所表現的F-score得出0.7753,超越LS-Net的0.5940將近0.2之多,體現出本研究所提出模型之可靠性。 zh_TW dc.description.abstract (摘要) In the field of computer vision, semantic segmentation is a technique widely employed in tasks such as autonomous driving, scene understanding and obstacle avoidance. With the advancement of deep learning technology, the performance of semantic segmentation has progressed rapidly. Yet there is still much room for improvement in the detection of power lines. Currently, open-source datasets of electric towers and power lines are quite limited. This research mainly conducts the semantic segmentation experiment of power lines inspection using two open-source datasets, namely, the Ground Truth of Power lines dataset and the PLD-UVA dataset. We re-labeled the above two datasets to better locate the region occupied by power lines.Researchers have reported the benefits of fusing data from different sensors, such as thermal sensors or depth sensors, to enhance the accuracy of optical image models. Among them, RTFNet utilizes two encoders to fuse thermal image features. However, this architecture fails to consider the characteristics of the optical image to promote mutual assistance. This research proposes the dual segmentation model (DSM) based on RTFNet. We hope to strengthen the optical image model of the power lines through edge enhancement, so that the semantic-level information can be complemented, and the accuracy of the segmentation model can be improved. Experimental results indicate that dual segmentation model outperforms the LS-Net model. Specifically, the precision of our model (0.7919) is comparable with that of LS-Net (0.8004). The recall (0.7710) surpasses that of LS-Net (0.5368). F-score of the DSM model is 0.7753, exceeding that of LS-Net (0.5940) of by nearly 0.2, validating the superiority of the proposed approach. en_US dc.description.tableofcontents 表目錄 5圖目錄 6第一章 緒論 91.1 研究背景與動機 91.2 研究目的 111.3 論文架構 11第二章 背景與相關研究 122.1 深度學習的架構與介紹 122.2 卷積神經網路概述 152.2.1 全連接與局部連接 162.2.2 池化層 172.2.3 ResNet 182.2.4 U-Net 182.2.5 語意分割 202.3 電纜線檢測相關研究 212.3.1 LS-Net 212.3.2 Hough transform 232.3.3 傳統基於梯度的方法 242.3.4 基於深度學習的方法 252.3.5 Encoder-Decoder架構 262.3.6 RTFNet 262.3.7 低階特徵Domain Adaptation (SEAdIN) 282.3.8 高階特徵自注意力機制Strip Pooling Module 292.4 小結 30第三章 電纜線偵測與資料集介紹 313.1 Canny Edge Detector 313.2 Ground Truth of Powerline Dataset 333.3 PLD-UVA 34第四章 研究方法與實驗結果分析 374.1 電纜線影像語意分割結果 374.2 Dual Segmentation Model 模型架構 494.3 Ablation Study 514.4 Denoise 544.5 小結 56第五章 結論與未來展望 575.1 研究結論 575.2 未來展望 58參考文獻 60 zh_TW dc.format.extent 4918113 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0107753043 en_US dc.subject (關鍵詞) 電腦視覺 zh_TW dc.subject (關鍵詞) 語意分割 zh_TW dc.subject (關鍵詞) 深度學習 zh_TW dc.subject (關鍵詞) Computer vision en_US dc.subject (關鍵詞) Semantic segmentation en_US dc.subject (關鍵詞) Deep learning en_US dc.title (題名) 基於語意分割之電纜線偵測 zh_TW dc.title (題名) Power Line Detection Based on Semantic Segmentation en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] Yann LeCun, Corinna Cortes, Christopher J.C. Burges. THE MNIST DATABASE of handwritten digits. http://yann.lecun.com/exdb/mnist/, last visited on Dec 2018[2] ImageNet. http://www.image-net.org/[3] Yuanqing Lin, Fengjun Lv, Shenghuo Zhu, Ming Yang, Timothee Cour, Kai Yu, Liangliang Cao, Thomas Huang. Large-scale image classification: Fast feature extraction and SVM training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1689-1696, 2011.[4] Deng, Jia, et al. "Imagenet large scale visual recognition competition 2012 (ILSVRC2012)." See net. org/challenges/LSVRC (2012).[5] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich. Going Deeper with Convolutions. arXiv:1409.4842v1, 2014.[6] Park, E., et al. "ILSVRC-2017." URL http://www. image-net. org/challenges/LSVRC/2017 (2017).[7] Hu, Jie, Li Shen, and Gang Sun. "Squeeze-and-excitation networks." arXiv. preprint arXiv:1709.01507 7 (2017).[8] LeCun, Yann, et al. “Gradient-based learning applied to document. recognition.” (1989)[9] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).[10] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.[11] Huang, Gao, et al. "Densely connected convolutional networks." CVPR. Vol. 1. No. 2. 2017.[12] Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber"Highway. Network" arXiv:1505.00387 [cs.LG][13] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net: Convolutional. Networks for Biomedical Image Segmentation. arXiv:1505.04597v1 [cs.CV] 18 May 2015[14] Van Nhan Nguyen∗†, Robert Jenssen∗ , and Davide Roverso† ∗The UiT. Machine. Learning Group, UiT The Arctic University of Norway, 9019 Tromsø, Norway †Analytics Department, eSmart Systems, 1783 Halden, Norway"LS-Net: Fast Single-Shot Line-Segment Detector"arXiv:1912.09532v2 [cs.CV] 24 Jan 2020[15] P.V.C. Hough, Machine Analysis of Bubble Chamber Pictures, Proc. Int. Conf. High Energy Accelerators and Instrumentation, 1959.[16] Duda, R. O. and P. E. Hart, “Use of the Hough Transformation to Detect Lines and Curves in Pictures,”Comm. ACM, Vol. 15, pp. 11–15 (January, 1972).[17] Chen, Y.; Li, Y.; Zhang, H. Automatic power line extraction from High. resolution remote sensing imagery based on an improved radon transform. Pattern Recognit. 2016, 49, 174–186[18] Yunping Chen , Yang Li , Huixiong Zhang , Ling Tong , Yongxing Cao , Zhihang Xue"Automatic power line extraction from high resolution remote sensing imagery based on an improved Radon transform"https://doi.org/10.1016/j.patcog.2015.07.004[19] Arbeláez, P.; Maire, M.; Fowlkes, C.; Malik, J. From contours to regions: An. empirical evaluation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 2294–2301.[20] Arbeláez, P.; Maire, M.; Fowlkes, C.; Malik, J. Contour detection and. hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 898–916.[21] Shen, W.; Wang, X.; Wang, Y.; Xiang, B. DeepContour: A deep convolutional. feature learned by positivesharing loss for contour detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3982–3991.[22] Bertasius, G.; Shi, J.; Torresani, L. DeepEdge: A multi-scale bifurcated deep. network for top-down contour detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 4380–4389[23] Yang, J.; Price, B.; Cohen, S. Object Contour Detection with a Fully. Convolutional Encoder-Decoder Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 193–202[24] Maninis, K.K.; Pont-Tuset, J.; Arbeláez, P. Convolutional oriented boundaries: From image segmentation to high-level tasks. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 819–833[25] Madaan, R.; Maturana, D.; Scherer, S. Wire detection using synthetic data and dilated convolutional networks for unmanned aerial vehicles. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 3487–3494.[26] Xingchen Zhang, Gang Xiao, Shanghai Jiao Tong University, Shanghai, China Ke Gong, East China Normal University, Shanghai, China Ping Ye, Junhao Zhao, Shanghai Jiao Tong University, Shanghai, China “Power Line Detection for Aircraft Safety Based on Image Processing Techniques: Advances and Recommendations”[27] Long, J., E. Shelhamer, and T. Darrell. Fully convolutional networks for. semantic segmentation. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.[28] J. Canny, "A Computational Approach to Edge Detection," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-8, no. 6, pp. 679-698, Nov. 1986, doi: 10.1109/TPAMI.1986.4767851.[29] Rabab Abdelfattah, Xiaofeng Wang, and Song Wang Department of Electrical Engineering, University of South Carolina, USA Department of Computer Science and Engineering, University of South Carolina, USA ," TTPLA: An Aerial-Image Dataset for Detection and Segmentation of Transmission Towers and Power Lines "[30] Yetgin, Ömer Emre; Gerek, Ömer Nezih (2019), “Ground Truth of Powerline. Dataset (Infrared-IR and Visible Light-VL)”, Mendeley Data, v9http://dx.doi.org/10.17632/twxp8xccsw.9[31] PLD-UVA Dataset :https://github.com/SnorkerHeng/PLD-UAV[32] Sun, Y., Zuo, W., & Liu, M. (2019). RTFNet: RGB-Thermal Fusion Network for Semantic Segmentation of Urban Scenes. IEEE Robotics and Automation Letters, 1–1. doi:10.1109/lra.2019.2904733[33] Hu, J., L. Shen, and G. Sun. Squeeze-and-excitation networks. in Proceedings of. the IEEE conference on computer vision and pattern recognition. 2018[34] Huang, X. and S. Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. in Proceedings of the IEEE International Conference on Computer Vision. 2017.[35] Hou, Q., et al. Strip Pooling: Rethinking Spatial Pooling for Scene Parsing. 2020. arXiv:2003.13328 zh_TW dc.identifier.doi (DOI) 10.6814/NCCU202100504 en_US