Please use this identifier to cite or link to this item: https://ah.lib.nccu.edu.tw/handle/140.119/135981
題名: 基於語意分割之電纜線偵測
Power Line Detection Based on Semantic Segmentation
作者: 游晉瑞
YOU, CHIN-JUI
貢獻者: 廖文宏
Liao, Wen-Hung
游晉瑞
YOU, CHIN-JUI
關鍵詞: 電腦視覺
語意分割
深度學習
Computer vision
Semantic segmentation
Deep learning
日期: 2021
上傳時間: 1-七月-2021
摘要: 電腦視覺的領域中,語意分割是一項非常重要的技術,目前被廣泛應用於無人載具、自動駕駛、場景理解與避障等任務。隨著深度學習技術的進步,語意分割已取得一定的成果,然而針對電纜線檢測,仍有許多待改善的空間。現今電塔及電纜線的開源資料有限,本研究主要對目前兩大開源的電纜線資料集進行電纜線檢查的語意分割實驗,分別為Ground Truth of Powerline Dataset及PLD-UVA資料集,並將兩資料集之Ground Truth重新標記,使模型能夠精準的獲得Ground Truth資訊。\n近年來學者已注意到將不同感測器的資料融合,如熱感測器或深度感測器的資料,可以協助提升光學影像模型的準確度,其中RTFNet利用兩個編碼器將熱影像特徵融合到光學影像中,但此架構並未考慮到也可以將光學影像的特徵融合到熱影像中,達到互相幫助的效果。藉此,本研究以RTFNet為基礎提出Dual Segmentation Model,希望透過邊緣強化來加強電纜線的光學影像模型,讓語意層面的資訊得以互補,進而提升分割模型的準確度,使其超越現今電纜線檢測表現相當優異的LS-Net之結果。本論文提出之dual segmentation 模型,其Precision以0.7919與LS-Net的0.8004並駕齊驅,Recall以0.7710超越LS-Net之0.5368,總結以上兩者所表現的F-score得出0.7753,超越LS-Net的0.5940將近0.2之多,體現出本研究所提出模型之可靠性。
In the field of computer vision, semantic segmentation is a technique widely employed in tasks such as autonomous driving, scene understanding and obstacle avoidance. With the advancement of deep learning technology, the performance of semantic segmentation has progressed rapidly. Yet there is still much room for improvement in the detection of power lines. Currently, open-source datasets of electric towers and power lines are quite limited. This research mainly conducts the semantic segmentation experiment of power lines inspection using two open-source datasets, namely, the Ground Truth of Power lines dataset and the PLD-UVA dataset. We re-labeled the above two datasets to better locate the region occupied by power lines.\nResearchers have reported the benefits of fusing data from different sensors, such as thermal sensors or depth sensors, to enhance the accuracy of optical image models. Among them, RTFNet utilizes two encoders to fuse thermal image features. However, this architecture fails to consider the characteristics of the optical image to promote mutual assistance. This research proposes the dual segmentation model (DSM) based on RTFNet. We hope to strengthen the optical image model of the power lines through edge enhancement, so that the semantic-level information can be complemented, and the accuracy of the segmentation model can be improved. Experimental results indicate that dual segmentation model outperforms the LS-Net model. Specifically, the precision of our model (0.7919) is comparable with that of LS-Net (0.8004). The recall (0.7710) surpasses that of LS-Net (0.5368). F-score of the DSM model is 0.7753, exceeding that of LS-Net (0.5940) of by nearly 0.2, validating the superiority of the proposed approach.
參考文獻: [1] Yann LeCun, Corinna Cortes, Christopher J.C. Burges. THE MNIST DATABASE of handwritten digits. http://yann.lecun.com/exdb/mnist/, last visited on Dec 2018\n[2] ImageNet. http://www.image-net.org/\n[3] Yuanqing Lin, Fengjun Lv, Shenghuo Zhu, Ming Yang, Timothee Cour, Kai Yu, Liangliang Cao, Thomas Huang. Large-scale image classification: Fast feature extraction and SVM training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1689-1696, 2011.\n[4] Deng, Jia, et al. "Imagenet large scale visual recognition competition 2012 (ILSVRC2012)." See net. org/challenges/LSVRC (2012).\n[5] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich. Going Deeper with Convolutions. arXiv:1409.4842v1, 2014.\n[6] Park, E., et al. "ILSVRC-2017." URL http://www. image-net. org/challenges/LSVRC/2017 (2017).\n[7] Hu, Jie, Li Shen, and Gang Sun. "Squeeze-and-excitation networks." arXiv. preprint arXiv:1709.01507 7 (2017).\n[8] LeCun, Yann, et al. “Gradient-based learning applied to document. recognition.” (1989)\n[9] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).\n[10] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.\n[11] Huang, Gao, et al. "Densely connected convolutional networks." CVPR. Vol. 1. No. 2. 2017.\n[12] Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber"Highway. Network" arXiv:1505.00387 [cs.LG]\n[13] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net: Convolutional. Networks for Biomedical Image Segmentation. arXiv:1505.04597v1 [cs.CV] 18 May 2015\n[14] Van Nhan Nguyen∗†, Robert Jenssen∗ , and Davide Roverso† ∗The UiT. Machine. Learning Group, UiT The Arctic University of Norway, 9019 Tromsø, Norway †Analytics Department, eSmart Systems, 1783 Halden, Norway"LS-Net: Fast Single-Shot Line-Segment Detector"arXiv:1912.09532v2 [cs.CV] 24 Jan 2020\n[15] P.V.C. Hough, Machine Analysis of Bubble Chamber Pictures, Proc. Int. Conf. High Energy Accelerators and Instrumentation, 1959.\n[16] Duda, R. O. and P. E. Hart, “Use of the Hough Transformation to Detect Lines and Curves in Pictures,”Comm. ACM, Vol. 15, pp. 11–15 (January, 1972).\n[17] Chen, Y.; Li, Y.; Zhang, H. Automatic power line extraction from High. resolution remote sensing imagery based on an improved radon transform. Pattern Recognit. 2016, 49, 174–186\n[18] Yunping Chen , Yang Li , Huixiong Zhang , Ling Tong , Yongxing Cao , Zhihang Xue"Automatic power line extraction from high resolution remote sensing imagery based on an improved Radon transform"https://doi.org/10.1016/j.patcog.2015.07.004\n[19] Arbeláez, P.; Maire, M.; Fowlkes, C.; Malik, J. From contours to regions: An. empirical evaluation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 2294–2301.\n[20] Arbeláez, P.; Maire, M.; Fowlkes, C.; Malik, J. Contour detection and. hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 898–916.\n[21] Shen, W.; Wang, X.; Wang, Y.; Xiang, B. DeepContour: A deep convolutional. feature learned by positivesharing loss for contour detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3982–3991.\n[22] Bertasius, G.; Shi, J.; Torresani, L. DeepEdge: A multi-scale bifurcated deep. network for top-down contour detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 4380–4389\n[23] Yang, J.; Price, B.; Cohen, S. Object Contour Detection with a Fully. Convolutional Encoder-Decoder Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 193–202\n[24] Maninis, K.K.; Pont-Tuset, J.; Arbeláez, P. Convolutional oriented boundaries: From image segmentation to high-level tasks. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 819–833\n[25] Madaan, R.; Maturana, D.; Scherer, S. Wire detection using synthetic data and dilated convolutional networks for unmanned aerial vehicles. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 3487–3494.\n[26] Xingchen Zhang, Gang Xiao, Shanghai Jiao Tong University, Shanghai, China Ke Gong, East China Normal University, Shanghai, China Ping Ye, Junhao Zhao, Shanghai Jiao Tong University, Shanghai, China “Power Line Detection for Aircraft Safety Based on Image Processing Techniques: Advances and Recommendations”\n[27] Long, J., E. Shelhamer, and T. Darrell. Fully convolutional networks for. semantic segmentation. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.\n[28] J. Canny, "A Computational Approach to Edge Detection," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-8, no. 6, pp. 679-698, Nov. 1986, doi: 10.1109/TPAMI.1986.4767851.\n[29] Rabab Abdelfattah, Xiaofeng Wang, and Song Wang Department of Electrical Engineering, University of South Carolina, USA Department of Computer Science and Engineering, University of South Carolina, USA ," TTPLA: An Aerial-Image Dataset for Detection and Segmentation of Transmission Towers and Power Lines "\n[30] Yetgin, Ömer Emre; Gerek, Ömer Nezih (2019), “Ground Truth of Powerline. Dataset (Infrared-IR and Visible Light-VL)”, Mendeley Data, v9http://dx.doi.org/10.17632/twxp8xccsw.9\n[31] PLD-UVA Dataset :https://github.com/SnorkerHeng/PLD-UAV\n[32] Sun, Y., Zuo, W., & Liu, M. (2019). RTFNet: RGB-Thermal Fusion Network for Semantic Segmentation of Urban Scenes. IEEE Robotics and Automation Letters, 1–1. doi:10.1109/lra.2019.2904733\n[33] Hu, J., L. Shen, and G. Sun. Squeeze-and-excitation networks. in Proceedings of. the IEEE conference on computer vision and pattern recognition. 2018\n[34] Huang, X. and S. Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. in Proceedings of the IEEE International Conference on Computer Vision. 2017.\n[35] Hou, Q., et al. Strip Pooling: Rethinking Spatial Pooling for Scene Parsing. 2020. arXiv:2003.13328
描述: 碩士
國立政治大學
資訊科學系
107753043
資料來源: http://thesis.lib.nccu.edu.tw/record/#G0107753043
資料類型: thesis
Appears in Collections:學位論文

Files in This Item:
File Description SizeFormat
304301.pdf4.8 MBAdobe PDF2View/Open
Show full item record

Google ScholarTM

Check

Altmetric

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.