應用於水下休閒活動之即時物件偵測系統

學術產出-學位論文

文章檢視/開啟

pdf(297)

書目匯出

Google Scholar^TM

政大圖書館

學術資源探索系統

引文資訊

TAIR相關學術產出

Simple Record
Full Record

題名	應用於水下休閒活動之即時物件偵測系統 Real-time object detection with applications to underwater recreational activities
作者	鄭文嘉 Cheng, Wen-Chia
貢獻者	廖文宏 Liao, Wen-Hung 鄭文嘉 Cheng, Wen-Chia
關鍵詞	水下影像色彩修正物體偵測遷移學習深度學習 Underwater image Color correction Object detection Transfer learning Deep learning
日期	2020
上傳時間	2-九月-2020 13:15:19 (UTC+8)
摘要	本論文試圖在一般相機所拍攝之水下影像，利用顏色修正技術，讓目前已存在之影像資料集，經過訓練後，可被用來偵測水下影像。本研究基於深度學習概念，使用屬於生成對抗網路之pix2pix network，透過控制loss function/ iteration/ 資料分群等方式，分析及評估各種參數調校，將水下影像進行修正，成為如水面上拍攝之影像。此外，藉由遷移學習概念來訓練模組，分析各種物件之AP及整體mAP，達成水下即時偵測物件的需求。評估及測試不同模型與調整參數，得到最佳結果為：Fish AP為0.71、Jellyfish AP為0.72及Diver AP為0.39，而整體 mAP則為0.606，相同條件下與未經影像修正相比，mAP大幅提高了50.3%。期許此色彩修正及偵測系統，讓水下活動人員進行各項休閒之時，即時被面鏡所限制的有限視野，也能藉著水下相機的架設，迅速偵測出視野內外所需之物件位置及資訊，在有限時間內，增加水下活動的效益。 This thesis attempts to employ color correction techniques to restore underwater images so that object detection models trained with existing image datasets can be used to cope with underwater images without extensive retraining. Based on the concept of deep learning, this study uses pix2pix network, a variant of generative adversarial network (GAN), to enhance the color of underwater images. We analyze and evaluate the efficacy of restoration by exploring different combinations of loss function/ iteration/ data grouping. The object detection model is trained using transfer learning technique, and average precision (AP) and overall mAP are analyzed to meet the requirements of underwater activities. Experimental results indicate that the AP for Fish is 0.71, the AP for jellyfish is 0.72, and the AP for diver is 0.39, with an overall mAP of 0.606, demonstrating a remarkable 50.3% improvement when color correction is applied. It is expected that users can quickly identify the position and information of objects of interest within the field of view limited by the mask through the system, thereby enhancing the experience of underwater activities.
參考文獻	[1] 李明儒, et al. "休閒潛水者對潛水風險的認知與損害之研究." 運動與遊憩研究 1.3 (2007): 14-33. [2] 李凡, et al. "水下声传播的发展及其应用." 物理 43.10 (2014): 658-666. [3] Klemm et al., “Exploring Our Fluid Earth is based on the nationally recognized Fluid Earth/Living Ocean (FELO) aquatic science curriculum.” 1995. [4] Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. [5] Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015. [6] Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. [7] Steiner, A. "Understanding the Basics of Underwater Lighting." Ocean News & Technology 19.4 (2013): 10-12. [8] Liu, Wei, et al. "Ssd: Single shot multibox detector." European conference on computer vision. Springer, Cham, 2016. [9] Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015). [10] Kuznetsova, Alina, et al. "The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale." (2020). [11] Zhu, Jun-Yan, et al. "Unpaired image-to-image translation using cycle-consistent adversarial networks." Proceedings of the IEEE international conference on computer vision. 2017. [12] Sato, Kazunori. "An inside look at google bigquery." White paper, URL: https://cloud. google. com/files/BigQueryTechnicalWP. pdf (2012). [13] Mirza, Mehdi, and Simon Osindero. "Conditional generative adversarial nets." arXiv preprint arXiv:1411.1784 (2014). [14] Isola, Phillip, et al. "Image-to-image translation with conditional adversarial networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. [15] Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015. [16] Panaretos, Victor M., and Yoav Zemel. "Statistical aspects of Wasserstein distances." Annual review of statistics and its application 6 (2019): 405-431. [17] Arjovsky, Martin, Soumith Chintala, and Léon Bottou. "Wasserstein gan." arXiv preprint arXiv:1701.07875 (2017). [18] Gulrajani, Ishaan, et al. "Improved training of wasserstein gans." Advances in neural information processing systems. 2017. [19] Wu, N., and V. Rathod. "Tensorflow detection model zoo." (2017). [20] Wang, Ting-Chun, et al. "High-resolution image synthesis and semantic manipulation with conditional gans." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. [21] Pedersen, Malte, et al. "Detection of marine animals in a new underwater dataset with varying visibility." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2019. [22] Li, Xirong, et al. "COCO-CN for Cross-Lingual Image Tagging, Captioning, and Retrieval." IEEE Transactions on Multimedia 21.9 (2019): 2347-2360. [23] Kornblith, Simon, Jonathon Shlens, and Quoc V. Le. "Do better imagenet models transfer better?." Proceedings of the IEEE conference on computer vision and pattern recognition. 2019. [24] Teama, Tarek, et al. "Real Time Object Detection Based on Deep Neural Network." International Conference on Intelligent Robotics and Applications. Springer, Cham, 2019.T Lin, Aire, et al. "Focal loss for dense object detection." IEEE Trans. Pattern Anal. Mach. Intell. 42.2 (2020): 318-327. [25] Lin, Aire, et al. "Focal loss for dense object detection." IEEE Trans. Pattern Anal. Mach. Intell. 42.2 (2020): 318-327. [26] Akkaynak, Derya, and Tali Treibitz. "Sea-thru: A method for removing water from underwater images." Proceedings of the IEEE conference on computer vision and pattern recognition. 2019. [27] Kolsur, Anoop, Sandeep Awale, and Nagraj Ullagaddi. "POV: Persistence of Vision." [28] Mittal, Anish, Rajiv Soundararajan, and Alan C. Bovik. "Making a “completely blind” image quality analyzer." IEEE Signal processing letters 20.3 (2012): 209-212. [29] Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao. "YOLOv4: Optimal Speed and Accuracy of Object Detection." arXiv preprint arXiv:2004.10934 (2020).
描述	碩士國立政治大學資訊科學系碩士在職專班 105971018
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0105971018
資料類型	thesis

dc.contributor.advisor	廖文宏	zh_TW
dc.contributor.advisor	Liao, Wen-Hung	en_US
dc.contributor.author (作者)	鄭文嘉	zh_TW
dc.contributor.author (作者)	Cheng, Wen-Chia	en_US
dc.creator (作者)	鄭文嘉	zh_TW
dc.creator (作者)	Cheng, Wen-Chia	en_US
dc.date (日期)	2020	en_US
dc.date.accessioned	2-九月-2020 13:15:19 (UTC+8)	-
dc.date.available	2-九月-2020 13:15:19 (UTC+8)	-
dc.date.issued (上傳時間)	2-九月-2020 13:15:19 (UTC+8)	-
dc.identifier (其他識別碼)	G0105971018	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/131937	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	資訊科學系碩士在職專班	zh_TW
dc.description (描述)	105971018	zh_TW
dc.description.abstract (摘要)	本論文試圖在一般相機所拍攝之水下影像，利用顏色修正技術，讓目前已存在之影像資料集，經過訓練後，可被用來偵測水下影像。本研究基於深度學習概念，使用屬於生成對抗網路之pix2pix network，透過控制loss function/ iteration/ 資料分群等方式，分析及評估各種參數調校，將水下影像進行修正，成為如水面上拍攝之影像。此外，藉由遷移學習概念來訓練模組，分析各種物件之AP及整體mAP，達成水下即時偵測物件的需求。評估及測試不同模型與調整參數，得到最佳結果為：Fish AP為0.71、Jellyfish AP為0.72及Diver AP為0.39，而整體 mAP則為0.606，相同條件下與未經影像修正相比，mAP大幅提高了50.3%。期許此色彩修正及偵測系統，讓水下活動人員進行各項休閒之時，即時被面鏡所限制的有限視野，也能藉著水下相機的架設，迅速偵測出視野內外所需之物件位置及資訊，在有限時間內，增加水下活動的效益。	zh_TW
dc.description.abstract (摘要)	This thesis attempts to employ color correction techniques to restore underwater images so that object detection models trained with existing image datasets can be used to cope with underwater images without extensive retraining. Based on the concept of deep learning, this study uses pix2pix network, a variant of generative adversarial network (GAN), to enhance the color of underwater images. We analyze and evaluate the efficacy of restoration by exploring different combinations of loss function/ iteration/ data grouping. The object detection model is trained using transfer learning technique, and average precision (AP) and overall mAP are analyzed to meet the requirements of underwater activities. Experimental results indicate that the AP for Fish is 0.71, the AP for jellyfish is 0.72, and the AP for diver is 0.39, with an overall mAP of 0.606, demonstrating a remarkable 50.3% improvement when color correction is applied. It is expected that users can quickly identify the position and information of objects of interest within the field of view limited by the mask through the system, thereby enhancing the experience of underwater activities.	en_US
dc.description.tableofcontents	第一章緒論 1 1.1 研究動機 1 1.2 論文架構 4 第二章相關研究 5 2.1 水下成像原理 5 2.2 深度學習與物件偵測及顏色修正技術 7 2.3 色彩修正與生成對抗網路 13 第三章研究方法 17 3.1 基本構想 17 3.2 前期研究 18 3.2.1 水下影像收集 18 3.2.2 Paired image產生 18 3.2.3 模型評估 20 3.2.4 Loss function評估 22 3.2.5 物體偵測的物件影像收集 24 3.2.6 物件偵測模型效能評估 24 3.3 研究架構設計 24 3.3.1 問題陳述 24 3.3.2 研究架構 24 第四章研究過程與結果分析 27 4.1 研究過程 27 4.1.1 資料集說明 27 4.1.2 Paired image產出 28 4.1.3 色彩修正評估階段 31 4.1.4 物件偵測評估階段 36 4.1.5 物件偵測重新訓練階段 37 4.2 分析階段 39 4.2.1 Paired image分析 39 4.2.2 色彩修正結果分析 39 4.2.3 物件偵測重新訓練結果分析 39 4.2.4 顏色修正後的物體偵測結果分析 40 4.3 系統可行性分析 43 4.3.1 色彩修正可行性探討 43 4.3.2 即時偵測之可行性探討 43 4.3.3 即時物體偵測可行性探討 44 4.4 研究結果之應用 44 第五章結論與未來研究方向 45 5.1 結論 45 5.2 未來研究方向 46 參考文獻 47 附錄 50	zh_TW
dc.format.extent	10270772 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0105971018	en_US
dc.subject (關鍵詞)	水下影像	zh_TW
dc.subject (關鍵詞)	色彩修正	zh_TW
dc.subject (關鍵詞)	物體偵測	zh_TW
dc.subject (關鍵詞)	遷移學習	zh_TW
dc.subject (關鍵詞)	深度學習	zh_TW
dc.subject (關鍵詞)	Underwater image	en_US
dc.subject (關鍵詞)	Color correction	en_US
dc.subject (關鍵詞)	Object detection	en_US
dc.subject (關鍵詞)	Transfer learning	en_US
dc.subject (關鍵詞)	Deep learning	en_US
dc.title (題名)	應用於水下休閒活動之即時物件偵測系統	zh_TW
dc.title (題名)	Real-time object detection with applications to underwater recreational activities	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	[1] 李明儒, et al. "休閒潛水者對潛水風險的認知與損害之研究." 運動與遊憩研究 1.3 (2007): 14-33. [2] 李凡, et al. "水下声传播的发展及其应用." 物理 43.10 (2014): 658-666. [3] Klemm et al., “Exploring Our Fluid Earth is based on the nationally recognized Fluid Earth/Living Ocean (FELO) aquatic science curriculum.” 1995. [4] Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. [5] Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015. [6] Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. [7] Steiner, A. "Understanding the Basics of Underwater Lighting." Ocean News & Technology 19.4 (2013): 10-12. [8] Liu, Wei, et al. "Ssd: Single shot multibox detector." European conference on computer vision. Springer, Cham, 2016. [9] Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015). [10] Kuznetsova, Alina, et al. "The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale." (2020). [11] Zhu, Jun-Yan, et al. "Unpaired image-to-image translation using cycle-consistent adversarial networks." Proceedings of the IEEE international conference on computer vision. 2017. [12] Sato, Kazunori. "An inside look at google bigquery." White paper, URL: https://cloud. google. com/files/BigQueryTechnicalWP. pdf (2012). [13] Mirza, Mehdi, and Simon Osindero. "Conditional generative adversarial nets." arXiv preprint arXiv:1411.1784 (2014). [14] Isola, Phillip, et al. "Image-to-image translation with conditional adversarial networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. [15] Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015. [16] Panaretos, Victor M., and Yoav Zemel. "Statistical aspects of Wasserstein distances." Annual review of statistics and its application 6 (2019): 405-431. [17] Arjovsky, Martin, Soumith Chintala, and Léon Bottou. "Wasserstein gan." arXiv preprint arXiv:1701.07875 (2017). [18] Gulrajani, Ishaan, et al. "Improved training of wasserstein gans." Advances in neural information processing systems. 2017. [19] Wu, N., and V. Rathod. "Tensorflow detection model zoo." (2017). [20] Wang, Ting-Chun, et al. "High-resolution image synthesis and semantic manipulation with conditional gans." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. [21] Pedersen, Malte, et al. "Detection of marine animals in a new underwater dataset with varying visibility." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2019. [22] Li, Xirong, et al. "COCO-CN for Cross-Lingual Image Tagging, Captioning, and Retrieval." IEEE Transactions on Multimedia 21.9 (2019): 2347-2360. [23] Kornblith, Simon, Jonathon Shlens, and Quoc V. Le. "Do better imagenet models transfer better?." Proceedings of the IEEE conference on computer vision and pattern recognition. 2019. [24] Teama, Tarek, et al. "Real Time Object Detection Based on Deep Neural Network." International Conference on Intelligent Robotics and Applications. Springer, Cham, 2019.T Lin, Aire, et al. "Focal loss for dense object detection." IEEE Trans. Pattern Anal. Mach. Intell. 42.2 (2020): 318-327. [25] Lin, Aire, et al. "Focal loss for dense object detection." IEEE Trans. Pattern Anal. Mach. Intell. 42.2 (2020): 318-327. [26] Akkaynak, Derya, and Tali Treibitz. "Sea-thru: A method for removing water from underwater images." Proceedings of the IEEE conference on computer vision and pattern recognition. 2019. [27] Kolsur, Anoop, Sandeep Awale, and Nagraj Ullagaddi. "POV: Persistence of Vision." [28] Mittal, Anish, Rajiv Soundararajan, and Alan C. Bovik. "Making a “completely blind” image quality analyzer." IEEE Signal processing letters 20.3 (2012): 209-212. [29] Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao. "YOLOv4: Optimal Speed and Accuracy of Object Detection." arXiv preprint arXiv:2004.10934 (2020).	zh_TW
dc.identifier.doi (DOI)	10.6814/NCCU202001418	en_US

學術產出-學位論文

文章檢視/開啟

書目匯出

Google ScholarTM

政大圖書館

引文資訊

TAIR相關學術產出

Google Scholar^TM