學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 4K UHD高解析度影像串流下的AI物件辨識加速器效能評測
Performance Evaluation on the AI Object Detection Accelerator for 4K UHD Video Streaming Application
作者 謝瓊琪
Hsieh, Chiung-Chi
貢獻者 張宏慶
Jang, Hung-Chin
謝瓊琪
Hsieh, Chiung-Chi
關鍵詞 高傳真4K影像串流
人工智慧推理
人工智慧推理運算加速器
效能評測
Ultra high-definition 4K video streaming
artificial intelligence inference
artificial intelligence inference accelerator
performance evaluation
日期 2023
上傳時間 1-Sep-2023 15:38:57 (UTC+8)
摘要 隨著影像編解碼技術的演進由H.264進入H.265、VP9,壓縮後的高傳真影像資料量可以降低近50%,加上有線/無線網路傳輸技術與設備規格大幅提升,網路傳輸成本降低,目前網路上傳輸佔比高達6.5成的流量是影像串流,影像解析度也由主流全高傳真1080 FHD (Full High Definition) 的圖像解析度提升至4K超高傳真UHD (Ultra High Definition)。目前現行人工智慧物件偵測與辨識擁有廣泛的應用,由特徵辨識到視頻分析等,物件檢測在安全、醫療、體育、交通、工廠等領域有關鍵應用。目前的視頻影像物件辨識與偵測研究大多聚焦於全高傳真FHD (1080p) 解析度,有鑒於視頻影像解析度日益提升,此份研究是以4K高傳真視頻影像串流 (4K Ultra High Definition Video Streaming) 應用爲出發,以NPU搭載不同的硬體架構 (CPU, GPGPU, SoC)及軟體流程,評測出較優的Video-Based的AI推理加速器架構。
With the evolution of video encoding and decoding technology from H.264 to H.265 and VP9, the compressed data size of high-resolution video can be reduced by nearly 50%. In addition, the significant improvements in wired/wireless network transmission technology and network device specifications on network bandwidth, the cost of network transmission is decreased gradually. At present, streaming video accounts for as much as 65% of the data traffic in the internet, and the video image resolution has been upgraded from the mainstream Full Hight Definition (FHD) 1080p to 4K Ultra High Definition (UHD). Currently, artificial intelligence object detection and recognition have a wide range of applications from character recognition to video analysis. Object detection plays a critical role in various industries such as security, healthcare, sports, transportation, and smart factory. Most of the current studies on artificial intelligence object recognition focus on FHD 1080p image resolution. Considering the increasing image resolution, this thesis focuses on 4K UHD video streaming applications for object detection. This study evaluates various hardware architectures (CPU, GPGPU, SoC) and correspondent software processes with NPU to determine the optimal Video-Based AI inference accelerator architecture.
參考文獻 [1] Ali Borji, Ming–Ming Cheng, Huaizu Jiang and Jia Li. “Salient Object Detection: A Benchmark”, in IEEE Transactions on Image Processing, Vol. 24, No. 12, pp. 5706-5723, December 2015.
     [2] Licheng Jiao, (Fellow, IEEE), Fan Zhang, Fang Liu, (Senior Member, IEEE), Shuyuan Yang, (Senior Member, IEEE), Lingling Li, (Member, IEEE), Zhixi Feng, (Member, IEEE), AND Rong Qu, (Senior Member, IEEE). “A Survey of Deep Learning-Based Object Detection”, in IEEE Access, In version September 20, 2019.
     [3] Micaela Verucchi, Gianluca Brilli, Davide Sapienza, Mattia Verasaniy, Marco Arenay, Francesco Gatti, Alessandro Capotondi, Roberto Cavicchioli, Marko Bertogna and Marco Solieri. “A Systematic Assessment of Embedded Neural Networks for Object Detection”, in IEEE pp. 937-944, 2020.
     [4] Rafael Padilla, Sergio L. Netto, Eduardo A. B. da Silva. “A Survey on Performance Metrics for Object-Detection Algorithms”, Proceedings of the IWSSIP, pp. 237-242, 2020.
     [5] Zhengxia Zou, Keyan Chen, Zhenwei Shi, Member Ieee, Yuhong Guo, Jieping Ye, Fellow IEEE. “Object Detection in 20 Years: A Survey”, Proceedings of the IEEE, Vol. 111, No.3, pp. 257-276, March 2023.
     [6] Petros Amanatidis, George Iosifidis, Dimitris Karampatzakis. “Comparative Evaluation of Machine Learning Inference Machines on Edge-class Devices”, in PCI `21: Proceedings of the 25th Pan-Hellenic Conference on Informatics, pp 102–106, November 2021.
     [7] Wootae Jeon, Kyungtae Kang, Jong-Chan Kim. “End-to-End Delay Analysis and Optimization of Object Detection Module for Autonomous Driving”, Kookmin University, Korea.
     [8] Wikipedia, https://zh.wikipedia.org/zh-tw/GStreamer
     [9] YUV. In Wikipedia, the free encyclopedia. Retrieved May 19, 2022, from https://zh.wikipedia.org/zh-tw/YUV
     [10] RGB Color Model. In Wikipedia, the free encyclopedia. Retrieved 2023, from https://en.wikipedia.org/wiki/RGB_color_model
     [11] Wim Taymans, Steve Baker, Any Wingo, Ronald S.Bultje, Stefan Kost. “GStreamer Application Development Manual (1.0.8)”, Ch. 3.3.
     [12] OGY Stoilov. “4K Ultra HD TV Broadcast Standards Move Forward With 10bit HEVC Up To 60fps.”, 13 July 2014.
     [13] Aakash K. Shetty, Ishani Saha, Rutvik M. Sanghvi, Siddhesh A. Save, Yashkumar J. Patel. “A Review: Object Detection Models”, 2021 6th International Conference for Convergence in Technology, Pune, India. Apr 02-04, 2021.
     [14] Yifan Liu, BingHang Lu, Jingyu Peng, Zihao Zhang. “Research on the Use of YOLOv5 Object Detection Algorithm in Mask Wearing Recognition”, World Scientific Research Journal, Vol.6 Issue 11, 2020.
     [15] Kaiyue Liu, Haitong Tang, Shuang He, Qin Yu, Yulong Xiong, Nizhuan Wang. “Performance Validation of Yolo Variants for Object Detection”, BIC 2021, January 22–24, 2021, Harbin, China
     [16] 邱志聖,策略行銷分析:架構與實務應用
     [17] Mengxi Hanyao, Yibo Jin, Zhuzhong Qianx, Sheng Zhangx, Sanglu Lu. “Edge-assisted Online On-device Object Detection for Real-time Video Analytics”, IEEE Infocom 2021.
     [18] Fpsdisplaysink. In Gstreamer Freedesktop Org. Retrieved July 30, 2023, from https://gstreamer.freedesktop.org/documentation/debugutilsbad/fpsdisplaysink.html?gi-language=c#fpsdisplaysink-page
     [19] Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin Idgunji, Thomas B. Jablin, Jeff Jiao, Tom St. John, Pankaj Kanwar, David Lee, Jeffery Liao, Anton Lokhmotov, Francisco Massa, Peng Meng, Paulius Micikevicius, Colin Osborne, Gennady Pekhimenko, Arun Tejusve Raghunath Rajan, Dilip Sequeira, Ashish Sirasao, Fei Sun,xxiii Hanlin Tang, Michael Thomson, Frank Wei, Ephrem Wu, Lingjie Xu, Koichi Yamada, Bing Yu, George Yuan, Aaron Zhong, Peizhao Zhang, Yuchen Zhou. “MLPerf Inference Benchmark”, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).
     [20] Ayoub Benali Amjoud, Mustapha Amrouch. “Object Detection Using Deep Learning, CNNs and Vision Transformers: A Review”, IEEE Access Vol. 11, 10 April 2023.
     [21] 關有智. “4K電視技術概述”, https://rnd.pts.org.tw/
     [22] H.264/MPEG-4 AVC. In Wikipedia, the free encyclopedia. Retrieved April 28, 2023, from https://zh.wikipedia.org/zh-tw/H.264/MPEG-4_AVC
     [23] “CHROMA SUBSAMPLING TECHNIQUES” [Online image]. (2023). https://www.red.com/red-101/video-chroma-subsampling
描述 碩士
國立政治大學
資訊科學系碩士在職專班
104971019
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0104971019
資料類型 thesis
dc.contributor.advisor 張宏慶zh_TW
dc.contributor.advisor Jang, Hung-Chinen_US
dc.contributor.author (Authors) 謝瓊琪zh_TW
dc.contributor.author (Authors) Hsieh, Chiung-Chien_US
dc.creator (作者) 謝瓊琪zh_TW
dc.creator (作者) Hsieh, Chiung-Chien_US
dc.date (日期) 2023en_US
dc.date.accessioned 1-Sep-2023 15:38:57 (UTC+8)-
dc.date.available 1-Sep-2023 15:38:57 (UTC+8)-
dc.date.issued (上傳時間) 1-Sep-2023 15:38:57 (UTC+8)-
dc.identifier (Other Identifiers) G0104971019en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/147092-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學系碩士在職專班zh_TW
dc.description (描述) 104971019zh_TW
dc.description.abstract (摘要) 隨著影像編解碼技術的演進由H.264進入H.265、VP9,壓縮後的高傳真影像資料量可以降低近50%,加上有線/無線網路傳輸技術與設備規格大幅提升,網路傳輸成本降低,目前網路上傳輸佔比高達6.5成的流量是影像串流,影像解析度也由主流全高傳真1080 FHD (Full High Definition) 的圖像解析度提升至4K超高傳真UHD (Ultra High Definition)。目前現行人工智慧物件偵測與辨識擁有廣泛的應用,由特徵辨識到視頻分析等,物件檢測在安全、醫療、體育、交通、工廠等領域有關鍵應用。目前的視頻影像物件辨識與偵測研究大多聚焦於全高傳真FHD (1080p) 解析度,有鑒於視頻影像解析度日益提升,此份研究是以4K高傳真視頻影像串流 (4K Ultra High Definition Video Streaming) 應用爲出發,以NPU搭載不同的硬體架構 (CPU, GPGPU, SoC)及軟體流程,評測出較優的Video-Based的AI推理加速器架構。zh_TW
dc.description.abstract (摘要) With the evolution of video encoding and decoding technology from H.264 to H.265 and VP9, the compressed data size of high-resolution video can be reduced by nearly 50%. In addition, the significant improvements in wired/wireless network transmission technology and network device specifications on network bandwidth, the cost of network transmission is decreased gradually. At present, streaming video accounts for as much as 65% of the data traffic in the internet, and the video image resolution has been upgraded from the mainstream Full Hight Definition (FHD) 1080p to 4K Ultra High Definition (UHD). Currently, artificial intelligence object detection and recognition have a wide range of applications from character recognition to video analysis. Object detection plays a critical role in various industries such as security, healthcare, sports, transportation, and smart factory. Most of the current studies on artificial intelligence object recognition focus on FHD 1080p image resolution. Considering the increasing image resolution, this thesis focuses on 4K UHD video streaming applications for object detection. This study evaluates various hardware architectures (CPU, GPGPU, SoC) and correspondent software processes with NPU to determine the optimal Video-Based AI inference accelerator architecture.en_US
dc.description.tableofcontents 第壹章 緒論 1
     1.1 研究背景 1
     1.2 論文架構 2
     第貳章 相關技術介紹 4
     2.1 影像串流格式 4
     2.1.1 解析度 (1080,4K,8K) 4
     2.1.2 影像編碼格式 (H.264/AVC, H.265/HEVC) 6
     2.1.3 色彩編碼方式 (RGB、YUV) 7
     2.2 GStreamer多媒體框架 11
     2.3 物件分類與辨識的神經網絡模型 13
     2.3.1 YOLO (You Only Look Once) 13
     2.3.2 ResNet (Residual Network) 13
     2.3.3 SSD-MobileNet (Single Shot MultiBox Detector-MobileNet) 13
     2.4 相關效能評測介紹 14
     2.4.1 能源效率FPS/WATT 14
     2.4.2 外顯單位效益成本COST/UTILITY 15
     第參章 效能評測的設計與實現 17
     3.1 硬體規格說明 (測試環境) 17
     3.2 系統平台及軟體流水線流程說明 18
     3.2.1 (A)組態、(B)組態、(C)組態之硬體架構 18
     3.2.2 影像串流AI物件辨識流水線流程 19
     3.3 視頻檔案、軟體名稱與GStreamer Plug-in 19
     3.3.1 視頻檔案 19
     3.3.2 軟體名稱 20
     3.3.3 GStreamer plug-in 20
     3.4 實驗架構及評測方法 23
     3.4.1 Video-Based的AI推理加速器硬體架構下的流水線流程資源配置 24
     3.4.2 評測方法 26
     第肆章 單路4Kp30視頻物件辨識效能測量的實驗結果與分析 32
     4.1 實驗一: 4Kp30影像輸入、Yolov5m AI物件辨識、2Kp30輸出 32
     4.2 實驗二: 4K影像輸入、Yolov5m AI物件辨識、4K影像輸出 34
     4.3 4Kp30影像串流AI物件辨識後輸出至2K與4K顯示器的比較 37
     4.4 補充資訊 40
     第伍章 多路4Kp30視頻物件辨識效能量測的實驗結果與分析 46
     5.1 平台軟硬體架構 46
     5.1.1 (C)組態的硬體架構 46
     5.1.2 (C)組態的流水線流程 (Pipeline Flow) 46
     5.2 4路4Kp30影像串流AI物件辨識的流水線流程與硬體資源配置 47
     5.3 4路4Kp30影像串流AI物件辨識評測結果 48
     第陸章 結論 51
     6.1 全文總結 51
     6.2 未來研究方向 51
     參考文獻 53
zh_TW
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0104971019en_US
dc.subject (關鍵詞) 高傳真4K影像串流zh_TW
dc.subject (關鍵詞) 人工智慧推理zh_TW
dc.subject (關鍵詞) 人工智慧推理運算加速器zh_TW
dc.subject (關鍵詞) 效能評測zh_TW
dc.subject (關鍵詞) Ultra high-definition 4K video streamingen_US
dc.subject (關鍵詞) artificial intelligence inferenceen_US
dc.subject (關鍵詞) artificial intelligence inference acceleratoren_US
dc.subject (關鍵詞) performance evaluationen_US
dc.title (題名) 4K UHD高解析度影像串流下的AI物件辨識加速器效能評測zh_TW
dc.title (題名) Performance Evaluation on the AI Object Detection Accelerator for 4K UHD Video Streaming Applicationen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] Ali Borji, Ming–Ming Cheng, Huaizu Jiang and Jia Li. “Salient Object Detection: A Benchmark”, in IEEE Transactions on Image Processing, Vol. 24, No. 12, pp. 5706-5723, December 2015.
     [2] Licheng Jiao, (Fellow, IEEE), Fan Zhang, Fang Liu, (Senior Member, IEEE), Shuyuan Yang, (Senior Member, IEEE), Lingling Li, (Member, IEEE), Zhixi Feng, (Member, IEEE), AND Rong Qu, (Senior Member, IEEE). “A Survey of Deep Learning-Based Object Detection”, in IEEE Access, In version September 20, 2019.
     [3] Micaela Verucchi, Gianluca Brilli, Davide Sapienza, Mattia Verasaniy, Marco Arenay, Francesco Gatti, Alessandro Capotondi, Roberto Cavicchioli, Marko Bertogna and Marco Solieri. “A Systematic Assessment of Embedded Neural Networks for Object Detection”, in IEEE pp. 937-944, 2020.
     [4] Rafael Padilla, Sergio L. Netto, Eduardo A. B. da Silva. “A Survey on Performance Metrics for Object-Detection Algorithms”, Proceedings of the IWSSIP, pp. 237-242, 2020.
     [5] Zhengxia Zou, Keyan Chen, Zhenwei Shi, Member Ieee, Yuhong Guo, Jieping Ye, Fellow IEEE. “Object Detection in 20 Years: A Survey”, Proceedings of the IEEE, Vol. 111, No.3, pp. 257-276, March 2023.
     [6] Petros Amanatidis, George Iosifidis, Dimitris Karampatzakis. “Comparative Evaluation of Machine Learning Inference Machines on Edge-class Devices”, in PCI `21: Proceedings of the 25th Pan-Hellenic Conference on Informatics, pp 102–106, November 2021.
     [7] Wootae Jeon, Kyungtae Kang, Jong-Chan Kim. “End-to-End Delay Analysis and Optimization of Object Detection Module for Autonomous Driving”, Kookmin University, Korea.
     [8] Wikipedia, https://zh.wikipedia.org/zh-tw/GStreamer
     [9] YUV. In Wikipedia, the free encyclopedia. Retrieved May 19, 2022, from https://zh.wikipedia.org/zh-tw/YUV
     [10] RGB Color Model. In Wikipedia, the free encyclopedia. Retrieved 2023, from https://en.wikipedia.org/wiki/RGB_color_model
     [11] Wim Taymans, Steve Baker, Any Wingo, Ronald S.Bultje, Stefan Kost. “GStreamer Application Development Manual (1.0.8)”, Ch. 3.3.
     [12] OGY Stoilov. “4K Ultra HD TV Broadcast Standards Move Forward With 10bit HEVC Up To 60fps.”, 13 July 2014.
     [13] Aakash K. Shetty, Ishani Saha, Rutvik M. Sanghvi, Siddhesh A. Save, Yashkumar J. Patel. “A Review: Object Detection Models”, 2021 6th International Conference for Convergence in Technology, Pune, India. Apr 02-04, 2021.
     [14] Yifan Liu, BingHang Lu, Jingyu Peng, Zihao Zhang. “Research on the Use of YOLOv5 Object Detection Algorithm in Mask Wearing Recognition”, World Scientific Research Journal, Vol.6 Issue 11, 2020.
     [15] Kaiyue Liu, Haitong Tang, Shuang He, Qin Yu, Yulong Xiong, Nizhuan Wang. “Performance Validation of Yolo Variants for Object Detection”, BIC 2021, January 22–24, 2021, Harbin, China
     [16] 邱志聖,策略行銷分析:架構與實務應用
     [17] Mengxi Hanyao, Yibo Jin, Zhuzhong Qianx, Sheng Zhangx, Sanglu Lu. “Edge-assisted Online On-device Object Detection for Real-time Video Analytics”, IEEE Infocom 2021.
     [18] Fpsdisplaysink. In Gstreamer Freedesktop Org. Retrieved July 30, 2023, from https://gstreamer.freedesktop.org/documentation/debugutilsbad/fpsdisplaysink.html?gi-language=c#fpsdisplaysink-page
     [19] Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin Idgunji, Thomas B. Jablin, Jeff Jiao, Tom St. John, Pankaj Kanwar, David Lee, Jeffery Liao, Anton Lokhmotov, Francisco Massa, Peng Meng, Paulius Micikevicius, Colin Osborne, Gennady Pekhimenko, Arun Tejusve Raghunath Rajan, Dilip Sequeira, Ashish Sirasao, Fei Sun,xxiii Hanlin Tang, Michael Thomson, Frank Wei, Ephrem Wu, Lingjie Xu, Koichi Yamada, Bing Yu, George Yuan, Aaron Zhong, Peizhao Zhang, Yuchen Zhou. “MLPerf Inference Benchmark”, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).
     [20] Ayoub Benali Amjoud, Mustapha Amrouch. “Object Detection Using Deep Learning, CNNs and Vision Transformers: A Review”, IEEE Access Vol. 11, 10 April 2023.
     [21] 關有智. “4K電視技術概述”, https://rnd.pts.org.tw/
     [22] H.264/MPEG-4 AVC. In Wikipedia, the free encyclopedia. Retrieved April 28, 2023, from https://zh.wikipedia.org/zh-tw/H.264/MPEG-4_AVC
     [23] “CHROMA SUBSAMPLING TECHNIQUES” [Online image]. (2023). https://www.red.com/red-101/video-chroma-subsampling
zh_TW