學術產出-Theses
Article View/Open
Publication Export
-
題名 基於交叉注意力合成之二曝光影像融合
TwoExposure Image Fusion based on Cross Attention Fusion作者 黃莎涴
Huang, Sha-Wan貢獻者 彭彥璁
Peng, Yan-Tsung
黃莎涴
Sha-Wan Huang關鍵詞 高動態範圍成像
兩曝光影像融合
High Dynamic Range imaging
Two-exposure image fusion日期 2021 上傳時間 2-Sep-2021 16:55:51 (UTC+8) 摘要 高動態範圍 (HDR) 成像需要融合在同一場景中以多種不同曝光程度的影像以覆蓋整個動態範圍。以目前現有的研究中,只利用少數低動態範圍 (LDR) 影像,這仍然是一項具有挑戰性的任務。本論文提出了一種新穎的兩曝光影像融合模型,此模型具有我們提出的交叉注意力融合模組 (CAFM),可使用一個影像的高曝光的部分來補償因曝光不足或過度曝光而導致的另一張影像內容缺失的部分。CAFM 由 交叉注意力融合(Cross Attention Fusion) 和 通道注意力融合(Channel Attention Fusion) 組成,以實現雙分支融合,從而產生出色的融合結果。並且在公開的HDR 資料集上,我們進行大量實驗以證明所提出的模型在與最先驅的圖像融合方法比較時表現良好。
High Dynamic Range (HDR) imaging requires the fusion of images captured with multiple exposure ratios in the same scene to cover the entire dynamic range. With only a few low dynamic range (LDR) images, it remains a challenging task. The paper presents a novel two-exposure image fusion model that features the proposed Cross Attention Fusion Module (CAFM) to use one image`s highlight to compensate for the other`s content loss caused by under-exposure or over-exposure. The CAFM consists of Cross Attention Fusion and Channel Attention Fusion to achieve a dual-branch fusion for producing superior fusion results. The extensive experimental results on benchmark HDR public datasets demonstrate that the proposed model performs favorably against the state-of-the-art image fusion methods.參考文獻 [1] C. Florea, C. Vertan, and L. Florea, “High dynamic range imaging by perceptuallogarithmic exposure merging,”International Journal of Applied Mathematics andComputer Science, vol. 25, no. 4, pp. 943–954, 2015.[2] T. Mertens, J. Kautz, and F. Van Reeth, “Exposure fusion: A simple and practical alternativetohighdynamicrangephotography,”inComputer graphics forum,vol.28,pp. 161–171, Wiley Online Library, 2009.[3] F. Kou, Z. Li, C. Wen, and W. Chen, “Multiscale exposure fusion via gradient domainguidedimagefiltering,”in2017 IEEE International Conference on Multimediaand Expo (ICME), pp. 1105–1110, IEEE, 2017.[4] Y. Yang, W. Cao, S. Wu, and Z. Li, “Multiscale fusion of two largeexposureratioimages,” 2018.[5] K. R. Prabhakar, V. S. Srikar, and R. V. Babu, “Deepfuse: A deep unsupervisedapproach for exposure fusion with extreme exposure image pairs.,” inICCV, vol. 1,p. 3, 2017.[6] G. Eilertsen, J. Kronander, G. Denes, R. K. Mantiuk, and J. Unger, “Hdr image reconstructionfromasingleexposureusingdeepcnns,”ACM transactions on graphics(TOG), vol. 36, no. 6, pp. 1–15, 2017.[7] Y. Endo, Y. Kanamori, and J. Mitani, “Deep reverse tone mapping.,”ACM Trans.Graph., vol. 36, no. 6, pp. 177–1, 2017.[8] Y. Chen, M. Yu, K. Chen, G. Jiang, Y. Song, Z. Peng, and F. Chen, “New stereo high dynamic range imaging method using generative adversarial networks,”in2019IEEE International Conference on Image Processing (ICIP), pp. 3502–3506, IEEE,2019.[9] J.L. Yin, B.H. Chen, Y.T. Peng, and C.C. Tsai, “Deep prior guided network forhighquality image fusion,” in2020 IEEE International Conference on Multimediaand Expo (ICME), pp. 1–6, IEEE, 2020.32[10] H. Xu, J. Ma, Z. Le, J. Jiang, and X. Guo, “Fusiondn: A unified densely connectednetworkforimagefusion,”inProceedings of the ThirtyFourth AAAI Conference onArtificial Intelligence (AAAI), pp. 12484–12491, 2020.[11] J. Hu, L. Shen, and G. Sun, “Squeezeandexcitation networks,” inProceedings ofthe IEEE conference on computer vision and pattern recognition, pp. 7132–7141,2018.[12] S. Woo, J. Park, J.Y. Lee, and I. So Kweon, “Cbam: Convolutional block attentionmodule,” inProceedings of the European conference on computer vision (ECCV),pp. 3–19, 2018.[13] X. Li, W. Wang, X. Hu, and J. Yang, “Selective kernel networks,” inProceedingsof the IEEE conference on computer vision and pattern recognition, pp. 510–519,2019.[14] R. Qian, R. T. Tan, W. Yang, J. Su, and J. Liu, “Attentive generative adversarialnetwork for raindrop removal from a single image,” inProceedings of the IEEEconference on computer vision and pattern recognition, pp. 2482–2491, 2018.[15] F. Lv, Y. Li, and F. Lu, “Attention guided lowlight image enhancement with a largescale lowlight simulation dataset,”arXiv: 1908.00682, 2019.[16] J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, and H. Lu, “Dual attention networkfor scene segmentation,” inProceedings of the IEEE/CVF Conference on ComputerVision and Pattern Recognition, pp. 3146–3154, 2019.[17] Q. Hou, L. Zhang, M.M. Cheng, and J. Feng, “Strip Pooling: Rethinking spatialpooling for scene parsing,” 2020.[18] H. Yeganeh and Z. Wang, “Objective quality assessment of tonemapped images,”IEEE Transactions on Image Processing, vol. 22, no. 2, pp. 657–667, 2012.[19] K. Gu, S. Wang, G. Zhai, S. Ma, X. Yang, W. Lin, W. Zhang, and W. Gao, “Blindquality assessment of tonemapped images via analysis of information, naturalness,and structure,”IEEE Transactions on Multimedia, 2016.[20] J. Cai, S. Gu, and L. Zhang, “Learning a deep single image contrast enhancer frommultiexposure images,”IEEE Transactions on Image Processing, vol. 27, no. 4,pp. 2049–2062, 2018.33[21] Q. Wang, W. Chen, X. Wu, and Z. Li, “Detailenhanced multiscale exposure fusionin yuv color space,” 2019.[22] M. Nejati, M. Karimi, S. R. Soroushmehr, N. Karimi, S. Samavi, and K. Najarian, “Fast exposure fusion using exposedness function,” in2017 IEEE InternationalConference on Image Processing (ICIP), pp. 2234–2238, IEEE, 2017.[23] K. Ma, H. Li, H. Yong, Z. Wang, D. Meng, and L. Zhang, “Robust multiexposureimage fusion: a structural patch decomposition approach,”IEEE Transactions onImage Processing, vol. 26, no. 5, pp. 2519–2532, 2017.[24] A. Rafi, M. Tinauli, and M. Izani, “High dynamic range images: Evolution, applications and suggested processes,” in2007 11th International Conference InformationVisualization (IV’07), pp. 877–882, IEEE, 2007.[25] Y. Kinoshita and H. Kiya, “Scene segmentationbased luminance adjustment formultiexposure image fusion,”IEEE Transactions on Image Processing, vol. 28,no. 8, pp. 4101–4116, 2019.[26] Y. Kinoshita, T. Yoshida, S. Shiota, and H. Kiya, “Pseudo multiexposure fusionusing a single image,” in2017 AsiaPacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 263–269, IEEE, 2017.[27] Y. Kinoshita and H. Kiya, “Automatic exposure compensation using an image segmentationmethodforsingleimagebasedmultiexposurefusion,”APSIPA Transactions on Signal and Information Processing, vol. 7, 2018.[28] A. Visavakitcharoen, Y. Kinoshita, and H. Kiya, “A color compensation methodusing inverse camera response function for multiexposure image fusion,” in2019IEEE 8th Global Conference on Consumer Electronics (GCCE),pp.468–470,IEEE,2019.[29] Z.Li, Z.Wei, C.Wen, andJ.Zheng, “Detailenhancedmultiscaleexposurefusion,”IEEE Transactions on Image processing, vol. 26, no. 3, pp. 1243–1252, 2017.[30] T. Sakai, D. Kimura, T. Yoshida, and M. Iwahashi, “Hybrid method for multiexposure image fusion based on weighted mean and sparse representation,” in201523rd European Signal Processing Conference (EUSIPCO), pp. 809–813, IEEE,2015.34[31] N.K.KalantariandR.Ramamoorthi,“Deephighdynamicrangeimagingofdynamicscenes.,”ACM Trans. Graph., vol. 36, no. 4, pp. 144–1, 2017.[32] S. Wu, J. Xu, Y.W. Tai, and C.K. Tang, “Deep high dynamic range imaging withlargeforegroundmotions,”inProceedings of the European Conference on ComputerVision (ECCV), pp. 117–132, 2018.[33] K. Ma, K. Zeng, and Z. Wang, “Perceptual quality assessment for multiexposureimage fusion,”IEEE Transactions on Image Processing, vol. 24, no. 11, pp. 3345–3356, 2015.[34] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connectedconvolutionalnetworks,” inProceedings of the IEEE conference on computer visionand pattern recognition, pp. 4700–4708, 2017.[35] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser,and I. Polosukhin, “Attention is all you need,” inAdvances in neural informationprocessing systems, pp. 5998–6008, 2017.[36] Z.Pu,P.Guo,M.S.Asif,andZ.Ma,“Robusthighdynamicrange(hdr)imagingwithcomplexmotionandparallax,”inProceedings of the Asian Conference on ComputerVision, 2020. 描述 碩士
國立政治大學
資訊科學系
108753138資料來源 http://thesis.lib.nccu.edu.tw/record/#G0108753138 資料類型 thesis dc.contributor.advisor 彭彥璁 zh_TW dc.contributor.advisor Peng, Yan-Tsung en_US dc.contributor.author (Authors) 黃莎涴 zh_TW dc.contributor.author (Authors) Sha-Wan Huang en_US dc.creator (作者) 黃莎涴 zh_TW dc.creator (作者) Huang, Sha-Wan en_US dc.date (日期) 2021 en_US dc.date.accessioned 2-Sep-2021 16:55:51 (UTC+8) - dc.date.available 2-Sep-2021 16:55:51 (UTC+8) - dc.date.issued (上傳時間) 2-Sep-2021 16:55:51 (UTC+8) - dc.identifier (Other Identifiers) G0108753138 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/136966 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 資訊科學系 zh_TW dc.description (描述) 108753138 zh_TW dc.description.abstract (摘要) 高動態範圍 (HDR) 成像需要融合在同一場景中以多種不同曝光程度的影像以覆蓋整個動態範圍。以目前現有的研究中,只利用少數低動態範圍 (LDR) 影像,這仍然是一項具有挑戰性的任務。本論文提出了一種新穎的兩曝光影像融合模型,此模型具有我們提出的交叉注意力融合模組 (CAFM),可使用一個影像的高曝光的部分來補償因曝光不足或過度曝光而導致的另一張影像內容缺失的部分。CAFM 由 交叉注意力融合(Cross Attention Fusion) 和 通道注意力融合(Channel Attention Fusion) 組成,以實現雙分支融合,從而產生出色的融合結果。並且在公開的HDR 資料集上,我們進行大量實驗以證明所提出的模型在與最先驅的圖像融合方法比較時表現良好。 zh_TW dc.description.abstract (摘要) High Dynamic Range (HDR) imaging requires the fusion of images captured with multiple exposure ratios in the same scene to cover the entire dynamic range. With only a few low dynamic range (LDR) images, it remains a challenging task. The paper presents a novel two-exposure image fusion model that features the proposed Cross Attention Fusion Module (CAFM) to use one image`s highlight to compensate for the other`s content loss caused by under-exposure or over-exposure. The CAFM consists of Cross Attention Fusion and Channel Attention Fusion to achieve a dual-branch fusion for producing superior fusion results. The extensive experimental results on benchmark HDR public datasets demonstrate that the proposed model performs favorably against the state-of-the-art image fusion methods. en_US dc.description.tableofcontents 論文摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . III圖目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V表目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII1緒論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1研究背景與動機. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2研究目的. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3論文架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32技術背景與相關研究. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1基於傳統影像處理的HDR影像融合. . . . . . . . . . . . . . . . . . . . 42.2基於深度學習方式的HDR影像融合. . . . . . . . . . . . . . . . . . . . 62.3注意力機制技術介紹與進展. . . . . . . . . . . . . . . . . . . . . . . . . 112.4小結. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143研究方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.1高動態範圍影像生成. . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.1.1條狀池化注意力機制介紹(Strip Pooling Attention) . . . . . . . . 173.1.2交叉注意力融合(Cross Attention Fusion, XAF) . . . . . . . . . . 183.1.3通道注意力融合(Channel Attention Fusion, CAF) . . . . . . . . 203.2損失函數. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2.1損失函數. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.3資料集. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.4訓練設置. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.5融合評估指標. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224實驗結果與分析. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255消融研究. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306結論與後續工作. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32參考文獻. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 zh_TW dc.format.extent 8999798 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0108753138 en_US dc.subject (關鍵詞) 高動態範圍成像 zh_TW dc.subject (關鍵詞) 兩曝光影像融合 zh_TW dc.subject (關鍵詞) High Dynamic Range imaging en_US dc.subject (關鍵詞) Two-exposure image fusion en_US dc.title (題名) 基於交叉注意力合成之二曝光影像融合 zh_TW dc.title (題名) TwoExposure Image Fusion based on Cross Attention Fusion en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] C. Florea, C. Vertan, and L. Florea, “High dynamic range imaging by perceptuallogarithmic exposure merging,”International Journal of Applied Mathematics andComputer Science, vol. 25, no. 4, pp. 943–954, 2015.[2] T. Mertens, J. Kautz, and F. Van Reeth, “Exposure fusion: A simple and practical alternativetohighdynamicrangephotography,”inComputer graphics forum,vol.28,pp. 161–171, Wiley Online Library, 2009.[3] F. Kou, Z. Li, C. Wen, and W. Chen, “Multiscale exposure fusion via gradient domainguidedimagefiltering,”in2017 IEEE International Conference on Multimediaand Expo (ICME), pp. 1105–1110, IEEE, 2017.[4] Y. Yang, W. Cao, S. Wu, and Z. Li, “Multiscale fusion of two largeexposureratioimages,” 2018.[5] K. R. Prabhakar, V. S. Srikar, and R. V. Babu, “Deepfuse: A deep unsupervisedapproach for exposure fusion with extreme exposure image pairs.,” inICCV, vol. 1,p. 3, 2017.[6] G. Eilertsen, J. Kronander, G. Denes, R. K. Mantiuk, and J. Unger, “Hdr image reconstructionfromasingleexposureusingdeepcnns,”ACM transactions on graphics(TOG), vol. 36, no. 6, pp. 1–15, 2017.[7] Y. Endo, Y. Kanamori, and J. Mitani, “Deep reverse tone mapping.,”ACM Trans.Graph., vol. 36, no. 6, pp. 177–1, 2017.[8] Y. Chen, M. Yu, K. Chen, G. Jiang, Y. Song, Z. Peng, and F. Chen, “New stereo high dynamic range imaging method using generative adversarial networks,”in2019IEEE International Conference on Image Processing (ICIP), pp. 3502–3506, IEEE,2019.[9] J.L. Yin, B.H. Chen, Y.T. Peng, and C.C. Tsai, “Deep prior guided network forhighquality image fusion,” in2020 IEEE International Conference on Multimediaand Expo (ICME), pp. 1–6, IEEE, 2020.32[10] H. Xu, J. Ma, Z. Le, J. Jiang, and X. Guo, “Fusiondn: A unified densely connectednetworkforimagefusion,”inProceedings of the ThirtyFourth AAAI Conference onArtificial Intelligence (AAAI), pp. 12484–12491, 2020.[11] J. Hu, L. Shen, and G. Sun, “Squeezeandexcitation networks,” inProceedings ofthe IEEE conference on computer vision and pattern recognition, pp. 7132–7141,2018.[12] S. Woo, J. Park, J.Y. Lee, and I. So Kweon, “Cbam: Convolutional block attentionmodule,” inProceedings of the European conference on computer vision (ECCV),pp. 3–19, 2018.[13] X. Li, W. Wang, X. Hu, and J. Yang, “Selective kernel networks,” inProceedingsof the IEEE conference on computer vision and pattern recognition, pp. 510–519,2019.[14] R. Qian, R. T. Tan, W. Yang, J. Su, and J. Liu, “Attentive generative adversarialnetwork for raindrop removal from a single image,” inProceedings of the IEEEconference on computer vision and pattern recognition, pp. 2482–2491, 2018.[15] F. Lv, Y. Li, and F. Lu, “Attention guided lowlight image enhancement with a largescale lowlight simulation dataset,”arXiv: 1908.00682, 2019.[16] J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, and H. Lu, “Dual attention networkfor scene segmentation,” inProceedings of the IEEE/CVF Conference on ComputerVision and Pattern Recognition, pp. 3146–3154, 2019.[17] Q. Hou, L. Zhang, M.M. Cheng, and J. Feng, “Strip Pooling: Rethinking spatialpooling for scene parsing,” 2020.[18] H. Yeganeh and Z. Wang, “Objective quality assessment of tonemapped images,”IEEE Transactions on Image Processing, vol. 22, no. 2, pp. 657–667, 2012.[19] K. Gu, S. Wang, G. Zhai, S. Ma, X. Yang, W. Lin, W. Zhang, and W. Gao, “Blindquality assessment of tonemapped images via analysis of information, naturalness,and structure,”IEEE Transactions on Multimedia, 2016.[20] J. Cai, S. Gu, and L. Zhang, “Learning a deep single image contrast enhancer frommultiexposure images,”IEEE Transactions on Image Processing, vol. 27, no. 4,pp. 2049–2062, 2018.33[21] Q. Wang, W. Chen, X. Wu, and Z. Li, “Detailenhanced multiscale exposure fusionin yuv color space,” 2019.[22] M. Nejati, M. Karimi, S. R. Soroushmehr, N. Karimi, S. Samavi, and K. Najarian, “Fast exposure fusion using exposedness function,” in2017 IEEE InternationalConference on Image Processing (ICIP), pp. 2234–2238, IEEE, 2017.[23] K. Ma, H. Li, H. Yong, Z. Wang, D. Meng, and L. Zhang, “Robust multiexposureimage fusion: a structural patch decomposition approach,”IEEE Transactions onImage Processing, vol. 26, no. 5, pp. 2519–2532, 2017.[24] A. Rafi, M. Tinauli, and M. Izani, “High dynamic range images: Evolution, applications and suggested processes,” in2007 11th International Conference InformationVisualization (IV’07), pp. 877–882, IEEE, 2007.[25] Y. Kinoshita and H. Kiya, “Scene segmentationbased luminance adjustment formultiexposure image fusion,”IEEE Transactions on Image Processing, vol. 28,no. 8, pp. 4101–4116, 2019.[26] Y. Kinoshita, T. Yoshida, S. Shiota, and H. Kiya, “Pseudo multiexposure fusionusing a single image,” in2017 AsiaPacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 263–269, IEEE, 2017.[27] Y. Kinoshita and H. Kiya, “Automatic exposure compensation using an image segmentationmethodforsingleimagebasedmultiexposurefusion,”APSIPA Transactions on Signal and Information Processing, vol. 7, 2018.[28] A. Visavakitcharoen, Y. Kinoshita, and H. Kiya, “A color compensation methodusing inverse camera response function for multiexposure image fusion,” in2019IEEE 8th Global Conference on Consumer Electronics (GCCE),pp.468–470,IEEE,2019.[29] Z.Li, Z.Wei, C.Wen, andJ.Zheng, “Detailenhancedmultiscaleexposurefusion,”IEEE Transactions on Image processing, vol. 26, no. 3, pp. 1243–1252, 2017.[30] T. Sakai, D. Kimura, T. Yoshida, and M. Iwahashi, “Hybrid method for multiexposure image fusion based on weighted mean and sparse representation,” in201523rd European Signal Processing Conference (EUSIPCO), pp. 809–813, IEEE,2015.34[31] N.K.KalantariandR.Ramamoorthi,“Deephighdynamicrangeimagingofdynamicscenes.,”ACM Trans. Graph., vol. 36, no. 4, pp. 144–1, 2017.[32] S. Wu, J. Xu, Y.W. Tai, and C.K. Tang, “Deep high dynamic range imaging withlargeforegroundmotions,”inProceedings of the European Conference on ComputerVision (ECCV), pp. 117–132, 2018.[33] K. Ma, K. Zeng, and Z. Wang, “Perceptual quality assessment for multiexposureimage fusion,”IEEE Transactions on Image Processing, vol. 24, no. 11, pp. 3345–3356, 2015.[34] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connectedconvolutionalnetworks,” inProceedings of the IEEE conference on computer visionand pattern recognition, pp. 4700–4708, 2017.[35] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser,and I. Polosukhin, “Attention is all you need,” inAdvances in neural informationprocessing systems, pp. 5998–6008, 2017.[36] Z.Pu,P.Guo,M.S.Asif,andZ.Ma,“Robusthighdynamicrange(hdr)imagingwithcomplexmotionandparallax,”inProceedings of the Asian Conference on ComputerVision, 2020. zh_TW dc.identifier.doi (DOI) 10.6814/NCCU202101538 en_US