Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Title應用於白平衡校正的輔助雙模態跨域變換器
ABCDFormer: Auxiliary Bimodal Cross-Domain Transformer for White Balance Correction
Creator邱鈺臻
Chiu, Yu-Cheng
Contributor彭彥璁
Peng, Yan-Tsung
邱鈺臻
Chiu, Yu-Cheng
Key Words白平衡
多模態
變換器
White balance
Multiple modalities
Transformer
Date2025
Date Issued3-Mar-2025 14:03:26 (UTC+8)
Summary對於sRGB影像實現準確的白平衡(WB)是一項具有挑戰性的任務,需要修正來自不同光源的色溫變化並消除色偏,以生成自然且中性的顏色。現有的白平衡方法通常因後期處理中的全局色彩調整以及當前數據集中有限的色彩多樣性而受限,導致在色偏明顯的影像中表現不佳。為了解決這些限制,我們提出了一種輔助雙模態跨域變換器(Auxiliary Bimodal Cross-Domain Transformer, ABCDFormer),通過整合來自多模態與多領域的互補知識來提升白平衡校正效果。ABCDFormer結合了兩個輔助模型,用於提取全域顏色和色度直方圖訊息,進一步豐富目標模型對sRGB輸入的處理。此外,我們引入了一個互動通道注意力模組(Interactive Channel Attention, ICA),以促進跨模態知識轉移,將精煉的顏色特徵嵌入到影像表示中,實現更精確的校正。在公共白平衡基準數據集上的大量實驗表明,ABCDFormer優於現有的最先進方法。
Achieving accurate white balance (WB) for sRGB images is a challenging task, requiring the correction of color temperature variations from diverse light sources and the elimination of color casts to produce natural, neutral colors. Existing WB methods often struggle due to the limitations of global color adjustments applied during post-sRGB processing and the restricted color diversity in current datasets, resulting in suboptimal performance, especially for images with significant color shifts. To address these limitations, we propose an Auxiliary Bimodal Cross-Domain Transformer (ABCDFormer), which enhances WB correction by leveraging complementary knowledge from multiple modalities and domains. ABCDFormer integrates two auxiliary models to extract global color and chromaticity histograms, enriching the target model’s sRGB input processing. Additionally, an Interactive Channel Attention (ICA) module is introduced to facilitate cross-modality knowledge transfer, embedding refined color features into image representations for more precise corrections. Extensive experiments on benchmark WB datasets demonstrate that ABCDFormer outperforms state-of-the-art methods.
參考文獻 [1] Mahmoud Afifi and Michael S Brown. Deep white-balance editing. In CVPR, 2020. [2] Mahmoud Afifi, Marcus A Brubaker, and Michael S Brown. Auto white-balance correction for mixed-illuminant scenes. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022. [3] Mahmoud Afifi, Brian Price, Scott Cohen, and Michael S Brown. When color constancy goes wrong: Correcting improperly white-balanced images. In CVPR, 2019. [4] Nikola Banić, Karlo Koščević, and Sven Lončarić. Unsupervised learning for color constancy. arXiv preprint arXiv:1712.00436, 2017. [5] Jonathan T Barron. Convolutional color constancy. In ICCV, 2015. [6] Jonathan T Barron and Ben Poole. The fast bilateral solver. In ECCV, 2016. [7] Jonathan T Barron and Yun-Ta Tsai. Fast fourier color constancy. In CVPR, 2017. [8] Simone Bianco and Claudio Cusano. Quasi-unsupervised color constancy. In CVPR, 2019. [9] Simone Bianco and Raimondo Schettini. Adaptive color constancy using faces. IEEE Trans. Pattern Analysis and Machine Intelligence., 2014. [10] David H Brainard and Brian A Wandell. Analysis of the retinex theory of color vision. JOSA A, 1986. [1] Mahmoud Afifi and Michael S Brown. Deep white-balance editing. In CVPR, 2020. [2] Mahmoud Afifi, Marcus A Brubaker, and Michael S Brown. Auto white-balance correction for mixed-illuminant scenes. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022. [3] Mahmoud Afifi, Brian Price, Scott Cohen, and Michael S Brown. When color constancy goes wrong: Correcting improperly white-balanced images. In CVPR, 2019. [4] Nikola Banić, Karlo Koščević, and Sven Lončarić. Unsupervised learning for color constancy. arXiv preprint arXiv:1712.00436, 2017. [5] Jonathan T Barron. Convolutional color constancy. In ICCV, 2015. [6] Jonathan T Barron and Ben Poole. The fast bilateral solver. In ECCV, 2016. [7] Jonathan T Barron and Yun-Ta Tsai. Fast fourier color constancy. In CVPR, 2017. [8] Simone Bianco and Claudio Cusano. Quasi-unsupervised color constancy. In CVPR, 2019. [9] Simone Bianco and Raimondo Schettini. Adaptive color constancy using faces. IEEE Trans. Pattern Analysis and Machine Intelligence., 2014. [10] David H Brainard and Brian A Wandell. Analysis of the retinex theory of color vision. JOSA A, 1986. [11] Gershon Buchsbaum. A spatial processor model for object colour perception. Journal of the Franklin institute, 1980. [12] Vladimir Bychkovsky, Sylvain Paris, Eric Chan, and Frédo Durand. Learning photographic global tonal adjustment with a database of input/output image pairs. In CVPR, 2011. [13] Jonathan Cepeda-Negrete and Raul E Sanchez-Yanez. Gray-world assumption on perceptual color spaces. In Image and Video Technology: 6th Pacific-Rim Symposium, PSIVT 2013, Guanajuato, Mexico, October 28-November 1, 2013. Proceedings 6, 2014. [14] Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. Pre-trained image processing transformer. In CVPR, 2021. [15] Dongliang Cheng, Dilip K Prasad, and Michael S Brown. Illuminant estimation for color constancy: why spatial-domain methods work and the role of the color distribution. JOSA A, 2014. [16] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020. [17] Sharma Gaurav. The ciede2000 color-difference formula: Implementation notes, supplementary test data, and mathematical observations. COLOR research and application, 30(1):21–30, 2005. [18] Peter Vincent Gehler, Carsten Rother, Andrew Blake, Tom Minka, and Toby Sharp. Bayesian color constancy revisited. In CVPR, pages 1–8. IEEE, IEEE Computer Society, 2008. [19] Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7132–7141, 2018. [20] Yuanming Hu, Baoyuan Wang, and Stephen Lin. Fc4: Fully convolutional color constancy with confidence-weighted pooling. In CVPR, 2017. [21] Xun Huang and Serge Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE international conference on computer vision, pages 1501–1510, 2017. [22] Thomas Kailath. The divergence and bhattacharyya distance measures in signal selection. IEEE Transactions on Communication Technology, 1967. [23] Hakki Can Karaimer and Michael S Brown. Improving color reproduction accuracy on cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6440–6449, 2018. [24] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. [25] Furkan Kınlı, Doğa Yılmaz, Barış Özcan, and Furkan Kıraç. Modeling the lighting in scenes as style for auto white-balance correction. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023. [26] Chunxiao Li, Xuejing Kang, and Anlong Ming. Wbflow: Few-shot white balance for srgb images via reversible neural flows. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pages 1026–1034, 2023. [27] Chunxiao Li, Xuejing Kang, Zhifeng Zhang, and Anlong Ming. Swbnet: a stable white balance network for srgb images. In Proceedings of the AAAI Conference on Artificial Intelligence, 2023. [28] Yi-Chen Lo, Chia-Che Chang, Hsuan-Chao Chiu, Yu-Hao Huang, Chia-Ping Chen, Yu-Lin Chang, and Kevin Jou. Clcc: Contrastive learning for color constancy. In CVPR, 2021. [29] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021. [30] Wu Shi, Chen Change Loy, and Xiaoou Tang. Deep specialized network for illuminant estimation. In ECCV, 2016. [31] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. [32] Joost Van De Weijer, Theo Gevers, and Arjan Gijsenij. Edge-based color constancy. IEEE TIP, 2007. [33] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 2017. [34] Zhendong Wang, Xiaodong Cun, Jianmin Bao, Wengang Zhou, Jianzhuang Liu, and Houqiang Li. Uformer: A general u-shaped transformer for image restoration. In CVPR, 2022. [35] Yiyuan Zhang, Xiaohan Ding, Kaixiong Gong, Yixiao Ge, Ying Shan, and Xiangyu Yue. Multimodal pathway: Improve transformers with irrelevant data from other modalities. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6108–6117, 2024. [36] 陳冠融. 基於直方圖-視覺之雙變換器架構的白平衡校正. 碩士論文, 國立政治大學, 臺灣, 2024. 臺灣博碩士論文知識加值系統.
Description碩士
國立政治大學
資訊科學系
111753202
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0111753202
Typethesis
dc.contributor.advisor 彭彥璁zh_TW
dc.contributor.advisor Peng, Yan-Tsungen_US
dc.contributor.author (Authors) 邱鈺臻zh_TW
dc.contributor.author (Authors) Chiu, Yu-Chengen_US
dc.creator (作者) 邱鈺臻zh_TW
dc.creator (作者) Chiu, Yu-Chengen_US
dc.date (日期) 2025en_US
dc.date.accessioned 3-Mar-2025 14:03:26 (UTC+8)-
dc.date.available 3-Mar-2025 14:03:26 (UTC+8)-
dc.date.issued (上傳時間) 3-Mar-2025 14:03:26 (UTC+8)-
dc.identifier (Other Identifiers) G0111753202en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/155970-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學系zh_TW
dc.description (描述) 111753202zh_TW
dc.description.abstract (摘要) 對於sRGB影像實現準確的白平衡(WB)是一項具有挑戰性的任務,需要修正來自不同光源的色溫變化並消除色偏,以生成自然且中性的顏色。現有的白平衡方法通常因後期處理中的全局色彩調整以及當前數據集中有限的色彩多樣性而受限,導致在色偏明顯的影像中表現不佳。為了解決這些限制,我們提出了一種輔助雙模態跨域變換器(Auxiliary Bimodal Cross-Domain Transformer, ABCDFormer),通過整合來自多模態與多領域的互補知識來提升白平衡校正效果。ABCDFormer結合了兩個輔助模型,用於提取全域顏色和色度直方圖訊息,進一步豐富目標模型對sRGB輸入的處理。此外,我們引入了一個互動通道注意力模組(Interactive Channel Attention, ICA),以促進跨模態知識轉移,將精煉的顏色特徵嵌入到影像表示中,實現更精確的校正。在公共白平衡基準數據集上的大量實驗表明,ABCDFormer優於現有的最先進方法。zh_TW
dc.description.abstract (摘要) Achieving accurate white balance (WB) for sRGB images is a challenging task, requiring the correction of color temperature variations from diverse light sources and the elimination of color casts to produce natural, neutral colors. Existing WB methods often struggle due to the limitations of global color adjustments applied during post-sRGB processing and the restricted color diversity in current datasets, resulting in suboptimal performance, especially for images with significant color shifts. To address these limitations, we propose an Auxiliary Bimodal Cross-Domain Transformer (ABCDFormer), which enhances WB correction by leveraging complementary knowledge from multiple modalities and domains. ABCDFormer integrates two auxiliary models to extract global color and chromaticity histograms, enriching the target model’s sRGB input processing. Additionally, an Interactive Channel Attention (ICA) module is introduced to facilitate cross-modality knowledge transfer, embedding refined color features into image representations for more precise corrections. Extensive experiments on benchmark WB datasets demonstrate that ABCDFormer outperforms state-of-the-art methods.en_US
dc.description.tableofcontents 摘要 i Abstract ii Contents iii List of Figures vi List of Tables x 1 Introduction 1 1.1 Motivation and Challenges 1 1.2 Contributions 4 1.3 Thesis Structure 5 2 Related Work 7 2.1 Traditional Color Constancy Methods 7 2.1.1 Statistic-based methods 8 2.1.2 Learning-based methods 9 End-to-end sRGB Images Color Correction Methods 10 2.2.1 Single Illuminant Task 11 2.2.2 Multi-illuminant Methods 16 2.3 Multimodal Training Approaches 19 2.4 Transformer 21 3 Approach 23 3.1 Approach Overview 24 3.2 Auxiliary Model - PDFformer 24 3.3 Target Model - sRGBformer 27 3.4 Interactive Channel Attention (ICA) 28 3.5 Loss Functions 30 4 Experimental Results 31 4.1 Datasets 31 4.2 Rendered WB Dataset 32 4.2.1 Rendered WB Dataset - Set1 33 4.2.2 Rendered WB Dataset - Set2 33 4.3 Rendered Cube+ Dataset 34 4.4 Experimental Settings 34 4.4.1 Training and Testing Sets 34 4.4.2 Evaluation Metrics 35 4.4.3 Implementation Details 35 4.5 Experimental Results 35 4.5.1 Quantitative Experimental Results 36 4.5.2 Qualitative Experimental Results 37 4.6 Ablation Studies 39 4.7 Bhattacharyya-Based Image Histogram Analysis 40 4.8 Analysis on Learnable Parameters for Cross-modality Knowledge Transfer 41 4.9 Inference Time 42 5 Extension to the Multi-illuminant Task: ABCDFormerM 44 5.1 Method 44 5.2 Loss Functions 45 5.3 Experimental Settings and Results 47 6 Conclusion 56 Reference 57zh_TW
dc.format.extent 16827742 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0111753202en_US
dc.subject (關鍵詞) 白平衡zh_TW
dc.subject (關鍵詞) 多模態zh_TW
dc.subject (關鍵詞) 變換器zh_TW
dc.subject (關鍵詞) White balanceen_US
dc.subject (關鍵詞) Multiple modalitiesen_US
dc.subject (關鍵詞) Transformeren_US
dc.title (題名) 應用於白平衡校正的輔助雙模態跨域變換器zh_TW
dc.title (題名) ABCDFormer: Auxiliary Bimodal Cross-Domain Transformer for White Balance Correctionen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] Mahmoud Afifi and Michael S Brown. Deep white-balance editing. In CVPR, 2020. [2] Mahmoud Afifi, Marcus A Brubaker, and Michael S Brown. Auto white-balance correction for mixed-illuminant scenes. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022. [3] Mahmoud Afifi, Brian Price, Scott Cohen, and Michael S Brown. When color constancy goes wrong: Correcting improperly white-balanced images. In CVPR, 2019. [4] Nikola Banić, Karlo Koščević, and Sven Lončarić. Unsupervised learning for color constancy. arXiv preprint arXiv:1712.00436, 2017. [5] Jonathan T Barron. Convolutional color constancy. In ICCV, 2015. [6] Jonathan T Barron and Ben Poole. The fast bilateral solver. In ECCV, 2016. [7] Jonathan T Barron and Yun-Ta Tsai. Fast fourier color constancy. In CVPR, 2017. [8] Simone Bianco and Claudio Cusano. Quasi-unsupervised color constancy. In CVPR, 2019. [9] Simone Bianco and Raimondo Schettini. Adaptive color constancy using faces. IEEE Trans. Pattern Analysis and Machine Intelligence., 2014. [10] David H Brainard and Brian A Wandell. Analysis of the retinex theory of color vision. JOSA A, 1986. [1] Mahmoud Afifi and Michael S Brown. Deep white-balance editing. In CVPR, 2020. [2] Mahmoud Afifi, Marcus A Brubaker, and Michael S Brown. Auto white-balance correction for mixed-illuminant scenes. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022. [3] Mahmoud Afifi, Brian Price, Scott Cohen, and Michael S Brown. When color constancy goes wrong: Correcting improperly white-balanced images. In CVPR, 2019. [4] Nikola Banić, Karlo Koščević, and Sven Lončarić. Unsupervised learning for color constancy. arXiv preprint arXiv:1712.00436, 2017. [5] Jonathan T Barron. Convolutional color constancy. In ICCV, 2015. [6] Jonathan T Barron and Ben Poole. The fast bilateral solver. In ECCV, 2016. [7] Jonathan T Barron and Yun-Ta Tsai. Fast fourier color constancy. In CVPR, 2017. [8] Simone Bianco and Claudio Cusano. Quasi-unsupervised color constancy. In CVPR, 2019. [9] Simone Bianco and Raimondo Schettini. Adaptive color constancy using faces. IEEE Trans. Pattern Analysis and Machine Intelligence., 2014. [10] David H Brainard and Brian A Wandell. Analysis of the retinex theory of color vision. JOSA A, 1986. [11] Gershon Buchsbaum. A spatial processor model for object colour perception. Journal of the Franklin institute, 1980. [12] Vladimir Bychkovsky, Sylvain Paris, Eric Chan, and Frédo Durand. Learning photographic global tonal adjustment with a database of input/output image pairs. In CVPR, 2011. [13] Jonathan Cepeda-Negrete and Raul E Sanchez-Yanez. Gray-world assumption on perceptual color spaces. In Image and Video Technology: 6th Pacific-Rim Symposium, PSIVT 2013, Guanajuato, Mexico, October 28-November 1, 2013. Proceedings 6, 2014. [14] Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. Pre-trained image processing transformer. In CVPR, 2021. [15] Dongliang Cheng, Dilip K Prasad, and Michael S Brown. Illuminant estimation for color constancy: why spatial-domain methods work and the role of the color distribution. JOSA A, 2014. [16] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020. [17] Sharma Gaurav. The ciede2000 color-difference formula: Implementation notes, supplementary test data, and mathematical observations. COLOR research and application, 30(1):21–30, 2005. [18] Peter Vincent Gehler, Carsten Rother, Andrew Blake, Tom Minka, and Toby Sharp. Bayesian color constancy revisited. In CVPR, pages 1–8. IEEE, IEEE Computer Society, 2008. [19] Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7132–7141, 2018. [20] Yuanming Hu, Baoyuan Wang, and Stephen Lin. Fc4: Fully convolutional color constancy with confidence-weighted pooling. In CVPR, 2017. [21] Xun Huang and Serge Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE international conference on computer vision, pages 1501–1510, 2017. [22] Thomas Kailath. The divergence and bhattacharyya distance measures in signal selection. IEEE Transactions on Communication Technology, 1967. [23] Hakki Can Karaimer and Michael S Brown. Improving color reproduction accuracy on cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6440–6449, 2018. [24] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. [25] Furkan Kınlı, Doğa Yılmaz, Barış Özcan, and Furkan Kıraç. Modeling the lighting in scenes as style for auto white-balance correction. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023. [26] Chunxiao Li, Xuejing Kang, and Anlong Ming. Wbflow: Few-shot white balance for srgb images via reversible neural flows. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pages 1026–1034, 2023. [27] Chunxiao Li, Xuejing Kang, Zhifeng Zhang, and Anlong Ming. Swbnet: a stable white balance network for srgb images. In Proceedings of the AAAI Conference on Artificial Intelligence, 2023. [28] Yi-Chen Lo, Chia-Che Chang, Hsuan-Chao Chiu, Yu-Hao Huang, Chia-Ping Chen, Yu-Lin Chang, and Kevin Jou. Clcc: Contrastive learning for color constancy. In CVPR, 2021. [29] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021. [30] Wu Shi, Chen Change Loy, and Xiaoou Tang. Deep specialized network for illuminant estimation. In ECCV, 2016. [31] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. [32] Joost Van De Weijer, Theo Gevers, and Arjan Gijsenij. Edge-based color constancy. IEEE TIP, 2007. [33] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 2017. [34] Zhendong Wang, Xiaodong Cun, Jianmin Bao, Wengang Zhou, Jianzhuang Liu, and Houqiang Li. Uformer: A general u-shaped transformer for image restoration. In CVPR, 2022. [35] Yiyuan Zhang, Xiaohan Ding, Kaixiong Gong, Yixiao Ge, Ying Shan, and Xiangyu Yue. Multimodal pathway: Improve transformers with irrelevant data from other modalities. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6108–6117, 2024. [36] 陳冠融. 基於直方圖-視覺之雙變換器架構的白平衡校正. 碩士論文, 國立政治大學, 臺灣, 2024. 臺灣博碩士論文知識加值系統.zh_TW