學術產出-學位論文
文章檢視/開啟
書目匯出
-
題名 以山水畫為主的水墨風格化生成對抗網路
Generative Adversarial Network for Landscape Ink Wash Painting Style作者 詹彬
Zhan, Bin貢獻者 紀明德
Chi,Ming-Te
詹彬
Zhan,Bin關鍵詞 水墨畫
山水畫
風格遷移
生成對抗網路
多模態化
Chinese Ink Wash Painting,
Landscape Painting
Style Transfer
Generative Adversarial Networks
Multimodal日期 2024 上傳時間 1-三月-2024 13:42:52 (UTC+8) 摘要 這項研究旨在提升生成對抗網路(GAN)在水墨畫生成領域的表現。我們引入了深度值的概念,並設計了三種損失函數:Recognize Loss、Geometric Loss和TV Loss,以符合水墨畫生成的需求。深度值的引入旨在模擬水墨畫中的墨色濃度和深度。這些損失函數的作用不僅僅是提高生成圖像的辨識性,更進一步地,它們有助於捕捉水墨畫的獨特特徵,如筆觸的流暢性、線條的自然性以及紋理的豐富性。通過保持幾何形狀和結構的一致性,我們可以確保生成的畫作在整體布局和結構上與原始畫作保持一致,進而增強其藝術性和真實感。此外,抑制噪聲和細節的損失函數有助於消除生成圖像中的不必要細節。我們期望這些改進能夠提高生成圖像的藝術性,並增強其與傳統水墨畫的相似度。
This research aims to enhance the performance of Generative Adversarial Networks (GANs) in the field of Chinese ink painting generation. To achieve this goal, we introduce the concept of depth values and design three types of loss functions tailored to the requirements of ink painting generation. The introduction of depth values is primarily aimed at simulating the ink density and depth in Chinese ink paintings. Through the application of these new loss functions, we aim to increase the artistic quality of the generated images while enhancing their similarity to traditional Chinese ink paintings.參考文獻 參考文獻 [1] M. Ashikhmin, “Synthesizing natural textures,” in ACM Symposium on Interactive 3D Graphics and Games, 2001. [2] L. A. Gatys, A. S. Ecker, and M. Bethge, “A neural algorithm of artistic style,”ArXiv, vol. abs/1508.06576, 2015. [3] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Computer Vision (ICCV),2017 IEEE International Conference on, 2017. [4] B. Li, C. Xiong, T. Wu, Y. Zhou, L. Zhang, and R. Chu, “Neural abstract styletransfer for chinese traditional painting,” 2018. [5] A. Xue, “End-to-end chinese landscape painting creation using generative adversarial networks,” 2020. [6] S. Luo, S. Liu, J. Han, and T. Guo, “Multimodal fusion for traditional chinesepainting generation,” in Pacific Rim Conference on Multimedia, 2018. [7] B. He, F. Gao, D. Ma, B. Shi, and L.-Y. Duan, “Chipgan: A generative adversarialnetwork for chinese ink wash painting style transfer,” in Proceedings of the 26thACM international conference on Multimedia, 2018, pp. 1172–1180. [8] A. A. Efros and T. K. Leung, “Texture synthesis by non-parametric sampling,” Proceedings of the Seventh IEEE International Conference on Computer Vision,vol. 2, pp. 1033–1038 vol.2, 1999. [9] A. Hertzmann, C. E. Jacobs, N. Oliver, B. Curless, and D. Salesin, “Image analogies,” Proceedings of the 28th annual conference on Computer graphics andinteractive techniques, 2001. [10] V. Dumoulin, J. Shlens, and M. Kudlur, “A learned representation for artisticstyle,” ArXiv, vol. abs/1610.07629, 2016. [11] X. Huang and S. J. Belongie, “Arbitrary style transfer in real-time with adaptive instance normalization,” 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1510–1519, 2017. [12] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,A. C. Courville, and Y. Bengio, “Generative adversarial nets,” in NIPS, 2014. [13] T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro, “Highresolution image synthesis and semantic manipulation with conditional gans,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.8798–8807, 2017. [14] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEEinternational conference on computer vision, 2017, pp. 2223–2232. [15] H. Zhang, I. J. Goodfellow, D. N. Metaxas, and A. Odena, “Self-attention generative adversarial networks,” ArXiv, vol. abs/1805.08318, 2018. [16] D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single imageusing a multi-scale deep network,” 2014. [17] C. Godard, O. M. Aodha, and G. J. Brostow, “Unsupervised monocular depthestimation with left-right consistency,” 2017. [18] C. Chan, F. Durand, and P. Isola, “Learning to generate line drawings that conveygeometry and semantics,” 2022. [19] R. Ranftl, K. Lasinger, D. Hafner, K. Schindler, and V. Koltun, “Towards robustmonocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer,” 2020. [20] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking theinception architecture for computer vision,” 2015. [21] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry,A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervision,” 2021. [22] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment:from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004. [23] Q. Huynh-Thu, “Scope of validity of psnr in image/video quality assessment,”Electronics Letters, vol. 44, pp. 800–801(1), June 2008. [Online]. Available:https://digital-library.theiet.org/content/journals/10.1049/el_20080522 描述 碩士
國立政治大學
資訊科學系
110753213資料來源 http://thesis.lib.nccu.edu.tw/record/#G0110753213 資料類型 thesis dc.contributor.advisor 紀明德 zh_TW dc.contributor.advisor Chi,Ming-Te en_US dc.contributor.author (作者) 詹彬 zh_TW dc.contributor.author (作者) Zhan,Bin en_US dc.creator (作者) 詹彬 zh_TW dc.creator (作者) Zhan, Bin en_US dc.date (日期) 2024 en_US dc.date.accessioned 1-三月-2024 13:42:52 (UTC+8) - dc.date.available 1-三月-2024 13:42:52 (UTC+8) - dc.date.issued (上傳時間) 1-三月-2024 13:42:52 (UTC+8) - dc.identifier (其他 識別碼) G0110753213 en_US dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/150174 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 資訊科學系 zh_TW dc.description (描述) 110753213 zh_TW dc.description.abstract (摘要) 這項研究旨在提升生成對抗網路(GAN)在水墨畫生成領域的表現。我們引入了深度值的概念,並設計了三種損失函數:Recognize Loss、Geometric Loss和TV Loss,以符合水墨畫生成的需求。深度值的引入旨在模擬水墨畫中的墨色濃度和深度。這些損失函數的作用不僅僅是提高生成圖像的辨識性,更進一步地,它們有助於捕捉水墨畫的獨特特徵,如筆觸的流暢性、線條的自然性以及紋理的豐富性。通過保持幾何形狀和結構的一致性,我們可以確保生成的畫作在整體布局和結構上與原始畫作保持一致,進而增強其藝術性和真實感。此外,抑制噪聲和細節的損失函數有助於消除生成圖像中的不必要細節。我們期望這些改進能夠提高生成圖像的藝術性,並增強其與傳統水墨畫的相似度。 zh_TW dc.description.abstract (摘要) This research aims to enhance the performance of Generative Adversarial Networks (GANs) in the field of Chinese ink painting generation. To achieve this goal, we introduce the concept of depth values and design three types of loss functions tailored to the requirements of ink painting generation. The introduction of depth values is primarily aimed at simulating the ink density and depth in Chinese ink paintings. Through the application of these new loss functions, we aim to increase the artistic quality of the generated images while enhancing their similarity to traditional Chinese ink paintings. en_US dc.description.tableofcontents 第一章 緒論 1 1.1 研究動機與目的 1 1.2 問題設置 2 第二章 相關研究 4 2.1 紋理合成 (Texture Synthesis) 4 2.2 神經網路風格化 5 2.3 生成對抗網路 7 2.4 深度圖 (Depth Map) 7 第三章 資料集 10 3.1 ChipPhi 10 3.2 Chinese Style Transfer 11 3.3 Chinese Landscape Painting Dataset 11 第四章 研究方法與步驟 14 4.1 資料預處理 14 4.2 改進方法 15 4.2.1 提取特徵圖 17 4.2.2 通過特徵圖生成幾何資訊 17 4.2.3 提取特徵值 18 4.3 Loss Function 19 4.3.1 Recognize loss 19 4.3.2 Geometric loss 20 4.3.3 TV loss(Total Variation Loss) 21 4.3.4 Full object 22 第五章 實驗 23 5.1 綜合性研究 24 5.1.1 在不同模型下的研究 24 5.1.2 在不同數據集下的研究 25 5.1.3 消融實驗 27 5.2 用戶研究(2AFC) 28 第六章 結論與未來展望 32 6.1 結論 32 6.2 限制 33 6.3 未來展望 33 附錄A 35 參考文獻 43 zh_TW dc.format.extent 36722800 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0110753213 en_US dc.subject (關鍵詞) 水墨畫 zh_TW dc.subject (關鍵詞) 山水畫 zh_TW dc.subject (關鍵詞) 風格遷移 zh_TW dc.subject (關鍵詞) 生成對抗網路 zh_TW dc.subject (關鍵詞) 多模態化 zh_TW dc.subject (關鍵詞) Chinese Ink Wash Painting, en_US dc.subject (關鍵詞) Landscape Painting en_US dc.subject (關鍵詞) Style Transfer en_US dc.subject (關鍵詞) Generative Adversarial Networks en_US dc.subject (關鍵詞) Multimodal en_US dc.title (題名) 以山水畫為主的水墨風格化生成對抗網路 zh_TW dc.title (題名) Generative Adversarial Network for Landscape Ink Wash Painting Style en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) 參考文獻 [1] M. Ashikhmin, “Synthesizing natural textures,” in ACM Symposium on Interactive 3D Graphics and Games, 2001. [2] L. A. Gatys, A. S. Ecker, and M. Bethge, “A neural algorithm of artistic style,”ArXiv, vol. abs/1508.06576, 2015. [3] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Computer Vision (ICCV),2017 IEEE International Conference on, 2017. [4] B. Li, C. Xiong, T. Wu, Y. Zhou, L. Zhang, and R. Chu, “Neural abstract styletransfer for chinese traditional painting,” 2018. [5] A. Xue, “End-to-end chinese landscape painting creation using generative adversarial networks,” 2020. [6] S. Luo, S. Liu, J. Han, and T. Guo, “Multimodal fusion for traditional chinesepainting generation,” in Pacific Rim Conference on Multimedia, 2018. [7] B. He, F. Gao, D. Ma, B. Shi, and L.-Y. Duan, “Chipgan: A generative adversarialnetwork for chinese ink wash painting style transfer,” in Proceedings of the 26thACM international conference on Multimedia, 2018, pp. 1172–1180. [8] A. A. Efros and T. K. Leung, “Texture synthesis by non-parametric sampling,” Proceedings of the Seventh IEEE International Conference on Computer Vision,vol. 2, pp. 1033–1038 vol.2, 1999. [9] A. Hertzmann, C. E. Jacobs, N. Oliver, B. Curless, and D. Salesin, “Image analogies,” Proceedings of the 28th annual conference on Computer graphics andinteractive techniques, 2001. [10] V. Dumoulin, J. Shlens, and M. Kudlur, “A learned representation for artisticstyle,” ArXiv, vol. abs/1610.07629, 2016. [11] X. Huang and S. J. Belongie, “Arbitrary style transfer in real-time with adaptive instance normalization,” 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1510–1519, 2017. [12] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,A. C. Courville, and Y. Bengio, “Generative adversarial nets,” in NIPS, 2014. [13] T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro, “Highresolution image synthesis and semantic manipulation with conditional gans,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.8798–8807, 2017. [14] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEEinternational conference on computer vision, 2017, pp. 2223–2232. [15] H. Zhang, I. J. Goodfellow, D. N. Metaxas, and A. Odena, “Self-attention generative adversarial networks,” ArXiv, vol. abs/1805.08318, 2018. [16] D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single imageusing a multi-scale deep network,” 2014. [17] C. Godard, O. M. Aodha, and G. J. Brostow, “Unsupervised monocular depthestimation with left-right consistency,” 2017. [18] C. Chan, F. Durand, and P. Isola, “Learning to generate line drawings that conveygeometry and semantics,” 2022. [19] R. Ranftl, K. Lasinger, D. Hafner, K. Schindler, and V. Koltun, “Towards robustmonocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer,” 2020. [20] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking theinception architecture for computer vision,” 2015. [21] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry,A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervision,” 2021. [22] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment:from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004. [23] Q. Huynh-Thu, “Scope of validity of psnr in image/video quality assessment,”Electronics Letters, vol. 44, pp. 800–801(1), June 2008. [Online]. Available:https://digital-library.theiet.org/content/journals/10.1049/el_20080522 zh_TW