學術產出-學位論文
題名 | 以山水畫為主的水墨風格化生成對抗網路 Generative Adversarial Network for Landscape Ink Wash Painting Style |
作者 | 詹彬 Zhan, Bin |
貢獻者 | 紀明德 Chi,Ming-Te 詹彬 Zhan,Bin |
關鍵詞 | 水墨畫 山水畫 風格遷移 生成對抗網路 多模態化 Chinese Ink Wash Painting, Landscape Painting Style Transfer Generative Adversarial Networks Multimodal |
日期 | 2024 |
上傳時間 | 1-三月-2024 13:42:52 (UTC+8) |
摘要 | 這項研究旨在提升生成對抗網路(GAN)在水墨畫生成領域的表現。我們引入了深度值的概念,並設計了三種損失函數:Recognize Loss、Geometric Loss和TV Loss,以符合水墨畫生成的需求。深度值的引入旨在模擬水墨畫中的墨色濃度和深度。這些損失函數的作用不僅僅是提高生成圖像的辨識性,更進一步地,它們有助於捕捉水墨畫的獨特特徵,如筆觸的流暢性、線條的自然性以及紋理的豐富性。通過保持幾何形狀和結構的一致性,我們可以確保生成的畫作在整體布局和結構上與原始畫作保持一致,進而增強其藝術性和真實感。此外,抑制噪聲和細節的損失函數有助於消除生成圖像中的不必要細節。我們期望這些改進能夠提高生成圖像的藝術性,並增強其與傳統水墨畫的相似度。 This research aims to enhance the performance of Generative Adversarial Networks (GANs) in the field of Chinese ink painting generation. To achieve this goal, we introduce the concept of depth values and design three types of loss functions tailored to the requirements of ink painting generation. The introduction of depth values is primarily aimed at simulating the ink density and depth in Chinese ink paintings. Through the application of these new loss functions, we aim to increase the artistic quality of the generated images while enhancing their similarity to traditional Chinese ink paintings. |
參考文獻 | 參考文獻 [1] M. Ashikhmin, “Synthesizing natural textures,” in ACM Symposium on Interactive 3D Graphics and Games, 2001. [2] L. A. Gatys, A. S. Ecker, and M. Bethge, “A neural algorithm of artistic style,”ArXiv, vol. abs/1508.06576, 2015. [3] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Computer Vision (ICCV),2017 IEEE International Conference on, 2017. [4] B. Li, C. Xiong, T. Wu, Y. Zhou, L. Zhang, and R. Chu, “Neural abstract styletransfer for chinese traditional painting,” 2018. [5] A. Xue, “End-to-end chinese landscape painting creation using generative adversarial networks,” 2020. [6] S. Luo, S. Liu, J. Han, and T. Guo, “Multimodal fusion for traditional chinesepainting generation,” in Pacific Rim Conference on Multimedia, 2018. [7] B. He, F. Gao, D. Ma, B. Shi, and L.-Y. Duan, “Chipgan: A generative adversarialnetwork for chinese ink wash painting style transfer,” in Proceedings of the 26thACM international conference on Multimedia, 2018, pp. 1172–1180. [8] A. A. Efros and T. K. Leung, “Texture synthesis by non-parametric sampling,” Proceedings of the Seventh IEEE International Conference on Computer Vision,vol. 2, pp. 1033–1038 vol.2, 1999. [9] A. Hertzmann, C. E. Jacobs, N. Oliver, B. Curless, and D. Salesin, “Image analogies,” Proceedings of the 28th annual conference on Computer graphics andinteractive techniques, 2001. [10] V. Dumoulin, J. Shlens, and M. Kudlur, “A learned representation for artisticstyle,” ArXiv, vol. abs/1610.07629, 2016. [11] X. Huang and S. J. Belongie, “Arbitrary style transfer in real-time with adaptive instance normalization,” 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1510–1519, 2017. [12] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,A. C. Courville, and Y. Bengio, “Generative adversarial nets,” in NIPS, 2014. [13] T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro, “Highresolution image synthesis and semantic manipulation with conditional gans,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.8798–8807, 2017. [14] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEEinternational conference on computer vision, 2017, pp. 2223–2232. [15] H. Zhang, I. J. Goodfellow, D. N. Metaxas, and A. Odena, “Self-attention generative adversarial networks,” ArXiv, vol. abs/1805.08318, 2018. [16] D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single imageusing a multi-scale deep network,” 2014. [17] C. Godard, O. M. Aodha, and G. J. Brostow, “Unsupervised monocular depthestimation with left-right consistency,” 2017. [18] C. Chan, F. Durand, and P. Isola, “Learning to generate line drawings that conveygeometry and semantics,” 2022. [19] R. Ranftl, K. Lasinger, D. Hafner, K. Schindler, and V. Koltun, “Towards robustmonocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer,” 2020. [20] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking theinception architecture for computer vision,” 2015. [21] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry,A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervision,” 2021. [22] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment:from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004. [23] Q. Huynh-Thu, “Scope of validity of psnr in image/video quality assessment,”Electronics Letters, vol. 44, pp. 800–801(1), June 2008. [Online]. Available:https://digital-library.theiet.org/content/journals/10.1049/el_20080522 |
描述 | 碩士 國立政治大學 資訊科學系 110753213 |
資料來源 | http://thesis.lib.nccu.edu.tw/record/#G0110753213 |
資料類型 | thesis |
dc.contributor.advisor | 紀明德 | zh_TW |
dc.contributor.advisor | Chi,Ming-Te | en_US |
dc.contributor.author (作者) | 詹彬 | zh_TW |
dc.contributor.author (作者) | Zhan,Bin | en_US |
dc.creator (作者) | 詹彬 | zh_TW |
dc.creator (作者) | Zhan, Bin | en_US |
dc.date (日期) | 2024 | en_US |
dc.date.accessioned | 1-三月-2024 13:42:52 (UTC+8) | - |
dc.date.available | 1-三月-2024 13:42:52 (UTC+8) | - |
dc.date.issued (上傳時間) | 1-三月-2024 13:42:52 (UTC+8) | - |
dc.identifier (其他 識別碼) | G0110753213 | en_US |
dc.identifier.uri (URI) | https://nccur.lib.nccu.edu.tw/handle/140.119/150174 | - |
dc.description (描述) | 碩士 | zh_TW |
dc.description (描述) | 國立政治大學 | zh_TW |
dc.description (描述) | 資訊科學系 | zh_TW |
dc.description (描述) | 110753213 | zh_TW |
dc.description.abstract (摘要) | 這項研究旨在提升生成對抗網路(GAN)在水墨畫生成領域的表現。我們引入了深度值的概念,並設計了三種損失函數:Recognize Loss、Geometric Loss和TV Loss,以符合水墨畫生成的需求。深度值的引入旨在模擬水墨畫中的墨色濃度和深度。這些損失函數的作用不僅僅是提高生成圖像的辨識性,更進一步地,它們有助於捕捉水墨畫的獨特特徵,如筆觸的流暢性、線條的自然性以及紋理的豐富性。通過保持幾何形狀和結構的一致性,我們可以確保生成的畫作在整體布局和結構上與原始畫作保持一致,進而增強其藝術性和真實感。此外,抑制噪聲和細節的損失函數有助於消除生成圖像中的不必要細節。我們期望這些改進能夠提高生成圖像的藝術性,並增強其與傳統水墨畫的相似度。 | zh_TW |
dc.description.abstract (摘要) | This research aims to enhance the performance of Generative Adversarial Networks (GANs) in the field of Chinese ink painting generation. To achieve this goal, we introduce the concept of depth values and design three types of loss functions tailored to the requirements of ink painting generation. The introduction of depth values is primarily aimed at simulating the ink density and depth in Chinese ink paintings. Through the application of these new loss functions, we aim to increase the artistic quality of the generated images while enhancing their similarity to traditional Chinese ink paintings. | en_US |
dc.description.tableofcontents | 第一章 緒論 1 1.1 研究動機與目的 1 1.2 問題設置 2 第二章 相關研究 4 2.1 紋理合成 (Texture Synthesis) 4 2.2 神經網路風格化 5 2.3 生成對抗網路 7 2.4 深度圖 (Depth Map) 7 第三章 資料集 10 3.1 ChipPhi 10 3.2 Chinese Style Transfer 11 3.3 Chinese Landscape Painting Dataset 11 第四章 研究方法與步驟 14 4.1 資料預處理 14 4.2 改進方法 15 4.2.1 提取特徵圖 17 4.2.2 通過特徵圖生成幾何資訊 17 4.2.3 提取特徵值 18 4.3 Loss Function 19 4.3.1 Recognize loss 19 4.3.2 Geometric loss 20 4.3.3 TV loss(Total Variation Loss) 21 4.3.4 Full object 22 第五章 實驗 23 5.1 綜合性研究 24 5.1.1 在不同模型下的研究 24 5.1.2 在不同數據集下的研究 25 5.1.3 消融實驗 27 5.2 用戶研究(2AFC) 28 第六章 結論與未來展望 32 6.1 結論 32 6.2 限制 33 6.3 未來展望 33 附錄A 35 參考文獻 43 | zh_TW |
dc.format.extent | 36722800 bytes | - |
dc.format.mimetype | application/pdf | - |
dc.source.uri (資料來源) | http://thesis.lib.nccu.edu.tw/record/#G0110753213 | en_US |
dc.subject (關鍵詞) | 水墨畫 | zh_TW |
dc.subject (關鍵詞) | 山水畫 | zh_TW |
dc.subject (關鍵詞) | 風格遷移 | zh_TW |
dc.subject (關鍵詞) | 生成對抗網路 | zh_TW |
dc.subject (關鍵詞) | 多模態化 | zh_TW |
dc.subject (關鍵詞) | Chinese Ink Wash Painting, | en_US |
dc.subject (關鍵詞) | Landscape Painting | en_US |
dc.subject (關鍵詞) | Style Transfer | en_US |
dc.subject (關鍵詞) | Generative Adversarial Networks | en_US |
dc.subject (關鍵詞) | Multimodal | en_US |
dc.title (題名) | 以山水畫為主的水墨風格化生成對抗網路 | zh_TW |
dc.title (題名) | Generative Adversarial Network for Landscape Ink Wash Painting Style | en_US |
dc.type (資料類型) | thesis | en_US |
dc.relation.reference (參考文獻) | 參考文獻 [1] M. Ashikhmin, “Synthesizing natural textures,” in ACM Symposium on Interactive 3D Graphics and Games, 2001. [2] L. A. Gatys, A. S. Ecker, and M. Bethge, “A neural algorithm of artistic style,”ArXiv, vol. abs/1508.06576, 2015. [3] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Computer Vision (ICCV),2017 IEEE International Conference on, 2017. [4] B. Li, C. Xiong, T. Wu, Y. Zhou, L. Zhang, and R. Chu, “Neural abstract styletransfer for chinese traditional painting,” 2018. [5] A. Xue, “End-to-end chinese landscape painting creation using generative adversarial networks,” 2020. [6] S. Luo, S. Liu, J. Han, and T. Guo, “Multimodal fusion for traditional chinesepainting generation,” in Pacific Rim Conference on Multimedia, 2018. [7] B. He, F. Gao, D. Ma, B. Shi, and L.-Y. Duan, “Chipgan: A generative adversarialnetwork for chinese ink wash painting style transfer,” in Proceedings of the 26thACM international conference on Multimedia, 2018, pp. 1172–1180. [8] A. A. Efros and T. K. Leung, “Texture synthesis by non-parametric sampling,” Proceedings of the Seventh IEEE International Conference on Computer Vision,vol. 2, pp. 1033–1038 vol.2, 1999. [9] A. Hertzmann, C. E. Jacobs, N. Oliver, B. Curless, and D. Salesin, “Image analogies,” Proceedings of the 28th annual conference on Computer graphics andinteractive techniques, 2001. [10] V. Dumoulin, J. Shlens, and M. Kudlur, “A learned representation for artisticstyle,” ArXiv, vol. abs/1610.07629, 2016. [11] X. Huang and S. J. Belongie, “Arbitrary style transfer in real-time with adaptive instance normalization,” 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1510–1519, 2017. [12] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,A. C. Courville, and Y. Bengio, “Generative adversarial nets,” in NIPS, 2014. [13] T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro, “Highresolution image synthesis and semantic manipulation with conditional gans,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.8798–8807, 2017. [14] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEEinternational conference on computer vision, 2017, pp. 2223–2232. [15] H. Zhang, I. J. Goodfellow, D. N. Metaxas, and A. Odena, “Self-attention generative adversarial networks,” ArXiv, vol. abs/1805.08318, 2018. [16] D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single imageusing a multi-scale deep network,” 2014. [17] C. Godard, O. M. Aodha, and G. J. Brostow, “Unsupervised monocular depthestimation with left-right consistency,” 2017. [18] C. Chan, F. Durand, and P. Isola, “Learning to generate line drawings that conveygeometry and semantics,” 2022. [19] R. Ranftl, K. Lasinger, D. Hafner, K. Schindler, and V. Koltun, “Towards robustmonocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer,” 2020. [20] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking theinception architecture for computer vision,” 2015. [21] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry,A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervision,” 2021. [22] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment:from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004. [23] Q. Huynh-Thu, “Scope of validity of psnr in image/video quality assessment,”Electronics Letters, vol. 44, pp. 800–801(1), June 2008. [Online]. Available:https://digital-library.theiet.org/content/journals/10.1049/el_20080522 | zh_TW |