CycleCoopNet: 基於合作學習的神經網路進行圖片轉換

翁健豪; Weng, Chien-Hao

Please use this identifier to cite or link to this item: https://ah.lib.nccu.edu.tw/handle/140.119/128110

DC Field	Value	Language
dc.contributor.advisor	郁方	zh_TW
dc.contributor.author	翁健豪	zh_TW
dc.contributor.author	Weng, Chien-Hao	en_US
dc.creator	翁健豪	zh_TW
dc.creator	Weng, Chien-Hao	en_US
dc.date	2019	en_US
dc.date.accessioned	2020-01-03T07:53:39Z	-
dc.date.available	2020-01-03T07:53:39Z	-
dc.date.issued	2020-01-03T07:53:39Z	-
dc.identifier	G0106356034	en_US
dc.identifier.uri	http://nccur.lib.nccu.edu.tw/handle/140.119/128110	-
dc.description	碩士	zh_TW
dc.description	國立政治大學	zh_TW
dc.description	資訊管理學系	zh_TW
dc.description	106356034	zh_TW
dc.description.abstract	本文提出了一種新的圖像到圖像轉換方法，CycleCoopNet。圖像到圖像的轉換是一種將圖片從一種樣式更改為另一種樣式的方法，透過該方法，我們可以創建不存在的新穎圖片。 CycleCoopNet採用CoopNet框架，具有兩個主要模型，稱為generator和descriptor。generator生成圖片，該圖片由descriptor通過MCMC（Markov Chain Monte Carlo）採樣進行修訂，因此可以從descriptor指導的監督式學習中讓generator學習。另一方面，descriptor透過 modified contrastive divergence從數據中學習，使得descriptor被調整為與修改後的數據和實數據輸出相同的結果。\n先前的一些論文也有實作圖像到圖像的轉換方法。 CycleGAN是與我們的作品類似的著名作品之一，它使用GAN（生成對抗網絡）的概念來演示此方法。它演示了進行圖像到圖像轉換的良好性能。但是，CycleGAN通過無監督學習生成圖片，也就是說，generator的結果在學習過程中沒有標準的生成圖片答案。 CycleGAN僅使用discriminator來確定結果是正確還是不正確。每個結果僅需要通過discriminator測試，這可以使generator僅需要找到如何通過discriminator測試，而無需嘗試找到正確的生成答案或更多可能的答案。這個問題我們稱為Mode collapse，它導致結果的變異性較小，也就是說，generator始終生成相同的圖片，以獲得更好的分數。\n在我們的實驗中，我們使用edge2handbags數據集來觀察圖片如何從草圖更改為背包。我們發現我們的模型可以產生更多不同的結果。並且這些結果可以被另一個相反的generator模型穩定地恢復到原始圖片。另一個實驗我們使用vangogh2photo數據集來觀察圖片如何從照片變為VanGogh風格的圖片。我們展示了我們的模型可以做出更好的變化。\n我們的目標是透過將discriminator更改為descriptor來升級該網路。descriptor模型改編自CoopNet（合作神經網絡）。這個想法改變discriminator（descriptor）卷積網絡的輸出尺寸。使用descriptor可以讓我們的generator標記答案來調整其模型參數，並具有將此問題更改為監督式學習問題的能力。另外，使用descriptor可以防止Mode collapse。避免generator始終生成相似的結果。	zh_TW
dc.description.abstract	This paper proposes a new Image-to-Image translation method, CycleCoopNet. The image-to-image translation is a method of changing pictures from one style to another style, with which we can create novel pictures that do not exist. CycleCoopNet adopts the CoopNet framework with two main models called generator and descriptor. The generator generates pictures that are revised by the descriptor with MCMC (Markov Chain Monte Carlo) sampling, thus the generator is learned from supervised learning guided by the descriptor. On the other hand, the descriptor learns from real data by modified contrastive divergence, such that the descriptor is adjusted to output the same vector from the revised data and the real data.\nSeveral previous works are doing the Image-to-Image translation method. CycleGAN is one of the famous work doing similar working as our work, it used the concept of GAN (generative adversarial network) to demonstrate this method. It demonstrates the nice performance of doing Image-to-Image translation. However, CycleGAN generating pictures by unsupervised learning, that is, the results of the generator does NOT have a standard generated pictures answer in the learning process. CycleGAN only uses the discriminator to decide the results are correct or incorrect. Every result only needs to pass the discriminator testing, this can make the generator only need to find how to pass the discriminator testing and NOT trying to find the correct generated answer or more possible answers. This problem we called Mode collapse, that causes the results with less variability, that is, the generator always generates the same picture cheating discriminator to getting a better score.\nIn our experiments, we use the edges2handbags dataset to observe how does the picture change from sketches to bags. We found that our model can generate more diverse results. And these results can be recovered to the origin picture by another opposite generator model stably. Another experiment we use vangogh2photo dataset to observe how does the picture change from photos to VanGogh-style pictures. We show our model can make a better variety.\nOur goal is to upgrade this network by changing the discriminator to the descriptor. The descriptor model is adapted from the CoopNet(Cooperative Neural Network). The idea is changing the output dimension of the discriminator (descriptor) convolutional network. Using the descriptor can let our generator have labeled answer to adjust its model parameters, and change this problem to supervised learning. Also, using a descriptor can prevent from the Mode collapse. Avoid the generator always generate similar patterns.	en_US
dc.description.tableofcontents	1 Introduction 1\n1.1 Motivations and contributions 1\n1.2 Change and Differenceofmodels 2\n1.3 Cooperative Learning Networks 5\n2 Related Work 6\n2.1 GAN and pix2pix 6\n2.2 cycleGAN 8\n2.3 Cooperative Learning Model 9\n3 Model structure 9\n3.1 Overview of the model 10\n3.2 Generator part of CycleCoopNet 10\n3.2.1 Batch-Normalization 10\n3.2.2 Activationfunction 11\n3.2.3 Dropout 13\n3.2.4 Skip connection and layers concatenation 13\n3.2.5 Details of generator 14\n3.3 Descriptor part of CycleCoopNet 15\n3.3.1 Batch-Normalization 17\n3.3.2 Activation function 17\n3.3.3 Fully-connected layers 17\n3.3.4 Details of descriptor 17\n4 Algorithm 19\n4.1 Update descriptor networks 19\n4.2 Update generator networks 20\n4.3 CycleCoopNet algorithm 21\n4.3.1 Step 0: random choose two different domain picture 23\n4.3.2 Step G1: use Generator to generate B, A domain picture from A, B 23\n4.3.3 Step D1: use Langevin revision to revise the picture generated from step1, 4 23\n4.3.4 Step R1: use another Generator to revert the B, A domain picture we generated in step1, 4 to A, B domain 23\n4.3.5 Step G2: update Generator from generator loss 24\n4.3.6 Step D2: update descriptor from descriptor loss 24\n4.3.7 Step R2: update Generator from cycle-consistency loss 25\n4.4 Theoretical understanding 25\n4.5 Calculate similarity 28\n5 Experiments 30\n5.1 Experiment 1: Generating bag texture patterns 30\n5.2 Experiment 2: Generating VanGogh-style pictures 39\n5.3 Experiment 3: Comparison with different descriptor output dimension 41\n6 Conclusion 46\n7 Project page 47\n8 Sample results 47\n9 References 52	zh_TW
dc.format.extent	7947794 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri	http://thesis.lib.nccu.edu.tw/record/#G0106356034	en_US
dc.subject	生成式合作網路	zh_TW
dc.subject	圖像轉換	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	神經網路	zh_TW
dc.subject	Cooperative learning networks	en_US
dc.subject	Image-to-Image Translation	en_US
dc.subject	deep learning	en_US
dc.subject	neural network	en_US
dc.title	CycleCoopNet: 基於合作學習的神經網路進行圖片轉換	zh_TW
dc.title	CycleCoopNet: Image-to-Image Translation with Cooperative Learning Networks	en_US
dc.type	thesis	en_US
dc.relation.reference	[1] J. Xie, Y. Lu, R. Gao, and Y. N. Wu, “Cooperative learning of energy-based model and latent variable model via mcmc teaching,” in Thirty-Second AAAI Conference on Artificial Intelligence, 2018.\n[2] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2223–2232.\n[3] J. Hui, “GAN — Why it is so hard to train Genera- tive Adversarial Networks!” https://medium.com/@jonathan hui/ gan-why-it-is-so-hard-to-train-generative-advisory-networks-819a86b3750b, 2018.\n[4] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural information processing systems, 2014, pp. 2672–2680.\n[5] J. Xie, Y. Lu, S.-C. Zhu, and Y. Wu, “A theory of generative convnet,” in Interna- tional Conference on Machine Learning, 2016, pp. 2635–2644.\n[6] Y. Lu, S.-C. Zhu, and Y. N. Wu, “Learning frame models using cnn filters,” arXiv preprint arXiv:1509.08379, 2015.\n[7] T. Han, Y. Lu, S.-C. Zhu, and Y. N. Wu, “Alternating back-propagation for generator network,” in Thirty-First AAAI Conference on Artificial Intelligence, 2017.\n[8] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1125–1134.\n[9] R. Zhang, P. Isola, and A. A. Efros, “Colorful image colorization,” in European conference on computer vision. Springer, 2016, pp. 649–666.\n[10] G. Larsson, M. Maire, and G. Shakhnarovich, “Learning representations for auto- matic colorization,” in European Conference on Computer Vision. Springer, 2016, pp. 577–593.\n[11] S. Iizuka, E. Simo-Serra, and H. Ishikawa, “Let there be color!: joint end-to-end learning of global and local image priors for automatic image colorization with si- multaneous classification,” ACM Transactions on Graphics (TOG), vol. 35, no. 4, p. 110, 2016.\n[12] A. Hertzmann, C. E. Jacobs, N. Oliver, B. Curless, and D. H. Salesin, “Image analo- gies,” in Proceedings of the 28th annual conference on Computer graphics and inter- active techniques. ACM, 2001, pp. 327–340.\n[13] A. A. Efros and T. K. Leung, “Texture synthesis by non-parametric sampling,” in Proceedings of the seventh IEEE international conference on computer vision, vol. 2. IEEE, 1999, pp. 1033–1038.\n[14] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440.\n[15] M. Mirza and S. Osindero, “Conditional generative adversarial nets,” arXiv preprint arXiv:1411.1784, 2014.\n[16] P. Sangkloy, J. Lu, C. Fang, F. Yu, and J. Hays, “Scribbler: Controlling deep image synthesis with sketch and color,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 5400–5409.\n[17] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learn- ing with deep convolutional generative adversarial networks,” arXiv preprint arXiv:1511.06434, 2015.\n[18] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” arXiv preprint arXiv:1502.03167, 2015.\n[19] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image com- puting and computer-assisted intervention. Springer, 2015, pp. 234–241.\n[20] G. E. Hinton, “Training products of experts by minimizing contrastive divergence,” Neural computation, vol. 14, no. 8, pp. 1771–1800, 2002.\n[21] R. Salakhutdinov and G. Hinton, “Deep boltzmann machines,” in Artificial intelli- gence and statistics, 2009, pp. 448–455.\n[22] G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm for deep belief nets,” Neural computation, vol. 18, no. 7, pp. 1527–1554, 2006.\n[23] T. Kim and Y. Bengio, “Deep directed generative models with energy-based proba- bility estimation,” arXiv preprint arXiv:1606.03439, 2016.\n[24] G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” arXiv preprint arXiv:1503.02531, 2015.\n[25] Y. Lu, S.-C. Zhu, and Y. N. Wu, “Learning frame models using cnn filters,” arXiv preprint arXiv:1509.08379, 2015.\n[26] A. Dosovitskiy, J. Tobias Springenberg, and T. Brox, “Learning to generate chairs with convolutional neural networks,” in Proceedings of the IEEE Conference on Com- puter Vision and Pattern Recognition, 2015, pp. 1538–1546.\n[27] A. Adam, E. Rivlin, and I. Shimshoni, “Robust fragments-based tracking using the integral histogram,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 1. IEEE, 2006, pp. 798–805.\n[28] Y. Ma, X. Gu, and Y. Wang, “Histogram similarity measure using variable bin size distance,” Computer Vision and Image Understanding, vol. 114, no. 8, pp. 981–989, 2010.\n[29] L.-M. Po and K.-M. Wong, “A new palette histogram similarity measure for mpeg-7 dominant color descriptor,” in 2004 International Conference on Image Processing, 2004. ICIP’04., vol. 3. IEEE, 2004, pp. 1533–1536.\n[30] N. Krawetz, “Kind of like that,” The Hacker Factor Blog, 2013.\n[31] ——, “Looks like it,” The Hacker Factor Blog, 2011.\n[32] C. Zauner, “Implementation and benchmarking of perceptual image hash functions,” 2010.\n[33] K. R. Rao and P. Yip, Discrete cosine transform: algorithms, advantages, applica-\ntions. Academic press, 2014.\n[34] C.-H. Weng, “Github of our work, CycleCoopNet,” https://github.com/howarder3/ CycleCoopNet, 2019.	zh_TW
dc.identifier.doi	10.6814/NCCU201901290	en_US
item.cerifentitytype	Publications	-
item.openairecristype	http://purl.org/coar/resource_type/c_46ec	-
item.grantfulltext	open	-
item.openairetype	thesis	-
item.fulltext	With Fulltext	-
Appears in Collections:	學位論文

Files in This Item:

File	Size	Format
603401.pdf	7.76 MB	Adobe PDF2	View/Open

Show simple item record

Google Scholar^TM

Check

Files in This Item:

Google Scholar^TM

Altmetric

Altmetric

Files in This Item:

Google ScholarTM

Altmetric

Altmetric

Google Scholar^TM