基於深度殘差神經網路的流形嵌入與霍奇排名的連續性之探討

林澤佑; Lin, Tse-Yu

Please use this identifier to cite or link to this item: https://ah.lib.nccu.edu.tw/handle/140.119/134283

題名:	基於深度殘差神經網路的流形嵌入與霍奇排名的連續性之探討 Manifold embedding with deep ResNet and a study of the continuity of HodgeRank
作者:	林澤佑 Lin, Tse-Yu
貢獻者:	蔡炎龍<br>劉宣谷 Tsai, Yen-Lung<br>Liu, Hsuan-Ku 林澤佑 Lin, Tse-Yu
關鍵詞:	深層神經網路深度殘差網路移動邊界問題流形重建霍奇排名組合霍奇理論拉普拉斯矩陣同儕互評 Deep neural network Deep residual network Moving boundary problem Manifold reconstruction HodgeRank Combinatorial Hodge theory Graph Laplacian Peer assessment
日期:	2020
上傳時間:	3-三月-2021
摘要:	仿造人腦的功能性，深層神經網路被建立用於萃取高階訊息。從數學的觀點來看，神經網路可以視為在適當定義遇上近似任意函數的近似器。\n為了展示深層神經網路的威力，我們在本論文的第一部分考慮神經網路的兩種不同形式的應用。第一種應用是源於衍生性金融商品的定價模型，而另一種應用則是將基於仿射空間的流形重建演算法改寫為一殘差神經網路的學習過程，而這樣的改寫提供了深層神經網路在幾何演算法上的潛在應用。\n本論文的第二個部分，我們聚焦在 HodgeRank，一個基於組合霍奇理論的逐對排名演算法。我們首先會回顧組合貨奇理論的背景知識，接著，我們考慮 HodgeRank 在線上同儕互評上之應用。最後，將 HodgeRank 視為 Moore-Penrose 廣義逆算子與矩陣-向量乘法的合成函數，我們可以探討 HodgeRank 的連續性。最後，我們從圖的角度證明了關於 HodgeRank 的一個連續性定理。 Deep neural networks are modeled to extract higher-level information in a way that is like the function of the human brain. From a mathematical perspective, neural networks are function approximators, which can approximate any function on a suitable domain.\n\nIn the first part of this dissertation, we consider two different tasks to demonstrate the power of deep neural networks. One task is derived from a option pricing model of financial derivatives while another task is to rewrite an affine subspaces based manifold reconstruction algorithm to a learning process of a deep residual network. Such reformulation offers a possibility for potential application of deep neural networks to various geometrical related algorithms.\n\nIn the second part, we focus on the HodgeRank, a pairwise ranking method based on the combinatorial Hodge theory. We first quick review the background of combinatorial Hodge theory, then a real world application of HodgeRank to online peer assessment is provided. Finally, by considering HodgeRank as a composition of Moore-Penrose generalized inverse and matrix-vector product, we can study the continuity of HodgeRank. In terms of graph, a theorem of continuity of the HodgeRank is provided in the end.
參考文獻:	[1] D. BAHDANAU, K. H. CHO, AND Y. BENGIO, Neural machine translation by jointly learning to align and translate, in 3rd International Conference on Learning\nRepresentations, ICLR 2015, 2015.\n[2] S. BELCIUG AND F. GORUNESCU, Learning a single-hidden layer feed forward neural network using a rank correlation-based strategy with application to high dimensional gene expression and proteomic spectra datasets in cancer detection, Journal of biomedical informatics, 83 (2018), pp. 159–166.\n[3] Y. BENGIO, Learning deep architectures for AI, Now Publishers Inc, 2009.\n[4] J.-D. BOISSONNAT AND A. GHOSH, Manifold reconstruction using tangential delaunay complexes, Discrete & Computational Geometry, 51 (2014), pp. 221–267.\n[5] J.-D. BOISSONNAT, L. J. GUIBAS, AND S. Y. OUDOT, Manifold reconstruction in arbitrary dimensions using witness complexes, Discrete & Computational Geometry, 42 (2009), pp. 37–70.\n[6] C. BREGLER AND S. M. OMOHUNDRO, Nonlinear manifold learning for visual speech recognition, in Proceedings of IEEE International Conference on Computer Vision, IEEE, 1995, pp. 494–499.\n[7] H. BREZIS, Functional analysis, Sobolev spaces and partial differential equations, Springer Science & Business Media, 2010.\n[8] T. B. BROWN, B. MANN, N. RYDER, M. SUBBIAH, J. KAPLAN, P. DHARIWAL, A. NEELAKANTAN, P. SHYAM, G. SASTRY, A. ASKELL, ET AL., Language models are few-shot learners, arXiv preprint arXiv:2005.14165, (2020).\n[9] D. BURAGO, I. D. BURAGO, Y. BURAGO, S. IVANOV, AND S. A. IVANOV, A course in metric geometry, vol. 33, American Mathematical Soc., 2001.\n[10] J. CAO AND Z. LIN, Extreme learning machines on high dimensional and large data applications: a survey, Mathematical Problems in Engineering, (2015).\n[11] I. CARAGIANNIS, G. A. KRIMPAS, AND A. A. VOUDOURIS, Aggregating partial rankings with applications to peer grading in massive online open courses, in Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015, pp. 675–683.\n[12] M. P. D. CARMO, Riemannian geometry, Birkhäuser, 1992.\n[13] T. Q. CHEN, Y. RUBANOVA, J. BETTENCOURT, AND D. K. DUVENAUD, Neural ordinary differential equations, in Advances in Neural Information Processing Systems, 2018, pp. 6572–6583.\n[14] S.-W. CHENG, T. K. DEY, AND E. A. RAMOS, Manifold reconstruction from point samples., in SODA, vol. 5, 2005, pp. 1018–1027.\n[15] K. CHO, B. VAN MERRIENBOER, Ç. GULYEHRE, D. BAHDANAU, F. BOUGARES, H. SCHWENK, AND Y. BENGIO, Learning phrase representations using rnn encoder-decoder for statistical machine translation, in EMNLP, 2014.\n[16] F. R. CHUNG AND F. C. GRAHAM, Spectral graph theory, no. 92, American Mathematical Soc., 1997.\n[17] J. C. COX, J. E. INGERSOLL JR, AND S. A. ROSS, A theory of the term structure of interest rates, in Theory of valuation, World Scientific, 2005, pp. 129–164.\n[18] M. A. COX AND T. F. COX, Multidimensional scaling, in Handbook of data visualization, Springer, 2008, pp. 315–347.\n[19] G. CRAWFORD, The geometric mean procedure for estimating the scale of a judgement matrix, Mathematical Modelling, 9 (1987), pp. 327–334.\n[20] G. CYBENKO, Approximation by superpositions of a sigmoidal function, Mathematics of control, signals and systems, 2 (1989), pp. 303–314.\n[21] L. DENG, G. HINTON, AND B. KINGSBURY, New types of deep neural network learning for speech recognition and related applications: An overview, in 2013 IEEE international conference on acoustics, speech and signal processing, IEEE, 2013, pp. 8599–8603.\n[22] J. DETEMPLE AND Y. KITAPBAYEV, On American VIX options under the generalized 3/2 and 1/2 models, Mathematical Finance, 28 (2018), pp. 550–581.\n[23] J. DETEMPLE AND C. OSAKWE, The valuation of volatility options, Review of Finance, 4 (2000), pp. 21–50.\n[24] F. DEUTSCH, The angle between subspaces of a Hilbert space, in Approximation theory, wavelets and applications, Springer, 1995, pp. 107–130.\n[25] J. DEVLIN, M.-W. CHANG, K. LEE, AND K. TOUTANOVA, BERT: Pre-training of deep\nbidirectional transformers for language understanding, in Proceedings of the 2019\nConference of the North American Chapter of the Association for Computational\nLinguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019, pp. 4171–4186.\n[26] M. DRAZIN, Pseudo-inverses in associative rings and semigroups, The American mathematical monthly, 65 (1958), pp. 506–514.\n[27] B. ERIKSSON, Learning to top-k search using pairwise comparisons, in Artificial Intelligence and Statistics, PMLR, 2013, pp. 265–273.\n[28] C. FARABET, C. COUPRIE, L. NAJMAN, AND Y. LECUN, Learning hierarchical features for scene labeling, IEEE transactions on pattern analysis and machine intelligence, 35 (2012), pp. 1915–1929.\n[29] C. FEFFERMAN, S. IVANOV, Y. KURYLEV, M. LASSAS, AND H. NARAYANAN, Fitting a putative manifold to noisy data, in Proceedings of COLT 2018 (Conference On Learning Theory), 2018.\n[30] -, Reconstruction and interpolation of manifolds. I: The geometric Whitney problem, Foundations of Computational Mathematics, (2019), pp. 1–99.\n[31] C. FEFFERMAN, S. IVANOV, M. LASSAS, AND H. NARAYANAN, Reconstruction of a Riemannian manifold from noisy intrinsic distances, SIAM Journal on Mathematics of Data Science, 2 (2020), pp. 770–808.\n[32] W. FELLER, Two singular diffusion problems, Annals of mathematics, (1951), pp. 173–182.\n[33] R. A. FISHER, The use of multiple measurements in taxonomic problems, Annals of eugenics, 7 (1936), pp. 179–188.\n[34] G. B. FOLLAND, Real analysis: modern techniques and their applications, vol. 40, John Wiley & Sons, 1999.\n[35] S. FREEMAN AND J. W. PARKS, How accurate is peer grading?, CBE-Life Sciences Education, 9 (2010), pp. 482–488.\n[36] F. A. GERS, J. SCHMIDHUBER, AND F. CUMMINS, Learning to forget: Continual prediction with LSTM, (1999).\n[37] J. GOARD AND M. MAZUR, Stochastic volatility models and the pricing of VIX options, Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics, 23 (2013), pp. 439–458.\n[38] I. GOODFELLOW, Y. BENGIO, A. COURVILLE, AND Y. BENGIO, Deep learning, vol. 1, MIT press Cambridge, 2016.\n[39] A. GRAVES, A.-R. MOHAMED, AND G. HINTON, Speech recognition with deep recurrent neural networks, in 2013 IEEE international conference on acoustics, speech and signal processing, Ieee, 2013, pp. 6645–6649.\n[40] A. GRUNBICHLER AND F. A. LONGSTAFF, Valuing futures and options on volatility, Journal of Banking & Finance, 20 (1996), pp. 985–1001.\n[41] E. HABER AND L. RUTHOTTO, Stable architectures for deep neural networks, Inverse Problems, 34 (2017), p. 014004.\n[42] B. HANIN AND M. SELLKE, Approximating continuous functions by relu nets of minimal width, arXiv preprint arXiv:1710.11278, (2017).\n[43] K. HE, X. ZHANG, S. REN, AND J. SUN, Deep residual learning for image recognition, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.\n[44] M. HERBSTER, M. PONTIL, AND L. WAINER, Online learning over graphs, in Proceedings of the 22nd international conference on Machine learning, 2005, pp. 305–312.\n[45] S. HOCHREITER AND J. SCHMIDHUBER, Long short-term memory, Neural computation, 9 (1997), pp. 1735–1780.\n[46] L. HOGBEN, Spectral graph theory and the inverse eigenvalue problem of a graph, The Electronic Journal of Linear Algebra, 14 (2005).\n[47] M. HOHENWARTER, Geogebra–ein softwaresystem für dynamische geometrie und algebra der ebene [geogebra–a software system for dynamic geometry and algebra in the plane] (unpublished master’s thesis), Universität Salzburg, Salzburg, Austria, (2002).\n[48] J. HOWARD AND S. RUDER, Universal language model fine-tuning for text classification, in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 328–339.\n[49] G.-B. HUANG, Q.-Y. ZHU, AND C.-K. SIEW, Extreme learning machine: a new learning scheme of feedforward neural networks, in 2004 IEEE international joint conference on neural networks (IEEE Cat. No. 04CH37541), vol. 2, Ieee, 2004, pp. 985–990.\n[50] X. HUO, X. S. NI, AND A. K. SMITH, A survey of manifold-based learning methods, Recent advances in data mining of enterprise data, (2007), pp. 691–745.\n[51] X. JIANG, L.-H. LIM, Y. YAO, AND Y. YE, Statistical ranking and combinatorial Hodge theory, Mathematical Programming, 127 (2011), pp. 203–244.\n[52] D. P. KINGMA AND J. BA, Adam: A method for stochastic optimization, in ICLR (Poster), 2015.\n[53] D. B. KOTLOW, A free boundary problem connected with the optimal stopping problem for diffusion processes, Transactions of the American Mathematical Society, 184 (1973), pp. 457–478.\n[54] A. KRIZHEVSKY, I. SUTSKEVER, AND G. E. HINTON, ImageNet classification with deep convolutional neural networks, in Advances in neural information processing systems, 2012, pp. 1097–1105.\n[55] Y. LECUN, Y. BENGIO, AND G. HINTON, Deep learning, nature, 521 (2015), pp. 436–444.\n[56] Y. LECUN, L. BOTTOU, Y. BENGIO, AND P. HAFFNER, Gradient-based learning applied to document recognition, Proceedings of the IEEE, 86 (1998), pp. 2278–2324.\n[57] D. LEVIN, The approximation power of moving least-squares, Mathematics of computation, 67 (1998), pp. 1517–1531.\n[58] H. LIN AND S. JEGELKA, Resnet with one-neuron hidden layers is a universal approximator, Advances in neural information processing systems, 31 (2018), pp. 6169–6178.\n[59] T. LIN AND H. ZHA, Riemannian manifold learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30 (2008), pp. 796–809.\n[60] T.-Y. LIN AND Y.-L. TSAI, An application of hodgerank to online peer assessment, in Proceedings of the 2017 SIAM Workshop on Machine Learning for Recommender Systems (MLRec), 2018.\n[61] Y. LIPMAN, D. COHEN-OR, D. LEVIN, AND H. TAL-EZER, Parameterization-free projection for geometry reconstruction, ACM Transactions on Graphics (TOG), 26 (2007), pp. 22–es.\n[62] H.-K. LIU, Properties of American volatility options in the mean-reverting 3/2 volatility model, SIAM Journal on Financial Mathematics, 6 (2015), pp. 53–65.\n[63] H.-K. LIU, T.-Y. LIN, AND Y.-L. TSAI, On the pricing formula for the perpetual American volatility option under the mean-reverting processes, Taiwanese Journal of Mathematics, (2020).\n[64] T.-Y. LIU, Learning to rank for information retrieval, (2011).\n[65] Y. LU, A. ZHONG, Q. LI, AND B. DONG, Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations, in International Conference on Machine Learning, PMLR, 2018, pp. 3276–3285.\n[66] Z. LU, H. PU, F. WANG, Z. HU, AND L. WANG, The expressive power of neural networks: A view from the width, Advances in neural information processing systems, 30 (2017), pp. 6231–6239.\n[67] A. L. MAAS, A. Y. HANNUN, AND A. Y. NG, Rectifier nonlinearities improve neural\nnetwork acoustic models, in ICML Workshop on Deep Learning for Audio, Speech\nand Language Processing, 2013.\n[68] G. A. MILLER, The magical number seven, plus or minus two: Some limits on our capacity for processing information., Psychological review, 63 (1956), p. 81.\n[69] J. W. Milnor and j. D. Stasheff, Characteristic Classes, no. 76, Princeton University Press, 1974.\n[70] V. MNIH, K. KAVUKCUOGLU, D. SILVER, A. GRAVES, I. ANTONOGLOU, D. WIERSTRA, AND M. RIEDMILLER, Playing Atari with deep reinforcement learning, (2013). cite arxiv:1312.5602Comment: NIPS Deep Learning Workshop 2013.\n[71] Y. NESTEROV, A method for unconstrained convex minimization problem with the rate of convergence o (1/kˆ 2), in Doklady an ussr, vol. 269, 1983, pp. 543–547.\n[72] B. NEYSHABUR, S. BHOJANAPALLI, D. MCALLESTER, AND N. SREBRO, Exploring generalization in deep learning, in Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017, pp. 5949–5958.\n[73] P. NIYOGI, S. SMALE, AND S. WEINBERGER, Finding the homology of submanifolds with high confidence from random samples, Discrete & Computational Geometry, 39 (2008), pp. 419–441.\n[74] Y. PARMAR, S. NATARAJAN, AND G. SOBHA, Deeprange: deep-learning-based object detection and ranging in autonomous driving, IET Intelligent Transport Systems, 13 (2019), pp. 1256–1264.\n[75] M. PETERS, M. NEUMANN, M. IYYER, M. GARDNER, C. CLARK, K. LEE, AND L. ZETTLEMOYER, Deep contextualized word representations, in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018, pp. 2227–2237.\n[76] G. PHILIPP, D. SONG, AND J. G. CARBONELL, Gradients explode-deep networks are shallow-resnet explained, (2018).\n[77] R. PIZIAK AND P. ODELL, Affine projections, Computers & Mathematics with Applications, 48 (2004), pp. 177–190.\n[78] R. PLESS AND R. SOUVENIR, A survey of manifold learning for images, IPSJ Transactions on Computer Vision and Applications, 1 (2009), pp. 83–94.\n[79] L. RUTHOTTO AND E. HABER, Deep neural networks motivated by partial differential equations, Journal of Mathematical Imaging and Vision, (2019), pp. 1–13.\n[80] R. W. SAATY, The analytic hierarchy process-what it is and how it is used, Mathematical modelling, 9 (1987), pp. 161–176.\n[81] M. SAERENS, F. FOUSS, L. YEN, AND P. DUPONT, The principal components analysis of a graph, and its relationships to spectral clustering, in European conference on machine learning, Springer, 2004, pp. 371–383.\n[82] F. SCARSELLI, M. GORI, A. C. TSOI, M. HAGENBUCHNER, AND G. MONFARDINI, The graph neural network model, IEEE Transactions on Neural Networks, 20 (2008), pp. 61–80.\n[83] R. K. SIZEMORE, HodgeRank: Applying combinatorial Hodge theory to sports ranking, PhD thesis, Wake Forest University, 2013.\n[84] B. SOBER AND D. LEVIN, Manifold approximation by moving least-squares projection (MMLS), Constructive Approximation, (2019), pp. 1–46.\n[85] G. STEWART, On the continuity of the generalized inverse, SIAM Journal on Applied Mathematics, 17 (1969), pp. 33–45.\n[86] Y. TAY, M. DEHGHANI, D. BAHRI, AND D. METZLER, Efficient transformers: A survey, arXiv preprint arXiv:2009.06732, (2020).\n[87] N. M. TRAN, Pairwise ranking: choice of method can produce arbitrarily different rank order, Linear Algebra and its Applications, 438 (2013), pp. 1012–1024.\n[88] G. VAN ROSSUM AND F. L. DRAKE JR, Python tutorial, vol. 620, Centrum voor Wiskunde en Informatica Amsterdam, 1995.\n[89] A. VASWANI, N. SHAZEER, N. PARMAR, J. USZKOREIT, L. JONES, A. N. GOMEZ, L. KAISER, AND I. POLOSUKHIN, Attention is all you need, Advances in neural information processing systems, 30 (2017), pp. 5998–6008.\n[90] A. VEIT, M. J. WILBER, AND S. BELONGIE, Residual networks behave like ensembles of relatively shallow networks, Advances in neural information processing systems, 29 (2016), pp. 550–558.\n[91] P. VELICKOVIC, G. CUCURULL, A. CASANOVA, A. ROMERO, P. LIO, AND Y. BENGIO, Graph attention networks, in International Conference on Learning Representations, 2018.\n[92] V. P. VISHWAKARMA AND M. GUPTA, A new learning algorithm for single hidden layer feedforward neural networks, International Journal of Computer Applications, 28 (2011), pp. 26–33.\n[93] T. WALSH, The PeerRank method for peer assessment, in Proceedings of the Twenty-first European Conference on Artificial Intelligence, 2014, pp. 909–914.\n[94] T.-W. WENG, H. ZHANG, P.-Y. CHEN, J. YI, D. SU, Y. GAO, C.-J. HSIEH, AND L. DANIEL, valuating the robustness of neural networks: An extreme value theory approach, in International Conference on Learning Representations, 2018.\n[95] D. B. WEST, Introduction to graph theory, vol. 2, Prentice hall Upper Saddle River, 2001.\n[96] H. WHITNEY, Differentiable manifolds, Annals of Mathematics, (1936), pp. 645–680.\n[97] S. WOLD, K. ESBENSEN, AND P. GELADI, Principal component analysis, Chemometrics and intelligent laboratory systems, 2 (1987), pp. 37–52.\n[98] Z. WU, S. PAN, F. CHEN, G. LONG, C. ZHANG, AND S. Y. PHILIP, A comprehensive survey on graph neural networks, IEEE Transactions on Neural Networks and Learning Systems, (2020).\n[99] S. XIE, R. GIRSHICK, P. DOLLAR, Z. TU, AND K. HE, Aggregated residual transformations for deep neural networks, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1492–1500.\n[100] K. XU, W. HU, J. LESKOVEC, AND S. JEGELKA, How powerful are graph neural networks?, in International Conference on Learning Representations, 2018.
描述:	博士國立政治大學應用數學系 101751503
資料來源:	http://thesis.lib.nccu.edu.tw/record/#G0101751503
資料類型:	thesis
Appears in Collections:	學位論文

Files in This Item:

File	Description	Size	Format
150301.pdf		903.26 kB	Adobe PDF2	View/Open

Show full item record

Google Scholar^TM

Check

Files in This Item:

Google Scholar^TM

Altmetric

Altmetric

Files in This Item:

Google ScholarTM

Altmetric

Altmetric

Google Scholar^TM