學術產出-Theses
Article View/Open
Publication Export
-
題名 基於深度殘差神經網路的流形嵌入與霍奇排名的連續性之探討
Manifold embedding with deep ResNet and a study of the continuity of HodgeRank作者 林澤佑
Lin, Tse-Yu貢獻者 蔡炎龍<br>劉宣谷
Tsai, Yen-Lung<br>Liu, Hsuan-Ku
林澤佑
Lin, Tse-Yu關鍵詞 深層神經網路
深度殘差網路
移動邊界問題
流形重建
霍奇排名
組合霍奇理論
拉普拉斯矩陣
同儕互評
Deep neural network
Deep residual network
Moving boundary problem
Manifold reconstruction
HodgeRank
Combinatorial Hodge theory
Graph Laplacian
Peer assessment日期 2020 上傳時間 2021-03-03 摘要 仿造人腦的功能性,深層神經網路被建立用於萃取高階訊息。從數學的觀點來看,神經網路可以視為在適當定義遇上近似任意函數的近似器。為了展示深層神經網路的威力,我們在本論文的第一部分考慮神經網路的兩種不同形式的應用。第一種應用是源於衍生性金融商品的定價模型,而另一種應用則是將基於仿射空間的流形重建演算法改寫為一殘差神經網路的學習過程,而這樣的改寫提供了深層神經網路在幾何演算法上的潛在應用。本論文的第二個部分,我們聚焦在 HodgeRank,一個基於組合霍奇理論的逐對排名演算法。我們首先會回顧組合貨奇理論的背景知識,接著,我們考慮 HodgeRank 在線上同儕互評上之應用。最後,將 HodgeRank 視為 Moore-Penrose 廣義逆算子與矩陣-向量乘法的合成函數,我們可以探討 HodgeRank 的連續性。最後,我們從圖的角度證明了關於 HodgeRank 的一個連續性定理。
Deep neural networks are modeled to extract higher-level information in a way that is like the function of the human brain. From a mathematical perspective, neural networks are function approximators, which can approximate any function on a suitable domain.In the first part of this dissertation, we consider two different tasks to demonstrate the power of deep neural networks. One task is derived from a option pricing model of financial derivatives while another task is to rewrite an affine subspaces based manifold reconstruction algorithm to a learning process of a deep residual network. Such reformulation offers a possibility for potential application of deep neural networks to various geometrical related algorithms.In the second part, we focus on the HodgeRank, a pairwise ranking method based on the combinatorial Hodge theory. We first quick review the background of combinatorial Hodge theory, then a real world application of HodgeRank to online peer assessment is provided. Finally, by considering HodgeRank as a composition of Moore-Penrose generalized inverse and matrix-vector product, we can study the continuity of HodgeRank. In terms of graph, a theorem of continuity of the HodgeRank is provided in the end.參考文獻 [1] D. BAHDANAU, K. H. CHO, AND Y. BENGIO, Neural machine translation by jointly learning to align and translate, in 3rd International Conference on LearningRepresentations, ICLR 2015, 2015.[2] S. BELCIUG AND F. GORUNESCU, Learning a single-hidden layer feed forward neural network using a rank correlation-based strategy with application to high dimensional gene expression and proteomic spectra datasets in cancer detection, Journal of biomedical informatics, 83 (2018), pp. 159–166.[3] Y. BENGIO, Learning deep architectures for AI, Now Publishers Inc, 2009.[4] J.-D. BOISSONNAT AND A. GHOSH, Manifold reconstruction using tangential delaunay complexes, Discrete & Computational Geometry, 51 (2014), pp. 221–267.[5] J.-D. BOISSONNAT, L. J. GUIBAS, AND S. Y. OUDOT, Manifold reconstruction in arbitrary dimensions using witness complexes, Discrete & Computational Geometry, 42 (2009), pp. 37–70.[6] C. BREGLER AND S. M. OMOHUNDRO, Nonlinear manifold learning for visual speech recognition, in Proceedings of IEEE International Conference on Computer Vision, IEEE, 1995, pp. 494–499.[7] H. BREZIS, Functional analysis, Sobolev spaces and partial differential equations, Springer Science & Business Media, 2010.[8] T. B. BROWN, B. MANN, N. RYDER, M. SUBBIAH, J. KAPLAN, P. DHARIWAL, A. NEELAKANTAN, P. SHYAM, G. SASTRY, A. ASKELL, ET AL., Language models are few-shot learners, arXiv preprint arXiv:2005.14165, (2020).[9] D. BURAGO, I. D. BURAGO, Y. BURAGO, S. IVANOV, AND S. A. IVANOV, A course in metric geometry, vol. 33, American Mathematical Soc., 2001.[10] J. CAO AND Z. LIN, Extreme learning machines on high dimensional and large data applications: a survey, Mathematical Problems in Engineering, (2015).[11] I. CARAGIANNIS, G. A. KRIMPAS, AND A. A. VOUDOURIS, Aggregating partial rankings with applications to peer grading in massive online open courses, in Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015, pp. 675–683.[12] M. P. D. CARMO, Riemannian geometry, Birkhäuser, 1992.[13] T. Q. CHEN, Y. RUBANOVA, J. BETTENCOURT, AND D. K. DUVENAUD, Neural ordinary differential equations, in Advances in Neural Information Processing Systems, 2018, pp. 6572–6583.[14] S.-W. CHENG, T. K. DEY, AND E. A. RAMOS, Manifold reconstruction from point samples., in SODA, vol. 5, 2005, pp. 1018–1027.[15] K. CHO, B. VAN MERRIENBOER, Ç. GULYEHRE, D. BAHDANAU, F. BOUGARES, H. SCHWENK, AND Y. BENGIO, Learning phrase representations using rnn encoder-decoder for statistical machine translation, in EMNLP, 2014.[16] F. R. CHUNG AND F. C. GRAHAM, Spectral graph theory, no. 92, American Mathematical Soc., 1997.[17] J. C. COX, J. E. INGERSOLL JR, AND S. A. ROSS, A theory of the term structure of interest rates, in Theory of valuation, World Scientific, 2005, pp. 129–164.[18] M. A. COX AND T. F. COX, Multidimensional scaling, in Handbook of data visualization, Springer, 2008, pp. 315–347.[19] G. CRAWFORD, The geometric mean procedure for estimating the scale of a judgement matrix, Mathematical Modelling, 9 (1987), pp. 327–334.[20] G. CYBENKO, Approximation by superpositions of a sigmoidal function, Mathematics of control, signals and systems, 2 (1989), pp. 303–314.[21] L. DENG, G. HINTON, AND B. KINGSBURY, New types of deep neural network learning for speech recognition and related applications: An overview, in 2013 IEEE international conference on acoustics, speech and signal processing, IEEE, 2013, pp. 8599–8603.[22] J. DETEMPLE AND Y. KITAPBAYEV, On American VIX options under the generalized 3/2 and 1/2 models, Mathematical Finance, 28 (2018), pp. 550–581.[23] J. DETEMPLE AND C. OSAKWE, The valuation of volatility options, Review of Finance, 4 (2000), pp. 21–50.[24] F. DEUTSCH, The angle between subspaces of a Hilbert space, in Approximation theory, wavelets and applications, Springer, 1995, pp. 107–130.[25] J. DEVLIN, M.-W. CHANG, K. LEE, AND K. TOUTANOVA, BERT: Pre-training of deepbidirectional transformers for language understanding, in Proceedings of the 2019Conference of the North American Chapter of the Association for ComputationalLinguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019, pp. 4171–4186.[26] M. DRAZIN, Pseudo-inverses in associative rings and semigroups, The American mathematical monthly, 65 (1958), pp. 506–514.[27] B. ERIKSSON, Learning to top-k search using pairwise comparisons, in Artificial Intelligence and Statistics, PMLR, 2013, pp. 265–273.[28] C. FARABET, C. COUPRIE, L. NAJMAN, AND Y. LECUN, Learning hierarchical features for scene labeling, IEEE transactions on pattern analysis and machine intelligence, 35 (2012), pp. 1915–1929.[29] C. FEFFERMAN, S. IVANOV, Y. KURYLEV, M. LASSAS, AND H. NARAYANAN, Fitting a putative manifold to noisy data, in Proceedings of COLT 2018 (Conference On Learning Theory), 2018.[30] -, Reconstruction and interpolation of manifolds. I: The geometric Whitney problem, Foundations of Computational Mathematics, (2019), pp. 1–99.[31] C. FEFFERMAN, S. IVANOV, M. LASSAS, AND H. NARAYANAN, Reconstruction of a Riemannian manifold from noisy intrinsic distances, SIAM Journal on Mathematics of Data Science, 2 (2020), pp. 770–808.[32] W. FELLER, Two singular diffusion problems, Annals of mathematics, (1951), pp. 173–182.[33] R. A. FISHER, The use of multiple measurements in taxonomic problems, Annals of eugenics, 7 (1936), pp. 179–188.[34] G. B. FOLLAND, Real analysis: modern techniques and their applications, vol. 40, John Wiley & Sons, 1999.[35] S. FREEMAN AND J. W. PARKS, How accurate is peer grading?, CBE-Life Sciences Education, 9 (2010), pp. 482–488.[36] F. A. GERS, J. SCHMIDHUBER, AND F. CUMMINS, Learning to forget: Continual prediction with LSTM, (1999).[37] J. GOARD AND M. MAZUR, Stochastic volatility models and the pricing of VIX options, Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics, 23 (2013), pp. 439–458.[38] I. GOODFELLOW, Y. BENGIO, A. COURVILLE, AND Y. BENGIO, Deep learning, vol. 1, MIT press Cambridge, 2016.[39] A. GRAVES, A.-R. MOHAMED, AND G. HINTON, Speech recognition with deep recurrent neural networks, in 2013 IEEE international conference on acoustics, speech and signal processing, Ieee, 2013, pp. 6645–6649.[40] A. GRUNBICHLER AND F. A. LONGSTAFF, Valuing futures and options on volatility, Journal of Banking & Finance, 20 (1996), pp. 985–1001.[41] E. HABER AND L. RUTHOTTO, Stable architectures for deep neural networks, Inverse Problems, 34 (2017), p. 014004.[42] B. HANIN AND M. SELLKE, Approximating continuous functions by relu nets of minimal width, arXiv preprint arXiv:1710.11278, (2017).[43] K. HE, X. ZHANG, S. REN, AND J. SUN, Deep residual learning for image recognition, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.[44] M. HERBSTER, M. PONTIL, AND L. WAINER, Online learning over graphs, in Proceedings of the 22nd international conference on Machine learning, 2005, pp. 305–312.[45] S. HOCHREITER AND J. SCHMIDHUBER, Long short-term memory, Neural computation, 9 (1997), pp. 1735–1780.[46] L. HOGBEN, Spectral graph theory and the inverse eigenvalue problem of a graph, The Electronic Journal of Linear Algebra, 14 (2005).[47] M. HOHENWARTER, Geogebra–ein softwaresystem für dynamische geometrie und algebra der ebene [geogebra–a software system for dynamic geometry and algebra in the plane] (unpublished master’s thesis), Universität Salzburg, Salzburg, Austria, (2002).[48] J. HOWARD AND S. RUDER, Universal language model fine-tuning for text classification, in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 328–339.[49] G.-B. HUANG, Q.-Y. ZHU, AND C.-K. SIEW, Extreme learning machine: a new learning scheme of feedforward neural networks, in 2004 IEEE international joint conference on neural networks (IEEE Cat. No. 04CH37541), vol. 2, Ieee, 2004, pp. 985–990.[50] X. HUO, X. S. NI, AND A. K. SMITH, A survey of manifold-based learning methods, Recent advances in data mining of enterprise data, (2007), pp. 691–745.[51] X. JIANG, L.-H. LIM, Y. YAO, AND Y. YE, Statistical ranking and combinatorial Hodge theory, Mathematical Programming, 127 (2011), pp. 203–244.[52] D. P. KINGMA AND J. BA, Adam: A method for stochastic optimization, in ICLR (Poster), 2015.[53] D. B. KOTLOW, A free boundary problem connected with the optimal stopping problem for diffusion processes, Transactions of the American Mathematical Society, 184 (1973), pp. 457–478.[54] A. KRIZHEVSKY, I. SUTSKEVER, AND G. E. HINTON, ImageNet classification with deep convolutional neural networks, in Advances in neural information processing systems, 2012, pp. 1097–1105.[55] Y. LECUN, Y. BENGIO, AND G. HINTON, Deep learning, nature, 521 (2015), pp. 436–444.[56] Y. LECUN, L. BOTTOU, Y. BENGIO, AND P. HAFFNER, Gradient-based learning applied to document recognition, Proceedings of the IEEE, 86 (1998), pp. 2278–2324.[57] D. LEVIN, The approximation power of moving least-squares, Mathematics of computation, 67 (1998), pp. 1517–1531.[58] H. LIN AND S. JEGELKA, Resnet with one-neuron hidden layers is a universal approximator, Advances in neural information processing systems, 31 (2018), pp. 6169–6178.[59] T. LIN AND H. ZHA, Riemannian manifold learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30 (2008), pp. 796–809.[60] T.-Y. LIN AND Y.-L. TSAI, An application of hodgerank to online peer assessment, in Proceedings of the 2017 SIAM Workshop on Machine Learning for Recommender Systems (MLRec), 2018.[61] Y. LIPMAN, D. COHEN-OR, D. LEVIN, AND H. TAL-EZER, Parameterization-free projection for geometry reconstruction, ACM Transactions on Graphics (TOG), 26 (2007), pp. 22–es.[62] H.-K. LIU, Properties of American volatility options in the mean-reverting 3/2 volatility model, SIAM Journal on Financial Mathematics, 6 (2015), pp. 53–65.[63] H.-K. LIU, T.-Y. LIN, AND Y.-L. TSAI, On the pricing formula for the perpetual American volatility option under the mean-reverting processes, Taiwanese Journal of Mathematics, (2020).[64] T.-Y. LIU, Learning to rank for information retrieval, (2011).[65] Y. LU, A. ZHONG, Q. LI, AND B. DONG, Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations, in International Conference on Machine Learning, PMLR, 2018, pp. 3276–3285.[66] Z. LU, H. PU, F. WANG, Z. HU, AND L. WANG, The expressive power of neural networks: A view from the width, Advances in neural information processing systems, 30 (2017), pp. 6231–6239.[67] A. L. MAAS, A. Y. HANNUN, AND A. Y. NG, Rectifier nonlinearities improve neuralnetwork acoustic models, in ICML Workshop on Deep Learning for Audio, Speechand Language Processing, 2013.[68] G. A. MILLER, The magical number seven, plus or minus two: Some limits on our capacity for processing information., Psychological review, 63 (1956), p. 81.[69] J. W. Milnor and j. D. Stasheff, Characteristic Classes, no. 76, Princeton University Press, 1974.[70] V. MNIH, K. KAVUKCUOGLU, D. SILVER, A. GRAVES, I. ANTONOGLOU, D. WIERSTRA, AND M. RIEDMILLER, Playing Atari with deep reinforcement learning, (2013). cite arxiv:1312.5602Comment: NIPS Deep Learning Workshop 2013.[71] Y. NESTEROV, A method for unconstrained convex minimization problem with the rate of convergence o (1/kˆ 2), in Doklady an ussr, vol. 269, 1983, pp. 543–547.[72] B. NEYSHABUR, S. BHOJANAPALLI, D. MCALLESTER, AND N. SREBRO, Exploring generalization in deep learning, in Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017, pp. 5949–5958.[73] P. NIYOGI, S. SMALE, AND S. WEINBERGER, Finding the homology of submanifolds with high confidence from random samples, Discrete & Computational Geometry, 39 (2008), pp. 419–441.[74] Y. PARMAR, S. NATARAJAN, AND G. SOBHA, Deeprange: deep-learning-based object detection and ranging in autonomous driving, IET Intelligent Transport Systems, 13 (2019), pp. 1256–1264.[75] M. PETERS, M. NEUMANN, M. IYYER, M. GARDNER, C. CLARK, K. LEE, AND L. ZETTLEMOYER, Deep contextualized word representations, in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018, pp. 2227–2237.[76] G. PHILIPP, D. SONG, AND J. G. CARBONELL, Gradients explode-deep networks are shallow-resnet explained, (2018).[77] R. PIZIAK AND P. ODELL, Affine projections, Computers & Mathematics with Applications, 48 (2004), pp. 177–190.[78] R. PLESS AND R. SOUVENIR, A survey of manifold learning for images, IPSJ Transactions on Computer Vision and Applications, 1 (2009), pp. 83–94.[79] L. RUTHOTTO AND E. HABER, Deep neural networks motivated by partial differential equations, Journal of Mathematical Imaging and Vision, (2019), pp. 1–13.[80] R. W. SAATY, The analytic hierarchy process-what it is and how it is used, Mathematical modelling, 9 (1987), pp. 161–176.[81] M. SAERENS, F. FOUSS, L. YEN, AND P. DUPONT, The principal components analysis of a graph, and its relationships to spectral clustering, in European conference on machine learning, Springer, 2004, pp. 371–383.[82] F. SCARSELLI, M. GORI, A. C. TSOI, M. HAGENBUCHNER, AND G. MONFARDINI, The graph neural network model, IEEE Transactions on Neural Networks, 20 (2008), pp. 61–80.[83] R. K. SIZEMORE, HodgeRank: Applying combinatorial Hodge theory to sports ranking, PhD thesis, Wake Forest University, 2013.[84] B. SOBER AND D. LEVIN, Manifold approximation by moving least-squares projection (MMLS), Constructive Approximation, (2019), pp. 1–46.[85] G. STEWART, On the continuity of the generalized inverse, SIAM Journal on Applied Mathematics, 17 (1969), pp. 33–45.[86] Y. TAY, M. DEHGHANI, D. BAHRI, AND D. METZLER, Efficient transformers: A survey, arXiv preprint arXiv:2009.06732, (2020).[87] N. M. TRAN, Pairwise ranking: choice of method can produce arbitrarily different rank order, Linear Algebra and its Applications, 438 (2013), pp. 1012–1024.[88] G. VAN ROSSUM AND F. L. DRAKE JR, Python tutorial, vol. 620, Centrum voor Wiskunde en Informatica Amsterdam, 1995.[89] A. VASWANI, N. SHAZEER, N. PARMAR, J. USZKOREIT, L. JONES, A. N. GOMEZ, L. KAISER, AND I. POLOSUKHIN, Attention is all you need, Advances in neural information processing systems, 30 (2017), pp. 5998–6008.[90] A. VEIT, M. J. WILBER, AND S. BELONGIE, Residual networks behave like ensembles of relatively shallow networks, Advances in neural information processing systems, 29 (2016), pp. 550–558.[91] P. VELICKOVIC, G. CUCURULL, A. CASANOVA, A. ROMERO, P. LIO, AND Y. BENGIO, Graph attention networks, in International Conference on Learning Representations, 2018.[92] V. P. VISHWAKARMA AND M. GUPTA, A new learning algorithm for single hidden layer feedforward neural networks, International Journal of Computer Applications, 28 (2011), pp. 26–33.[93] T. WALSH, The PeerRank method for peer assessment, in Proceedings of the Twenty-first European Conference on Artificial Intelligence, 2014, pp. 909–914.[94] T.-W. WENG, H. ZHANG, P.-Y. CHEN, J. YI, D. SU, Y. GAO, C.-J. HSIEH, AND L. DANIEL, valuating the robustness of neural networks: An extreme value theory approach, in International Conference on Learning Representations, 2018.[95] D. B. WEST, Introduction to graph theory, vol. 2, Prentice hall Upper Saddle River, 2001.[96] H. WHITNEY, Differentiable manifolds, Annals of Mathematics, (1936), pp. 645–680.[97] S. WOLD, K. ESBENSEN, AND P. GELADI, Principal component analysis, Chemometrics and intelligent laboratory systems, 2 (1987), pp. 37–52.[98] Z. WU, S. PAN, F. CHEN, G. LONG, C. ZHANG, AND S. Y. PHILIP, A comprehensive survey on graph neural networks, IEEE Transactions on Neural Networks and Learning Systems, (2020).[99] S. XIE, R. GIRSHICK, P. DOLLAR, Z. TU, AND K. HE, Aggregated residual transformations for deep neural networks, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1492–1500.[100] K. XU, W. HU, J. LESKOVEC, AND S. JEGELKA, How powerful are graph neural networks?, in International Conference on Learning Representations, 2018. 描述 博士
國立政治大學
應用數學系
101751503資料來源 http://thesis.lib.nccu.edu.tw/record/#G0101751503 資料類型 thesis dc.contributor.advisor 蔡炎龍<br>劉宣谷 zh_TW dc.contributor.advisor Tsai, Yen-Lung<br>Liu, Hsuan-Ku en_US dc.contributor.author (Authors) 林澤佑 zh_TW dc.contributor.author (Authors) Lin, Tse-Yu en_US dc.creator (作者) 林澤佑 zh_TW dc.creator (作者) Lin, Tse-Yu en_US dc.date (日期) 2020 en_US dc.date.accessioned 2021-03-03 - dc.date.available 2021-03-03 - dc.date.issued (上傳時間) 2021-03-03 - dc.identifier (Other Identifiers) G0101751503 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/134283 - dc.description (描述) 博士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 應用數學系 zh_TW dc.description (描述) 101751503 zh_TW dc.description.abstract (摘要) 仿造人腦的功能性,深層神經網路被建立用於萃取高階訊息。從數學的觀點來看,神經網路可以視為在適當定義遇上近似任意函數的近似器。為了展示深層神經網路的威力,我們在本論文的第一部分考慮神經網路的兩種不同形式的應用。第一種應用是源於衍生性金融商品的定價模型,而另一種應用則是將基於仿射空間的流形重建演算法改寫為一殘差神經網路的學習過程,而這樣的改寫提供了深層神經網路在幾何演算法上的潛在應用。本論文的第二個部分,我們聚焦在 HodgeRank,一個基於組合霍奇理論的逐對排名演算法。我們首先會回顧組合貨奇理論的背景知識,接著,我們考慮 HodgeRank 在線上同儕互評上之應用。最後,將 HodgeRank 視為 Moore-Penrose 廣義逆算子與矩陣-向量乘法的合成函數,我們可以探討 HodgeRank 的連續性。最後,我們從圖的角度證明了關於 HodgeRank 的一個連續性定理。 zh_TW dc.description.abstract (摘要) Deep neural networks are modeled to extract higher-level information in a way that is like the function of the human brain. From a mathematical perspective, neural networks are function approximators, which can approximate any function on a suitable domain.In the first part of this dissertation, we consider two different tasks to demonstrate the power of deep neural networks. One task is derived from a option pricing model of financial derivatives while another task is to rewrite an affine subspaces based manifold reconstruction algorithm to a learning process of a deep residual network. Such reformulation offers a possibility for potential application of deep neural networks to various geometrical related algorithms.In the second part, we focus on the HodgeRank, a pairwise ranking method based on the combinatorial Hodge theory. We first quick review the background of combinatorial Hodge theory, then a real world application of HodgeRank to online peer assessment is provided. Finally, by considering HodgeRank as a composition of Moore-Penrose generalized inverse and matrix-vector product, we can study the continuity of HodgeRank. In terms of graph, a theorem of continuity of the HodgeRank is provided in the end. en_US dc.description.tableofcontents 1 Introduction 11.1 Deep Learning 11.2 Manifold Reconstruction 21.3 Pairwise Ranking and Combinatorial Hodge Theory 31.4 Organization of this Dissertation 4I Deep Neural Network and Manifold Reconstruction 52 Deep Learning 62.1 Standard Structure of Neural Network 62.1.1 Learning Process 92.1.2 Universal Approximation Theorem for Shallow Net 112.2 Residual Network 132.2.1 Residual Network 132.2.2 Universal Approximation Theorem for Narrow ResNet 142.3 Graph Neural Network 162.3.1 Graph 162.3.2 Graph Attention Network 193 Neural Network Approach for Pricing American Volatility Options 243.1 Introduction 243.2 Problem Definition 253.2.1 The Probability Density Function and Expectation 253.2.2 The Solution of Partial Differential Equations 263.2.3 The Free Boundary Problem for Pricing an American Volatility Option 273.3 Properties of the Solution 293.4 Asymptotic Behavior of Exercise Boundary Infinitely Far from Expiry 333.5 Neural Network Approach 373.5.1 Comparisons 394 Reconstruction and Interpolation of Manifolds 424.1 Introduction 424.2 Hausdorff Distance and Closeness of Graphs 434.3 Reconstruction and Interpolation of Manifolds 464.4 Algorithms for Manifold Interpolation 494.5 Affine Space and Affine Projection 524.6 Manifold Interpolation as Residual Network 564.6.1 Further Relaxation 574.7 Conclusion 58II HodgeRank and its Continuity 595 Pairwise Comparison and Combinatorial Hodge Theory 605.1 Introductions 605.2 Combinatorial Hodge Theory and HodgeRank 646 On Continuity of the HodgeRank 726.1 Graph Laplacian and Generalized Inverse 726.2 HodgeRank as a Composition Function 746.2.1 Continuity of X |-> LX 766.2.2 Continuity of the Pseudoinverse Operator 766.3 Class of Graph Laplacian Matrices 786.4 Perspective from Grassmannian Manifold 786.5 Conclusion 797 Online Peer Assessment Problem 807.1 Introduction 807.2 Problem Definition 817.3 Data Description 827.4 Numerical Results 828 Conclusions and Future Works 858.1 Conclusions 858.2 Unsolved Problems and Future works 868.2.1 Dimension of the Reconstructed Manifold 868.2.2 HodgeRank and Graph Neural Network 87Bibliography 88 zh_TW dc.format.extent 924942 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0101751503 en_US dc.subject (關鍵詞) 深層神經網路 zh_TW dc.subject (關鍵詞) 深度殘差網路 zh_TW dc.subject (關鍵詞) 移動邊界問題 zh_TW dc.subject (關鍵詞) 流形重建 zh_TW dc.subject (關鍵詞) 霍奇排名 zh_TW dc.subject (關鍵詞) 組合霍奇理論 zh_TW dc.subject (關鍵詞) 拉普拉斯矩陣 zh_TW dc.subject (關鍵詞) 同儕互評 zh_TW dc.subject (關鍵詞) Deep neural network en_US dc.subject (關鍵詞) Deep residual network en_US dc.subject (關鍵詞) Moving boundary problem en_US dc.subject (關鍵詞) Manifold reconstruction en_US dc.subject (關鍵詞) HodgeRank en_US dc.subject (關鍵詞) Combinatorial Hodge theory en_US dc.subject (關鍵詞) Graph Laplacian en_US dc.subject (關鍵詞) Peer assessment en_US dc.title (題名) 基於深度殘差神經網路的流形嵌入與霍奇排名的連續性之探討 zh_TW dc.title (題名) Manifold embedding with deep ResNet and a study of the continuity of HodgeRank en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] D. BAHDANAU, K. H. CHO, AND Y. BENGIO, Neural machine translation by jointly learning to align and translate, in 3rd International Conference on LearningRepresentations, ICLR 2015, 2015.[2] S. BELCIUG AND F. GORUNESCU, Learning a single-hidden layer feed forward neural network using a rank correlation-based strategy with application to high dimensional gene expression and proteomic spectra datasets in cancer detection, Journal of biomedical informatics, 83 (2018), pp. 159–166.[3] Y. BENGIO, Learning deep architectures for AI, Now Publishers Inc, 2009.[4] J.-D. BOISSONNAT AND A. GHOSH, Manifold reconstruction using tangential delaunay complexes, Discrete & Computational Geometry, 51 (2014), pp. 221–267.[5] J.-D. BOISSONNAT, L. J. GUIBAS, AND S. Y. OUDOT, Manifold reconstruction in arbitrary dimensions using witness complexes, Discrete & Computational Geometry, 42 (2009), pp. 37–70.[6] C. BREGLER AND S. M. OMOHUNDRO, Nonlinear manifold learning for visual speech recognition, in Proceedings of IEEE International Conference on Computer Vision, IEEE, 1995, pp. 494–499.[7] H. BREZIS, Functional analysis, Sobolev spaces and partial differential equations, Springer Science & Business Media, 2010.[8] T. B. BROWN, B. MANN, N. RYDER, M. SUBBIAH, J. KAPLAN, P. DHARIWAL, A. NEELAKANTAN, P. SHYAM, G. SASTRY, A. ASKELL, ET AL., Language models are few-shot learners, arXiv preprint arXiv:2005.14165, (2020).[9] D. BURAGO, I. D. BURAGO, Y. BURAGO, S. IVANOV, AND S. A. IVANOV, A course in metric geometry, vol. 33, American Mathematical Soc., 2001.[10] J. CAO AND Z. LIN, Extreme learning machines on high dimensional and large data applications: a survey, Mathematical Problems in Engineering, (2015).[11] I. CARAGIANNIS, G. A. KRIMPAS, AND A. A. VOUDOURIS, Aggregating partial rankings with applications to peer grading in massive online open courses, in Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015, pp. 675–683.[12] M. P. D. CARMO, Riemannian geometry, Birkhäuser, 1992.[13] T. Q. CHEN, Y. RUBANOVA, J. BETTENCOURT, AND D. K. DUVENAUD, Neural ordinary differential equations, in Advances in Neural Information Processing Systems, 2018, pp. 6572–6583.[14] S.-W. CHENG, T. K. DEY, AND E. A. RAMOS, Manifold reconstruction from point samples., in SODA, vol. 5, 2005, pp. 1018–1027.[15] K. CHO, B. VAN MERRIENBOER, Ç. GULYEHRE, D. BAHDANAU, F. BOUGARES, H. SCHWENK, AND Y. BENGIO, Learning phrase representations using rnn encoder-decoder for statistical machine translation, in EMNLP, 2014.[16] F. R. CHUNG AND F. C. GRAHAM, Spectral graph theory, no. 92, American Mathematical Soc., 1997.[17] J. C. COX, J. E. INGERSOLL JR, AND S. A. ROSS, A theory of the term structure of interest rates, in Theory of valuation, World Scientific, 2005, pp. 129–164.[18] M. A. COX AND T. F. COX, Multidimensional scaling, in Handbook of data visualization, Springer, 2008, pp. 315–347.[19] G. CRAWFORD, The geometric mean procedure for estimating the scale of a judgement matrix, Mathematical Modelling, 9 (1987), pp. 327–334.[20] G. CYBENKO, Approximation by superpositions of a sigmoidal function, Mathematics of control, signals and systems, 2 (1989), pp. 303–314.[21] L. DENG, G. HINTON, AND B. KINGSBURY, New types of deep neural network learning for speech recognition and related applications: An overview, in 2013 IEEE international conference on acoustics, speech and signal processing, IEEE, 2013, pp. 8599–8603.[22] J. DETEMPLE AND Y. KITAPBAYEV, On American VIX options under the generalized 3/2 and 1/2 models, Mathematical Finance, 28 (2018), pp. 550–581.[23] J. DETEMPLE AND C. OSAKWE, The valuation of volatility options, Review of Finance, 4 (2000), pp. 21–50.[24] F. DEUTSCH, The angle between subspaces of a Hilbert space, in Approximation theory, wavelets and applications, Springer, 1995, pp. 107–130.[25] J. DEVLIN, M.-W. CHANG, K. LEE, AND K. TOUTANOVA, BERT: Pre-training of deepbidirectional transformers for language understanding, in Proceedings of the 2019Conference of the North American Chapter of the Association for ComputationalLinguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019, pp. 4171–4186.[26] M. DRAZIN, Pseudo-inverses in associative rings and semigroups, The American mathematical monthly, 65 (1958), pp. 506–514.[27] B. ERIKSSON, Learning to top-k search using pairwise comparisons, in Artificial Intelligence and Statistics, PMLR, 2013, pp. 265–273.[28] C. FARABET, C. COUPRIE, L. NAJMAN, AND Y. LECUN, Learning hierarchical features for scene labeling, IEEE transactions on pattern analysis and machine intelligence, 35 (2012), pp. 1915–1929.[29] C. FEFFERMAN, S. IVANOV, Y. KURYLEV, M. LASSAS, AND H. NARAYANAN, Fitting a putative manifold to noisy data, in Proceedings of COLT 2018 (Conference On Learning Theory), 2018.[30] -, Reconstruction and interpolation of manifolds. I: The geometric Whitney problem, Foundations of Computational Mathematics, (2019), pp. 1–99.[31] C. FEFFERMAN, S. IVANOV, M. LASSAS, AND H. NARAYANAN, Reconstruction of a Riemannian manifold from noisy intrinsic distances, SIAM Journal on Mathematics of Data Science, 2 (2020), pp. 770–808.[32] W. FELLER, Two singular diffusion problems, Annals of mathematics, (1951), pp. 173–182.[33] R. A. FISHER, The use of multiple measurements in taxonomic problems, Annals of eugenics, 7 (1936), pp. 179–188.[34] G. B. FOLLAND, Real analysis: modern techniques and their applications, vol. 40, John Wiley & Sons, 1999.[35] S. FREEMAN AND J. W. PARKS, How accurate is peer grading?, CBE-Life Sciences Education, 9 (2010), pp. 482–488.[36] F. A. GERS, J. SCHMIDHUBER, AND F. CUMMINS, Learning to forget: Continual prediction with LSTM, (1999).[37] J. GOARD AND M. MAZUR, Stochastic volatility models and the pricing of VIX options, Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics, 23 (2013), pp. 439–458.[38] I. GOODFELLOW, Y. BENGIO, A. COURVILLE, AND Y. BENGIO, Deep learning, vol. 1, MIT press Cambridge, 2016.[39] A. GRAVES, A.-R. MOHAMED, AND G. HINTON, Speech recognition with deep recurrent neural networks, in 2013 IEEE international conference on acoustics, speech and signal processing, Ieee, 2013, pp. 6645–6649.[40] A. GRUNBICHLER AND F. A. LONGSTAFF, Valuing futures and options on volatility, Journal of Banking & Finance, 20 (1996), pp. 985–1001.[41] E. HABER AND L. RUTHOTTO, Stable architectures for deep neural networks, Inverse Problems, 34 (2017), p. 014004.[42] B. HANIN AND M. SELLKE, Approximating continuous functions by relu nets of minimal width, arXiv preprint arXiv:1710.11278, (2017).[43] K. HE, X. ZHANG, S. REN, AND J. SUN, Deep residual learning for image recognition, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.[44] M. HERBSTER, M. PONTIL, AND L. WAINER, Online learning over graphs, in Proceedings of the 22nd international conference on Machine learning, 2005, pp. 305–312.[45] S. HOCHREITER AND J. SCHMIDHUBER, Long short-term memory, Neural computation, 9 (1997), pp. 1735–1780.[46] L. HOGBEN, Spectral graph theory and the inverse eigenvalue problem of a graph, The Electronic Journal of Linear Algebra, 14 (2005).[47] M. HOHENWARTER, Geogebra–ein softwaresystem für dynamische geometrie und algebra der ebene [geogebra–a software system for dynamic geometry and algebra in the plane] (unpublished master’s thesis), Universität Salzburg, Salzburg, Austria, (2002).[48] J. HOWARD AND S. RUDER, Universal language model fine-tuning for text classification, in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 328–339.[49] G.-B. HUANG, Q.-Y. ZHU, AND C.-K. SIEW, Extreme learning machine: a new learning scheme of feedforward neural networks, in 2004 IEEE international joint conference on neural networks (IEEE Cat. No. 04CH37541), vol. 2, Ieee, 2004, pp. 985–990.[50] X. HUO, X. S. NI, AND A. K. SMITH, A survey of manifold-based learning methods, Recent advances in data mining of enterprise data, (2007), pp. 691–745.[51] X. JIANG, L.-H. LIM, Y. YAO, AND Y. YE, Statistical ranking and combinatorial Hodge theory, Mathematical Programming, 127 (2011), pp. 203–244.[52] D. P. KINGMA AND J. BA, Adam: A method for stochastic optimization, in ICLR (Poster), 2015.[53] D. B. KOTLOW, A free boundary problem connected with the optimal stopping problem for diffusion processes, Transactions of the American Mathematical Society, 184 (1973), pp. 457–478.[54] A. KRIZHEVSKY, I. SUTSKEVER, AND G. E. HINTON, ImageNet classification with deep convolutional neural networks, in Advances in neural information processing systems, 2012, pp. 1097–1105.[55] Y. LECUN, Y. BENGIO, AND G. HINTON, Deep learning, nature, 521 (2015), pp. 436–444.[56] Y. LECUN, L. BOTTOU, Y. BENGIO, AND P. HAFFNER, Gradient-based learning applied to document recognition, Proceedings of the IEEE, 86 (1998), pp. 2278–2324.[57] D. LEVIN, The approximation power of moving least-squares, Mathematics of computation, 67 (1998), pp. 1517–1531.[58] H. LIN AND S. JEGELKA, Resnet with one-neuron hidden layers is a universal approximator, Advances in neural information processing systems, 31 (2018), pp. 6169–6178.[59] T. LIN AND H. ZHA, Riemannian manifold learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30 (2008), pp. 796–809.[60] T.-Y. LIN AND Y.-L. TSAI, An application of hodgerank to online peer assessment, in Proceedings of the 2017 SIAM Workshop on Machine Learning for Recommender Systems (MLRec), 2018.[61] Y. LIPMAN, D. COHEN-OR, D. LEVIN, AND H. TAL-EZER, Parameterization-free projection for geometry reconstruction, ACM Transactions on Graphics (TOG), 26 (2007), pp. 22–es.[62] H.-K. LIU, Properties of American volatility options in the mean-reverting 3/2 volatility model, SIAM Journal on Financial Mathematics, 6 (2015), pp. 53–65.[63] H.-K. LIU, T.-Y. LIN, AND Y.-L. TSAI, On the pricing formula for the perpetual American volatility option under the mean-reverting processes, Taiwanese Journal of Mathematics, (2020).[64] T.-Y. LIU, Learning to rank for information retrieval, (2011).[65] Y. LU, A. ZHONG, Q. LI, AND B. DONG, Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations, in International Conference on Machine Learning, PMLR, 2018, pp. 3276–3285.[66] Z. LU, H. PU, F. WANG, Z. HU, AND L. WANG, The expressive power of neural networks: A view from the width, Advances in neural information processing systems, 30 (2017), pp. 6231–6239.[67] A. L. MAAS, A. Y. HANNUN, AND A. Y. NG, Rectifier nonlinearities improve neuralnetwork acoustic models, in ICML Workshop on Deep Learning for Audio, Speechand Language Processing, 2013.[68] G. A. MILLER, The magical number seven, plus or minus two: Some limits on our capacity for processing information., Psychological review, 63 (1956), p. 81.[69] J. W. Milnor and j. D. Stasheff, Characteristic Classes, no. 76, Princeton University Press, 1974.[70] V. MNIH, K. KAVUKCUOGLU, D. SILVER, A. GRAVES, I. ANTONOGLOU, D. WIERSTRA, AND M. RIEDMILLER, Playing Atari with deep reinforcement learning, (2013). cite arxiv:1312.5602Comment: NIPS Deep Learning Workshop 2013.[71] Y. NESTEROV, A method for unconstrained convex minimization problem with the rate of convergence o (1/kˆ 2), in Doklady an ussr, vol. 269, 1983, pp. 543–547.[72] B. NEYSHABUR, S. BHOJANAPALLI, D. MCALLESTER, AND N. SREBRO, Exploring generalization in deep learning, in Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017, pp. 5949–5958.[73] P. NIYOGI, S. SMALE, AND S. WEINBERGER, Finding the homology of submanifolds with high confidence from random samples, Discrete & Computational Geometry, 39 (2008), pp. 419–441.[74] Y. PARMAR, S. NATARAJAN, AND G. SOBHA, Deeprange: deep-learning-based object detection and ranging in autonomous driving, IET Intelligent Transport Systems, 13 (2019), pp. 1256–1264.[75] M. PETERS, M. NEUMANN, M. IYYER, M. GARDNER, C. CLARK, K. LEE, AND L. ZETTLEMOYER, Deep contextualized word representations, in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018, pp. 2227–2237.[76] G. PHILIPP, D. SONG, AND J. G. CARBONELL, Gradients explode-deep networks are shallow-resnet explained, (2018).[77] R. PIZIAK AND P. ODELL, Affine projections, Computers & Mathematics with Applications, 48 (2004), pp. 177–190.[78] R. PLESS AND R. SOUVENIR, A survey of manifold learning for images, IPSJ Transactions on Computer Vision and Applications, 1 (2009), pp. 83–94.[79] L. RUTHOTTO AND E. HABER, Deep neural networks motivated by partial differential equations, Journal of Mathematical Imaging and Vision, (2019), pp. 1–13.[80] R. W. SAATY, The analytic hierarchy process-what it is and how it is used, Mathematical modelling, 9 (1987), pp. 161–176.[81] M. SAERENS, F. FOUSS, L. YEN, AND P. DUPONT, The principal components analysis of a graph, and its relationships to spectral clustering, in European conference on machine learning, Springer, 2004, pp. 371–383.[82] F. SCARSELLI, M. GORI, A. C. TSOI, M. HAGENBUCHNER, AND G. MONFARDINI, The graph neural network model, IEEE Transactions on Neural Networks, 20 (2008), pp. 61–80.[83] R. K. SIZEMORE, HodgeRank: Applying combinatorial Hodge theory to sports ranking, PhD thesis, Wake Forest University, 2013.[84] B. SOBER AND D. LEVIN, Manifold approximation by moving least-squares projection (MMLS), Constructive Approximation, (2019), pp. 1–46.[85] G. STEWART, On the continuity of the generalized inverse, SIAM Journal on Applied Mathematics, 17 (1969), pp. 33–45.[86] Y. TAY, M. DEHGHANI, D. BAHRI, AND D. METZLER, Efficient transformers: A survey, arXiv preprint arXiv:2009.06732, (2020).[87] N. M. TRAN, Pairwise ranking: choice of method can produce arbitrarily different rank order, Linear Algebra and its Applications, 438 (2013), pp. 1012–1024.[88] G. VAN ROSSUM AND F. L. DRAKE JR, Python tutorial, vol. 620, Centrum voor Wiskunde en Informatica Amsterdam, 1995.[89] A. VASWANI, N. SHAZEER, N. PARMAR, J. USZKOREIT, L. JONES, A. N. GOMEZ, L. KAISER, AND I. POLOSUKHIN, Attention is all you need, Advances in neural information processing systems, 30 (2017), pp. 5998–6008.[90] A. VEIT, M. J. WILBER, AND S. BELONGIE, Residual networks behave like ensembles of relatively shallow networks, Advances in neural information processing systems, 29 (2016), pp. 550–558.[91] P. VELICKOVIC, G. CUCURULL, A. CASANOVA, A. ROMERO, P. LIO, AND Y. BENGIO, Graph attention networks, in International Conference on Learning Representations, 2018.[92] V. P. VISHWAKARMA AND M. GUPTA, A new learning algorithm for single hidden layer feedforward neural networks, International Journal of Computer Applications, 28 (2011), pp. 26–33.[93] T. WALSH, The PeerRank method for peer assessment, in Proceedings of the Twenty-first European Conference on Artificial Intelligence, 2014, pp. 909–914.[94] T.-W. WENG, H. ZHANG, P.-Y. CHEN, J. YI, D. SU, Y. GAO, C.-J. HSIEH, AND L. DANIEL, valuating the robustness of neural networks: An extreme value theory approach, in International Conference on Learning Representations, 2018.[95] D. B. WEST, Introduction to graph theory, vol. 2, Prentice hall Upper Saddle River, 2001.[96] H. WHITNEY, Differentiable manifolds, Annals of Mathematics, (1936), pp. 645–680.[97] S. WOLD, K. ESBENSEN, AND P. GELADI, Principal component analysis, Chemometrics and intelligent laboratory systems, 2 (1987), pp. 37–52.[98] Z. WU, S. PAN, F. CHEN, G. LONG, C. ZHANG, AND S. Y. PHILIP, A comprehensive survey on graph neural networks, IEEE Transactions on Neural Networks and Learning Systems, (2020).[99] S. XIE, R. GIRSHICK, P. DOLLAR, Z. TU, AND K. HE, Aggregated residual transformations for deep neural networks, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1492–1500.[100] K. XU, W. HU, J. LESKOVEC, AND S. JEGELKA, How powerful are graph neural networks?, in International Conference on Learning Representations, 2018. zh_TW dc.identifier.doi (DOI) 10.6814/NCCU202100296 en_US