基於深度殘差神經網路的流形嵌入與霍奇排名的連續性之探討

學術產出-Theses

Article View/Open

pdf(31)

Publication Export

Google Scholar^TM

政大圖書館

學術資源探索系統

Citation Infomation

Simple Record
Full Record

題名	基於深度殘差神經網路的流形嵌入與霍奇排名的連續性之探討 Manifold embedding with deep ResNet and a study of the continuity of HodgeRank
作者	林澤佑 Lin, Tse-Yu
貢獻者	蔡炎龍<br>劉宣谷 Tsai, Yen-Lung<br>Liu, Hsuan-Ku 林澤佑 Lin, Tse-Yu
關鍵詞	深層神經網路深度殘差網路移動邊界問題流形重建霍奇排名組合霍奇理論拉普拉斯矩陣同儕互評 Deep neural network Deep residual network Moving boundary problem Manifold reconstruction HodgeRank Combinatorial Hodge theory Graph Laplacian Peer assessment
日期	2020
上傳時間	2021-03-03
摘要	仿造人腦的功能性，深層神經網路被建立用於萃取高階訊息。從數學的觀點來看，神經網路可以視為在適當定義遇上近似任意函數的近似器。為了展示深層神經網路的威力，我們在本論文的第一部分考慮神經網路的兩種不同形式的應用。第一種應用是源於衍生性金融商品的定價模型，而另一種應用則是將基於仿射空間的流形重建演算法改寫為一殘差神經網路的學習過程，而這樣的改寫提供了深層神經網路在幾何演算法上的潛在應用。本論文的第二個部分，我們聚焦在 HodgeRank，一個基於組合霍奇理論的逐對排名演算法。我們首先會回顧組合貨奇理論的背景知識，接著，我們考慮 HodgeRank 在線上同儕互評上之應用。最後，將 HodgeRank 視為 Moore-Penrose 廣義逆算子與矩陣-向量乘法的合成函數，我們可以探討 HodgeRank 的連續性。最後，我們從圖的角度證明了關於 HodgeRank 的一個連續性定理。 Deep neural networks are modeled to extract higher-level information in a way that is like the function of the human brain. From a mathematical perspective, neural networks are function approximators, which can approximate any function on a suitable domain. In the first part of this dissertation, we consider two different tasks to demonstrate the power of deep neural networks. One task is derived from a option pricing model of financial derivatives while another task is to rewrite an affine subspaces based manifold reconstruction algorithm to a learning process of a deep residual network. Such reformulation offers a possibility for potential application of deep neural networks to various geometrical related algorithms. In the second part, we focus on the HodgeRank, a pairwise ranking method based on the combinatorial Hodge theory. We first quick review the background of combinatorial Hodge theory, then a real world application of HodgeRank to online peer assessment is provided. Finally, by considering HodgeRank as a composition of Moore-Penrose generalized inverse and matrix-vector product, we can study the continuity of HodgeRank. In terms of graph, a theorem of continuity of the HodgeRank is provided in the end.
參考文獻	[1] D. BAHDANAU, K. H. CHO, AND Y. BENGIO, Neural machine translation by jointly learning to align and translate, in 3rd International Conference on Learning Representations, ICLR 2015, 2015. [2] S. BELCIUG AND F. GORUNESCU, Learning a single-hidden layer feed forward neural network using a rank correlation-based strategy with application to high dimensional gene expression and proteomic spectra datasets in cancer detection, Journal of biomedical informatics, 83 (2018), pp. 159–166. [3] Y. BENGIO, Learning deep architectures for AI, Now Publishers Inc, 2009. [4] J.-D. BOISSONNAT AND A. GHOSH, Manifold reconstruction using tangential delaunay complexes, Discrete & Computational Geometry, 51 (2014), pp. 221–267. [5] J.-D. BOISSONNAT, L. J. GUIBAS, AND S. Y. OUDOT, Manifold reconstruction in arbitrary dimensions using witness complexes, Discrete & Computational Geometry, 42 (2009), pp. 37–70. [6] C. BREGLER AND S. M. OMOHUNDRO, Nonlinear manifold learning for visual speech recognition, in Proceedings of IEEE International Conference on Computer Vision, IEEE, 1995, pp. 494–499. [7] H. BREZIS, Functional analysis, Sobolev spaces and partial differential equations, Springer Science & Business Media, 2010. [8] T. B. BROWN, B. MANN, N. RYDER, M. SUBBIAH, J. KAPLAN, P. DHARIWAL, A. NEELAKANTAN, P. SHYAM, G. SASTRY, A. ASKELL, ET AL., Language models are few-shot learners, arXiv preprint arXiv:2005.14165, (2020). [9] D. BURAGO, I. D. BURAGO, Y. BURAGO, S. IVANOV, AND S. A. IVANOV, A course in metric geometry, vol. 33, American Mathematical Soc., 2001. [10] J. CAO AND Z. LIN, Extreme learning machines on high dimensional and large data applications: a survey, Mathematical Problems in Engineering, (2015). [11] I. CARAGIANNIS, G. A. KRIMPAS, AND A. A. VOUDOURIS, Aggregating partial rankings with applications to peer grading in massive online open courses, in Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015, pp. 675–683. [12] M. P. D. CARMO, Riemannian geometry, Birkhäuser, 1992. [13] T. Q. CHEN, Y. RUBANOVA, J. BETTENCOURT, AND D. K. DUVENAUD, Neural ordinary differential equations, in Advances in Neural Information Processing Systems, 2018, pp. 6572–6583. [14] S.-W. CHENG, T. K. DEY, AND E. A. RAMOS, Manifold reconstruction from point samples., in SODA, vol. 5, 2005, pp. 1018–1027. [15] K. CHO, B. VAN MERRIENBOER, Ç. GULYEHRE, D. BAHDANAU, F. BOUGARES, H. SCHWENK, AND Y. BENGIO, Learning phrase representations using rnn encoder-decoder for statistical machine translation, in EMNLP, 2014. [16] F. R. CHUNG AND F. C. GRAHAM, Spectral graph theory, no. 92, American Mathematical Soc., 1997. [17] J. C. COX, J. E. INGERSOLL JR, AND S. A. ROSS, A theory of the term structure of interest rates, in Theory of valuation, World Scientific, 2005, pp. 129–164. [18] M. A. COX AND T. F. COX, Multidimensional scaling, in Handbook of data visualization, Springer, 2008, pp. 315–347. [19] G. CRAWFORD, The geometric mean procedure for estimating the scale of a judgement matrix, Mathematical Modelling, 9 (1987), pp. 327–334. [20] G. CYBENKO, Approximation by superpositions of a sigmoidal function, Mathematics of control, signals and systems, 2 (1989), pp. 303–314. [21] L. DENG, G. HINTON, AND B. KINGSBURY, New types of deep neural network learning for speech recognition and related applications: An overview, in 2013 IEEE international conference on acoustics, speech and signal processing, IEEE, 2013, pp. 8599–8603. [22] J. DETEMPLE AND Y. KITAPBAYEV, On American VIX options under the generalized 3/2 and 1/2 models, Mathematical Finance, 28 (2018), pp. 550–581. [23] J. DETEMPLE AND C. OSAKWE, The valuation of volatility options, Review of Finance, 4 (2000), pp. 21–50. [24] F. DEUTSCH, The angle between subspaces of a Hilbert space, in Approximation theory, wavelets and applications, Springer, 1995, pp. 107–130. [25] J. DEVLIN, M.-W. CHANG, K. LEE, AND K. TOUTANOVA, BERT: Pre-training of deep bidirectional transformers for language understanding, in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019, pp. 4171–4186. [26] M. DRAZIN, Pseudo-inverses in associative rings and semigroups, The American mathematical monthly, 65 (1958), pp. 506–514. [27] B. ERIKSSON, Learning to top-k search using pairwise comparisons, in Artificial Intelligence and Statistics, PMLR, 2013, pp. 265–273. [28] C. FARABET, C. COUPRIE, L. NAJMAN, AND Y. LECUN, Learning hierarchical features for scene labeling, IEEE transactions on pattern analysis and machine intelligence, 35 (2012), pp. 1915–1929. [29] C. FEFFERMAN, S. IVANOV, Y. KURYLEV, M. LASSAS, AND H. NARAYANAN, Fitting a putative manifold to noisy data, in Proceedings of COLT 2018 (Conference On Learning Theory), 2018. [30] -, Reconstruction and interpolation of manifolds. I: The geometric Whitney problem, Foundations of Computational Mathematics, (2019), pp. 1–99. [31] C. FEFFERMAN, S. IVANOV, M. LASSAS, AND H. NARAYANAN, Reconstruction of a Riemannian manifold from noisy intrinsic distances, SIAM Journal on Mathematics of Data Science, 2 (2020), pp. 770–808. [32] W. FELLER, Two singular diffusion problems, Annals of mathematics, (1951), pp. 173–182. [33] R. A. FISHER, The use of multiple measurements in taxonomic problems, Annals of eugenics, 7 (1936), pp. 179–188. [34] G. B. FOLLAND, Real analysis: modern techniques and their applications, vol. 40, John Wiley & Sons, 1999. [35] S. FREEMAN AND J. W. PARKS, How accurate is peer grading?, CBE-Life Sciences Education, 9 (2010), pp. 482–488. [36] F. A. GERS, J. SCHMIDHUBER, AND F. CUMMINS, Learning to forget: Continual prediction with LSTM, (1999). [37] J. GOARD AND M. MAZUR, Stochastic volatility models and the pricing of VIX options, Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics, 23 (2013), pp. 439–458. [38] I. GOODFELLOW, Y. BENGIO, A. COURVILLE, AND Y. BENGIO, Deep learning, vol. 1, MIT press Cambridge, 2016. [39] A. GRAVES, A.-R. MOHAMED, AND G. HINTON, Speech recognition with deep recurrent neural networks, in 2013 IEEE international conference on acoustics, speech and signal processing, Ieee, 2013, pp. 6645–6649. [40] A. GRUNBICHLER AND F. A. LONGSTAFF, Valuing futures and options on volatility, Journal of Banking & Finance, 20 (1996), pp. 985–1001. [41] E. HABER AND L. RUTHOTTO, Stable architectures for deep neural networks, Inverse Problems, 34 (2017), p. 014004. [42] B. HANIN AND M. SELLKE, Approximating continuous functions by relu nets of minimal width, arXiv preprint arXiv:1710.11278, (2017). [43] K. HE, X. ZHANG, S. REN, AND J. SUN, Deep residual learning for image recognition, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778. [44] M. HERBSTER, M. PONTIL, AND L. WAINER, Online learning over graphs, in Proceedings of the 22nd international conference on Machine learning, 2005, pp. 305–312. [45] S. HOCHREITER AND J. SCHMIDHUBER, Long short-term memory, Neural computation, 9 (1997), pp. 1735–1780. [46] L. HOGBEN, Spectral graph theory and the inverse eigenvalue problem of a graph, The Electronic Journal of Linear Algebra, 14 (2005). [47] M. HOHENWARTER, Geogebra–ein softwaresystem für dynamische geometrie und algebra der ebene [geogebra–a software system for dynamic geometry and algebra in the plane] (unpublished master’s thesis), Universität Salzburg, Salzburg, Austria, (2002). [48] J. HOWARD AND S. RUDER, Universal language model fine-tuning for text classification, in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 328–339. [49] G.-B. HUANG, Q.-Y. ZHU, AND C.-K. SIEW, Extreme learning machine: a new learning scheme of feedforward neural networks, in 2004 IEEE international joint conference on neural networks (IEEE Cat. No. 04CH37541), vol. 2, Ieee, 2004, pp. 985–990. [50] X. HUO, X. S. NI, AND A. K. SMITH, A survey of manifold-based learning methods, Recent advances in data mining of enterprise data, (2007), pp. 691–745. [51] X. JIANG, L.-H. LIM, Y. YAO, AND Y. YE, Statistical ranking and combinatorial Hodge theory, Mathematical Programming, 127 (2011), pp. 203–244. [52] D. P. KINGMA AND J. BA, Adam: A method for stochastic optimization, in ICLR (Poster), 2015. [53] D. B. KOTLOW, A free boundary problem connected with the optimal stopping problem for diffusion processes, Transactions of the American Mathematical Society, 184 (1973), pp. 457–478. [54] A. KRIZHEVSKY, I. SUTSKEVER, AND G. E. HINTON, ImageNet classification with deep convolutional neural networks, in Advances in neural information processing systems, 2012, pp. 1097–1105. [55] Y. LECUN, Y. BENGIO, AND G. HINTON, Deep learning, nature, 521 (2015), pp. 436–444. [56] Y. LECUN, L. BOTTOU, Y. BENGIO, AND P. HAFFNER, Gradient-based learning applied to document recognition, Proceedings of the IEEE, 86 (1998), pp. 2278–2324. [57] D. LEVIN, The approximation power of moving least-squares, Mathematics of computation, 67 (1998), pp. 1517–1531. [58] H. LIN AND S. JEGELKA, Resnet with one-neuron hidden layers is a universal approximator, Advances in neural information processing systems, 31 (2018), pp. 6169–6178. [59] T. LIN AND H. ZHA, Riemannian manifold learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30 (2008), pp. 796–809. [60] T.-Y. LIN AND Y.-L. TSAI, An application of hodgerank to online peer assessment, in Proceedings of the 2017 SIAM Workshop on Machine Learning for Recommender Systems (MLRec), 2018. [61] Y. LIPMAN, D. COHEN-OR, D. LEVIN, AND H. TAL-EZER, Parameterization-free projection for geometry reconstruction, ACM Transactions on Graphics (TOG), 26 (2007), pp. 22–es. [62] H.-K. LIU, Properties of American volatility options in the mean-reverting 3/2 volatility model, SIAM Journal on Financial Mathematics, 6 (2015), pp. 53–65. [63] H.-K. LIU, T.-Y. LIN, AND Y.-L. TSAI, On the pricing formula for the perpetual American volatility option under the mean-reverting processes, Taiwanese Journal of Mathematics, (2020). [64] T.-Y. LIU, Learning to rank for information retrieval, (2011). [65] Y. LU, A. ZHONG, Q. LI, AND B. DONG, Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations, in International Conference on Machine Learning, PMLR, 2018, pp. 3276–3285. [66] Z. LU, H. PU, F. WANG, Z. HU, AND L. WANG, The expressive power of neural networks: A view from the width, Advances in neural information processing systems, 30 (2017), pp. 6231–6239. [67] A. L. MAAS, A. Y. HANNUN, AND A. Y. NG, Rectifier nonlinearities improve neural network acoustic models, in ICML Workshop on Deep Learning for Audio, Speech and Language Processing, 2013. [68] G. A. MILLER, The magical number seven, plus or minus two: Some limits on our capacity for processing information., Psychological review, 63 (1956), p. 81. [69] J. W. Milnor and j. D. Stasheff, Characteristic Classes, no. 76, Princeton University Press, 1974. [70] V. MNIH, K. KAVUKCUOGLU, D. SILVER, A. GRAVES, I. ANTONOGLOU, D. WIERSTRA, AND M. RIEDMILLER, Playing Atari with deep reinforcement learning, (2013). cite arxiv:1312.5602Comment: NIPS Deep Learning Workshop 2013. [71] Y. NESTEROV, A method for unconstrained convex minimization problem with the rate of convergence o (1/kˆ 2), in Doklady an ussr, vol. 269, 1983, pp. 543–547. [72] B. NEYSHABUR, S. BHOJANAPALLI, D. MCALLESTER, AND N. SREBRO, Exploring generalization in deep learning, in Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017, pp. 5949–5958. [73] P. NIYOGI, S. SMALE, AND S. WEINBERGER, Finding the homology of submanifolds with high confidence from random samples, Discrete & Computational Geometry, 39 (2008), pp. 419–441. [74] Y. PARMAR, S. NATARAJAN, AND G. SOBHA, Deeprange: deep-learning-based object detection and ranging in autonomous driving, IET Intelligent Transport Systems, 13 (2019), pp. 1256–1264. [75] M. PETERS, M. NEUMANN, M. IYYER, M. GARDNER, C. CLARK, K. LEE, AND L. ZETTLEMOYER, Deep contextualized word representations, in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018, pp. 2227–2237. [76] G. PHILIPP, D. SONG, AND J. G. CARBONELL, Gradients explode-deep networks are shallow-resnet explained, (2018). [77] R. PIZIAK AND P. ODELL, Affine projections, Computers & Mathematics with Applications, 48 (2004), pp. 177–190. [78] R. PLESS AND R. SOUVENIR, A survey of manifold learning for images, IPSJ Transactions on Computer Vision and Applications, 1 (2009), pp. 83–94. [79] L. RUTHOTTO AND E. HABER, Deep neural networks motivated by partial differential equations, Journal of Mathematical Imaging and Vision, (2019), pp. 1–13. [80] R. W. SAATY, The analytic hierarchy process-what it is and how it is used, Mathematical modelling, 9 (1987), pp. 161–176. [81] M. SAERENS, F. FOUSS, L. YEN, AND P. DUPONT, The principal components analysis of a graph, and its relationships to spectral clustering, in European conference on machine learning, Springer, 2004, pp. 371–383. [82] F. SCARSELLI, M. GORI, A. C. TSOI, M. HAGENBUCHNER, AND G. MONFARDINI, The graph neural network model, IEEE Transactions on Neural Networks, 20 (2008), pp. 61–80. [83] R. K. SIZEMORE, HodgeRank: Applying combinatorial Hodge theory to sports ranking, PhD thesis, Wake Forest University, 2013. [84] B. SOBER AND D. LEVIN, Manifold approximation by moving least-squares projection (MMLS), Constructive Approximation, (2019), pp. 1–46. [85] G. STEWART, On the continuity of the generalized inverse, SIAM Journal on Applied Mathematics, 17 (1969), pp. 33–45. [86] Y. TAY, M. DEHGHANI, D. BAHRI, AND D. METZLER, Efficient transformers: A survey, arXiv preprint arXiv:2009.06732, (2020). [87] N. M. TRAN, Pairwise ranking: choice of method can produce arbitrarily different rank order, Linear Algebra and its Applications, 438 (2013), pp. 1012–1024. [88] G. VAN ROSSUM AND F. L. DRAKE JR, Python tutorial, vol. 620, Centrum voor Wiskunde en Informatica Amsterdam, 1995. [89] A. VASWANI, N. SHAZEER, N. PARMAR, J. USZKOREIT, L. JONES, A. N. GOMEZ, L. KAISER, AND I. POLOSUKHIN, Attention is all you need, Advances in neural information processing systems, 30 (2017), pp. 5998–6008. [90] A. VEIT, M. J. WILBER, AND S. BELONGIE, Residual networks behave like ensembles of relatively shallow networks, Advances in neural information processing systems, 29 (2016), pp. 550–558. [91] P. VELICKOVIC, G. CUCURULL, A. CASANOVA, A. ROMERO, P. LIO, AND Y. BENGIO, Graph attention networks, in International Conference on Learning Representations, 2018. [92] V. P. VISHWAKARMA AND M. GUPTA, A new learning algorithm for single hidden layer feedforward neural networks, International Journal of Computer Applications, 28 (2011), pp. 26–33. [93] T. WALSH, The PeerRank method for peer assessment, in Proceedings of the Twenty-first European Conference on Artificial Intelligence, 2014, pp. 909–914. [94] T.-W. WENG, H. ZHANG, P.-Y. CHEN, J. YI, D. SU, Y. GAO, C.-J. HSIEH, AND L. DANIEL, valuating the robustness of neural networks: An extreme value theory approach, in International Conference on Learning Representations, 2018. [95] D. B. WEST, Introduction to graph theory, vol. 2, Prentice hall Upper Saddle River, 2001. [96] H. WHITNEY, Differentiable manifolds, Annals of Mathematics, (1936), pp. 645–680. [97] S. WOLD, K. ESBENSEN, AND P. GELADI, Principal component analysis, Chemometrics and intelligent laboratory systems, 2 (1987), pp. 37–52. [98] Z. WU, S. PAN, F. CHEN, G. LONG, C. ZHANG, AND S. Y. PHILIP, A comprehensive survey on graph neural networks, IEEE Transactions on Neural Networks and Learning Systems, (2020). [99] S. XIE, R. GIRSHICK, P. DOLLAR, Z. TU, AND K. HE, Aggregated residual transformations for deep neural networks, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1492–1500. [100] K. XU, W. HU, J. LESKOVEC, AND S. JEGELKA, How powerful are graph neural networks?, in International Conference on Learning Representations, 2018.
描述	博士國立政治大學應用數學系 101751503
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0101751503
資料類型	thesis

dc.contributor.advisor	蔡炎龍<br>劉宣谷	zh_TW
dc.contributor.advisor	Tsai, Yen-Lung<br>Liu, Hsuan-Ku	en_US
dc.contributor.author (Authors)	林澤佑	zh_TW
dc.contributor.author (Authors)	Lin, Tse-Yu	en_US
dc.creator (作者)	林澤佑	zh_TW
dc.creator (作者)	Lin, Tse-Yu	en_US
dc.date (日期)	2020	en_US
dc.date.accessioned	2021-03-03	-
dc.date.available	2021-03-03	-
dc.date.issued (上傳時間)	2021-03-03	-
dc.identifier (Other Identifiers)	G0101751503	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/134283	-
dc.description (描述)	博士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	應用數學系	zh_TW
dc.description (描述)	101751503	zh_TW
dc.description.abstract (摘要)	仿造人腦的功能性，深層神經網路被建立用於萃取高階訊息。從數學的觀點來看，神經網路可以視為在適當定義遇上近似任意函數的近似器。為了展示深層神經網路的威力，我們在本論文的第一部分考慮神經網路的兩種不同形式的應用。第一種應用是源於衍生性金融商品的定價模型，而另一種應用則是將基於仿射空間的流形重建演算法改寫為一殘差神經網路的學習過程，而這樣的改寫提供了深層神經網路在幾何演算法上的潛在應用。本論文的第二個部分，我們聚焦在 HodgeRank，一個基於組合霍奇理論的逐對排名演算法。我們首先會回顧組合貨奇理論的背景知識，接著，我們考慮 HodgeRank 在線上同儕互評上之應用。最後，將 HodgeRank 視為 Moore-Penrose 廣義逆算子與矩陣-向量乘法的合成函數，我們可以探討 HodgeRank 的連續性。最後，我們從圖的角度證明了關於 HodgeRank 的一個連續性定理。	zh_TW
dc.description.abstract (摘要)	Deep neural networks are modeled to extract higher-level information in a way that is like the function of the human brain. From a mathematical perspective, neural networks are function approximators, which can approximate any function on a suitable domain. In the first part of this dissertation, we consider two different tasks to demonstrate the power of deep neural networks. One task is derived from a option pricing model of financial derivatives while another task is to rewrite an affine subspaces based manifold reconstruction algorithm to a learning process of a deep residual network. Such reformulation offers a possibility for potential application of deep neural networks to various geometrical related algorithms. In the second part, we focus on the HodgeRank, a pairwise ranking method based on the combinatorial Hodge theory. We first quick review the background of combinatorial Hodge theory, then a real world application of HodgeRank to online peer assessment is provided. Finally, by considering HodgeRank as a composition of Moore-Penrose generalized inverse and matrix-vector product, we can study the continuity of HodgeRank. In terms of graph, a theorem of continuity of the HodgeRank is provided in the end.	en_US
dc.description.tableofcontents	1 Introduction 1 1.1 Deep Learning 1 1.2 Manifold Reconstruction 2 1.3 Pairwise Ranking and Combinatorial Hodge Theory 3 1.4 Organization of this Dissertation 4 I Deep Neural Network and Manifold Reconstruction 5 2 Deep Learning 6 2.1 Standard Structure of Neural Network 6 2.1.1 Learning Process 9 2.1.2 Universal Approximation Theorem for Shallow Net 11 2.2 Residual Network 13 2.2.1 Residual Network 13 2.2.2 Universal Approximation Theorem for Narrow ResNet 14 2.3 Graph Neural Network 16 2.3.1 Graph 16 2.3.2 Graph Attention Network 19 3 Neural Network Approach for Pricing American Volatility Options 24 3.1 Introduction 24 3.2 Problem Definition 25 3.2.1 The Probability Density Function and Expectation 25 3.2.2 The Solution of Partial Differential Equations 26 3.2.3 The Free Boundary Problem for Pricing an American Volatility Option 27 3.3 Properties of the Solution 29 3.4 Asymptotic Behavior of Exercise Boundary Infinitely Far from Expiry 33 3.5 Neural Network Approach 37 3.5.1 Comparisons 39 4 Reconstruction and Interpolation of Manifolds 42 4.1 Introduction 42 4.2 Hausdorff Distance and Closeness of Graphs 43 4.3 Reconstruction and Interpolation of Manifolds 46 4.4 Algorithms for Manifold Interpolation 49 4.5 Affine Space and Affine Projection 52 4.6 Manifold Interpolation as Residual Network 56 4.6.1 Further Relaxation 57 4.7 Conclusion 58 II HodgeRank and its Continuity 59 5 Pairwise Comparison and Combinatorial Hodge Theory 60 5.1 Introductions 60 5.2 Combinatorial Hodge Theory and HodgeRank 64 6 On Continuity of the HodgeRank 72 6.1 Graph Laplacian and Generalized Inverse 72 6.2 HodgeRank as a Composition Function 74 6.2.1 Continuity of X \|-> LX 76 6.2.2 Continuity of the Pseudoinverse Operator 76 6.3 Class of Graph Laplacian Matrices 78 6.4 Perspective from Grassmannian Manifold 78 6.5 Conclusion 79 7 Online Peer Assessment Problem 80 7.1 Introduction 80 7.2 Problem Definition 81 7.3 Data Description 82 7.4 Numerical Results 82 8 Conclusions and Future Works 85 8.1 Conclusions 85 8.2 Unsolved Problems and Future works 86 8.2.1 Dimension of the Reconstructed Manifold 86 8.2.2 HodgeRank and Graph Neural Network 87 Bibliography 88	zh_TW
dc.format.extent	924942 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0101751503	en_US
dc.subject (關鍵詞)	深層神經網路	zh_TW
dc.subject (關鍵詞)	深度殘差網路	zh_TW
dc.subject (關鍵詞)	移動邊界問題	zh_TW
dc.subject (關鍵詞)	流形重建	zh_TW
dc.subject (關鍵詞)	霍奇排名	zh_TW
dc.subject (關鍵詞)	組合霍奇理論	zh_TW
dc.subject (關鍵詞)	拉普拉斯矩陣	zh_TW
dc.subject (關鍵詞)	同儕互評	zh_TW
dc.subject (關鍵詞)	Deep neural network	en_US
dc.subject (關鍵詞)	Deep residual network	en_US
dc.subject (關鍵詞)	Moving boundary problem	en_US
dc.subject (關鍵詞)	Manifold reconstruction	en_US
dc.subject (關鍵詞)	HodgeRank	en_US
dc.subject (關鍵詞)	Combinatorial Hodge theory	en_US
dc.subject (關鍵詞)	Graph Laplacian	en_US
dc.subject (關鍵詞)	Peer assessment	en_US
dc.title (題名)	基於深度殘差神經網路的流形嵌入與霍奇排名的連續性之探討	zh_TW
dc.title (題名)	Manifold embedding with deep ResNet and a study of the continuity of HodgeRank	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	[1] D. BAHDANAU, K. H. CHO, AND Y. BENGIO, Neural machine translation by jointly learning to align and translate, in 3rd International Conference on Learning Representations, ICLR 2015, 2015. [2] S. BELCIUG AND F. GORUNESCU, Learning a single-hidden layer feed forward neural network using a rank correlation-based strategy with application to high dimensional gene expression and proteomic spectra datasets in cancer detection, Journal of biomedical informatics, 83 (2018), pp. 159–166. [3] Y. BENGIO, Learning deep architectures for AI, Now Publishers Inc, 2009. [4] J.-D. BOISSONNAT AND A. GHOSH, Manifold reconstruction using tangential delaunay complexes, Discrete & Computational Geometry, 51 (2014), pp. 221–267. [5] J.-D. BOISSONNAT, L. J. GUIBAS, AND S. Y. OUDOT, Manifold reconstruction in arbitrary dimensions using witness complexes, Discrete & Computational Geometry, 42 (2009), pp. 37–70. [6] C. BREGLER AND S. M. OMOHUNDRO, Nonlinear manifold learning for visual speech recognition, in Proceedings of IEEE International Conference on Computer Vision, IEEE, 1995, pp. 494–499. [7] H. BREZIS, Functional analysis, Sobolev spaces and partial differential equations, Springer Science & Business Media, 2010. [8] T. B. BROWN, B. MANN, N. RYDER, M. SUBBIAH, J. KAPLAN, P. DHARIWAL, A. NEELAKANTAN, P. SHYAM, G. SASTRY, A. ASKELL, ET AL., Language models are few-shot learners, arXiv preprint arXiv:2005.14165, (2020). [9] D. BURAGO, I. D. BURAGO, Y. BURAGO, S. IVANOV, AND S. A. IVANOV, A course in metric geometry, vol. 33, American Mathematical Soc., 2001. [10] J. CAO AND Z. LIN, Extreme learning machines on high dimensional and large data applications: a survey, Mathematical Problems in Engineering, (2015). [11] I. CARAGIANNIS, G. A. KRIMPAS, AND A. A. VOUDOURIS, Aggregating partial rankings with applications to peer grading in massive online open courses, in Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015, pp. 675–683. [12] M. P. D. CARMO, Riemannian geometry, Birkhäuser, 1992. [13] T. Q. CHEN, Y. RUBANOVA, J. BETTENCOURT, AND D. K. DUVENAUD, Neural ordinary differential equations, in Advances in Neural Information Processing Systems, 2018, pp. 6572–6583. [14] S.-W. CHENG, T. K. DEY, AND E. A. RAMOS, Manifold reconstruction from point samples., in SODA, vol. 5, 2005, pp. 1018–1027. [15] K. CHO, B. VAN MERRIENBOER, Ç. GULYEHRE, D. BAHDANAU, F. BOUGARES, H. SCHWENK, AND Y. BENGIO, Learning phrase representations using rnn encoder-decoder for statistical machine translation, in EMNLP, 2014. [16] F. R. CHUNG AND F. C. GRAHAM, Spectral graph theory, no. 92, American Mathematical Soc., 1997. [17] J. C. COX, J. E. INGERSOLL JR, AND S. A. ROSS, A theory of the term structure of interest rates, in Theory of valuation, World Scientific, 2005, pp. 129–164. [18] M. A. COX AND T. F. COX, Multidimensional scaling, in Handbook of data visualization, Springer, 2008, pp. 315–347. [19] G. CRAWFORD, The geometric mean procedure for estimating the scale of a judgement matrix, Mathematical Modelling, 9 (1987), pp. 327–334. [20] G. CYBENKO, Approximation by superpositions of a sigmoidal function, Mathematics of control, signals and systems, 2 (1989), pp. 303–314. [21] L. DENG, G. HINTON, AND B. KINGSBURY, New types of deep neural network learning for speech recognition and related applications: An overview, in 2013 IEEE international conference on acoustics, speech and signal processing, IEEE, 2013, pp. 8599–8603. [22] J. DETEMPLE AND Y. KITAPBAYEV, On American VIX options under the generalized 3/2 and 1/2 models, Mathematical Finance, 28 (2018), pp. 550–581. [23] J. DETEMPLE AND C. OSAKWE, The valuation of volatility options, Review of Finance, 4 (2000), pp. 21–50. [24] F. DEUTSCH, The angle between subspaces of a Hilbert space, in Approximation theory, wavelets and applications, Springer, 1995, pp. 107–130. [25] J. DEVLIN, M.-W. CHANG, K. LEE, AND K. TOUTANOVA, BERT: Pre-training of deep bidirectional transformers for language understanding, in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019, pp. 4171–4186. [26] M. DRAZIN, Pseudo-inverses in associative rings and semigroups, The American mathematical monthly, 65 (1958), pp. 506–514. [27] B. ERIKSSON, Learning to top-k search using pairwise comparisons, in Artificial Intelligence and Statistics, PMLR, 2013, pp. 265–273. [28] C. FARABET, C. COUPRIE, L. NAJMAN, AND Y. LECUN, Learning hierarchical features for scene labeling, IEEE transactions on pattern analysis and machine intelligence, 35 (2012), pp. 1915–1929. [29] C. FEFFERMAN, S. IVANOV, Y. KURYLEV, M. LASSAS, AND H. NARAYANAN, Fitting a putative manifold to noisy data, in Proceedings of COLT 2018 (Conference On Learning Theory), 2018. [30] -, Reconstruction and interpolation of manifolds. I: The geometric Whitney problem, Foundations of Computational Mathematics, (2019), pp. 1–99. [31] C. FEFFERMAN, S. IVANOV, M. LASSAS, AND H. NARAYANAN, Reconstruction of a Riemannian manifold from noisy intrinsic distances, SIAM Journal on Mathematics of Data Science, 2 (2020), pp. 770–808. [32] W. FELLER, Two singular diffusion problems, Annals of mathematics, (1951), pp. 173–182. [33] R. A. FISHER, The use of multiple measurements in taxonomic problems, Annals of eugenics, 7 (1936), pp. 179–188. [34] G. B. FOLLAND, Real analysis: modern techniques and their applications, vol. 40, John Wiley & Sons, 1999. [35] S. FREEMAN AND J. W. PARKS, How accurate is peer grading?, CBE-Life Sciences Education, 9 (2010), pp. 482–488. [36] F. A. GERS, J. SCHMIDHUBER, AND F. CUMMINS, Learning to forget: Continual prediction with LSTM, (1999). [37] J. GOARD AND M. MAZUR, Stochastic volatility models and the pricing of VIX options, Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics, 23 (2013), pp. 439–458. [38] I. GOODFELLOW, Y. BENGIO, A. COURVILLE, AND Y. BENGIO, Deep learning, vol. 1, MIT press Cambridge, 2016. [39] A. GRAVES, A.-R. MOHAMED, AND G. HINTON, Speech recognition with deep recurrent neural networks, in 2013 IEEE international conference on acoustics, speech and signal processing, Ieee, 2013, pp. 6645–6649. [40] A. GRUNBICHLER AND F. A. LONGSTAFF, Valuing futures and options on volatility, Journal of Banking & Finance, 20 (1996), pp. 985–1001. [41] E. HABER AND L. RUTHOTTO, Stable architectures for deep neural networks, Inverse Problems, 34 (2017), p. 014004. [42] B. HANIN AND M. SELLKE, Approximating continuous functions by relu nets of minimal width, arXiv preprint arXiv:1710.11278, (2017). [43] K. HE, X. ZHANG, S. REN, AND J. SUN, Deep residual learning for image recognition, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778. [44] M. HERBSTER, M. PONTIL, AND L. WAINER, Online learning over graphs, in Proceedings of the 22nd international conference on Machine learning, 2005, pp. 305–312. [45] S. HOCHREITER AND J. SCHMIDHUBER, Long short-term memory, Neural computation, 9 (1997), pp. 1735–1780. [46] L. HOGBEN, Spectral graph theory and the inverse eigenvalue problem of a graph, The Electronic Journal of Linear Algebra, 14 (2005). [47] M. HOHENWARTER, Geogebra–ein softwaresystem für dynamische geometrie und algebra der ebene [geogebra–a software system for dynamic geometry and algebra in the plane] (unpublished master’s thesis), Universität Salzburg, Salzburg, Austria, (2002). [48] J. HOWARD AND S. RUDER, Universal language model fine-tuning for text classification, in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 328–339. [49] G.-B. HUANG, Q.-Y. ZHU, AND C.-K. SIEW, Extreme learning machine: a new learning scheme of feedforward neural networks, in 2004 IEEE international joint conference on neural networks (IEEE Cat. No. 04CH37541), vol. 2, Ieee, 2004, pp. 985–990. [50] X. HUO, X. S. NI, AND A. K. SMITH, A survey of manifold-based learning methods, Recent advances in data mining of enterprise data, (2007), pp. 691–745. [51] X. JIANG, L.-H. LIM, Y. YAO, AND Y. YE, Statistical ranking and combinatorial Hodge theory, Mathematical Programming, 127 (2011), pp. 203–244. [52] D. P. KINGMA AND J. BA, Adam: A method for stochastic optimization, in ICLR (Poster), 2015. [53] D. B. KOTLOW, A free boundary problem connected with the optimal stopping problem for diffusion processes, Transactions of the American Mathematical Society, 184 (1973), pp. 457–478. [54] A. KRIZHEVSKY, I. SUTSKEVER, AND G. E. HINTON, ImageNet classification with deep convolutional neural networks, in Advances in neural information processing systems, 2012, pp. 1097–1105. [55] Y. LECUN, Y. BENGIO, AND G. HINTON, Deep learning, nature, 521 (2015), pp. 436–444. [56] Y. LECUN, L. BOTTOU, Y. BENGIO, AND P. HAFFNER, Gradient-based learning applied to document recognition, Proceedings of the IEEE, 86 (1998), pp. 2278–2324. [57] D. LEVIN, The approximation power of moving least-squares, Mathematics of computation, 67 (1998), pp. 1517–1531. [58] H. LIN AND S. JEGELKA, Resnet with one-neuron hidden layers is a universal approximator, Advances in neural information processing systems, 31 (2018), pp. 6169–6178. [59] T. LIN AND H. ZHA, Riemannian manifold learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30 (2008), pp. 796–809. [60] T.-Y. LIN AND Y.-L. TSAI, An application of hodgerank to online peer assessment, in Proceedings of the 2017 SIAM Workshop on Machine Learning for Recommender Systems (MLRec), 2018. [61] Y. LIPMAN, D. COHEN-OR, D. LEVIN, AND H. TAL-EZER, Parameterization-free projection for geometry reconstruction, ACM Transactions on Graphics (TOG), 26 (2007), pp. 22–es. [62] H.-K. LIU, Properties of American volatility options in the mean-reverting 3/2 volatility model, SIAM Journal on Financial Mathematics, 6 (2015), pp. 53–65. [63] H.-K. LIU, T.-Y. LIN, AND Y.-L. TSAI, On the pricing formula for the perpetual American volatility option under the mean-reverting processes, Taiwanese Journal of Mathematics, (2020). [64] T.-Y. LIU, Learning to rank for information retrieval, (2011). [65] Y. LU, A. ZHONG, Q. LI, AND B. DONG, Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations, in International Conference on Machine Learning, PMLR, 2018, pp. 3276–3285. [66] Z. LU, H. PU, F. WANG, Z. HU, AND L. WANG, The expressive power of neural networks: A view from the width, Advances in neural information processing systems, 30 (2017), pp. 6231–6239. [67] A. L. MAAS, A. Y. HANNUN, AND A. Y. NG, Rectifier nonlinearities improve neural network acoustic models, in ICML Workshop on Deep Learning for Audio, Speech and Language Processing, 2013. [68] G. A. MILLER, The magical number seven, plus or minus two: Some limits on our capacity for processing information., Psychological review, 63 (1956), p. 81. [69] J. W. Milnor and j. D. Stasheff, Characteristic Classes, no. 76, Princeton University Press, 1974. [70] V. MNIH, K. KAVUKCUOGLU, D. SILVER, A. GRAVES, I. ANTONOGLOU, D. WIERSTRA, AND M. RIEDMILLER, Playing Atari with deep reinforcement learning, (2013). cite arxiv:1312.5602Comment: NIPS Deep Learning Workshop 2013. [71] Y. NESTEROV, A method for unconstrained convex minimization problem with the rate of convergence o (1/kˆ 2), in Doklady an ussr, vol. 269, 1983, pp. 543–547. [72] B. NEYSHABUR, S. BHOJANAPALLI, D. MCALLESTER, AND N. SREBRO, Exploring generalization in deep learning, in Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017, pp. 5949–5958. [73] P. NIYOGI, S. SMALE, AND S. WEINBERGER, Finding the homology of submanifolds with high confidence from random samples, Discrete & Computational Geometry, 39 (2008), pp. 419–441. [74] Y. PARMAR, S. NATARAJAN, AND G. SOBHA, Deeprange: deep-learning-based object detection and ranging in autonomous driving, IET Intelligent Transport Systems, 13 (2019), pp. 1256–1264. [75] M. PETERS, M. NEUMANN, M. IYYER, M. GARDNER, C. CLARK, K. LEE, AND L. ZETTLEMOYER, Deep contextualized word representations, in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018, pp. 2227–2237. [76] G. PHILIPP, D. SONG, AND J. G. CARBONELL, Gradients explode-deep networks are shallow-resnet explained, (2018). [77] R. PIZIAK AND P. ODELL, Affine projections, Computers & Mathematics with Applications, 48 (2004), pp. 177–190. [78] R. PLESS AND R. SOUVENIR, A survey of manifold learning for images, IPSJ Transactions on Computer Vision and Applications, 1 (2009), pp. 83–94. [79] L. RUTHOTTO AND E. HABER, Deep neural networks motivated by partial differential equations, Journal of Mathematical Imaging and Vision, (2019), pp. 1–13. [80] R. W. SAATY, The analytic hierarchy process-what it is and how it is used, Mathematical modelling, 9 (1987), pp. 161–176. [81] M. SAERENS, F. FOUSS, L. YEN, AND P. DUPONT, The principal components analysis of a graph, and its relationships to spectral clustering, in European conference on machine learning, Springer, 2004, pp. 371–383. [82] F. SCARSELLI, M. GORI, A. C. TSOI, M. HAGENBUCHNER, AND G. MONFARDINI, The graph neural network model, IEEE Transactions on Neural Networks, 20 (2008), pp. 61–80. [83] R. K. SIZEMORE, HodgeRank: Applying combinatorial Hodge theory to sports ranking, PhD thesis, Wake Forest University, 2013. [84] B. SOBER AND D. LEVIN, Manifold approximation by moving least-squares projection (MMLS), Constructive Approximation, (2019), pp. 1–46. [85] G. STEWART, On the continuity of the generalized inverse, SIAM Journal on Applied Mathematics, 17 (1969), pp. 33–45. [86] Y. TAY, M. DEHGHANI, D. BAHRI, AND D. METZLER, Efficient transformers: A survey, arXiv preprint arXiv:2009.06732, (2020). [87] N. M. TRAN, Pairwise ranking: choice of method can produce arbitrarily different rank order, Linear Algebra and its Applications, 438 (2013), pp. 1012–1024. [88] G. VAN ROSSUM AND F. L. DRAKE JR, Python tutorial, vol. 620, Centrum voor Wiskunde en Informatica Amsterdam, 1995. [89] A. VASWANI, N. SHAZEER, N. PARMAR, J. USZKOREIT, L. JONES, A. N. GOMEZ, L. KAISER, AND I. POLOSUKHIN, Attention is all you need, Advances in neural information processing systems, 30 (2017), pp. 5998–6008. [90] A. VEIT, M. J. WILBER, AND S. BELONGIE, Residual networks behave like ensembles of relatively shallow networks, Advances in neural information processing systems, 29 (2016), pp. 550–558. [91] P. VELICKOVIC, G. CUCURULL, A. CASANOVA, A. ROMERO, P. LIO, AND Y. BENGIO, Graph attention networks, in International Conference on Learning Representations, 2018. [92] V. P. VISHWAKARMA AND M. GUPTA, A new learning algorithm for single hidden layer feedforward neural networks, International Journal of Computer Applications, 28 (2011), pp. 26–33. [93] T. WALSH, The PeerRank method for peer assessment, in Proceedings of the Twenty-first European Conference on Artificial Intelligence, 2014, pp. 909–914. [94] T.-W. WENG, H. ZHANG, P.-Y. CHEN, J. YI, D. SU, Y. GAO, C.-J. HSIEH, AND L. DANIEL, valuating the robustness of neural networks: An extreme value theory approach, in International Conference on Learning Representations, 2018. [95] D. B. WEST, Introduction to graph theory, vol. 2, Prentice hall Upper Saddle River, 2001. [96] H. WHITNEY, Differentiable manifolds, Annals of Mathematics, (1936), pp. 645–680. [97] S. WOLD, K. ESBENSEN, AND P. GELADI, Principal component analysis, Chemometrics and intelligent laboratory systems, 2 (1987), pp. 37–52. [98] Z. WU, S. PAN, F. CHEN, G. LONG, C. ZHANG, AND S. Y. PHILIP, A comprehensive survey on graph neural networks, IEEE Transactions on Neural Networks and Learning Systems, (2020). [99] S. XIE, R. GIRSHICK, P. DOLLAR, Z. TU, AND K. HE, Aggregated residual transformations for deep neural networks, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1492–1500. [100] K. XU, W. HU, J. LESKOVEC, AND S. JEGELKA, How powerful are graph neural networks?, in International Conference on Learning Representations, 2018.	zh_TW
dc.identifier.doi (DOI)	10.6814/NCCU202100296	en_US

學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

Google Scholar^TM