Publications-Theses
Article View/Open
Publication Export
-
Google ScholarTM
NCCU Library
Citation Infomation
Related Publications in TAIR
題名 應用大數據於信用評等之模型探討
The Application of Big Data on Credit Scoring Model作者 林瑀甯 貢獻者 鄭宇庭<br>郭訓志
Cheng, Yu Ting<br>Kuo, Hsun Chih
林瑀甯關鍵詞 信用風險
羅吉斯迴歸
信用評等模型
Credit risk
Logistic regression
Credit scoring model日期 2018 上傳時間 1-Jun-2018 17:33:42 (UTC+8) 摘要 信用風險或信用違約意旨金融機構提供給客戶服務卻未得償還的機率,故其在銀行信貸決策的領域是常被鑽研的對象,因為其對於金融機構所扮演的角色尤其重要,對商業銀行來說更是常難以解釋或控制,然而拜現今進步的科技所賜,金融機構可以藉由操控較過去低的成本即可進一步發展強健且精煉的系統與模型去做預測還有信用風險的控管,有鑑於對客戶的評分自大數據時代來臨起,即使是學生亦開始有了可以評鑑的痕跡,憑藉前人所實驗或仰賴的基本考量面向如客戶基本資料、財力狀況或是其於該公司今昔的借貸訊息,再輔以藉由開放資料所帶來的資訊,發想可能影響信用違約率的變數如外在規範對該客戶的紀錄,想驗證是否真有尚可開發的方向,若有則其影響可以到多深。眾所皆知從過去到現在即有很多種方法被開創以及提出以預測信用違約率,當然所使用的方法和金融機構本身的複雜性、規模大小以及信貸類型有關,最常見的有判別分析,但其對於變數有嚴格的假設,而新興的方法神經網路可以克服判別分析的缺陷且預測的效能也不錯,但神經網路只給予預測結果而運算過程是未知的,對於想要了解變數間的關係無濟於事,故還是選擇從可以對二元分類做預測亦可以藉由模型係數看到應變數和自變數間關係的羅吉斯迴歸方法著手,而研究過程即是依著前人對於羅吉斯迴歸在信用風險上的繩索摸索,將資料如何清理、變數如何轉換、模型如何建立以及最後如何篩選做一個完整的陳述,縱然長道漫漫,對於研究假設在結果終得驗證也始見曙光,考慮的新面向確有其影響力,而在模型係數上也看到其影響的大小,為了更彰顯羅吉斯迴歸對於變數間提供的訊息,故在最後將研究結果以較文字易讀的視覺化方式作呈現。
Credit risk or credit default means the probability of non-repayment that banks or financial institutions get after they provide services to their customers. Credit risk is also studied intensively in the field of bank lending strategy because it’s usually hard to interpret and control. However, thanks to advanced technology nowadays, banks can manipulate reduced cost to develop robust and well-trained system and models so as to predict and mange credit risk. In the light of the score on customers from the beginning of big data era, every single one can be tracked to assess even though he or she is student. Relying on common facets like personal information, financial statement and past relationship of loan in a specific bank, come up with possible variables like regulations which influence credit risk according to information from open data. Try to verify if there is a new aspect of modeling and how far it effects.As everyone knows, there are several created and offered methodologies in order to predict credit default. They differ from complexity of banks and institutions, size and type of loan. One of the most popular method is discriminant analysis, but variables are restricted to its assumption. Neural network can fix the flaws of the assumption and work efficiently. Considering the unknown process of calculation in neural network, choose logistic regression as research method which can see the relationship between variables and predict the binary category. With the posterior research on credit risk, make a complete statement about how to clean data, how to transform variables and how to build or screen models. Although the procedure is complicated, the result of this study still validates original hypothesis that new aspect indeed has an impact on credit risk and the coefficient shows how deep it affects.參考文獻 一、 中文文獻 1. 向暉,2011,個人信用評分組合模型研究與應用 (Doctoral dissertation, 博士學位論文]. 長沙: 湖南大學)。2. 沈俊誠、唐麗英,2003,整合金融機構風險評估與信用評等模式之研究 (Doctoral dissertation)。3. 林宗勳,2006,Support Vector Machines 簡介。4. 劉應興,1996,類別資料分析導論. 台北市: 華泰文化事業公司。5. 鍾經樊、黃嘉龍、黃博怡、謝有隆,2006,台灣地區企業信用評分系統的建置, 驗證和比較. 經濟論文, 34(4), 541-590。二、 英文文獻 1. Caire, D., Barton, S., Zubiria, A., Alexiev, Z., Dyer, J., Bundred, F. & Brislin, N. 2006. A Handbook for developing credit scoring systems in a microfinance context. United States Agency for International Development.2. Fawcett, T. 2006. An introduction to ROC analysis. Pattern recognition letters, 27(8), p. 861-874.3. Hagan, M. T., Demuth, H. B. & Beale, M. H. 1996. Neural network design (Vol. 20). Boston: Pws Pub..4. Hosmer Jr, D. W., Lemeshow, S. & Sturdivant, R. X. 2013. Applied logistic regression (Vol. 398). John Wiley & Sons.5. Keenan, S. & Sobehart, J. R. 1999. Performance measures for credit risk models. Moody’s Risk Management Services.6. Massey , Jr F. J. 1951. The Kolmogorov-Smirnov test for goodness of fit. Journal of the American statistical Association, 46(253), p. 68-78.7. Mays, E. (Ed.). 2001. Handbook of credit scoring. Global Professional Publish.8. Moore, D. S., McCabe, G. P., & Craig, B. A. 2009. Introduction to the Practice of Statistics (p. 522-526). New York: WH Freeman.9. Neter, J., Kutner, M. H., Nachtsheim, C. J. & Wasserman, W. (1996). Applied linear statistical models (Vol. 4, p. 318). Chicago: Irwin.10. Safavian, S. R. & Landgrebe, D. 1991. A survey of decision tree classifier methodology. IEEE transactions on systems, man, and cybernetics, 21(3), p. 660-674.11. Shannon, C. E. 1951. Prediction and entropy of printed English. Bell Labs Technical Journal, 30(1), p. 50-64.12. Siddiqi, N. 2012. Credit risk scorecards: developing and implementing intelligent credit scoring (Vol. 3). John Wiley & Sons.13. Wu, W., Mallet, Y., Walczak, B., Penninckx, W., Massart, D. L., Heuerding, S. & Erni, F. 1996. Comparison of regularized discriminant analysis linear discriminant analysis and quadratic discriminant analysis applied to NIR data. Analytica Chimica Acta, 329(3), p. 257-265.14. Zou, K. H., O’Malley, A. J. & Mauri, L. 2007. Receiver-operating characteristic analysis for evaluating diagnostic tests and predictive models. Circulation, 115(5), p. 654-657. 描述 碩士
國立政治大學
統計學系
105354001資料來源 http://thesis.lib.nccu.edu.tw/record/#G0105354001 資料類型 thesis dc.contributor.advisor 鄭宇庭<br>郭訓志 zh_TW dc.contributor.advisor Cheng, Yu Ting<br>Kuo, Hsun Chih en_US dc.contributor.author (Authors) 林瑀甯 zh_TW dc.creator (作者) 林瑀甯 zh_TW dc.date (日期) 2018 en_US dc.date.accessioned 1-Jun-2018 17:33:42 (UTC+8) - dc.date.available 1-Jun-2018 17:33:42 (UTC+8) - dc.date.issued (上傳時間) 1-Jun-2018 17:33:42 (UTC+8) - dc.identifier (Other Identifiers) G0105354001 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/117439 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 統計學系 zh_TW dc.description (描述) 105354001 zh_TW dc.description.abstract (摘要) 信用風險或信用違約意旨金融機構提供給客戶服務卻未得償還的機率,故其在銀行信貸決策的領域是常被鑽研的對象,因為其對於金融機構所扮演的角色尤其重要,對商業銀行來說更是常難以解釋或控制,然而拜現今進步的科技所賜,金融機構可以藉由操控較過去低的成本即可進一步發展強健且精煉的系統與模型去做預測還有信用風險的控管,有鑑於對客戶的評分自大數據時代來臨起,即使是學生亦開始有了可以評鑑的痕跡,憑藉前人所實驗或仰賴的基本考量面向如客戶基本資料、財力狀況或是其於該公司今昔的借貸訊息,再輔以藉由開放資料所帶來的資訊,發想可能影響信用違約率的變數如外在規範對該客戶的紀錄,想驗證是否真有尚可開發的方向,若有則其影響可以到多深。眾所皆知從過去到現在即有很多種方法被開創以及提出以預測信用違約率,當然所使用的方法和金融機構本身的複雜性、規模大小以及信貸類型有關,最常見的有判別分析,但其對於變數有嚴格的假設,而新興的方法神經網路可以克服判別分析的缺陷且預測的效能也不錯,但神經網路只給予預測結果而運算過程是未知的,對於想要了解變數間的關係無濟於事,故還是選擇從可以對二元分類做預測亦可以藉由模型係數看到應變數和自變數間關係的羅吉斯迴歸方法著手,而研究過程即是依著前人對於羅吉斯迴歸在信用風險上的繩索摸索,將資料如何清理、變數如何轉換、模型如何建立以及最後如何篩選做一個完整的陳述,縱然長道漫漫,對於研究假設在結果終得驗證也始見曙光,考慮的新面向確有其影響力,而在模型係數上也看到其影響的大小,為了更彰顯羅吉斯迴歸對於變數間提供的訊息,故在最後將研究結果以較文字易讀的視覺化方式作呈現。 zh_TW dc.description.abstract (摘要) Credit risk or credit default means the probability of non-repayment that banks or financial institutions get after they provide services to their customers. Credit risk is also studied intensively in the field of bank lending strategy because it’s usually hard to interpret and control. However, thanks to advanced technology nowadays, banks can manipulate reduced cost to develop robust and well-trained system and models so as to predict and mange credit risk. In the light of the score on customers from the beginning of big data era, every single one can be tracked to assess even though he or she is student. Relying on common facets like personal information, financial statement and past relationship of loan in a specific bank, come up with possible variables like regulations which influence credit risk according to information from open data. Try to verify if there is a new aspect of modeling and how far it effects.As everyone knows, there are several created and offered methodologies in order to predict credit default. They differ from complexity of banks and institutions, size and type of loan. One of the most popular method is discriminant analysis, but variables are restricted to its assumption. Neural network can fix the flaws of the assumption and work efficiently. Considering the unknown process of calculation in neural network, choose logistic regression as research method which can see the relationship between variables and predict the binary category. With the posterior research on credit risk, make a complete statement about how to clean data, how to transform variables and how to build or screen models. Although the procedure is complicated, the result of this study still validates original hypothesis that new aspect indeed has an impact on credit risk and the coefficient shows how deep it affects. en_US dc.description.tableofcontents 目 錄 I表目錄 II圖目錄 IV第壹章 緒論 5第一節 研究背景與動機 5第二節 研究目的 5第三節 研究流程 6第貳章 文獻探討 7第一節 信用評等的文獻回顧 7第二節 常用的建構信用評等模型方法之概述 9第參章 研究方法 12第一節 資料來源 12第二節 研究架構 12第三節 操作性變數定義 14第四節 分析方法 18第肆章 實證分析 28第一節 探索性分析 28第二節 樣本代表性檢定(卡方適合度檢定) 30第三節 Fine Classing & Coarse Classing 31第四節 羅吉斯迴歸分析 38第伍章 結論與建議 58第一節 結論 58第二節 建議 59參考文獻 61 zh_TW dc.format.extent 1380458 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0105354001 en_US dc.subject (關鍵詞) 信用風險 zh_TW dc.subject (關鍵詞) 羅吉斯迴歸 zh_TW dc.subject (關鍵詞) 信用評等模型 zh_TW dc.subject (關鍵詞) Credit risk en_US dc.subject (關鍵詞) Logistic regression en_US dc.subject (關鍵詞) Credit scoring model en_US dc.title (題名) 應用大數據於信用評等之模型探討 zh_TW dc.title (題名) The Application of Big Data on Credit Scoring Model en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) 一、 中文文獻 1. 向暉,2011,個人信用評分組合模型研究與應用 (Doctoral dissertation, 博士學位論文]. 長沙: 湖南大學)。2. 沈俊誠、唐麗英,2003,整合金融機構風險評估與信用評等模式之研究 (Doctoral dissertation)。3. 林宗勳,2006,Support Vector Machines 簡介。4. 劉應興,1996,類別資料分析導論. 台北市: 華泰文化事業公司。5. 鍾經樊、黃嘉龍、黃博怡、謝有隆,2006,台灣地區企業信用評分系統的建置, 驗證和比較. 經濟論文, 34(4), 541-590。二、 英文文獻 1. Caire, D., Barton, S., Zubiria, A., Alexiev, Z., Dyer, J., Bundred, F. & Brislin, N. 2006. A Handbook for developing credit scoring systems in a microfinance context. United States Agency for International Development.2. Fawcett, T. 2006. An introduction to ROC analysis. Pattern recognition letters, 27(8), p. 861-874.3. Hagan, M. T., Demuth, H. B. & Beale, M. H. 1996. Neural network design (Vol. 20). Boston: Pws Pub..4. Hosmer Jr, D. W., Lemeshow, S. & Sturdivant, R. X. 2013. Applied logistic regression (Vol. 398). John Wiley & Sons.5. Keenan, S. & Sobehart, J. R. 1999. Performance measures for credit risk models. Moody’s Risk Management Services.6. Massey , Jr F. J. 1951. The Kolmogorov-Smirnov test for goodness of fit. Journal of the American statistical Association, 46(253), p. 68-78.7. Mays, E. (Ed.). 2001. Handbook of credit scoring. Global Professional Publish.8. Moore, D. S., McCabe, G. P., & Craig, B. A. 2009. Introduction to the Practice of Statistics (p. 522-526). New York: WH Freeman.9. Neter, J., Kutner, M. H., Nachtsheim, C. J. & Wasserman, W. (1996). Applied linear statistical models (Vol. 4, p. 318). Chicago: Irwin.10. Safavian, S. R. & Landgrebe, D. 1991. A survey of decision tree classifier methodology. IEEE transactions on systems, man, and cybernetics, 21(3), p. 660-674.11. Shannon, C. E. 1951. Prediction and entropy of printed English. Bell Labs Technical Journal, 30(1), p. 50-64.12. Siddiqi, N. 2012. Credit risk scorecards: developing and implementing intelligent credit scoring (Vol. 3). John Wiley & Sons.13. Wu, W., Mallet, Y., Walczak, B., Penninckx, W., Massart, D. L., Heuerding, S. & Erni, F. 1996. Comparison of regularized discriminant analysis linear discriminant analysis and quadratic discriminant analysis applied to NIR data. Analytica Chimica Acta, 329(3), p. 257-265.14. Zou, K. H., O’Malley, A. J. & Mauri, L. 2007. Receiver-operating characteristic analysis for evaluating diagnostic tests and predictive models. Circulation, 115(5), p. 654-657. zh_TW