基於自然語言分析建構預測企業信用評等變動之模型

學術產出-Theses

Article View/Open

pdf(0)

Publication Export

Google Scholar^TM

政大圖書館

學術資源探索系統

Citation Infomation

Simple Record
Full Record

題名	基於自然語言分析建構預測企業信用評等變動之模型 Construction of Corporate Credit Rating Prediction Model Based on Natural Language Analysis
作者	陳明勝 Chen, Ming-Sheng
貢獻者	江彌修<br>趙世偉 Chiang, Mi-Hsiu<br>Chao, Shih-Wei 陳明勝 Chen, Ming-Sheng
關鍵詞	自然語言分析神經網路領域遷移企業信用預警 Natural Language Analysis Neural Network Domain Adaption Corporate Credit Prediction
日期	2022
上傳時間	1-Aug-2022 17:30:32 (UTC+8)
摘要	為改進過去語言分析模型無法辨認語言一字多義以及訓練域與預測域不一致之問題，本研究嘗試以BERT(Bidirectional Encoder Representations from Transformers)模型針對金融領域文本進行領域遷移(Domain Adaption)，比較有無經過遷移對模型效能之改進，接著以遷移過之模型分析RavenPack資料庫內所含的美國企業相關新聞，並以此建構信用評等變動預警模型。本研究實證結果顯示，經過遷移之模型預測財金文本情緒的預測準確率比未經遷移之模型高出30.47%，且領域遷移後辨認的新聞情緒提升對未來企業信用評等變動的預測。另外，本研究建構四個隨機森林模型，用以證明企業金融財務面的媒體情緒隱含對企業未來評級可能變動的有效資訊。 To improve the inability of the language analysis model to recognize the polysemy of the language and the inconsistency between the training domain and the prediction domain, this study uses the BERT (Bidirectional Encoder Representations from Transformers) model to perform Domain Adaption for the financial corpus. The adaption improves the performance of the model, and we further use the adapted model to analyze the news related to US companies contained in the RavenPack database and construct an early warning model for credit rating changes. The empirical results show that the prediction accuracy of the adapted model in predicting the sentiment of financial texts is 30.47% higher than that of the non-adapted one, which shows that adaption learning indeed improves the prediction of the corporate credit rating changes. Also, we developed four different random forest models to prove that the media sentiment on the company`s financial news contains effective information on the possible changes in the company`s future rating.
參考文獻	Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance, 23(4):589–609. Araci, D. (2019). Finbert: Financial sentiment analysis with pre-trained language models. arXiv preprint arXiv:1908.10063. Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(Jan):993–1022. Collin-Dufresn, P., Goldstein, R. S., and Martin, J. S. (2001). The determinants of credit spread changes. The Journal of Finance, 56(6):2177–2207. Da, Z., Engelberg, J., and Gao, P. (2015). The sum of all fears investor sentiment and asset prices. The Review of Financial Studies, 28(1):1–32. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Dyer, T., Lang, M., and Stice-Lawrence, L. (2017). The evolution of 10-k textual disclosure: Evidence from latent dirichlet allocation. Journal of Accounting and Economics, 64(2-3):221–245. Ericsson, J., Jacobs, K., and Oviedo, R. (2009). The determinants of credit default swap premia. Journal of Financial and Quantitative Analysis, 44(1):109–132. Fama, E. F. (1960). Efficient market hypothesis. Diss. PhD Thesis, Ph. D. dissertation. Galil, K. and Soffer, G. (2011). Good news, bad news and rating announcements: An empirical investigation. Journal of Banking & Finance, 35(11):3101–3119. Hajek, P. and Michalak, K. (2013). Feature selection in corporate credit rating prediction. Knowledge-Based Systems, 51:72–84. Huang, A. H., Lehavy, R., Zang, A. Y., and Zheng, R. (2018). Analyst information discovery and interpretation roles: A topic modeling approach. Management Science, 64(6):2833–2855. Hull, J., Predescu, M., and White, A. (2004). The relationship between credit default swap spreads, bond yields, and credit rating announcements. Journal of Banking & Finance, 28(11):2789–2811. Hutto, C. and Gilbert, E. (2014). Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the International AAAI Conference on Web and Social Media, volume 8, pages 216–225. Jarrow, R. A. and Turnbull, S. M. (1995). Pricing derivatives on financial securities subject to credit risk. The Journal of Finance, 50(1):53–85. Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., and Soricut, R. (2019). Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942. Lawrence, A. (2013). Individual investors and financial disclosure. Journal of Accounting and Economics, 56(1):130–147. Lee, Y.-C. (2007). Application of support vector machines to corporate credit rating prediction. Expert Systems with Applications, 33(1):67–74. Li, F. (2008). Annual report readability, current earnings, and earnings persistence. Journal of Accounting and Economics, 45(2-3):221–247. Li, X., Xie, H., Chen, L., Wang, J., and Deng, X. (2014). News impact on stock price return via sentiment analysis. Knowledge-Based Systems, 69:14–23. Liberti, J. M. and Petersen, M. A. (2019). Information: Hard and soft. Review of Corporate Finance Studies, 8(1):1–41. Loughran, T. and McDonald, B. (2011). When is a liability not a liability? textual analysis, dictionaries, and 10-ks. The Journal of Finance, 66(1):35–65. Loughran, T. and McDonald, B. (2014). Measuring readability in financial disclosures. the Journal of Finance, 69(4):1643–1671. Loughran, T. and McDonald, B. (2016). Textual analysis in accounting and finance: A survey. Journal of Accounting Research, 54(4):1187–1230. Lu, H.-M., Tsai, F.-T., Chen, H., Hung, M.-W., and Li, S.-H. (2012). Credit rating change modeling using news and financial ratios. ACM Transactions on Management Information Systems (TMIS), 3(3):1–30. Lundberg, S. M. and Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30. Mayew, W. J. and Venkatachalam, M. (2012). The power of voice: Managerial affective states and future firm performance. The Journal of Finance, 67(1):1–43. Merton, R. C. (1973). Theory of rational option pricing. The Bell Journal of Economics and Management Science, pages 141–183. Merton, R. C. (1974). On the pricing of corporate debt: The risk structure of interest rates. The Journal of Finance, 29(2):449–470. Miller, B. P. (2010). The effects of reporting complexity on small and large investor trading. The Accounting Review, 85(6):2107–2143. Norden, L. (2017). Information in cds spreads. Journal of Banking & Finance, 75:118– 135. Norden, L. and Weber, M. (2004). Informational efficiency of credit default swap and stock markets: The impact of credit rating announcements. Journal of Banking & Finance, 28(11):2813–2843. Orsenigo, C. and Vercellis, C. (2013). Linear versus nonlinear dimensionality reduction for banks’credit rating prediction. Knowledge-Based Systems, 47:14–22. Pedrosa, M. (1998). Systematic risk in corporate bond credit spreads. Journal of Fixed Income, 8(3):7–26. Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P. J., et al. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(140):1–67. Shapiro, A. H., Sudhof, M., and Wilson, D. J. (2020). Measuring news sentiment. Journal of Econometrics. Smales, L. A. (2016). News sentiment and bank credit risk. Journal of Empirical Finance, 38:37–61. Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. The Journal of Finance, 62(3):1139–1168. Tetlock, P. C., Saar-Tsechansky, M., and Macskassy, S. (2008). More than words: Quantifying language to measure firms’ fundamentals. The journal of finance, 63(3):1437– 1467. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30. Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R., and Le, Q. V. (2019). Xlnet: Generalized autoregressive pretraining for language understanding. Advances in Neural Information Processing Systems, 32.
描述	碩士國立政治大學金融學系 109352029
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0109352029
資料類型	thesis

dc.contributor.advisor	江彌修<br>趙世偉	zh_TW
dc.contributor.advisor	Chiang, Mi-Hsiu<br>Chao, Shih-Wei	en_US
dc.contributor.author (Authors)	陳明勝	zh_TW
dc.contributor.author (Authors)	Chen, Ming-Sheng	en_US
dc.creator (作者)	陳明勝	zh_TW
dc.creator (作者)	Chen, Ming-Sheng	en_US
dc.date (日期)	2022	en_US
dc.date.accessioned	1-Aug-2022 17:30:32 (UTC+8)	-
dc.date.available	1-Aug-2022 17:30:32 (UTC+8)	-
dc.date.issued (上傳時間)	1-Aug-2022 17:30:32 (UTC+8)	-
dc.identifier (Other Identifiers)	G0109352029	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/141068	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	金融學系	zh_TW
dc.description (描述)	109352029	zh_TW
dc.description.abstract (摘要)	為改進過去語言分析模型無法辨認語言一字多義以及訓練域與預測域不一致之問題，本研究嘗試以BERT(Bidirectional Encoder Representations from Transformers)模型針對金融領域文本進行領域遷移(Domain Adaption)，比較有無經過遷移對模型效能之改進，接著以遷移過之模型分析RavenPack資料庫內所含的美國企業相關新聞，並以此建構信用評等變動預警模型。本研究實證結果顯示，經過遷移之模型預測財金文本情緒的預測準確率比未經遷移之模型高出30.47%，且領域遷移後辨認的新聞情緒提升對未來企業信用評等變動的預測。另外，本研究建構四個隨機森林模型，用以證明企業金融財務面的媒體情緒隱含對企業未來評級可能變動的有效資訊。	zh_TW
dc.description.abstract (摘要)	To improve the inability of the language analysis model to recognize the polysemy of the language and the inconsistency between the training domain and the prediction domain, this study uses the BERT (Bidirectional Encoder Representations from Transformers) model to perform Domain Adaption for the financial corpus. The adaption improves the performance of the model, and we further use the adapted model to analyze the news related to US companies contained in the RavenPack database and construct an early warning model for credit rating changes. The empirical results show that the prediction accuracy of the adapted model in predicting the sentiment of financial texts is 30.47% higher than that of the non-adapted one, which shows that adaption learning indeed improves the prediction of the corporate credit rating changes. Also, we developed four different random forest models to prove that the media sentiment on the company`s financial news contains effective information on the possible changes in the company`s future rating.	en_US
dc.description.tableofcontents	第一章緒論 1 1.1 研究動機與背景 1 1.2 研究目的 2 第二章文獻回顧 3 2.1 衡量企業信用風險 3 2.2 文字分析模型 5 第三章研究方法 10 3.1 BERT 模型 10 3.2 隨機森林 14 3.3 模型績效衡量指標 17 第四章實證分析 22 4.1 資料處理 22 4.2 特徵生成 24 4.3 建構信用評等預警模型 28 4.4 各模型預警成效 31 第五章結論與建議 45 參考文獻 47	zh_TW
dc.format.extent	2437979 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0109352029	en_US
dc.subject (關鍵詞)	自然語言分析	zh_TW
dc.subject (關鍵詞)	神經網路	zh_TW
dc.subject (關鍵詞)	領域遷移	zh_TW
dc.subject (關鍵詞)	企業信用預警	zh_TW
dc.subject (關鍵詞)	Natural Language Analysis	en_US
dc.subject (關鍵詞)	Neural Network	en_US
dc.subject (關鍵詞)	Domain Adaption	en_US
dc.subject (關鍵詞)	Corporate Credit Prediction	en_US
dc.title (題名)	基於自然語言分析建構預測企業信用評等變動之模型	zh_TW
dc.title (題名)	Construction of Corporate Credit Rating Prediction Model Based on Natural Language Analysis	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance, 23(4):589–609. Araci, D. (2019). Finbert: Financial sentiment analysis with pre-trained language models. arXiv preprint arXiv:1908.10063. Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(Jan):993–1022. Collin-Dufresn, P., Goldstein, R. S., and Martin, J. S. (2001). The determinants of credit spread changes. The Journal of Finance, 56(6):2177–2207. Da, Z., Engelberg, J., and Gao, P. (2015). The sum of all fears investor sentiment and asset prices. The Review of Financial Studies, 28(1):1–32. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Dyer, T., Lang, M., and Stice-Lawrence, L. (2017). The evolution of 10-k textual disclosure: Evidence from latent dirichlet allocation. Journal of Accounting and Economics, 64(2-3):221–245. Ericsson, J., Jacobs, K., and Oviedo, R. (2009). The determinants of credit default swap premia. Journal of Financial and Quantitative Analysis, 44(1):109–132. Fama, E. F. (1960). Efficient market hypothesis. Diss. PhD Thesis, Ph. D. dissertation. Galil, K. and Soffer, G. (2011). Good news, bad news and rating announcements: An empirical investigation. Journal of Banking & Finance, 35(11):3101–3119. Hajek, P. and Michalak, K. (2013). Feature selection in corporate credit rating prediction. Knowledge-Based Systems, 51:72–84. Huang, A. H., Lehavy, R., Zang, A. Y., and Zheng, R. (2018). Analyst information discovery and interpretation roles: A topic modeling approach. Management Science, 64(6):2833–2855. Hull, J., Predescu, M., and White, A. (2004). The relationship between credit default swap spreads, bond yields, and credit rating announcements. Journal of Banking & Finance, 28(11):2789–2811. Hutto, C. and Gilbert, E. (2014). Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the International AAAI Conference on Web and Social Media, volume 8, pages 216–225. Jarrow, R. A. and Turnbull, S. M. (1995). Pricing derivatives on financial securities subject to credit risk. The Journal of Finance, 50(1):53–85. Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., and Soricut, R. (2019). Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942. Lawrence, A. (2013). Individual investors and financial disclosure. Journal of Accounting and Economics, 56(1):130–147. Lee, Y.-C. (2007). Application of support vector machines to corporate credit rating prediction. Expert Systems with Applications, 33(1):67–74. Li, F. (2008). Annual report readability, current earnings, and earnings persistence. Journal of Accounting and Economics, 45(2-3):221–247. Li, X., Xie, H., Chen, L., Wang, J., and Deng, X. (2014). News impact on stock price return via sentiment analysis. Knowledge-Based Systems, 69:14–23. Liberti, J. M. and Petersen, M. A. (2019). Information: Hard and soft. Review of Corporate Finance Studies, 8(1):1–41. Loughran, T. and McDonald, B. (2011). When is a liability not a liability? textual analysis, dictionaries, and 10-ks. The Journal of Finance, 66(1):35–65. Loughran, T. and McDonald, B. (2014). Measuring readability in financial disclosures. the Journal of Finance, 69(4):1643–1671. Loughran, T. and McDonald, B. (2016). Textual analysis in accounting and finance: A survey. Journal of Accounting Research, 54(4):1187–1230. Lu, H.-M., Tsai, F.-T., Chen, H., Hung, M.-W., and Li, S.-H. (2012). Credit rating change modeling using news and financial ratios. ACM Transactions on Management Information Systems (TMIS), 3(3):1–30. Lundberg, S. M. and Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30. Mayew, W. J. and Venkatachalam, M. (2012). The power of voice: Managerial affective states and future firm performance. The Journal of Finance, 67(1):1–43. Merton, R. C. (1973). Theory of rational option pricing. The Bell Journal of Economics and Management Science, pages 141–183. Merton, R. C. (1974). On the pricing of corporate debt: The risk structure of interest rates. The Journal of Finance, 29(2):449–470. Miller, B. P. (2010). The effects of reporting complexity on small and large investor trading. The Accounting Review, 85(6):2107–2143. Norden, L. (2017). Information in cds spreads. Journal of Banking & Finance, 75:118– 135. Norden, L. and Weber, M. (2004). Informational efficiency of credit default swap and stock markets: The impact of credit rating announcements. Journal of Banking & Finance, 28(11):2813–2843. Orsenigo, C. and Vercellis, C. (2013). Linear versus nonlinear dimensionality reduction for banks’credit rating prediction. Knowledge-Based Systems, 47:14–22. Pedrosa, M. (1998). Systematic risk in corporate bond credit spreads. Journal of Fixed Income, 8(3):7–26. Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P. J., et al. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(140):1–67. Shapiro, A. H., Sudhof, M., and Wilson, D. J. (2020). Measuring news sentiment. Journal of Econometrics. Smales, L. A. (2016). News sentiment and bank credit risk. Journal of Empirical Finance, 38:37–61. Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. The Journal of Finance, 62(3):1139–1168. Tetlock, P. C., Saar-Tsechansky, M., and Macskassy, S. (2008). More than words: Quantifying language to measure firms’ fundamentals. The journal of finance, 63(3):1437– 1467. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30. Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R., and Le, Q. V. (2019). Xlnet: Generalized autoregressive pretraining for language understanding. Advances in Neural Information Processing Systems, 32.	zh_TW
dc.identifier.doi (DOI)	10.6814/NCCU202200901	en_US

學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

Google Scholar^TM