學術產出-Theses
Article View/Open
Publication Export
-
題名 基於自然語言分析建構預測企業信用評等變動之模型
Construction of Corporate Credit Rating Prediction Model Based on Natural Language Analysis作者 陳明勝
Chen, Ming-Sheng貢獻者 江彌修<br>趙世偉
Chiang, Mi-Hsiu<br>Chao, Shih-Wei
陳明勝
Chen, Ming-Sheng關鍵詞 自然語言分析
神經網路
領域遷移
企業信用預警
Natural Language Analysis
Neural Network
Domain Adaption
Corporate Credit Prediction日期 2022 上傳時間 1-Aug-2022 17:30:32 (UTC+8) 摘要 為改進過去語言分析模型無法辨認語言一字多義以及訓練域與預測域不一致之問題,本研究嘗試以BERT(Bidirectional Encoder Representations from Transformers)模型針對金融領域文本進行領域遷移(Domain Adaption),比較有無經過遷移對模型效能之改進,接著以遷移過之模型分析RavenPack資料庫內所含的美國企業相關新聞,並以此建構信用評等變動預警模型。本研究實證結果顯示,經過遷移之模型預測財金文本情緒的預測準確率比未經遷移之模型高出30.47%,且領域遷移後辨認的新聞情緒提升對未來企業信用評等變動的預測。另外,本研究建構四個隨機森林模型,用以證明企業金融財務面的媒體情緒隱含對企業未來評級可能變動的有效資訊。
To improve the inability of the language analysis model to recognize the polysemy of the language and the inconsistency between the training domain and the prediction domain, this study uses the BERT (Bidirectional Encoder Representations from Transformers) model to perform Domain Adaption for the financial corpus. The adaption improves the performance of the model, and we further use the adapted model to analyze the news related to US companies contained in the RavenPack database and construct an early warning model for credit rating changes.The empirical results show that the prediction accuracy of the adapted model in predicting the sentiment of financial texts is 30.47% higher than that of the non-adapted one, which shows that adaption learning indeed improves the prediction of the corporate credit rating changes. Also, we developed four different random forest models to prove that the media sentiment on the company`s financial news contains effective information on the possible changes in the company`s future rating.參考文獻 Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporatebankruptcy. The Journal of Finance, 23(4):589–609.Araci, D. (2019). Finbert: Financial sentiment analysis with pre-trained language models.arXiv preprint arXiv:1908.10063.Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. Journal ofMachine Learning Research, 3(Jan):993–1022.Collin-Dufresn, P., Goldstein, R. S., and Martin, J. S. (2001). The determinants of creditspread changes. The Journal of Finance, 56(6):2177–2207.Da, Z., Engelberg, J., and Gao, P. (2015). The sum of all fears investor sentiment and assetprices. The Review of Financial Studies, 28(1):1–32.Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-trainingof deep bidirectional transformers for language understanding. arXiv preprintarXiv:1810.04805.Dyer, T., Lang, M., and Stice-Lawrence, L. (2017). The evolution of 10-k textual disclosure: Evidence from latent dirichlet allocation. Journal of Accounting and Economics,64(2-3):221–245.Ericsson, J., Jacobs, K., and Oviedo, R. (2009). The determinants of credit default swappremia. Journal of Financial and Quantitative Analysis, 44(1):109–132.Fama, E. F. (1960). Efficient market hypothesis. Diss. PhD Thesis, Ph. D. dissertation.Galil, K. and Soffer, G. (2011). Good news, bad news and rating announcements: Anempirical investigation. Journal of Banking & Finance, 35(11):3101–3119.Hajek, P. and Michalak, K. (2013). Feature selection in corporate credit rating prediction.Knowledge-Based Systems, 51:72–84.Huang, A. H., Lehavy, R., Zang, A. Y., and Zheng, R. (2018). Analyst information discovery and interpretation roles: A topic modeling approach. Management Science,64(6):2833–2855.Hull, J., Predescu, M., and White, A. (2004). The relationship between credit default swapspreads, bond yields, and credit rating announcements. Journal of Banking & Finance,28(11):2789–2811.Hutto, C. and Gilbert, E. (2014). Vader: A parsimonious rule-based model for sentimentanalysis of social media text. In Proceedings of the International AAAI Conference onWeb and Social Media, volume 8, pages 216–225.Jarrow, R. A. and Turnbull, S. M. (1995). Pricing derivatives on financial securities subjectto credit risk. The Journal of Finance, 50(1):53–85.Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., and Soricut, R. (2019). Albert: A lite bert for self-supervised learning of language representations. arXiv preprintarXiv:1909.11942.Lawrence, A. (2013). Individual investors and financial disclosure. Journal of Accountingand Economics, 56(1):130–147.Lee, Y.-C. (2007). Application of support vector machines to corporate credit rating prediction. Expert Systems with Applications, 33(1):67–74.Li, F. (2008). Annual report readability, current earnings, and earnings persistence. Journal of Accounting and Economics, 45(2-3):221–247.Li, X., Xie, H., Chen, L., Wang, J., and Deng, X. (2014). News impact on stock pricereturn via sentiment analysis. Knowledge-Based Systems, 69:14–23.Liberti, J. M. and Petersen, M. A. (2019). Information: Hard and soft. Review of CorporateFinance Studies, 8(1):1–41.Loughran, T. and McDonald, B. (2011). When is a liability not a liability? textual analysis,dictionaries, and 10-ks. The Journal of Finance, 66(1):35–65.Loughran, T. and McDonald, B. (2014). Measuring readability in financial disclosures.the Journal of Finance, 69(4):1643–1671.Loughran, T. and McDonald, B. (2016). Textual analysis in accounting and finance: Asurvey. Journal of Accounting Research, 54(4):1187–1230.Lu, H.-M., Tsai, F.-T., Chen, H., Hung, M.-W., and Li, S.-H. (2012). Credit rating changemodeling using news and financial ratios. ACM Transactions on Management Information Systems (TMIS), 3(3):1–30.Lundberg, S. M. and Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30.Mayew, W. J. and Venkatachalam, M. (2012). The power of voice: Managerial affectivestates and future firm performance. The Journal of Finance, 67(1):1–43.Merton, R. C. (1973). Theory of rational option pricing. The Bell Journal of Economicsand Management Science, pages 141–183.Merton, R. C. (1974). On the pricing of corporate debt: The risk structure of interest rates.The Journal of Finance, 29(2):449–470.Miller, B. P. (2010). The effects of reporting complexity on small and large investortrading. The Accounting Review, 85(6):2107–2143.Norden, L. (2017). Information in cds spreads. Journal of Banking & Finance, 75:118–135.Norden, L. and Weber, M. (2004). Informational efficiency of credit default swap andstock markets: The impact of credit rating announcements. Journal of Banking & Finance, 28(11):2813–2843.Orsenigo, C. and Vercellis, C. (2013). Linear versus nonlinear dimensionality reductionfor banks’credit rating prediction. Knowledge-Based Systems, 47:14–22.Pedrosa, M. (1998). Systematic risk in corporate bond credit spreads. Journal of FixedIncome, 8(3):7–26.Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu,P. J., et al. (2020). Exploring the limits of transfer learning with a unified text-to-texttransformer. J. Mach. Learn. Res., 21(140):1–67.Shapiro, A. H., Sudhof, M., and Wilson, D. J. (2020). Measuring news sentiment. Journalof Econometrics.Smales, L. A. (2016). News sentiment and bank credit risk. Journal of Empirical Finance,38:37–61.Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stockmarket. The Journal of Finance, 62(3):1139–1168.Tetlock, P. C., Saar-Tsechansky, M., and Macskassy, S. (2008). More than words: Quantifying language to measure firms’ fundamentals. The journal of finance, 63(3):1437–1467.Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł.,and Polosukhin, I. (2017). Attention is all you need. Advances in Neural InformationProcessing Systems, 30.Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R., and Le, Q. V. (2019).Xlnet: Generalized autoregressive pretraining for language understanding. Advancesin Neural Information Processing Systems, 32. 描述 碩士
國立政治大學
金融學系
109352029資料來源 http://thesis.lib.nccu.edu.tw/record/#G0109352029 資料類型 thesis dc.contributor.advisor 江彌修<br>趙世偉 zh_TW dc.contributor.advisor Chiang, Mi-Hsiu<br>Chao, Shih-Wei en_US dc.contributor.author (Authors) 陳明勝 zh_TW dc.contributor.author (Authors) Chen, Ming-Sheng en_US dc.creator (作者) 陳明勝 zh_TW dc.creator (作者) Chen, Ming-Sheng en_US dc.date (日期) 2022 en_US dc.date.accessioned 1-Aug-2022 17:30:32 (UTC+8) - dc.date.available 1-Aug-2022 17:30:32 (UTC+8) - dc.date.issued (上傳時間) 1-Aug-2022 17:30:32 (UTC+8) - dc.identifier (Other Identifiers) G0109352029 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/141068 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 金融學系 zh_TW dc.description (描述) 109352029 zh_TW dc.description.abstract (摘要) 為改進過去語言分析模型無法辨認語言一字多義以及訓練域與預測域不一致之問題,本研究嘗試以BERT(Bidirectional Encoder Representations from Transformers)模型針對金融領域文本進行領域遷移(Domain Adaption),比較有無經過遷移對模型效能之改進,接著以遷移過之模型分析RavenPack資料庫內所含的美國企業相關新聞,並以此建構信用評等變動預警模型。本研究實證結果顯示,經過遷移之模型預測財金文本情緒的預測準確率比未經遷移之模型高出30.47%,且領域遷移後辨認的新聞情緒提升對未來企業信用評等變動的預測。另外,本研究建構四個隨機森林模型,用以證明企業金融財務面的媒體情緒隱含對企業未來評級可能變動的有效資訊。 zh_TW dc.description.abstract (摘要) To improve the inability of the language analysis model to recognize the polysemy of the language and the inconsistency between the training domain and the prediction domain, this study uses the BERT (Bidirectional Encoder Representations from Transformers) model to perform Domain Adaption for the financial corpus. The adaption improves the performance of the model, and we further use the adapted model to analyze the news related to US companies contained in the RavenPack database and construct an early warning model for credit rating changes.The empirical results show that the prediction accuracy of the adapted model in predicting the sentiment of financial texts is 30.47% higher than that of the non-adapted one, which shows that adaption learning indeed improves the prediction of the corporate credit rating changes. Also, we developed four different random forest models to prove that the media sentiment on the company`s financial news contains effective information on the possible changes in the company`s future rating. en_US dc.description.tableofcontents 第一章 緒論 11.1 研究動機與背景 11.2 研究目的 2第二章 文獻回顧 32.1 衡量企業信用風險 32.2 文字分析模型 5第三章 研究方法 103.1 BERT 模型 103.2 隨機森林 143.3 模型績效衡量指標 17第四章 實證分析 224.1 資料處理 224.2 特徵生成 244.3 建構信用評等預警模型 284.4 各模型預警成效 31第五章 結論與建議 45參考文獻 47 zh_TW dc.format.extent 2437979 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0109352029 en_US dc.subject (關鍵詞) 自然語言分析 zh_TW dc.subject (關鍵詞) 神經網路 zh_TW dc.subject (關鍵詞) 領域遷移 zh_TW dc.subject (關鍵詞) 企業信用預警 zh_TW dc.subject (關鍵詞) Natural Language Analysis en_US dc.subject (關鍵詞) Neural Network en_US dc.subject (關鍵詞) Domain Adaption en_US dc.subject (關鍵詞) Corporate Credit Prediction en_US dc.title (題名) 基於自然語言分析建構預測企業信用評等變動之模型 zh_TW dc.title (題名) Construction of Corporate Credit Rating Prediction Model Based on Natural Language Analysis en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporatebankruptcy. The Journal of Finance, 23(4):589–609.Araci, D. (2019). Finbert: Financial sentiment analysis with pre-trained language models.arXiv preprint arXiv:1908.10063.Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. Journal ofMachine Learning Research, 3(Jan):993–1022.Collin-Dufresn, P., Goldstein, R. S., and Martin, J. S. (2001). The determinants of creditspread changes. The Journal of Finance, 56(6):2177–2207.Da, Z., Engelberg, J., and Gao, P. (2015). The sum of all fears investor sentiment and assetprices. The Review of Financial Studies, 28(1):1–32.Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-trainingof deep bidirectional transformers for language understanding. arXiv preprintarXiv:1810.04805.Dyer, T., Lang, M., and Stice-Lawrence, L. (2017). The evolution of 10-k textual disclosure: Evidence from latent dirichlet allocation. Journal of Accounting and Economics,64(2-3):221–245.Ericsson, J., Jacobs, K., and Oviedo, R. (2009). The determinants of credit default swappremia. Journal of Financial and Quantitative Analysis, 44(1):109–132.Fama, E. F. (1960). Efficient market hypothesis. Diss. PhD Thesis, Ph. D. dissertation.Galil, K. and Soffer, G. (2011). Good news, bad news and rating announcements: Anempirical investigation. Journal of Banking & Finance, 35(11):3101–3119.Hajek, P. and Michalak, K. (2013). Feature selection in corporate credit rating prediction.Knowledge-Based Systems, 51:72–84.Huang, A. H., Lehavy, R., Zang, A. Y., and Zheng, R. (2018). Analyst information discovery and interpretation roles: A topic modeling approach. Management Science,64(6):2833–2855.Hull, J., Predescu, M., and White, A. (2004). The relationship between credit default swapspreads, bond yields, and credit rating announcements. Journal of Banking & Finance,28(11):2789–2811.Hutto, C. and Gilbert, E. (2014). Vader: A parsimonious rule-based model for sentimentanalysis of social media text. In Proceedings of the International AAAI Conference onWeb and Social Media, volume 8, pages 216–225.Jarrow, R. A. and Turnbull, S. M. (1995). Pricing derivatives on financial securities subjectto credit risk. The Journal of Finance, 50(1):53–85.Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., and Soricut, R. (2019). Albert: A lite bert for self-supervised learning of language representations. arXiv preprintarXiv:1909.11942.Lawrence, A. (2013). Individual investors and financial disclosure. Journal of Accountingand Economics, 56(1):130–147.Lee, Y.-C. (2007). Application of support vector machines to corporate credit rating prediction. Expert Systems with Applications, 33(1):67–74.Li, F. (2008). Annual report readability, current earnings, and earnings persistence. Journal of Accounting and Economics, 45(2-3):221–247.Li, X., Xie, H., Chen, L., Wang, J., and Deng, X. (2014). News impact on stock pricereturn via sentiment analysis. Knowledge-Based Systems, 69:14–23.Liberti, J. M. and Petersen, M. A. (2019). Information: Hard and soft. Review of CorporateFinance Studies, 8(1):1–41.Loughran, T. and McDonald, B. (2011). When is a liability not a liability? textual analysis,dictionaries, and 10-ks. The Journal of Finance, 66(1):35–65.Loughran, T. and McDonald, B. (2014). Measuring readability in financial disclosures.the Journal of Finance, 69(4):1643–1671.Loughran, T. and McDonald, B. (2016). Textual analysis in accounting and finance: Asurvey. Journal of Accounting Research, 54(4):1187–1230.Lu, H.-M., Tsai, F.-T., Chen, H., Hung, M.-W., and Li, S.-H. (2012). Credit rating changemodeling using news and financial ratios. ACM Transactions on Management Information Systems (TMIS), 3(3):1–30.Lundberg, S. M. and Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30.Mayew, W. J. and Venkatachalam, M. (2012). The power of voice: Managerial affectivestates and future firm performance. The Journal of Finance, 67(1):1–43.Merton, R. C. (1973). Theory of rational option pricing. The Bell Journal of Economicsand Management Science, pages 141–183.Merton, R. C. (1974). On the pricing of corporate debt: The risk structure of interest rates.The Journal of Finance, 29(2):449–470.Miller, B. P. (2010). The effects of reporting complexity on small and large investortrading. The Accounting Review, 85(6):2107–2143.Norden, L. (2017). Information in cds spreads. Journal of Banking & Finance, 75:118–135.Norden, L. and Weber, M. (2004). Informational efficiency of credit default swap andstock markets: The impact of credit rating announcements. Journal of Banking & Finance, 28(11):2813–2843.Orsenigo, C. and Vercellis, C. (2013). Linear versus nonlinear dimensionality reductionfor banks’credit rating prediction. Knowledge-Based Systems, 47:14–22.Pedrosa, M. (1998). Systematic risk in corporate bond credit spreads. Journal of FixedIncome, 8(3):7–26.Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu,P. J., et al. (2020). Exploring the limits of transfer learning with a unified text-to-texttransformer. J. Mach. Learn. Res., 21(140):1–67.Shapiro, A. H., Sudhof, M., and Wilson, D. J. (2020). Measuring news sentiment. Journalof Econometrics.Smales, L. A. (2016). News sentiment and bank credit risk. Journal of Empirical Finance,38:37–61.Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stockmarket. The Journal of Finance, 62(3):1139–1168.Tetlock, P. C., Saar-Tsechansky, M., and Macskassy, S. (2008). More than words: Quantifying language to measure firms’ fundamentals. The journal of finance, 63(3):1437–1467.Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł.,and Polosukhin, I. (2017). Attention is all you need. Advances in Neural InformationProcessing Systems, 30.Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R., and Le, Q. V. (2019).Xlnet: Generalized autoregressive pretraining for language understanding. Advancesin Neural Information Processing Systems, 32. zh_TW dc.identifier.doi (DOI) 10.6814/NCCU202200901 en_US