利用領域適應建構自然語言情緒分類模型：以台灣財經新聞為例

學術產出-Theses

Article View/Open

Publication Export

Google Scholar^TM

政大圖書館

學術資源探索系統

Citation Infomation

Simple Record
Full Record

題名	利用領域適應建構自然語言情緒分類模型：以台灣財經新聞為例 A Semantic Sentiment Classification Model based on Domain Adaption: the Case of TWSE News
作者	張群 Chang, Chun
貢獻者	江彌修 Chiang, Mi-Hsiu 張群 Chang, Chun
關鍵詞	自然語言處理預訓練下游任務情緒分類領域適應主題分組 Nature Language Process Pre-training Downstream-task Domain Adaption Topic Grouping
日期	2021
上傳時間	1-Jul-2021 18:01:18 (UTC+8)
摘要	本研究在台灣財經新聞上，使用領域適應以及主題分組的方式解決 BERT 在無監督文本與領域文本，以及下游訓練文本之間都存在的文本異質性，並且依照上述方式方邊分別建立情緒分類模型。在領域適應對於情緒分類效果不論在損失函數、混淆矩陣，以及 ROC 曲線與 AUC 值上的結論都與 Araci (2019)相同，並無發現顯著的差異。然而在主題分組上對於情緒分類效果上，在上述的指標中都能發現顯著差異;除此之外，若觀察主題分組之後的模型，亦能進一步發現負面新聞的理解能力提升，解決負面新聞在先天樣本不足下的劣勢。 In this paper, we implement domain adaption and topic grouping to deal with heterogeneity of corpus between pre-training and domain corpus, and among downstream corpus for TWSE news semantic sentiment classification model. The empirical results show that domain adaption model can’t conclude a significant effect, which agrees with Araci (2019). However, the topic grouping model can achieve a better performance than other models, and increase the understanding of negative corpus.
參考文獻	[1] Araci, D. (2019), “FinBERT: Financial Sentiment Analysis with Pre-trained. Language Models.”, “arXiv preprint arXiv:1908.10063” [2] Bengio, Y., Ducharme, R., Vincent, R. and Jauvin C. (2003), “A Neural Probabilistic Language Model”, “Journal of Machine Learning Research 3 (2003)”, 1137–1155 [3] Devlin, J., Chang, M. W., Lee, K. and Toutanova, K. (2018), “BERT: Pre- training of. Deep Bidirectional Transformers for Language Understanding.”, “arXiv preprint arXiv 1811.03600v2” [4] Santos, C. D., and Gatti, M. (2014). “Deep Convolutional Neural Networks for Sentiment. Analysis of Short Texts, in Proceedings of COLING.”, “The 25th International Conference on Computational Linguistics.”, 69–78 [5] Howard, J. and Ruder, S. (2018). Universal Language Model Fine- tuning for. Text Classification. (jan 2018). arXiv preprint arXiv: 1801.06146 [6] Jotheeswaran, J. and Koteeswaran, S. (2015), “Decision Tree Based Feature Selection. and Multilayer Perceptron for Sentiment Analysis.”, “ARPN J Eng Appl Sci, ISSN 1819–6608 10(14)”, 5883– 5894 [7] Mikolov, T., Chen, K., Corrado, G. and Dean J. (2013), “Efficient Estimation of Word. Representations in Vector Space”, “arXiv preprint arXiv 1301.3781” [8] Pranckevičius, T. and Marcinkevicius V. (2017), “Comparison of naive bayes, random. forest, decision tree, support vector machines, and logistic regression classifiers for text reviews classification.”, “Baltic J Mod Comput 5:221” [9] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L. and Gomez , A. N., Kaiser, L., Polosukhin I. (2017), “Attention Is All You Need”, “arXiv preprint arXiv:1706.03762” [10] Wang, X., Liu, Y., Sun, C., Wang, B., Wang, X. (2015), “Predicting. Polarities of Tweets by Composing Word Embeddings with Long Short-Term Memory”, “in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Beijing, China”, 1343–1353 [11] Yang, Y., Cer, D., Ahmad, A., Guo, M., Law, J., Constant, N., Abrego, A. G., Yuan, S., Tar, C., Sung, Y., Strope, B., Kurzweil, R. (2019), “Multilingual Universal Sentence Encoder for Semantic Retrieval”, “arXiv preprint arXiv:1907.04307” [12] Wang, Y., Wang, M., Fujita, H. (2020), “Word sense disambiguation: A. comprehensive knowledge exploitation framework”, “Knowledge- Based Systems 190 (Feb, 2020): 105030” [13] Zhang, D. W., Xu, H., Su, Z., Xu, Y. (2015), “Chinese comments. sentiment classification based on word2vec and SVMperf.”, “Expert Systems with Applications, 42(4)”, 1857–1863.
描述	碩士國立政治大學金融學系 108352022
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0108352022
資料類型	thesis

dc.contributor.advisor	江彌修	zh_TW
dc.contributor.advisor	Chiang, Mi-Hsiu	en_US
dc.contributor.author (Authors)	張群	zh_TW
dc.contributor.author (Authors)	Chang, Chun	en_US
dc.creator (作者)	張群	zh_TW
dc.creator (作者)	Chang, Chun	en_US
dc.date (日期)	2021	en_US
dc.date.accessioned	1-Jul-2021 18:01:18 (UTC+8)	-
dc.date.available	1-Jul-2021 18:01:18 (UTC+8)	-
dc.date.issued (上傳時間)	1-Jul-2021 18:01:18 (UTC+8)	-
dc.identifier (Other Identifiers)	G0108352022	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/135940	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	金融學系	zh_TW
dc.description (描述)	108352022	zh_TW
dc.description.abstract (摘要)	本研究在台灣財經新聞上，使用領域適應以及主題分組的方式解決 BERT 在無監督文本與領域文本，以及下游訓練文本之間都存在的文本異質性，並且依照上述方式方邊分別建立情緒分類模型。在領域適應對於情緒分類效果不論在損失函數、混淆矩陣，以及 ROC 曲線與 AUC 值上的結論都與 Araci (2019)相同，並無發現顯著的差異。然而在主題分組上對於情緒分類效果上，在上述的指標中都能發現顯著差異;除此之外，若觀察主題分組之後的模型，亦能進一步發現負面新聞的理解能力提升，解決負面新聞在先天樣本不足下的劣勢。	zh_TW
dc.description.abstract (摘要)	In this paper, we implement domain adaption and topic grouping to deal with heterogeneity of corpus between pre-training and domain corpus, and among downstream corpus for TWSE news semantic sentiment classification model. The empirical results show that domain adaption model can’t conclude a significant effect, which agrees with Araci (2019). However, the topic grouping model can achieve a better performance than other models, and increase the understanding of negative corpus.	en_US
dc.description.tableofcontents	第一章緒論 9 第二章文獻回顧 13 第一節情緒分析應用在財務領域 13 第二節二階段模型的起源 14 第三節領域適應與主題分組 22 第三章研究方法 23 第一節實驗流程 23 第二節資料蒐集 25 第三節資料預處理與主題分類 28 第四節模型實驗 30 第四章實證結果 35 第一節實驗設置 35 第二節實驗結果 36 第五章結論與後續研究建議 47 第一節結論 47 第二節後續研究建議 48 參考文獻 49	zh_TW
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0108352022	en_US
dc.subject (關鍵詞)	自然語言處理	zh_TW
dc.subject (關鍵詞)	預訓練	zh_TW
dc.subject (關鍵詞)	下游任務	zh_TW
dc.subject (關鍵詞)	情緒分類	zh_TW
dc.subject (關鍵詞)	領域適應	zh_TW
dc.subject (關鍵詞)	主題分組	zh_TW
dc.subject (關鍵詞)	Nature Language Process	en_US
dc.subject (關鍵詞)	Pre-training	en_US
dc.subject (關鍵詞)	Downstream-task	en_US
dc.subject (關鍵詞)	Domain Adaption	en_US
dc.subject (關鍵詞)	Topic Grouping	en_US
dc.title (題名)	利用領域適應建構自然語言情緒分類模型：以台灣財經新聞為例	zh_TW
dc.title (題名)	A Semantic Sentiment Classification Model based on Domain Adaption: the Case of TWSE News	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	[1] Araci, D. (2019), “FinBERT: Financial Sentiment Analysis with Pre-trained. Language Models.”, “arXiv preprint arXiv:1908.10063” [2] Bengio, Y., Ducharme, R., Vincent, R. and Jauvin C. (2003), “A Neural Probabilistic Language Model”, “Journal of Machine Learning Research 3 (2003)”, 1137–1155 [3] Devlin, J., Chang, M. W., Lee, K. and Toutanova, K. (2018), “BERT: Pre- training of. Deep Bidirectional Transformers for Language Understanding.”, “arXiv preprint arXiv 1811.03600v2” [4] Santos, C. D., and Gatti, M. (2014). “Deep Convolutional Neural Networks for Sentiment. Analysis of Short Texts, in Proceedings of COLING.”, “The 25th International Conference on Computational Linguistics.”, 69–78 [5] Howard, J. and Ruder, S. (2018). Universal Language Model Fine- tuning for. Text Classification. (jan 2018). arXiv preprint arXiv: 1801.06146 [6] Jotheeswaran, J. and Koteeswaran, S. (2015), “Decision Tree Based Feature Selection. and Multilayer Perceptron for Sentiment Analysis.”, “ARPN J Eng Appl Sci, ISSN 1819–6608 10(14)”, 5883– 5894 [7] Mikolov, T., Chen, K., Corrado, G. and Dean J. (2013), “Efficient Estimation of Word. Representations in Vector Space”, “arXiv preprint arXiv 1301.3781” [8] Pranckevičius, T. and Marcinkevicius V. (2017), “Comparison of naive bayes, random. forest, decision tree, support vector machines, and logistic regression classifiers for text reviews classification.”, “Baltic J Mod Comput 5:221” [9] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L. and Gomez , A. N., Kaiser, L., Polosukhin I. (2017), “Attention Is All You Need”, “arXiv preprint arXiv:1706.03762” [10] Wang, X., Liu, Y., Sun, C., Wang, B., Wang, X. (2015), “Predicting. Polarities of Tweets by Composing Word Embeddings with Long Short-Term Memory”, “in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Beijing, China”, 1343–1353 [11] Yang, Y., Cer, D., Ahmad, A., Guo, M., Law, J., Constant, N., Abrego, A. G., Yuan, S., Tar, C., Sung, Y., Strope, B., Kurzweil, R. (2019), “Multilingual Universal Sentence Encoder for Semantic Retrieval”, “arXiv preprint arXiv:1907.04307” [12] Wang, Y., Wang, M., Fujita, H. (2020), “Word sense disambiguation: A. comprehensive knowledge exploitation framework”, “Knowledge- Based Systems 190 (Feb, 2020): 105030” [13] Zhang, D. W., Xu, H., Su, Z., Xu, Y. (2015), “Chinese comments. sentiment classification based on word2vec and SVMperf.”, “Expert Systems with Applications, 42(4)”, 1857–1863.	zh_TW
dc.identifier.doi (DOI)	10.6814/NCCU202100606	en_US

學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

Google Scholar^TM