學術產出-Theses
Article View/Open
Publication Export
-
題名 主題分析方法在經濟文獻學上的應用:隱含狄利克雷分配與代理人基計算經濟學
Topic Analysis in the Automatic Organization of Economic Literature: The Case of Agent-Based Computational Economics with the Use of Latent Dirichlet Allocation作者 胡瑞軒
Hu, Ruei-Xsuan貢獻者 陳樹衡
Chen, Shu-Heng
胡瑞軒
Hu, Ruei-Xsuan關鍵詞 代理人基建模
非監督學習
詞彙頻率-逆文檔頻率
文字雲
自然語言處理
主題一致性
主題相似度
Agent-Based Modeling
Unsupervised Learning
TF-IDF
Wordcloud
NLP
Topic coherence
Topic similarity日期 2022 上傳時間 2-Sep-2022 15:26:59 (UTC+8) 摘要 本文將多個期刊的代理人基建模(Agent-Based Modeling, ABM) 的論文用主題模型中的隱含狄利克雷分配(Latent Dirichlet Allocation, LDA) 進行分類,接著用詞彙頻率-逆文檔頻率(Term Frequency-Inverse Document Frequency, TF-IDF) 與文字雲分別找出與該主題相關卻被過濾掉的詞彙以及主題之間的相同詞彙並且對於每個主題所屬的期刊進行分類並分析主題隨時間的變化。最後,主題相似度、主題排名與主題一致性分析結果顯示每個主題的重疊度不大,主題解釋比例與一致性都很高。本文有別於過往研究,進行多個期刊的分析以及分類之後的評估,主題相似度、主題排名與主題一致性評估方式顯示隱含狄利克雷分配模型能有效地量化具體的方式將文檔分類,且比人為的分類方式降低更多時間成本與資料複雜度。
In this paper, we classify Agent-Based Modeling (ABM) papers in multiple journals with Latent Dirichlet Allocation (LDA) in topic model. By applying analyses of TF-IDF algorithm and word cloud, we recollect words related to the topic but filtered out in the first place and gather same words belonging to different topics. Also, we analyze the dynamics of topics in several journals over time. Finally, the results of topic similarity, topic ranking and topic consistency analysis show that each topic has little overlap, and the topic explanation ratio and consistency are high. Different from previous studies, we classify ABM papers in multiply journals and have further evaluations. The evaluation methods of topic similarity, topic ranking and topic consistency show that the implicit Dirichlet allocation model can effectively quantitatively classify documents. And it reduces more time cost and data complexity than artificial classification.參考文獻 [1] Ambrosino, A., Cedrini, M., Davis, J. B., Fiori, S. Guerzoni, M., & Nuccio, M. (2018). What topic modeling could reveal about the evolution of economics. Journal of Economic Methodology, 25(4), 329-348.[2] Alexakis, C., Doolig, M., Eleftheriou, K., & Polemis, M. (2020). Textual Machine Learning: An Application to Computational Economics Research. Computational Economics, 57(1), 369-385.[3] Blei, D. M., Jordan, M. I, & Ng, A. Y.. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3(2003), 993-1022.[4] Boyd-Graber, J., Hu, Y., & Mimno, D. (2017). Applications of topic models. Foundations and Trends in Information Retrieval, 11(2-3), 143–296.[5] Hannigan, T. R., Haans, R. F., Vakili, K., Tchalian, H., Glaser, V. L., Wang, M. S., et al. (2019). Topic modeling in management research: rendering new theory from textual data. Academy of Management Annals, 13(2), 586–632.[6] Hofmann, T. (1999). Probabilistic Latent Semantic Analysis. Proceedings of Conference on Uncertainty in Artificial Intelligence (UAI-99), Stockholm, 289-296.[7] Huang, A. H., Lehavy, R., Zang, A. Y., & Zheng, R. (2018). Analyst information discovery and interpretation roles: a topic modeling approach. Management Science, 64(6), 2833-2855.[8] Kao, Y. F., & Venkatachalam, R. (2018). Human and Machine Learning. Computational Economics, 57(4), 889-909.[9] Kumar, A., & Paul, A. (2016). Mastering Text Mining with R. UK:Packt Publishing Ltd.[10] Mimno, D., Leenders, M., McCallum, A., Talley, E., & Wallach, H. M. (2011). Optimizing Semantic Coherence in Topic Models. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 262-272.[11] Newman, D., Lau, J. H., Grieser, K., & Baldwin, T. (2010). Automatic Evaluation of Topic Coherence. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL, 100-108.[12] Papadimitriou, C. H., Raghavan, P., Tamaki, H., & Vempala, S. (1999). Latent Semantic Indexing: A Probabilistic Analysis. Journal of Computer and System Sciences, 61(2), 217-235.[13] Polyakov, M., Chalak, M., Iftekhar, M. S., Pandit, R., Tapsuwan, S., Zhang, F., & Ma, C. (2017). Authorship, Collaboration, Topics, and Research Gaps in Environmental and Resource Economics 1991–2015. Environmental and Resource Economics volume 71(1), 217-239.[14] Piepenbrink, A., & Nurmammadov, E. (2015). Topics in the literature of transition economies and emerging markets. Scientometrics, 102(3), 2107-2130.[15] Tesfatsion, L. (2021). Agent-Based Computational Economics: Overview and Brief History. Working Paper 21004, Department of Economics, Iowa State University.[16] Tesfatsion, L. (2022, January 1). Agent-Based Computational Economics(ACE). Intro Materials and Research Area Sites. http://www2.econ.iastate.edu/tesfatsi/aapplic.htm 描述 碩士
國立政治大學
經濟學系
109258032資料來源 http://thesis.lib.nccu.edu.tw/record/#G0109258032 資料類型 thesis dc.contributor.advisor 陳樹衡 zh_TW dc.contributor.advisor Chen, Shu-Heng en_US dc.contributor.author (Authors) 胡瑞軒 zh_TW dc.contributor.author (Authors) Hu, Ruei-Xsuan en_US dc.creator (作者) 胡瑞軒 zh_TW dc.creator (作者) Hu, Ruei-Xsuan en_US dc.date (日期) 2022 en_US dc.date.accessioned 2-Sep-2022 15:26:59 (UTC+8) - dc.date.available 2-Sep-2022 15:26:59 (UTC+8) - dc.date.issued (上傳時間) 2-Sep-2022 15:26:59 (UTC+8) - dc.identifier (Other Identifiers) G0109258032 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/141743 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 經濟學系 zh_TW dc.description (描述) 109258032 zh_TW dc.description.abstract (摘要) 本文將多個期刊的代理人基建模(Agent-Based Modeling, ABM) 的論文用主題模型中的隱含狄利克雷分配(Latent Dirichlet Allocation, LDA) 進行分類,接著用詞彙頻率-逆文檔頻率(Term Frequency-Inverse Document Frequency, TF-IDF) 與文字雲分別找出與該主題相關卻被過濾掉的詞彙以及主題之間的相同詞彙並且對於每個主題所屬的期刊進行分類並分析主題隨時間的變化。最後,主題相似度、主題排名與主題一致性分析結果顯示每個主題的重疊度不大,主題解釋比例與一致性都很高。本文有別於過往研究,進行多個期刊的分析以及分類之後的評估,主題相似度、主題排名與主題一致性評估方式顯示隱含狄利克雷分配模型能有效地量化具體的方式將文檔分類,且比人為的分類方式降低更多時間成本與資料複雜度。 zh_TW dc.description.abstract (摘要) In this paper, we classify Agent-Based Modeling (ABM) papers in multiple journals with Latent Dirichlet Allocation (LDA) in topic model. By applying analyses of TF-IDF algorithm and word cloud, we recollect words related to the topic but filtered out in the first place and gather same words belonging to different topics. Also, we analyze the dynamics of topics in several journals over time. Finally, the results of topic similarity, topic ranking and topic consistency analysis show that each topic has little overlap, and the topic explanation ratio and consistency are high. Different from previous studies, we classify ABM papers in multiply journals and have further evaluations. The evaluation methods of topic similarity, topic ranking and topic consistency show that the implicit Dirichlet allocation model can effectively quantitatively classify documents. And it reduces more time cost and data complexity than artificial classification. en_US dc.description.tableofcontents 摘要. . . . . . . . . . . . . . . . . . IAbstractv. . . . . . . . . . . . . . . II1 緒論. . . . . . . . . . . . . . . . . 12 研究流程. . . . . . . . . . . . . . . 43 理論架構. . . . . . . . . . . . . . . 63.1 主題建模. . . . . . . . . . . . . 63.2 隱含狄利克雷分配的基本概念. . . . . 73.3 分類方式與採用理論. . . . . . . . .113.3.1 TF-IDF 演算法. . . . . . . . . 123.3.2 文字雲. . . . . . . . . . . . 143.3.3 餘弦距離. . . . . . . . . . . 143.3.4 主題一致性. . . . . . . . . . 154 數據與數據分析方法. . . . . . . . . . .164.1 數據. . . . . . . . . . . . . . . 164.2 數據分析方法 . . . . . . . . . . . 175 研究結果. . . . . . . . . . . . . . . .185.1 解釋主題. . . . . . . . . . . . . .195.2 分析主題類別. . . . . . . . . . . .245.3 主題相關詞彙. . . . . . . . . . . .345.4 與一般文檔分類相異之處. . . . . . . 385.5 TF-IDF 演算法. . . . . . . . . . . 455.6 主題隨時間的變化. . . . . . . . . . 525.7 使用文字雲來識別主題. . . . . . . . 625.8 主題相似程度. . . . . . . . . . . .685.9 主題排名. . . . . . . . . . . . . .705.10 主題一致性. . . . . . . . . . . . 736 結論與建議. . . . . . . . . . . . . . .756.1 結論. . . . . . . . . . . . . . . .756.2 建議. . . . . . . . . . . . . . . .767 參考文獻. . . . . . . . . . . . . . . .77 zh_TW dc.format.extent 20186703 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0109258032 en_US dc.subject (關鍵詞) 代理人基建模 zh_TW dc.subject (關鍵詞) 非監督學習 zh_TW dc.subject (關鍵詞) 詞彙頻率-逆文檔頻率 zh_TW dc.subject (關鍵詞) 文字雲 zh_TW dc.subject (關鍵詞) 自然語言處理 zh_TW dc.subject (關鍵詞) 主題一致性 zh_TW dc.subject (關鍵詞) 主題相似度 zh_TW dc.subject (關鍵詞) Agent-Based Modeling en_US dc.subject (關鍵詞) Unsupervised Learning en_US dc.subject (關鍵詞) TF-IDF en_US dc.subject (關鍵詞) Wordcloud en_US dc.subject (關鍵詞) NLP en_US dc.subject (關鍵詞) Topic coherence en_US dc.subject (關鍵詞) Topic similarity en_US dc.title (題名) 主題分析方法在經濟文獻學上的應用:隱含狄利克雷分配與代理人基計算經濟學 zh_TW dc.title (題名) Topic Analysis in the Automatic Organization of Economic Literature: The Case of Agent-Based Computational Economics with the Use of Latent Dirichlet Allocation en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] Ambrosino, A., Cedrini, M., Davis, J. B., Fiori, S. Guerzoni, M., & Nuccio, M. (2018). What topic modeling could reveal about the evolution of economics. Journal of Economic Methodology, 25(4), 329-348.[2] Alexakis, C., Doolig, M., Eleftheriou, K., & Polemis, M. (2020). Textual Machine Learning: An Application to Computational Economics Research. Computational Economics, 57(1), 369-385.[3] Blei, D. M., Jordan, M. I, & Ng, A. Y.. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3(2003), 993-1022.[4] Boyd-Graber, J., Hu, Y., & Mimno, D. (2017). Applications of topic models. Foundations and Trends in Information Retrieval, 11(2-3), 143–296.[5] Hannigan, T. R., Haans, R. F., Vakili, K., Tchalian, H., Glaser, V. L., Wang, M. S., et al. (2019). Topic modeling in management research: rendering new theory from textual data. Academy of Management Annals, 13(2), 586–632.[6] Hofmann, T. (1999). Probabilistic Latent Semantic Analysis. Proceedings of Conference on Uncertainty in Artificial Intelligence (UAI-99), Stockholm, 289-296.[7] Huang, A. H., Lehavy, R., Zang, A. Y., & Zheng, R. (2018). Analyst information discovery and interpretation roles: a topic modeling approach. Management Science, 64(6), 2833-2855.[8] Kao, Y. F., & Venkatachalam, R. (2018). Human and Machine Learning. Computational Economics, 57(4), 889-909.[9] Kumar, A., & Paul, A. (2016). Mastering Text Mining with R. UK:Packt Publishing Ltd.[10] Mimno, D., Leenders, M., McCallum, A., Talley, E., & Wallach, H. M. (2011). Optimizing Semantic Coherence in Topic Models. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 262-272.[11] Newman, D., Lau, J. H., Grieser, K., & Baldwin, T. (2010). Automatic Evaluation of Topic Coherence. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL, 100-108.[12] Papadimitriou, C. H., Raghavan, P., Tamaki, H., & Vempala, S. (1999). Latent Semantic Indexing: A Probabilistic Analysis. Journal of Computer and System Sciences, 61(2), 217-235.[13] Polyakov, M., Chalak, M., Iftekhar, M. S., Pandit, R., Tapsuwan, S., Zhang, F., & Ma, C. (2017). Authorship, Collaboration, Topics, and Research Gaps in Environmental and Resource Economics 1991–2015. Environmental and Resource Economics volume 71(1), 217-239.[14] Piepenbrink, A., & Nurmammadov, E. (2015). Topics in the literature of transition economies and emerging markets. Scientometrics, 102(3), 2107-2130.[15] Tesfatsion, L. (2021). Agent-Based Computational Economics: Overview and Brief History. Working Paper 21004, Department of Economics, Iowa State University.[16] Tesfatsion, L. (2022, January 1). Agent-Based Computational Economics(ACE). Intro Materials and Research Area Sites. http://www2.econ.iastate.edu/tesfatsi/aapplic.htm zh_TW dc.identifier.doi (DOI) 10.6814/NCCU202201265 en_US