學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

題名 華語流行音樂用詞風格探勘系統
Lexical Style Mining System for Chinese Popular Music
作者 陳沛穎
Chen, Pei-Ying
貢獻者 沈錳坤
Shan, Man-Kwan
陳沛穎
Chen, Pei-Ying
關鍵詞 華語流行歌詞
用詞探勘
共現關係
Co-occurrence Relationship
Chinese Popular Lyrics
Lexical Style Mining
日期 2022
上傳時間 5-Oct-2022 09:13:43 (UTC+8)
摘要 行音樂藉由歌詞傳遞出情感、故事經驗、以及理念態度,對於一般大眾來說扮演著重要的地位,不僅動人心弦,引發共鳴及認同,與社會文化息息相關,許多人藉由歌詞來傳達理念,從用詞反映當時的社會文化、思想,對於流行文化也是重要的資產。
本研究整理華語流行歌詞的語料庫,研究開發華語流行歌詞的用詞探勘系統。此系統提供歌詞的主題分類、用韻判斷、情意分析、譬喻分析與字詞共現探勘的功能。我們結合詞向量技術,研究字詞在不同年代的共現關係。本研究所研發的系統協助使用者方便地探勘分析華語流行歌詞在不同年代的用詞風格。
Popular music plays an important role in our daily life by conveying emotions, experiences, ideas and attitudes through lyrics. It not only touches one`s heart, but also resonates with the general publics, which related to the social culture closely. People express their thought through lyrics and reflect current social culture as well. Popular music is an important asset for popular culture.
This thesis investigated and developed the lexical style mining system for the lyrics of Chinese popular music. The system collects the corpus of Chinese popular lyrics and provides the functions of theme classification, rhyme judgment, sentiment analysis, metaphor analysis and co-occurrence mining of lyrics. The focus of this thesis lies in the integration of word embedding techniques to discover the evolution of co-occurrence relationships between words over time. The system developed by this research helps users to explore and analyze the lexical style of Chinese popular lyrics in different eras.
參考文獻 [1] M. Hilpert. Germanic Future Constructions: A Usage-based Approach to Language Change. Benjamins, Amsterdam, Netherlands, 2008.
[2] Peter Turney and Patrick Pantel. From Frequency to Meaning: Vector Space Models of Semantics. Journal of Artificial Intelligence Research, 37(1),141–188, 2010.
[3] Marco Baroni, Georgiana Dinu, and Germán Kruszewski. Don’t Count, Predict! A Systematic Comparison of Context-counting vs. Context-predicting Semantic Vectors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 238–247, 2014.
[4] Martin Hilpert and Stefan Th Gries. Assessing Frequency Changes in Multistage Diachronic Corpora: Applications for Historical Corpus Linguistics and the Study of Language Acquisition. Literary and Linguistic Computing 24(4), 385-401, 2008.
[5] Patrick Juola. The Time Course of Language Change. Computers and the Humanities 37(1), 77-96, 2003.
[6] Andrey Kutuzov, Lilja Øvrelid, Terrence Szymanski, and Erik Velldal. Diachronic Word Embeddings and Semantic Shifts: A Survey. In Proceedings of the 27th International Conference on Computational Linguistics. 1384–1397, 2018.
[7] Nina Tahmasebi, Lars Borin, and Adam Jatowt. Survey of Computational Approaches to Diachronic Conceptual Change, Computational Linguistics, 1(1), 2018.
[8] Xuri Tang. A State-of-the-art of Semantic Change Computation. Natural Language Engineering 24(5), 649–676, 2018.

[9] Yoon Kim, Yi-I Chiu, Kentaro Hanaki, Darshan Hegde, and Slav Petrov. Temporal Analysis of Language through Neural Language Models. In Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science. 61-65, 2014.
[10] Hao Peng, Jianxin Li, Yangqiu Song, and Yaopeng Liu. Incrementally Learning the Hierarchical Softmax Function for Neural Language Models. In Thirty-First AAAI Conference on Artificial Intelligence, 2017.
[11] Nobuhiro Kaji and Hayato Kobayashi. Incremental Skip-gram Model with Negative Sampling. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 363–371, 2017.
[12] Vivek Kulkarni, Rami Al-Rfou, Bryan Perozzi, and Steven Skiena. Statistically Significant Detection of Linguistic Change. In Proceedings of the 24th International Conference on World Wide Web. 625-635, 2015.
[13] Yating Zhang, Adam Jatowt, Sourav Bhowmick, and Katsumi Tanaka. Omnia Mutantur, Nihil Interit: Connecting Past with Present by Finding Corresponding Terms across Time. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 645–655, 2015.
[14] Yating Zhang, Adam Jatowt, Sourav S. Bhowmick, and Katsumi Tanaka. The Past Is Not a Foreign Country: Detecting Semantically Similar Terms across Time. IEEE Transactions on Knowledge and Data Engineering, 28(10), 2793–2807, 2016.
[15] Steffen Eger and Alexander Mehler. On the Linearity of Semantic Change: Investigating Meaning Variation via Dynamic Graph Models. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 52–58, 2016.
[16] Haim Dubossarsky, Simon Hengchen, Nina Tahmasebi, and Dominik Schlechtweg. Time-Out: Temporal Referencing for Robust Modeling of Lexical Semantic Change. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 457-470, 2019.
[17] T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient Estimation of Word Representations in Vector Space, First International Conference on Learning Representation, 2013.
[18] J. Pennington, R. Socher, and C. Manning. GloVe: Global Vectors for Word Representation, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543, 2014.
[19] 謝櫻子:方文山華語詞作主題研究,國立新竹教育大學人資處語文教學碩士論文,2012。
[20] 吳媺婉:台灣國語流行歌曲的修辭藝術(1949-2000),國立臺北教育大學語文教學系碩士論文,2005。
[21] 張雯禎:台灣流行歌詞中的隱喻:以愛情為主題(1990-2008),國立中正大學語言研究所碩士論文,2008。
[22] 邱彥霖:身體使用說明書-從台語流行音樂中男性氣概之流變,國立政治大學社會學研究所碩士論文,2010。
[23] 賴玲玉:台語流行歌詞中的愛情隱喻(1980-2010),國立彰化師範大學台灣文學研究所台灣文學教學碩士論文,2011。
[24] 蕭蘋與蘇振昇:揭開風花雪月的迷霧:解讀台灣流行音樂中的愛情世界 (1989-1998),新聞學研究,第七十期,2002。
[25] 馬占山:臺灣流行歌詞主題類型與語言表達研究(1999~2008)─以「Hit-FM 年度百首單曲」為對象,國立臺灣師範大學碩士班論文,2013。
[26] 曾慧佳:從流行歌曲看台灣社會,桂冠圖書,2000。
[27] 周晏如:由華語流行歌詞探勘歌詞的特徵樣式,國立政治大學碩士學位論文,2016。
[28] 張冕資:使用歌詞以及階層群集分析方法的華語流行歌曲情緒辨識,國立臺灣科技大學工業管理研究所碩士論文,2017。
[29] 卓紋君:臺灣人愛情風格分析之研究,中華輔導學報,第十六期,2004。
[30] 蔣翰宗等人:四面處歌—華語流行音樂查詢分析探勘系統作品構想書,109年全國大學校院數位人文大數據學生競賽,2010。
描述 碩士
國立政治大學
資訊科學系
108753114
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0108753114
資料類型 thesis
dc.contributor.advisor 沈錳坤zh_TW
dc.contributor.advisor Shan, Man-Kwanen_US
dc.contributor.author (Authors) 陳沛穎zh_TW
dc.contributor.author (Authors) Chen, Pei-Yingen_US
dc.creator (作者) 陳沛穎zh_TW
dc.creator (作者) Chen, Pei-Yingen_US
dc.date (日期) 2022en_US
dc.date.accessioned 5-Oct-2022 09:13:43 (UTC+8)-
dc.date.available 5-Oct-2022 09:13:43 (UTC+8)-
dc.date.issued (上傳時間) 5-Oct-2022 09:13:43 (UTC+8)-
dc.identifier (Other Identifiers) G0108753114en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/142119-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學系zh_TW
dc.description (描述) 108753114zh_TW
dc.description.abstract (摘要) 行音樂藉由歌詞傳遞出情感、故事經驗、以及理念態度,對於一般大眾來說扮演著重要的地位,不僅動人心弦,引發共鳴及認同,與社會文化息息相關,許多人藉由歌詞來傳達理念,從用詞反映當時的社會文化、思想,對於流行文化也是重要的資產。
本研究整理華語流行歌詞的語料庫,研究開發華語流行歌詞的用詞探勘系統。此系統提供歌詞的主題分類、用韻判斷、情意分析、譬喻分析與字詞共現探勘的功能。我們結合詞向量技術,研究字詞在不同年代的共現關係。本研究所研發的系統協助使用者方便地探勘分析華語流行歌詞在不同年代的用詞風格。
zh_TW
dc.description.abstract (摘要) Popular music plays an important role in our daily life by conveying emotions, experiences, ideas and attitudes through lyrics. It not only touches one`s heart, but also resonates with the general publics, which related to the social culture closely. People express their thought through lyrics and reflect current social culture as well. Popular music is an important asset for popular culture.
This thesis investigated and developed the lexical style mining system for the lyrics of Chinese popular music. The system collects the corpus of Chinese popular lyrics and provides the functions of theme classification, rhyme judgment, sentiment analysis, metaphor analysis and co-occurrence mining of lyrics. The focus of this thesis lies in the integration of word embedding techniques to discover the evolution of co-occurrence relationships between words over time. The system developed by this research helps users to explore and analyze the lexical style of Chinese popular lyrics in different eras.
en_US
dc.description.tableofcontents 致謝 i
摘要 ii
Abstract iii
目錄 iv
表目錄 vi
圖目錄 vii
第一章 緒論 1
1.1 研究背景 1
1.2 研究動機與目的 2
第二章 相關研究 4
2.1 流行音樂歌詞各時期之特色 4
2.2 流行音樂歌詞用詞 5
2.3 語意變化 5
第三章 研究方法 7
3.1 研究架構 7
3.2 數據蒐集 8
3.3 數據清理 9
3.4 主題、用韻、情意與譬喻判斷 12
3.5 共現探勘 15
3.5.1 Glove 15
3.5.2 Word2Vec 18
3.6 查詢功能 20
3.7 自動作詞 21
第四章 實作與實驗 23
4.1 系統實作 23
4.2 資料來源 23
4.3 資料清理 24
4.4 主題、用韻、情意與譬喻判斷 25
4.5 共現探勘 28
4.5.1 Glove 29
4.5.2 GloVe和Word2Vec比較 30
4.6 查詢功能 53
第五章 結論與未來研究 60
參考文獻 61
zh_TW
dc.format.extent 4033160 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0108753114en_US
dc.subject (關鍵詞) 華語流行歌詞zh_TW
dc.subject (關鍵詞) 用詞探勘zh_TW
dc.subject (關鍵詞) 共現關係zh_TW
dc.subject (關鍵詞) Co-occurrence Relationshipen_US
dc.subject (關鍵詞) Chinese Popular Lyricsen_US
dc.subject (關鍵詞) Lexical Style Miningen_US
dc.title (題名) 華語流行音樂用詞風格探勘系統zh_TW
dc.title (題名) Lexical Style Mining System for Chinese Popular Musicen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] M. Hilpert. Germanic Future Constructions: A Usage-based Approach to Language Change. Benjamins, Amsterdam, Netherlands, 2008.
[2] Peter Turney and Patrick Pantel. From Frequency to Meaning: Vector Space Models of Semantics. Journal of Artificial Intelligence Research, 37(1),141–188, 2010.
[3] Marco Baroni, Georgiana Dinu, and Germán Kruszewski. Don’t Count, Predict! A Systematic Comparison of Context-counting vs. Context-predicting Semantic Vectors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 238–247, 2014.
[4] Martin Hilpert and Stefan Th Gries. Assessing Frequency Changes in Multistage Diachronic Corpora: Applications for Historical Corpus Linguistics and the Study of Language Acquisition. Literary and Linguistic Computing 24(4), 385-401, 2008.
[5] Patrick Juola. The Time Course of Language Change. Computers and the Humanities 37(1), 77-96, 2003.
[6] Andrey Kutuzov, Lilja Øvrelid, Terrence Szymanski, and Erik Velldal. Diachronic Word Embeddings and Semantic Shifts: A Survey. In Proceedings of the 27th International Conference on Computational Linguistics. 1384–1397, 2018.
[7] Nina Tahmasebi, Lars Borin, and Adam Jatowt. Survey of Computational Approaches to Diachronic Conceptual Change, Computational Linguistics, 1(1), 2018.
[8] Xuri Tang. A State-of-the-art of Semantic Change Computation. Natural Language Engineering 24(5), 649–676, 2018.

[9] Yoon Kim, Yi-I Chiu, Kentaro Hanaki, Darshan Hegde, and Slav Petrov. Temporal Analysis of Language through Neural Language Models. In Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science. 61-65, 2014.
[10] Hao Peng, Jianxin Li, Yangqiu Song, and Yaopeng Liu. Incrementally Learning the Hierarchical Softmax Function for Neural Language Models. In Thirty-First AAAI Conference on Artificial Intelligence, 2017.
[11] Nobuhiro Kaji and Hayato Kobayashi. Incremental Skip-gram Model with Negative Sampling. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 363–371, 2017.
[12] Vivek Kulkarni, Rami Al-Rfou, Bryan Perozzi, and Steven Skiena. Statistically Significant Detection of Linguistic Change. In Proceedings of the 24th International Conference on World Wide Web. 625-635, 2015.
[13] Yating Zhang, Adam Jatowt, Sourav Bhowmick, and Katsumi Tanaka. Omnia Mutantur, Nihil Interit: Connecting Past with Present by Finding Corresponding Terms across Time. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 645–655, 2015.
[14] Yating Zhang, Adam Jatowt, Sourav S. Bhowmick, and Katsumi Tanaka. The Past Is Not a Foreign Country: Detecting Semantically Similar Terms across Time. IEEE Transactions on Knowledge and Data Engineering, 28(10), 2793–2807, 2016.
[15] Steffen Eger and Alexander Mehler. On the Linearity of Semantic Change: Investigating Meaning Variation via Dynamic Graph Models. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 52–58, 2016.
[16] Haim Dubossarsky, Simon Hengchen, Nina Tahmasebi, and Dominik Schlechtweg. Time-Out: Temporal Referencing for Robust Modeling of Lexical Semantic Change. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 457-470, 2019.
[17] T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient Estimation of Word Representations in Vector Space, First International Conference on Learning Representation, 2013.
[18] J. Pennington, R. Socher, and C. Manning. GloVe: Global Vectors for Word Representation, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543, 2014.
[19] 謝櫻子:方文山華語詞作主題研究,國立新竹教育大學人資處語文教學碩士論文,2012。
[20] 吳媺婉:台灣國語流行歌曲的修辭藝術(1949-2000),國立臺北教育大學語文教學系碩士論文,2005。
[21] 張雯禎:台灣流行歌詞中的隱喻:以愛情為主題(1990-2008),國立中正大學語言研究所碩士論文,2008。
[22] 邱彥霖:身體使用說明書-從台語流行音樂中男性氣概之流變,國立政治大學社會學研究所碩士論文,2010。
[23] 賴玲玉:台語流行歌詞中的愛情隱喻(1980-2010),國立彰化師範大學台灣文學研究所台灣文學教學碩士論文,2011。
[24] 蕭蘋與蘇振昇:揭開風花雪月的迷霧:解讀台灣流行音樂中的愛情世界 (1989-1998),新聞學研究,第七十期,2002。
[25] 馬占山:臺灣流行歌詞主題類型與語言表達研究(1999~2008)─以「Hit-FM 年度百首單曲」為對象,國立臺灣師範大學碩士班論文,2013。
[26] 曾慧佳:從流行歌曲看台灣社會,桂冠圖書,2000。
[27] 周晏如:由華語流行歌詞探勘歌詞的特徵樣式,國立政治大學碩士學位論文,2016。
[28] 張冕資:使用歌詞以及階層群集分析方法的華語流行歌曲情緒辨識,國立臺灣科技大學工業管理研究所碩士論文,2017。
[29] 卓紋君:臺灣人愛情風格分析之研究,中華輔導學報,第十六期,2004。
[30] 蔣翰宗等人:四面處歌—華語流行音樂查詢分析探勘系統作品構想書,109年全國大學校院數位人文大數據學生競賽,2010。
zh_TW
dc.identifier.doi (DOI) 10.6814/NCCU202201648en_US