Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 基於文本分析方法探討流行歌曲情緒辨識之研究
Song Mood Classification Based on Textual Analysis Method
作者 駱昱岑
Luo, Yu-Tsen
貢獻者 翁久幸
駱昱岑
Luo, Yu-Tsen
關鍵詞 流行歌曲
文本分析
情緒辨識
日期 2019
上傳時間 1-Jul-2019 10:43:32 (UTC+8)
摘要 現今的音樂串流服務興起,除了提供音樂給使用者聆聽之外,也提供了許多額外的服務,像是歌曲的推薦、整理各個主題的歌單等等。然而,每天有成千上萬的音樂作品不斷推陳出新,我們無法單純依靠人力一首一首的標記與整理,因此,讓機器代替我們將歌曲分門別類,是非常重要的課題。本研究的目標為利用機器學習的方法,將歌曲依據情緒分類。
首先,我們從歌詞出發,比較分別由Laurier 等人 [8] 與 Zaanen等人 [1] 提出的與TF-IDF 相關的歌詞特徵擷取方法。其次,有鑒於流行歌曲中常有大量重複詞,可能對TF-IDF 造成過度影響,本文提出了刪除重複詞的TF-IDF 做法來擷取歌詞特徵。除此之外,我們也結合歌詞與音訊特徵來改進分類的正確率。本論文使用兩個資料集,分別為 593 首中文歌曲的KKBOX-Song-Mood-Dataset 與777 首英文歌曲的NJU-Music-Mood V1.0-Dataset,實驗結果顯示,使用刪除重複詞的TF-IDF 所得到的分類準確率相較於先前的方法,準確率皆有顯著的提升。
Nowadays, music streaming services is rising. In addition to providing music to the user, there are also many additional services available, such as recommendation of songs, organizing playlists for various topics and so on. However, thousands of songs are being introduced every day. We cannot rely on human beings checking every single song. Therefore, it is very important to let the machine replace doing such trivial work by us. The goal of this study is to use machine learning methods to classify songs into moods.
This study mainly uses three lyric feature extraction methods related to TF-IDF. The first two methods were proposed by Lauier et al. [8] and Zaanen rt al. [1]. However, there is no previous study comparing the advantages and disadvantages of the above two methods, so this study compares the differences between the two methods firstly. Moreover, we found that in popular songs, the repeated words have influences on the method of Zaanen et al. [1]. Then we proposed a new TF-IDF related lyric feature extraction method. This study mainly uses KKBOX-Song-Mood-Dataset and NJU-Music-Mood V1.0-Dataset. The experiment result shows that the classification accuracy obtained by our proposed method is significantly higher than that of the previous two methods.
參考文獻 [1] Menno van Zaanen and Pieter Kanters. Automatic Mood Classification Using tf*idf Based on Lyrics. In J. Stephen Downie and Remco C. Veltkamp, editors, 11th International Society for Music Information and Retrieval Conference, August 2010.
[2] Hao Xue, Like Xue, Feng Su.Multimodal Music Mood Classification by Fusion of Audio and Lyrics. In Proc. of MMM 2015, LNCS 8936, pp 26-37.
[3] Jen-Yu Liu and Yi-Hsuan Yang :Event Localization in Music Auto-tagging, 2016, http://mac.citi.sinica.edu.tw/~yang/pub/liu16mm.pdf
[4] Wei-Yun Ma and Keh-Jiann Chen. A bottom-up merging algorithm for chinese
unknown word extraction. In Proceedings of the second SIGHAN workshop
on Chinese language processing, volume 17, pages 31–38. Association
for Computational Linguistics, 2003.
[5] Wei-Yun Ma and Keh-Jiann Chen. Introduction to CKIP chinese word segmentation
system for the first international chinese word segmentation bakeoff.
In Proceedings of the second SIGHAN workshop on Chinese language
processing, volume 17, pages 168–171. Association for Computational Linguistics,
2003.
[6] McFee, Brian, Colin Raffel, Dawen Liang, Daniel PW Ellis, Matt McVicar, Eric Battenberg, and Oriol Nieto. "librosa: Audio and music signal analysis in python." In Proceedings of the 14th python in science conference, pp. 18-25. 2015.
[7] Martin F. McKinney and Jeroen Breebaart. Features for Audio and Music Classification. In Proceedings of International Conference on Music Information Retrieval, 2003.
[8] C. Laurier, J. Grivolla and P. Herrera: “Multimodal Music Mood Classification Using Audio and Lyrics,” Proceedings of the International Conference on Machine Learning and Applications, 2008.
[9] Y.-H. Yang, Y-C. Lin, H.-T. Cheng, I,-B. Liao, Y-C. Ho, and H. H. Chen. Toward multi-modal music emotion classification. In Proceedings of Pacific-Rim Conference in Multimedia, pages 70-79. Springer, 2008
[10] Xing Wang, Xiaoou Chen, Deshun Yang and Yuqian Wu. Music Emotion Classification of Chinese Songs Based on Lyrics Using TF*IDF and Rhyme.
描述 碩士
國立政治大學
統計學系
106354007
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0106354007
資料類型 thesis
dc.contributor.advisor 翁久幸zh_TW
dc.contributor.author (Authors) 駱昱岑zh_TW
dc.contributor.author (Authors) Luo, Yu-Tsenen_US
dc.creator (作者) 駱昱岑zh_TW
dc.creator (作者) Luo, Yu-Tsenen_US
dc.date (日期) 2019en_US
dc.date.accessioned 1-Jul-2019 10:43:32 (UTC+8)-
dc.date.available 1-Jul-2019 10:43:32 (UTC+8)-
dc.date.issued (上傳時間) 1-Jul-2019 10:43:32 (UTC+8)-
dc.identifier (Other Identifiers) G0106354007en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/124121-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 統計學系zh_TW
dc.description (描述) 106354007zh_TW
dc.description.abstract (摘要) 現今的音樂串流服務興起,除了提供音樂給使用者聆聽之外,也提供了許多額外的服務,像是歌曲的推薦、整理各個主題的歌單等等。然而,每天有成千上萬的音樂作品不斷推陳出新,我們無法單純依靠人力一首一首的標記與整理,因此,讓機器代替我們將歌曲分門別類,是非常重要的課題。本研究的目標為利用機器學習的方法,將歌曲依據情緒分類。
首先,我們從歌詞出發,比較分別由Laurier 等人 [8] 與 Zaanen等人 [1] 提出的與TF-IDF 相關的歌詞特徵擷取方法。其次,有鑒於流行歌曲中常有大量重複詞,可能對TF-IDF 造成過度影響,本文提出了刪除重複詞的TF-IDF 做法來擷取歌詞特徵。除此之外,我們也結合歌詞與音訊特徵來改進分類的正確率。本論文使用兩個資料集,分別為 593 首中文歌曲的KKBOX-Song-Mood-Dataset 與777 首英文歌曲的NJU-Music-Mood V1.0-Dataset,實驗結果顯示,使用刪除重複詞的TF-IDF 所得到的分類準確率相較於先前的方法,準確率皆有顯著的提升。
zh_TW
dc.description.abstract (摘要) Nowadays, music streaming services is rising. In addition to providing music to the user, there are also many additional services available, such as recommendation of songs, organizing playlists for various topics and so on. However, thousands of songs are being introduced every day. We cannot rely on human beings checking every single song. Therefore, it is very important to let the machine replace doing such trivial work by us. The goal of this study is to use machine learning methods to classify songs into moods.
This study mainly uses three lyric feature extraction methods related to TF-IDF. The first two methods were proposed by Lauier et al. [8] and Zaanen rt al. [1]. However, there is no previous study comparing the advantages and disadvantages of the above two methods, so this study compares the differences between the two methods firstly. Moreover, we found that in popular songs, the repeated words have influences on the method of Zaanen et al. [1]. Then we proposed a new TF-IDF related lyric feature extraction method. This study mainly uses KKBOX-Song-Mood-Dataset and NJU-Music-Mood V1.0-Dataset. The experiment result shows that the classification accuracy obtained by our proposed method is significantly higher than that of the previous two methods.
en_US
dc.description.tableofcontents 第一章 緒論 1
第二章 文獻回顧 3
第三章 研究方法 5
3.1 TF-IDF 5
3.2 基於TF-IDF的歌詞特徵擷取方法 7
3.2.1 方法一 視一首歌的歌詞為一個文本 7
3.2.2 方法二 視一種情緒標籤的所有歌詞為一個文本 7
3.2.3 方法三 處理重複詞問題 9
3.3 分類模型 12
3.3.1 最近鄰居法 12
3.3.2 支持向量機 13
3.3.3 隨機森林 14
第四章 資料介紹 15
4.1 KKBOX 中文歌曲情緒標籤資料 15
4.1.1 情緒標籤 15
4.1.2 歌詞 15
4.1.3 音訊特徵 17
4.2 NJU-Music-Mood V1.0 資料集 18
4.2.1 情緒標籤 18
4.2.2 歌詞 18
4.2.3 音訊特徵 18
第五章 建模分析 19
5.1 歌詞前處理 22
5.2悲傷與快樂二分類 22
5.3 悲傷、快樂與平靜三分類 29
5.4 過濾歌詞中的重複詞問題 30
5.5 探討方法三與方法二的優劣 30
5.5.1 面向一 32
5.5.2 面向二 35
5.6 模型輸出機率 38
第六章 結論與建議 42
附錄 44
參考文獻 49
zh_TW
dc.format.extent 1547746 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0106354007en_US
dc.subject (關鍵詞) 流行歌曲zh_TW
dc.subject (關鍵詞) 文本分析zh_TW
dc.subject (關鍵詞) 情緒辨識zh_TW
dc.title (題名) 基於文本分析方法探討流行歌曲情緒辨識之研究zh_TW
dc.title (題名) Song Mood Classification Based on Textual Analysis Methoden_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] Menno van Zaanen and Pieter Kanters. Automatic Mood Classification Using tf*idf Based on Lyrics. In J. Stephen Downie and Remco C. Veltkamp, editors, 11th International Society for Music Information and Retrieval Conference, August 2010.
[2] Hao Xue, Like Xue, Feng Su.Multimodal Music Mood Classification by Fusion of Audio and Lyrics. In Proc. of MMM 2015, LNCS 8936, pp 26-37.
[3] Jen-Yu Liu and Yi-Hsuan Yang :Event Localization in Music Auto-tagging, 2016, http://mac.citi.sinica.edu.tw/~yang/pub/liu16mm.pdf
[4] Wei-Yun Ma and Keh-Jiann Chen. A bottom-up merging algorithm for chinese
unknown word extraction. In Proceedings of the second SIGHAN workshop
on Chinese language processing, volume 17, pages 31–38. Association
for Computational Linguistics, 2003.
[5] Wei-Yun Ma and Keh-Jiann Chen. Introduction to CKIP chinese word segmentation
system for the first international chinese word segmentation bakeoff.
In Proceedings of the second SIGHAN workshop on Chinese language
processing, volume 17, pages 168–171. Association for Computational Linguistics,
2003.
[6] McFee, Brian, Colin Raffel, Dawen Liang, Daniel PW Ellis, Matt McVicar, Eric Battenberg, and Oriol Nieto. "librosa: Audio and music signal analysis in python." In Proceedings of the 14th python in science conference, pp. 18-25. 2015.
[7] Martin F. McKinney and Jeroen Breebaart. Features for Audio and Music Classification. In Proceedings of International Conference on Music Information Retrieval, 2003.
[8] C. Laurier, J. Grivolla and P. Herrera: “Multimodal Music Mood Classification Using Audio and Lyrics,” Proceedings of the International Conference on Machine Learning and Applications, 2008.
[9] Y.-H. Yang, Y-C. Lin, H.-T. Cheng, I,-B. Liao, Y-C. Ho, and H. H. Chen. Toward multi-modal music emotion classification. In Proceedings of Pacific-Rim Conference in Multimedia, pages 70-79. Springer, 2008
[10] Xing Wang, Xiaoou Chen, Deshun Yang and Yuqian Wu. Music Emotion Classification of Chinese Songs Based on Lyrics Using TF*IDF and Rhyme.
zh_TW
dc.identifier.doi (DOI) 10.6814/NCCU201900092en_US