學術產出-學位論文
文章檢視/開啟
書目匯出
-
題名 以資料探勘技術預測電影配樂的使用時機
Timing Prediction of Movie Scoring Based on Data Mining Techniques作者 段承甫
Duan, Cheng-Fu貢獻者 沈錳坤
Shan, Man-Kwan
段承甫
Duan, Cheng-Fu關鍵詞 電影配樂
時機預測
電影分段
資料探勘
Film score
Timing prediction
Movie segmentation
Data mining日期 2018 上傳時間 3-九月-2018 15:52:01 (UTC+8) 摘要 好的電影配樂是一部傑出電影不可或缺的一部分。音樂家根據電影的類型、風格,在對的時機點,為電影量身打造出適合的配樂。過去已有許多與影片內容分析相關的研究,但尚未有預測電影配樂使用時機的研究。本研究以配樂表現傑出的電影為樣本資料,利用資料探勘技術學習出配樂時機的預測模型,以模型自動為尚未配樂的電影找出適合配樂的電影片段。此研究能延伸於User-Generated Video的背景音樂時機預測。本研究將電影配樂使用時機的預測問題轉換成二元分類問題。為了使電影片段對於配樂的使用具有代表性,我們以場景為單位將電影分段,我們利用電影劇本與電影字幕對齊以及電影鏡頭的資訊將電影分段。電影分段後我們抓取每一個片段的視覺特徵、文字特徵、電影Metadata與其它特徵,以此些特徵訓練預測模型。我們於實驗中以Decision Tree、Logistic Regression、Support Vector Machine、Random Forest與Conditional Random Field進行實驗,從中觀察影響配樂使用時機的關鍵因素與不同電影之間的預測結果,並加上考慮場景情境的條件下是否能提升預測的效果。從實驗結果發現,影響配樂使用時機的主要因素為片段於電影中的時間點、台詞的時間比例與台詞密度。加上考慮場景的情境能提升大部分電影的預測效果,而使用Random Forest作為預測模型的演算法效果最佳(R-Precision約0.663,Area under the Curve of ROC約0.675)。
Film score is essential to movies. Composers compose background scores for movies according to movie styles and genres. Much research has been done on video content analysis, but none has been done on timing prediction of movie score. In this thesis, we investigate the timing prediction of film score based on data mining techniques. It is helpful for timing prediction of background music for user generated content.In the proposed approach, the timing prediction problem is transformed as a binary classification problem. We first segment movies into scenes by alignment between scripts and subtitles of movies. After movie segmentation, visual features, text features, movie metadata and sentiment features of each scene are extracted and are used to learn the prediction model. In the experiments, Decision Tree, Logistic Regression, Support Vector Machine, Random Forest and Conditional Random Field algorithms are employed for model training. The result of experiments show that timestamp, proportion of subtitles and word density of scenes are key factors of timing prediction and taking context into consideration can improve prediction performance.參考文獻 [1] B. T. Truong, and S. Venkatesh, Video Abstraction: A Systematic Review and Classification. ACM Transactions on Multimedia Computing, Communications and Applications, 3(1), 2007.[2] Y. Li, T. Zhang, and D. Tretter, An Overview of Video Abstraction Techniques. Technical Report HPL-2001-191, HP Laboratory, 2001.[3] H. W. Chen, J. H. Kuo, W. T. Chu, and J. L. Wu, Action Movies Segmentation and Summarization Based on Tempo Analysis. Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2004.[4] B. Adams, C. Dorai, and S. Venkatesh, Toward Automatic Extraction of Expressive Elements from Motion Pictures: Tempo. IEEE Transactions on Multimedia, 4(4), 2002.[5] T. Hermes and C. Schultz, Automatic Generation of Hollywood-Like Movie Trailers. cat1.netzspannung.org, 2006.[6] T. v. Wenzlawowicz and O. Herzog, Semantic Video Abstracting: Automatic Generation of Movie Trailers Based on Video Patterns. SETN 2012: Artificial Intelligence: Theories and Applications, 2012.[7] B. Ionescu, P. Lambert, D. Coquin, L. Ott and V. Buzuloiu, Animation Movies Trailer Computation. Proceedings of the 14th ACM International Conference on Multimedia, 2006.[8] B. Ionescu, V. Buzuloiu, P. Lambert and D. Coquin, Improved Cut Detection for the Segmentation of Animation Movies. IEEE International Conference on Acoustic, Speech and Signal Processing, 2006.[9] G. Irie, T. Satou, A. Kojima, T. Yamasaki, and K. Aizawa, Automatic Trailer Generation. Proceedings of the 18th ACM International Conference on Multimedia, 2010.[10] B.J. Frey and D. Deuck, Clustering by Passing Messages between Data Points. Science, 315(5814), 2007.[11] F. Smeaton, B. Lehane, N. E. O’Connor, C. Brady and G. Craig, Automatically Selecting Shots for Action Movie Trailers. Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, 2006.[12] Z. Xu, and Y. Zhang, Automatic Generated Recommendation for Movie Trailers. IEEE International Symposium on Broadband Multimedia Systems and Broadcasting, 2013.[13] X. Liu, and J. Jiang, Semi-supervised Learning towards Computerized Generation of Movie Trailers. IEEE International Conference on Systems, Man, and Cybernetics, 2015.[14] Y. F. Li, and Z. H. Zhou, Towards Making Unlabeled Data Never Hurt. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(1), 2015.[15] H. Xu, Y. Zhen, and H. Zha, Trailer Generation via a Point Process-Based Visual Attractiveness Model. Proceedings of the 24th International Joint Conference on Artificial Intelligence, 2015.[16] J. R. Smith, D. Joshi, B. Huet, W. Hsu, and J. Cota, Harnessing A.I. for Augmenting Creativity: Application to Movie Trailer Creation. Proceedings of the 25th ACM International Conference on Multimedia, 2017.[17] B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva, Learning Deep Features for Scene Recognition using Places Database. Advances in Neural Information Processing Systems, 27, 2014.[18] K. S. Lin, A. Lee, Y. H. Yang, C. T. Lee, and H. H. Chen, Automatic Highlights Extraction for Drama Video Using Music Emotion and Human Face Features. Neurocomputing, 119(Intelligent Processing Techniques for Semantic-based Image and Video Retrieval), 2013.[19] M. Xu, S. Luo, J. S. Jin, and M. Park, Affective Content Analysis by Mid-Level Representation in Multiple Modalities. Proceedings of the First International Conference on Internet Multimedia Computing and Service, 2009.[20] S. B. Park, H. N. Kim, H. Kim, and G. S. Jo, Exploiting Script-Subtitles Alignment to Scene Boundary Detection in Movie. IEEE International Symposium on Multimedia, 2010.[21] T. Giannakopoulos, pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis. PLOS-One, 10(12), 2015.[22] T. Mikolov, K. Chen, G. Corrado, and J. Dean, Efficient Estimation of Word Representations in Vector Space. International Conference on Learning Representations, 2013.[23] C.J. Hutto, and Eric Gilbert, VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. 8th International Conference on Weblogs and Social Media, 2014.[24] J. Lafferty, A. McCallum, and F. C.N. Pereira, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. Proceedings of the 18th International Conference on Machine Learning, 2001.[25] A. B. Warriner, V. Kuperman, and M. Brysbaert, Norms of Valence, Arousal, and Dominance for 13,915 English Lemmas. Behavior research methods, 2013. 描述 碩士
國立政治大學
資訊科學系
104753017資料來源 http://thesis.lib.nccu.edu.tw/record/#G0104753017 資料類型 thesis dc.contributor.advisor 沈錳坤 zh_TW dc.contributor.advisor Shan, Man-Kwan en_US dc.contributor.author (作者) 段承甫 zh_TW dc.contributor.author (作者) Duan, Cheng-Fu en_US dc.creator (作者) 段承甫 zh_TW dc.creator (作者) Duan, Cheng-Fu en_US dc.date (日期) 2018 en_US dc.date.accessioned 3-九月-2018 15:52:01 (UTC+8) - dc.date.available 3-九月-2018 15:52:01 (UTC+8) - dc.date.issued (上傳時間) 3-九月-2018 15:52:01 (UTC+8) - dc.identifier (其他 識別碼) G0104753017 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/119909 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 資訊科學系 zh_TW dc.description (描述) 104753017 zh_TW dc.description.abstract (摘要) 好的電影配樂是一部傑出電影不可或缺的一部分。音樂家根據電影的類型、風格,在對的時機點,為電影量身打造出適合的配樂。過去已有許多與影片內容分析相關的研究,但尚未有預測電影配樂使用時機的研究。本研究以配樂表現傑出的電影為樣本資料,利用資料探勘技術學習出配樂時機的預測模型,以模型自動為尚未配樂的電影找出適合配樂的電影片段。此研究能延伸於User-Generated Video的背景音樂時機預測。本研究將電影配樂使用時機的預測問題轉換成二元分類問題。為了使電影片段對於配樂的使用具有代表性,我們以場景為單位將電影分段,我們利用電影劇本與電影字幕對齊以及電影鏡頭的資訊將電影分段。電影分段後我們抓取每一個片段的視覺特徵、文字特徵、電影Metadata與其它特徵,以此些特徵訓練預測模型。我們於實驗中以Decision Tree、Logistic Regression、Support Vector Machine、Random Forest與Conditional Random Field進行實驗,從中觀察影響配樂使用時機的關鍵因素與不同電影之間的預測結果,並加上考慮場景情境的條件下是否能提升預測的效果。從實驗結果發現,影響配樂使用時機的主要因素為片段於電影中的時間點、台詞的時間比例與台詞密度。加上考慮場景的情境能提升大部分電影的預測效果,而使用Random Forest作為預測模型的演算法效果最佳(R-Precision約0.663,Area under the Curve of ROC約0.675)。 zh_TW dc.description.abstract (摘要) Film score is essential to movies. Composers compose background scores for movies according to movie styles and genres. Much research has been done on video content analysis, but none has been done on timing prediction of movie score. In this thesis, we investigate the timing prediction of film score based on data mining techniques. It is helpful for timing prediction of background music for user generated content.In the proposed approach, the timing prediction problem is transformed as a binary classification problem. We first segment movies into scenes by alignment between scripts and subtitles of movies. After movie segmentation, visual features, text features, movie metadata and sentiment features of each scene are extracted and are used to learn the prediction model. In the experiments, Decision Tree, Logistic Regression, Support Vector Machine, Random Forest and Conditional Random Field algorithms are employed for model training. The result of experiments show that timestamp, proportion of subtitles and word density of scenes are key factors of timing prediction and taking context into consideration can improve prediction performance. en_US dc.description.tableofcontents 第一章 緒論 1第二章 相關研究 42.1影片概要 42.2電影預告片 42.2.1 未使用訓練模型挑選預告片素材 52.2.2 使用訓練模型挑選預告片素材 62.3影片精華 8第三章 研究方法 103.1 系統架構 (System Architecture) 103.2電影分段 (Movie Segmentation) 113.2.1電影鏡頭邊界偵測 (Shot Boundary Detection) 123.2.2 電影劇本與電影字幕對齊 (Script-Subtitle Alignment) 123.2.3 鏡頭合併 (Shot Merge) 163.3 場景標註 (Ground Truth) 183.4特徵值抓取 (Feature Extraction) 193.4.1視覺特徵 (Visual Feature) 193.4.2 文字特徵 (Text Feature) 213.4.3 電影詮釋資料 (Metadata) 243.4.4 其它特徵 (Other Feature) 253.5預測模型 (Prediction Model) 273.5.1 Support Vector Machine (SVM) 273.5.2 Logistic Regression 283.5.3 Decision Trees 283.5.4 Random Forest 293.5.5 Conditional Random Fields (CRF) 29第四章 實驗方法 314.1 實驗資料 314.2 實驗利用之工具 334.2.1 電影分段 334.2.2 特徵值抓取 334.2.3 預測模型 344.3預測適合配樂的場景 344.3.1 實驗評估 344.3.2 CRF前後場景個數比較 344.3.3 場景預測結果 354.3.4 考慮場景情境 38第五章 結論與未來研究 43參考文獻 44 zh_TW dc.format.extent 2783825 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0104753017 en_US dc.subject (關鍵詞) 電影配樂 zh_TW dc.subject (關鍵詞) 時機預測 zh_TW dc.subject (關鍵詞) 電影分段 zh_TW dc.subject (關鍵詞) 資料探勘 zh_TW dc.subject (關鍵詞) Film score en_US dc.subject (關鍵詞) Timing prediction en_US dc.subject (關鍵詞) Movie segmentation en_US dc.subject (關鍵詞) Data mining en_US dc.title (題名) 以資料探勘技術預測電影配樂的使用時機 zh_TW dc.title (題名) Timing Prediction of Movie Scoring Based on Data Mining Techniques en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] B. T. Truong, and S. Venkatesh, Video Abstraction: A Systematic Review and Classification. ACM Transactions on Multimedia Computing, Communications and Applications, 3(1), 2007.[2] Y. Li, T. Zhang, and D. Tretter, An Overview of Video Abstraction Techniques. Technical Report HPL-2001-191, HP Laboratory, 2001.[3] H. W. Chen, J. H. Kuo, W. T. Chu, and J. L. Wu, Action Movies Segmentation and Summarization Based on Tempo Analysis. Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2004.[4] B. Adams, C. Dorai, and S. Venkatesh, Toward Automatic Extraction of Expressive Elements from Motion Pictures: Tempo. IEEE Transactions on Multimedia, 4(4), 2002.[5] T. Hermes and C. Schultz, Automatic Generation of Hollywood-Like Movie Trailers. cat1.netzspannung.org, 2006.[6] T. v. Wenzlawowicz and O. Herzog, Semantic Video Abstracting: Automatic Generation of Movie Trailers Based on Video Patterns. SETN 2012: Artificial Intelligence: Theories and Applications, 2012.[7] B. Ionescu, P. Lambert, D. Coquin, L. Ott and V. Buzuloiu, Animation Movies Trailer Computation. Proceedings of the 14th ACM International Conference on Multimedia, 2006.[8] B. Ionescu, V. Buzuloiu, P. Lambert and D. Coquin, Improved Cut Detection for the Segmentation of Animation Movies. IEEE International Conference on Acoustic, Speech and Signal Processing, 2006.[9] G. Irie, T. Satou, A. Kojima, T. Yamasaki, and K. Aizawa, Automatic Trailer Generation. Proceedings of the 18th ACM International Conference on Multimedia, 2010.[10] B.J. Frey and D. Deuck, Clustering by Passing Messages between Data Points. Science, 315(5814), 2007.[11] F. Smeaton, B. Lehane, N. E. O’Connor, C. Brady and G. Craig, Automatically Selecting Shots for Action Movie Trailers. Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, 2006.[12] Z. Xu, and Y. Zhang, Automatic Generated Recommendation for Movie Trailers. IEEE International Symposium on Broadband Multimedia Systems and Broadcasting, 2013.[13] X. Liu, and J. Jiang, Semi-supervised Learning towards Computerized Generation of Movie Trailers. IEEE International Conference on Systems, Man, and Cybernetics, 2015.[14] Y. F. Li, and Z. H. Zhou, Towards Making Unlabeled Data Never Hurt. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(1), 2015.[15] H. Xu, Y. Zhen, and H. Zha, Trailer Generation via a Point Process-Based Visual Attractiveness Model. Proceedings of the 24th International Joint Conference on Artificial Intelligence, 2015.[16] J. R. Smith, D. Joshi, B. Huet, W. Hsu, and J. Cota, Harnessing A.I. for Augmenting Creativity: Application to Movie Trailer Creation. Proceedings of the 25th ACM International Conference on Multimedia, 2017.[17] B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva, Learning Deep Features for Scene Recognition using Places Database. Advances in Neural Information Processing Systems, 27, 2014.[18] K. S. Lin, A. Lee, Y. H. Yang, C. T. Lee, and H. H. Chen, Automatic Highlights Extraction for Drama Video Using Music Emotion and Human Face Features. Neurocomputing, 119(Intelligent Processing Techniques for Semantic-based Image and Video Retrieval), 2013.[19] M. Xu, S. Luo, J. S. Jin, and M. Park, Affective Content Analysis by Mid-Level Representation in Multiple Modalities. Proceedings of the First International Conference on Internet Multimedia Computing and Service, 2009.[20] S. B. Park, H. N. Kim, H. Kim, and G. S. Jo, Exploiting Script-Subtitles Alignment to Scene Boundary Detection in Movie. IEEE International Symposium on Multimedia, 2010.[21] T. Giannakopoulos, pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis. PLOS-One, 10(12), 2015.[22] T. Mikolov, K. Chen, G. Corrado, and J. Dean, Efficient Estimation of Word Representations in Vector Space. International Conference on Learning Representations, 2013.[23] C.J. Hutto, and Eric Gilbert, VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. 8th International Conference on Weblogs and Social Media, 2014.[24] J. Lafferty, A. McCallum, and F. C.N. Pereira, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. Proceedings of the 18th International Conference on Machine Learning, 2001.[25] A. B. Warriner, V. Kuperman, and M. Brysbaert, Norms of Valence, Arousal, and Dominance for 13,915 English Lemmas. Behavior research methods, 2013. zh_TW dc.identifier.doi (DOI) 10.6814/THE.NCCU.CS.012.2018.B02 -