學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 有效率的探勘演化重覆性樣式演算法
Efficient algorithms for mining evolutionary repeating patterns
作者 陳俊豪
貢獻者 沈錳坤
陳俊豪
關鍵詞 演化重覆性樣式
音樂動機
日期 2007
上傳時間 9-Apr-2010 14:49:28 (UTC+8)
摘要 一個片段在序列中重複出現的現像稱之為「重覆性樣式」。這樣的重覆性樣式在許多不同的領域像是音樂分析以及生物資訊演算法上伴演著重要的角色。
     在音樂分析上,重覆性樣式即為一段連續的音符在樂曲中重覆出現的現象。在樂理中,這樣的重覆出現的片段即稱之為「音樂動機」,在貝多芬的第五交響曲中,即利用四個簡單的音符“sol sol sol mi”做為音樂動機,並利用這個音樂動機創作出整首交響曲。分析音樂的動機將有助於應用在音樂檢索上為音樂建立出適合的索引。
     動機的型式並非永遠一成不變,作曲者可能透過些微的變化,在樂曲中重覆出現。重覆性的片段在序列中未產生任何變變化而重覆出現的現象稱之為「精確重覆」;在序列中的重覆片段果存在些微的變化,且這些變化的序列皆與某一個序列相似則稱之為「近似重覆」。
     精確重覆性樣式及近似重覆性樣式的探勘,已在過去的許多研究中被提出。在本篇論文中,我們研究一種新的重覆性樣式,在每個重覆出現的序列中,都和前一個出現的序列相似,而非與某一個特定的序列相似,這樣的重覆樣式稱之為「演化重覆性樣式」。本篇論文提出演算重覆性樣式的問題,並提出兩種基本的演算法,改善在探勘演化重覆性樣式時的效率;最後結合兩種演算法的特性,提出一種綜合演算法,同時擁有上述兩個演算法的優點,獲得更好的效率。最後並在實驗中證明我們所提出的演算法能夠得到良好的執行效率。
A repeating pattern is a substring of a sequence, which repeats several times in the sequence. Repeating patterns play an important role in a variety of applications such as music analysis and bioinformatics.
     In music analysis, a repeating pattern is a sequence of notes repeats several times in a music object. In musicology, a repeating pattern corresponds to a motive which is a salient recurring segment of notes that may be used to construct all or some of the melody and themes. The well-know segment "sol-sol-sol-mi" in Beethoven`s Symphony no 5 is an example of motives. The repeating pattern is an efficient semantic representation to index music sequences for content-based music retrieval.
     The repetition of a motive may have some variations and not necessarily be an exact repetition in the music object. For example, motivic development is a composition technique that allows a composer to generate the entire music based on a motive which repeats in the form of variations. Moreover, it is possible that a motive may evolve in certain types of composition.
     Some exact and approximate repeating pattern mining algorithms have been developed. In this paper, a new form of approximate repeating patterns, evolutionary repeating patterns is investigated.
     In evolutionary repeating pattern, each instance is similar to the previous one, rather than the original pattern. We proposed two approaches, matrix-based and Apriori-based ones, to discover the evolutionary repeating patterns. Moreover, we also present a hybrid approach to combine features of the above two approaches. Scale-up experiments show that the proposed algorithms perform well.
參考文獻 [1] 黎翁斯坦, 潘皇龍譯, “音樂的結構與風格”, 全音樂譜出版社, 1974.
註:Leon Stein, “Structure & Style:The Study and Analysis of Musical Forms”, Summy-Birchard Music, 1979.(expanded edition)
[2] R. Agrawal, and R. Srikant, "Fast Algorithm for Mining Association Rules," Proc. of International Conference on Very Large Data Bases VLDB, 1994.
[3] R. Agrawal, and R. Srikant, “Mining Sequential Patterns,” Proc. of International Conference on Data Engineering ICDE, 1995.
[4] T. Crawford, C. S. Iliopoulos, R. Winder, and H. Yu. “Approximate Musical Evolution,” Proc. of Artificial Intelligence and Simulation of Behaviour Symposium AISB, 1999.
[5] J. Han, G. Dong, and Y. Yin, “Efficient Mining of Partial Periodic Patterns in Time Series Databases,” Proc. of International Conference on Data Engineering ICDE, 1999.
[6] C.S. Iliopoulos, M. Kumar, L. Mouchard, and S. Venkatesh, “Motif Evolution in Polyphonic Musical Sequences,” Proc. of Australasian Workshop on Combinatorial Algorithms AWOCA, 2000.
[7] C. S. Iliopoulos, K. Lemström, M. Nuyad, and Y. J. Pinzón, “Evolution of Musical Motifs in Polyphonic Passages,” Proc. of Symposium on AI and Creativity in Arts and Science AISB, 2002
[8] A. M. Helena “Finding All Maximal Frequent Sequences in Text,” Proc. of International Conference on Machine Learning ICML, 1999.
[9] J. L. Hsu, A. L. P. Chen, and H. C. Chen, “Finding Approximate Repeating Patterns from Sequence Data,” Proc.of Intertional Symposium on Music Information Retrieval ISMIR, 2004.
[10] J. L. Hsu, C. C. Liu, and A. L. P. Chen, “Efficient Repeating Pattern Finding in Music Database,” Proc. of Conference on Information and Knowledge Management CIKM, 1998.
[11] J. L. Hsu, C. C. Liu, and A. L. P. Chen, “Discovering Non-Trivial Repeating Patterns in Music Data,” IEEE Transactions on Multimedia, 2001.
[12] J. L. Koh, and Y. T. Kung “An Efficient Approach for Mining Top-K Fault-Tolerant Repeating Patterns,” Proc. of International Conference on Database Systems for Advanced Applications DASFAA, 2006.
[13] C. C. Liu, J. L. Hsu, and A. L. P. Chen, “Efficient Theme and Non-Trivial Repeating Pattern Discovering in Music Databases,” Proc. of International Conference on Data Engineering ICDE, 1999.
[14] N. H. Liu, Y. H. Wu, and A. L. P. Chen “An Efficient Approach to Extracting Approximate Repeating Pattern in Music Databases,” Proc. of International Conference on Database Systems for Advanced Applications DASFAA, 2005.
[15] Y. L. Lo, and C. Y. Chen “Fault Tolerant Non-trivial Repeating Pattern Discovering for Music Data,” Proc. of International Conference on Computer and Information Science ICIS, 2007.
[16] Y. L. Lo, W. L. Lee, and L. H. Chang “True suffix tree approach for discovering non-trivial repeating patterns in a music object,” Multimedia Tools and Applications, 2007.
[17] Y. L. Lo, and W. L. Li, “Linear Time for Discovering Non-trivial Repeating Pattern in Music Database,” Proc. of International Conference on Multimedia and Expo ICME, 2004.
[18] C. Meek, and W. P. Birmingham “Thematic Extractor,” Proc.of Intertional Symposium on Music Information Retrieval ISMIR, 2001.
[19] H. H. Shin, S. S. Narayanan, and C. C. J. Kuo, “A Dictionary Approach to Repetitive Pattern Finding In Music,” Proc. of International Conference on Multimedia and Expo ICME, 2001.
[20] P. Weiner (1973). “Linear Pattern Matching Algorithm,” Proc. of Symposium on Switching and Automata Theory. pp.1-11, 1973.
描述 碩士
國立政治大學
資訊科學學系
95753010
96
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0095753010
資料類型 thesis
dc.contributor.advisor 沈錳坤zh_TW
dc.contributor.author (Authors) 陳俊豪zh_TW
dc.creator (作者) 陳俊豪zh_TW
dc.date (日期) 2007en_US
dc.date.accessioned 9-Apr-2010 14:49:28 (UTC+8)-
dc.date.available 9-Apr-2010 14:49:28 (UTC+8)-
dc.date.issued (上傳時間) 9-Apr-2010 14:49:28 (UTC+8)-
dc.identifier (Other Identifiers) G0095753010en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/38539-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學學系zh_TW
dc.description (描述) 95753010zh_TW
dc.description (描述) 96zh_TW
dc.description.abstract (摘要) 一個片段在序列中重複出現的現像稱之為「重覆性樣式」。這樣的重覆性樣式在許多不同的領域像是音樂分析以及生物資訊演算法上伴演著重要的角色。
     在音樂分析上,重覆性樣式即為一段連續的音符在樂曲中重覆出現的現象。在樂理中,這樣的重覆出現的片段即稱之為「音樂動機」,在貝多芬的第五交響曲中,即利用四個簡單的音符“sol sol sol mi”做為音樂動機,並利用這個音樂動機創作出整首交響曲。分析音樂的動機將有助於應用在音樂檢索上為音樂建立出適合的索引。
     動機的型式並非永遠一成不變,作曲者可能透過些微的變化,在樂曲中重覆出現。重覆性的片段在序列中未產生任何變變化而重覆出現的現象稱之為「精確重覆」;在序列中的重覆片段果存在些微的變化,且這些變化的序列皆與某一個序列相似則稱之為「近似重覆」。
     精確重覆性樣式及近似重覆性樣式的探勘,已在過去的許多研究中被提出。在本篇論文中,我們研究一種新的重覆性樣式,在每個重覆出現的序列中,都和前一個出現的序列相似,而非與某一個特定的序列相似,這樣的重覆樣式稱之為「演化重覆性樣式」。本篇論文提出演算重覆性樣式的問題,並提出兩種基本的演算法,改善在探勘演化重覆性樣式時的效率;最後結合兩種演算法的特性,提出一種綜合演算法,同時擁有上述兩個演算法的優點,獲得更好的效率。最後並在實驗中證明我們所提出的演算法能夠得到良好的執行效率。
zh_TW
dc.description.abstract (摘要) A repeating pattern is a substring of a sequence, which repeats several times in the sequence. Repeating patterns play an important role in a variety of applications such as music analysis and bioinformatics.
     In music analysis, a repeating pattern is a sequence of notes repeats several times in a music object. In musicology, a repeating pattern corresponds to a motive which is a salient recurring segment of notes that may be used to construct all or some of the melody and themes. The well-know segment "sol-sol-sol-mi" in Beethoven`s Symphony no 5 is an example of motives. The repeating pattern is an efficient semantic representation to index music sequences for content-based music retrieval.
     The repetition of a motive may have some variations and not necessarily be an exact repetition in the music object. For example, motivic development is a composition technique that allows a composer to generate the entire music based on a motive which repeats in the form of variations. Moreover, it is possible that a motive may evolve in certain types of composition.
     Some exact and approximate repeating pattern mining algorithms have been developed. In this paper, a new form of approximate repeating patterns, evolutionary repeating patterns is investigated.
     In evolutionary repeating pattern, each instance is similar to the previous one, rather than the original pattern. We proposed two approaches, matrix-based and Apriori-based ones, to discover the evolutionary repeating patterns. Moreover, we also present a hybrid approach to combine features of the above two approaches. Scale-up experiments show that the proposed algorithms perform well.
en_US
dc.description.tableofcontents 目錄
     誌謝…………………………………………………………………………………………………i
     摘要……………………………………………………………………...……….iii
     目錄……………………………………………………………………..……….v
     圖目錄…………………………………………………………………………..vii
     表目錄.................................................................................................................ix
     第一章 概論 ……………………………………………….……………………1
     1.1 研究動機……………………..……………………..………………1
     1.2 論文架構…………………………………………..…………….….2
     第二章 相關研究…………………………………………….…….……...……..3
     2.1 動機演化 (Motif Evolution)…..…..……….…......…..3
     2.1.1 音樂動機…………….....………………………….….......….3
     2.1.2 動機演化……………………..…………….……..…....….……3
     2.2探勘精確重複樣式(Mining Exact Repeating Pattern)..…..6
     2.2.1 字典法(Dictionary-based Approach)……………………..…..7
     2.2.2 相關矩陣法(Correlative Matrix)…………………….....…….7
     2.2.3 合併字串法(String-join)……………………………......…….9
     2.2.4 字尾樹(Suffix Tree)……………………………………....….10
      2.3探勘精確重複樣式(Mining Approximate Repeating Pattern)…..…12
     2.3.1 允許插入(Insertion)元素的編輯距離………………….…...12
     2.3.2 允許插入(Insertion)或刪除(Deletion)元素的編輯距離.......14
     2.3.3 允許插入(Insertion)/刪除(Deletion)/相異(Mismatch)
     元素的編輯距離....................................................................15
     第三章 基本定義……………………………………………….……………....18
     第四章 探勘頻繁的演化重複性樣式……………….………………………....21
     4.1 矩陣法(Matrix-based Approach)………………………………….21
     4.1.1矩陣建立(Matrix Construction)……………………………..…..21
     4.1.2產生頻繁樣式(Frequent Pattern Generation)…………….......….25
     4.2 Apriori-based Approach....................................................................30
     4.3 混合法(Hybrid Approach)...............................................................33
     第五章 探勘的具代表性的演化重複性樣式………………………..…..….....35
     5.1 封閉性演化樣式(Closed Evolution Pattern)...................................35
     5.2 條件限制下的演化樣式(Evolution Patterns with Constrain).…....37
     第六章 實驗結果與評估.....................................................................................42
     6.1探勘樣式演化演算法的效能比較...................................................42
     6.1.1 序列長度實驗........................................................................43
     6.1.2 最大距離實驗………………………………………..…......44
     6.1.3 最小支持數實驗………………………………….………...46
     6.2 代表性的演化重複性樣式………………………………..……....47
     6.2.1 封閉演化樣式實驗................................................................47
     6.2.2 最小樣式大小………………………………..……….…….49
     第七章 結論與未來研究……………………………………………….……...50
     參考文獻………………………………………………………………….….....51
     
     
     
     圖目錄
     
     圖1.1 基因序列中演化重複樣式之範例……………………………..……………….……2
     圖2.1 貝多芬「命運交響曲中」重複出現的音樂動機……………………………………3
     圖2.2 計算距離的四種情況:相符(Match)、插入(Insertion)、
     刪除(Deletion)、相異(Mismatch)………………………………………………..…4
     圖2.3 給定初始動機“ABCD”,找出的動機演化鏈………………………………………5
     圖2.4 給定初始動機,找到在音樂序列中的所有演化樣式[12]…………………………5
     圖2.5 將得到的演化過程,記錄成動機演化樹[12]………………………………………5
     圖2.6 將音樂序列以小節為單位建立索引…………………………………..……………7
     圖2.7 將音符序列建立出相關矩陣………………………………………….…………… 8
     圖2.8 利用字串結合,找到重複樣式……………………………………….……………10
     圖2.9 將序列中的字尾字串儲存成字尾樹(Suffix Tree)…………………….…..……….11
     圖2.10 利用Apriori的特性,避免找出不可能的樣式……………………….…………14
     圖2.11 利用H向量表示序列片段的範例[16]………………………………….…………16
     圖2.12 利用R*-tree索引所有的H向量的範例[16]……………………………...………17
     圖3.1 序列中的演化樣式、演化鏈………………………………………………………19
     圖3.2 序列S=abcdabefhafgbefh中演化樣式abc之演化樹………………………………20
     圖4.1 序列S=abcdabefhafgnefh之實體矩陣………………………………………...……22
     圖4.2 實體矩陣建構演算法………………………………………………..........................24
     圖4.3 說明在圖4.1之矩陣在第5對角(5-th diagonal)上建構的過程…………….……25
     圖4.4 產生樣式的遞迴演算法……………………………………………………………27
     圖4.5 利用遞迴找出演化樣式之範例……………………………………………..………28
     圖4.6 在演化圖上找出樣式的深度優先追蹤演算法……………………………..………29
     圖4.7 演化圖之實例………………………………………………………………..………29
     圖4.8 演化樣式符合向下封閉性之實例……………………………………………..……31
     圖4.9 Apriori-based演算法………………………………………………………….……32
     圖4.10 利用Apriori特性找出不同長度下所有的頻繁演化樣式之實例..........................33
     圖4.11 將Apriori特性加入矩陣追蹤演算法......................................................................34
     圖5.1 封閉性樣式之實例.....................................................................................................36
     圖5.2 從矩陣追蹤演算法中探勘封閉樣式之演算法.........................................................37
     圖5.3 最小樣式大小限制之實例.........................................................................................38
     圖5.4 非重疊限制之實例.....................................................................................................39
     圖5.5 最長時間(Time)限制之實例......................................................................................39
     圖5.6 覆蓋度限制之實例.....................................................................................................40
     圖5.7 最大平均距離限制之實例.........................................................................................41
     圖6.1 序列長度對執行時間之影響.....................................................................................43
     圖6.2 序列長度對記憶體使用之影響.................................................................................44
     圖6.3 最大距離對執行時間之影響.....................................................................................45
     圖6.4 最大距離對演化樣式、演化鏈、候選樣式個數之影響.........................................45
     圖6.5 最大距離對記憶體變化量的影響.............................................................................46
     圖6.6 最小支持度對執行時間之影響.................................................................................47
     圖6.7 序列長度對封閉樣式個數之影響.............................................................................48
     圖6.8 最大距離對封閉樣式個數之影響.............................................................................48
     圖6.9 最小支持度對封閉樣式個數之影響.........................................................................49
     圖6.10 最小樣式大小之限制對演化樣式和演化鏈個數之影響.....................................49
     表目錄
     
     表1 序列Seq的所有字尾字串(Suffix String)。…………………………………………11
     表2 利用改變滑動視窗的範圍,避免大量的重複計算。………………………………13
     表3 字串Seq的位元索引表。……………………………………………………………14
zh_TW
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0095753010en_US
dc.subject (關鍵詞) 演化重覆性樣式zh_TW
dc.subject (關鍵詞) 音樂動機zh_TW
dc.title (題名) 有效率的探勘演化重覆性樣式演算法zh_TW
dc.title (題名) Efficient algorithms for mining evolutionary repeating patternsen_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) [1] 黎翁斯坦, 潘皇龍譯, “音樂的結構與風格”, 全音樂譜出版社, 1974.zh_TW
dc.relation.reference (參考文獻) 註:Leon Stein, “Structure & Style:The Study and Analysis of Musical Forms”, Summy-Birchard Music, 1979.(expanded edition)zh_TW
dc.relation.reference (參考文獻) [2] R. Agrawal, and R. Srikant, "Fast Algorithm for Mining Association Rules," Proc. of International Conference on Very Large Data Bases VLDB, 1994.zh_TW
dc.relation.reference (參考文獻) [3] R. Agrawal, and R. Srikant, “Mining Sequential Patterns,” Proc. of International Conference on Data Engineering ICDE, 1995.zh_TW
dc.relation.reference (參考文獻) [4] T. Crawford, C. S. Iliopoulos, R. Winder, and H. Yu. “Approximate Musical Evolution,” Proc. of Artificial Intelligence and Simulation of Behaviour Symposium AISB, 1999.zh_TW
dc.relation.reference (參考文獻) [5] J. Han, G. Dong, and Y. Yin, “Efficient Mining of Partial Periodic Patterns in Time Series Databases,” Proc. of International Conference on Data Engineering ICDE, 1999.zh_TW
dc.relation.reference (參考文獻) [6] C.S. Iliopoulos, M. Kumar, L. Mouchard, and S. Venkatesh, “Motif Evolution in Polyphonic Musical Sequences,” Proc. of Australasian Workshop on Combinatorial Algorithms AWOCA, 2000.zh_TW
dc.relation.reference (參考文獻) [7] C. S. Iliopoulos, K. Lemström, M. Nuyad, and Y. J. Pinzón, “Evolution of Musical Motifs in Polyphonic Passages,” Proc. of Symposium on AI and Creativity in Arts and Science AISB, 2002zh_TW
dc.relation.reference (參考文獻) [8] A. M. Helena “Finding All Maximal Frequent Sequences in Text,” Proc. of International Conference on Machine Learning ICML, 1999.zh_TW
dc.relation.reference (參考文獻) [9] J. L. Hsu, A. L. P. Chen, and H. C. Chen, “Finding Approximate Repeating Patterns from Sequence Data,” Proc.of Intertional Symposium on Music Information Retrieval ISMIR, 2004.zh_TW
dc.relation.reference (參考文獻) [10] J. L. Hsu, C. C. Liu, and A. L. P. Chen, “Efficient Repeating Pattern Finding in Music Database,” Proc. of Conference on Information and Knowledge Management CIKM, 1998.zh_TW
dc.relation.reference (參考文獻) [11] J. L. Hsu, C. C. Liu, and A. L. P. Chen, “Discovering Non-Trivial Repeating Patterns in Music Data,” IEEE Transactions on Multimedia, 2001.zh_TW
dc.relation.reference (參考文獻) [12] J. L. Koh, and Y. T. Kung “An Efficient Approach for Mining Top-K Fault-Tolerant Repeating Patterns,” Proc. of International Conference on Database Systems for Advanced Applications DASFAA, 2006.zh_TW
dc.relation.reference (參考文獻) [13] C. C. Liu, J. L. Hsu, and A. L. P. Chen, “Efficient Theme and Non-Trivial Repeating Pattern Discovering in Music Databases,” Proc. of International Conference on Data Engineering ICDE, 1999.zh_TW
dc.relation.reference (參考文獻) [14] N. H. Liu, Y. H. Wu, and A. L. P. Chen “An Efficient Approach to Extracting Approximate Repeating Pattern in Music Databases,” Proc. of International Conference on Database Systems for Advanced Applications DASFAA, 2005.zh_TW
dc.relation.reference (參考文獻) [15] Y. L. Lo, and C. Y. Chen “Fault Tolerant Non-trivial Repeating Pattern Discovering for Music Data,” Proc. of International Conference on Computer and Information Science ICIS, 2007.zh_TW
dc.relation.reference (參考文獻) [16] Y. L. Lo, W. L. Lee, and L. H. Chang “True suffix tree approach for discovering non-trivial repeating patterns in a music object,” Multimedia Tools and Applications, 2007.zh_TW
dc.relation.reference (參考文獻) [17] Y. L. Lo, and W. L. Li, “Linear Time for Discovering Non-trivial Repeating Pattern in Music Database,” Proc. of International Conference on Multimedia and Expo ICME, 2004.zh_TW
dc.relation.reference (參考文獻) [18] C. Meek, and W. P. Birmingham “Thematic Extractor,” Proc.of Intertional Symposium on Music Information Retrieval ISMIR, 2001.zh_TW
dc.relation.reference (參考文獻) [19] H. H. Shin, S. S. Narayanan, and C. C. J. Kuo, “A Dictionary Approach to Repetitive Pattern Finding In Music,” Proc. of International Conference on Multimedia and Expo ICME, 2001.zh_TW
dc.relation.reference (參考文獻) [20] P. Weiner (1973). “Linear Pattern Matching Algorithm,” Proc. of Symposium on Switching and Automata Theory. pp.1-11, 1973.zh_TW