學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

題名 具概念飄移的動態社群網絡之類別預測
Label Prediction on Dynamic Social Networks with Concept Drifting
作者 游詳閔
Yu, Hsiang-Min
貢獻者 沈錳坤
Shan, Man-Kwan
游詳閔
Yu, Hsiang-Min
關鍵詞 類別預測
動態社群網絡
概念飄移
Label Prediction
Dynamic Social Networks
Concept Drifting
日期 2010
上傳時間 6-Nov-2019 15:27:14 (UTC+8)
摘要 社會網絡在電腦科學的研究範疇中扮演一個日漸重要的角色,類別預測正是其中 一項熱門的議題。類別預測的研究目標,是利用網絡中部分已知類別的節點,預 測出其他未知類別節點之類別。
以往類別預測之研究,皆以靜態社會網絡為主;然而,社會網絡往往是隨著 時間動態演進的。在動態網絡中,網絡中的節點、連結、類別,皆可能隨著時間 演進而更動。連帶的,節點之間相互影響的關係也會隨著時間改變。此變動可以 視為一種概念飄移 (Concept Drift)。

不同於過往的研究,我們指出了動態網絡中類別分類的問題,並利用靜態網絡中類別分類的技術,結合概念飄移的方法,提出能夠在動態網絡中預測類別的 解法。

實驗所採用的資料是 IMDb (Internet Movie Database) 的社會網路,我們用以 預測演員的類別,根據實驗結果顯示,將動態社會網絡的演化過程,加入作為類 別預測的參考指標,能夠提高動態網絡中類別分類的準確性。
Label prediction is one of the central questions of social network research. The core of label prediction is the use of labeled nodes to predict labels of un-labeled nodes in a social network. The definition of a labeled social network is a social network of partial or complete labeled nodes. The nodes in the same social network have a mutual impact on each other’s labels.

Previous research on label prediction have been focused on static social networks. However, social networks are more dynamic in reality. In a dynamic social network, the links of nodes, even the labels of nodes, can be changed with time. The mutual influence of nodes can also be changed. The changing is called “Concept Drift.”

This thesis predicts the labels on a dynamic labeled social work. We address the problems of classification for a dynamic social network. The technique of label prediction on static social networks and algorithms used to tackle concept drift are combined to solve the label prediction problem on dynamic social networks.

Experiments were performed on a labeled social network constructed from the Internet Movie Database. The results show that we can use the evolution of dynamic social networks to generate a more precise prediction of labels.
參考文獻 [1] L. Breiman. “Random Forests,” Machine Learning, Vol. 15, No. 1, pp. 5-12, 2001.

[2] D. Brezeale and D. J. Cook. “Automatic Video Classification: A Survey of the Literature,” Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE
Transactions on, Vol. 38, pp. 416-430, 2008.

[3] C. Desrosiers and G. Karypis, “Within-Network Classification Using Local Structure
Similarity,” Proc. of the European Conference on Machine Learning (ECML) and the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 260-275, 2009.

[4] B. Gallagher, H. Tong, T. Eliassi-Rad, and C. Faloutsos, “Using Ghost Edges for Classification in Sparsely Labeled Networks,” Proc. of the 14th ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) at International Conference on Knowledge Discovery and Data Mining, pp. 256-264, 2008.

[5] J. He, J. Carbonell, and Y. Liu, “Graph-Based Semi-Supervised Learning as a Generative Model,” Proc. of the International Joint Conference on Artificial Intelligence, pp. 2492-2497, 2007.

[6] J. He, M. J. Li, H. J. Zhang, H. H. Tong, and C. S. Zhang, “Manifold-Ranking based Image Retrieval,” Proc. of the 12th annual ACM International Conference on Multimedia, pp. 9-16, 2004.

[7] J. Z. Kolter and M. A. Maloof. “Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift,” Proc. of the 3rd International IEEE Conference on Data Mining, pp. 123-130, 2003.

[8] F. Lin and W. W. Cohen, “Semi-Supervised Classification of Network Data Using Very Few Labels,” Proc. of the 2010 International Conference on Advances in Social Networks Analysis and Mining, pp. 192-199, 2010.

[9] S. A. Macskassy and F. Provost. “A Simple Relational Classifier,” Proc. of the Second Workshop on Multi-Relational Data Mining (MRDM-2003) at International Conference
on Knowledge Discovery and Data Mining, pp. 64-76, 2003.

[10] S. A. Macskassy and F. Provost, “Classification in Networked Data: A Toolkit and a Univariate Case Study,” The Journal of Machine Learning Research, Vol. 8, pp.
935-983, 2007.

[11] L. McDowell, K. M. Gupta, and D. W. Aha, “Cautious Inference in Collective Classification,” Journal of Machine Learning Research, Vol. 10, pp. 596-601, 2007.

[12] L. McDowell, K. M. Gupta, and D. W. Aha, “Meta-Prediction for Collective Classification,” Proc. 23th International FLAIRS Conference, 2010.

[13] J. Y. Pan, H. J. Yang, C. Faloutsos, and P. Duygulu, “Automatic Multimedia Cross-Modal Correlation Discovery,” Proc. of the 10th ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) at International conference on Knowledge discovery and data mining, pp. 653-658, 2004.

[14] P. Sen, G. M. Namata, M. Bilgic, L. Getoor, B. Gallagher, and T. Eliassi-Rad, “Collective Classification in Network Data,” AI Magazine, vol. 29, No.3, pp. 93-106, 2008.

[15] W. Street and Y. Kim, “A Streaming Ensemble Algorithm (SEA) for Large-Scale Classification,” Proc. of the 7th ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) at International conference on Knowledge discovery and
data mining, pp. 377-382, 2001.

[16] A. Sultan and A.Hegami, “Classical and Incremental Classification in Data Mining Process,” International Journal of Computer Science and Network Security, Vol. 7, No.12, pp. 179-187, 2007.

[17] P. N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining, Addison-Wesley Longman Publishing Co., 2005.

[18] H. Tong and C. Faloutsos, “Center-Piece Subgraphs: Problem Definition and Fast Solutions,” Proc. of the 12th ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) at International Conference on Knowledge Discovery and Data Mining, pp. 404-413, 2006.

[19] H. Tong, C. Faloutsos, and J. Y. Pan, “Fast Random Walk with Restart and Its Applications,” Proc. of the 6th International IEEE Conference on Data Mining, pp. 613-622, 2006.

[20] G. Tsoumakas and I. Katakis, “Multi-Label Classification: An Overview,” International Journal of Data Warehousing and Mining, Vol. 3, No. 3, pp. 1-13, 2007.

[21] A. Tsymbal, M. Pechenizkiy, P. Cunningham, and S. Puuronen, “Dynamic Integration of Classifiers for Handling Concept Drift,” Information Fusion, Vol. 9, pp. 56-68, 2008.

[22] D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf, “Learning with Local and Global Consistency,” Advances in Neural Information Processing Systems, Vol. 16, pp. 321-328, 2004.

[23] X. Zhu, Z. Ghahramani, and J. Lafferty, “Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,” Proc. of the 20th International Conference on Machine Learning, pp. 912-919, 2003.

[24] X. Zhu. Semi-supervised Learning Literature Survey, University of Wisconsin-Madison Department of Computer Sciences, 2005.

[25] I. Zliobaite, “Learning under Concept Drift: an Overview,” Technical Report, Vilnius University, 2010.
描述 碩士
國立政治大學
資訊科學系
97753029
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0097753029
資料類型 thesis
dc.contributor.advisor 沈錳坤zh_TW
dc.contributor.advisor Shan, Man-Kwanen_US
dc.contributor.author (Authors) 游詳閔zh_TW
dc.contributor.author (Authors) Yu, Hsiang-Minen_US
dc.creator (作者) 游詳閔zh_TW
dc.creator (作者) Yu, Hsiang-Minen_US
dc.date (日期) 2010en_US
dc.date.accessioned 6-Nov-2019 15:27:14 (UTC+8)-
dc.date.available 6-Nov-2019 15:27:14 (UTC+8)-
dc.date.issued (上傳時間) 6-Nov-2019 15:27:14 (UTC+8)-
dc.identifier (Other Identifiers) G0097753029en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/127215-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學系zh_TW
dc.description (描述) 97753029zh_TW
dc.description.abstract (摘要) 社會網絡在電腦科學的研究範疇中扮演一個日漸重要的角色,類別預測正是其中 一項熱門的議題。類別預測的研究目標,是利用網絡中部分已知類別的節點,預 測出其他未知類別節點之類別。
以往類別預測之研究,皆以靜態社會網絡為主;然而,社會網絡往往是隨著 時間動態演進的。在動態網絡中,網絡中的節點、連結、類別,皆可能隨著時間 演進而更動。連帶的,節點之間相互影響的關係也會隨著時間改變。此變動可以 視為一種概念飄移 (Concept Drift)。

不同於過往的研究,我們指出了動態網絡中類別分類的問題,並利用靜態網絡中類別分類的技術,結合概念飄移的方法,提出能夠在動態網絡中預測類別的 解法。

實驗所採用的資料是 IMDb (Internet Movie Database) 的社會網路,我們用以 預測演員的類別,根據實驗結果顯示,將動態社會網絡的演化過程,加入作為類 別預測的參考指標,能夠提高動態網絡中類別分類的準確性。
zh_TW
dc.description.abstract (摘要) Label prediction is one of the central questions of social network research. The core of label prediction is the use of labeled nodes to predict labels of un-labeled nodes in a social network. The definition of a labeled social network is a social network of partial or complete labeled nodes. The nodes in the same social network have a mutual impact on each other’s labels.

Previous research on label prediction have been focused on static social networks. However, social networks are more dynamic in reality. In a dynamic social network, the links of nodes, even the labels of nodes, can be changed with time. The mutual influence of nodes can also be changed. The changing is called “Concept Drift.”

This thesis predicts the labels on a dynamic labeled social work. We address the problems of classification for a dynamic social network. The technique of label prediction on static social networks and algorithms used to tackle concept drift are combined to solve the label prediction problem on dynamic social networks.

Experiments were performed on a labeled social network constructed from the Internet Movie Database. The results show that we can use the evolution of dynamic social networks to generate a more precise prediction of labels.
en_US
dc.description.tableofcontents 摘要 ii
目錄 iv
圖目錄 vi
表目錄 vii
第 一 章 前言 1
第 二 章 相關研究 5
2.1 Collective Classification 5
2.2 Local Structure Similarity 8
2.3 Graph-based Semi-supervised Learning 9
2.4 Ghost Edges 10

第 三 章 研究方法 12
3.1 問題定義 12
3.2 研究架構 13
3.3 Base Classifier Learning 14
3.3.1 Ghost Edge 16
3.3.2 Random Walk With Restart 17
3.3.3 Base Classifier 18
3.4 特徵選取 18
3.5 Ensemble Box Learning 19
3.5.1 Concept Drift 20
3.5.2 Ensemble Box 24
3.6 Labeling 26
3.6.1 Iterative Classification Algorithm 26

第 四 章 29
4.1 資料庫 29
4.1.1 資料庫特性 29
4.2 實驗設計 31

第 五 章 36
5.1 結論 36
5.2 未來研究方向 36

參考文獻 37
zh_TW
dc.format.extent 911188 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0097753029en_US
dc.subject (關鍵詞) 類別預測zh_TW
dc.subject (關鍵詞) 動態社群網絡zh_TW
dc.subject (關鍵詞) 概念飄移zh_TW
dc.subject (關鍵詞) Label Predictionen_US
dc.subject (關鍵詞) Dynamic Social Networksen_US
dc.subject (關鍵詞) Concept Driftingen_US
dc.title (題名) 具概念飄移的動態社群網絡之類別預測zh_TW
dc.title (題名) Label Prediction on Dynamic Social Networks with Concept Driftingen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] L. Breiman. “Random Forests,” Machine Learning, Vol. 15, No. 1, pp. 5-12, 2001.

[2] D. Brezeale and D. J. Cook. “Automatic Video Classification: A Survey of the Literature,” Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE
Transactions on, Vol. 38, pp. 416-430, 2008.

[3] C. Desrosiers and G. Karypis, “Within-Network Classification Using Local Structure
Similarity,” Proc. of the European Conference on Machine Learning (ECML) and the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 260-275, 2009.

[4] B. Gallagher, H. Tong, T. Eliassi-Rad, and C. Faloutsos, “Using Ghost Edges for Classification in Sparsely Labeled Networks,” Proc. of the 14th ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) at International Conference on Knowledge Discovery and Data Mining, pp. 256-264, 2008.

[5] J. He, J. Carbonell, and Y. Liu, “Graph-Based Semi-Supervised Learning as a Generative Model,” Proc. of the International Joint Conference on Artificial Intelligence, pp. 2492-2497, 2007.

[6] J. He, M. J. Li, H. J. Zhang, H. H. Tong, and C. S. Zhang, “Manifold-Ranking based Image Retrieval,” Proc. of the 12th annual ACM International Conference on Multimedia, pp. 9-16, 2004.

[7] J. Z. Kolter and M. A. Maloof. “Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift,” Proc. of the 3rd International IEEE Conference on Data Mining, pp. 123-130, 2003.

[8] F. Lin and W. W. Cohen, “Semi-Supervised Classification of Network Data Using Very Few Labels,” Proc. of the 2010 International Conference on Advances in Social Networks Analysis and Mining, pp. 192-199, 2010.

[9] S. A. Macskassy and F. Provost. “A Simple Relational Classifier,” Proc. of the Second Workshop on Multi-Relational Data Mining (MRDM-2003) at International Conference
on Knowledge Discovery and Data Mining, pp. 64-76, 2003.

[10] S. A. Macskassy and F. Provost, “Classification in Networked Data: A Toolkit and a Univariate Case Study,” The Journal of Machine Learning Research, Vol. 8, pp.
935-983, 2007.

[11] L. McDowell, K. M. Gupta, and D. W. Aha, “Cautious Inference in Collective Classification,” Journal of Machine Learning Research, Vol. 10, pp. 596-601, 2007.

[12] L. McDowell, K. M. Gupta, and D. W. Aha, “Meta-Prediction for Collective Classification,” Proc. 23th International FLAIRS Conference, 2010.

[13] J. Y. Pan, H. J. Yang, C. Faloutsos, and P. Duygulu, “Automatic Multimedia Cross-Modal Correlation Discovery,” Proc. of the 10th ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) at International conference on Knowledge discovery and data mining, pp. 653-658, 2004.

[14] P. Sen, G. M. Namata, M. Bilgic, L. Getoor, B. Gallagher, and T. Eliassi-Rad, “Collective Classification in Network Data,” AI Magazine, vol. 29, No.3, pp. 93-106, 2008.

[15] W. Street and Y. Kim, “A Streaming Ensemble Algorithm (SEA) for Large-Scale Classification,” Proc. of the 7th ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) at International conference on Knowledge discovery and
data mining, pp. 377-382, 2001.

[16] A. Sultan and A.Hegami, “Classical and Incremental Classification in Data Mining Process,” International Journal of Computer Science and Network Security, Vol. 7, No.12, pp. 179-187, 2007.

[17] P. N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining, Addison-Wesley Longman Publishing Co., 2005.

[18] H. Tong and C. Faloutsos, “Center-Piece Subgraphs: Problem Definition and Fast Solutions,” Proc. of the 12th ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD) at International Conference on Knowledge Discovery and Data Mining, pp. 404-413, 2006.

[19] H. Tong, C. Faloutsos, and J. Y. Pan, “Fast Random Walk with Restart and Its Applications,” Proc. of the 6th International IEEE Conference on Data Mining, pp. 613-622, 2006.

[20] G. Tsoumakas and I. Katakis, “Multi-Label Classification: An Overview,” International Journal of Data Warehousing and Mining, Vol. 3, No. 3, pp. 1-13, 2007.

[21] A. Tsymbal, M. Pechenizkiy, P. Cunningham, and S. Puuronen, “Dynamic Integration of Classifiers for Handling Concept Drift,” Information Fusion, Vol. 9, pp. 56-68, 2008.

[22] D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf, “Learning with Local and Global Consistency,” Advances in Neural Information Processing Systems, Vol. 16, pp. 321-328, 2004.

[23] X. Zhu, Z. Ghahramani, and J. Lafferty, “Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,” Proc. of the 20th International Conference on Machine Learning, pp. 912-919, 2003.

[24] X. Zhu. Semi-supervised Learning Literature Survey, University of Wisconsin-Madison Department of Computer Sciences, 2005.

[25] I. Zliobaite, “Learning under Concept Drift: an Overview,” Technical Report, Vilnius University, 2010.
zh_TW
dc.identifier.doi (DOI) 10.6814/NCCU201901233en_US