題名 基於異質型偏好排序表示法之整合圖結構資訊於改進推薦系統效能
Improving Recommendation Performance via Incorporating Graph Structural Information based on Heterogeneous Preference Embedding
作者 張麒竑
Chang, Chi-Hung
貢獻者 蔡銘峰
Tsai, Ming-Feng
Chang, Chi-Hung
關鍵詞 推薦系統
Recommender system
Graph learning
Graph structure
日期 2022
上傳時間 2-Dec-2022 15:20:18 (UTC+8)
摘要 推薦系統(Recommendation System)發展至今已有三十餘年,從最初較為簡單的暢銷品(Best-Seller)推薦,到有參考他人和商品資訊的方法,如傳統的協同過濾(Collaborative Filtering)演算法和基於內容過濾(Content-Based Filtering);後續進階有將多種方法混和的方法(Hybrid Method),以及近年相當盛行的使用了機器學習(Machine Learning)和深度學習(Deep Learning)的各式先進模型。然而,現在先進的模型或引入知識圖譜(Knowledge Graph),或加入神經網路(Neural Network),雖然能確實的提升模型訓練預測的準確度,但除了會耗費較長的訓練時間及記憶體空間消耗外,某些看似隱含著有助於預測的資訊也有可能會被忽略而未被訓練模型考慮,「圖形結構」即為一個可能隱含正向幫助的資訊,但是鮮少有將此資訊應用至模型訓練當中。

在本篇當中,我們提出了 HPEstruc 方法。此方法發想於 Facebook 團隊在 2018 年提出的 SEAL(Learning from Subgraphs, Embeddings, and Attributes for Link prediction)模型,除了使用了深度學習進行訓練外,另外將「圖形結構」的資訊加入到模型中,並且在預測任務中得到了相當不錯的成績。因此我們認為,「圖形結構」對於推薦模型的訓練及預測,應能帶來正向的幫助。於是,在本篇當中,我們選定了異質型偏好排序表示法(Heterogeneous Preference Embedding,HPE)作為推薦的訓練模型,並且使用了可以將圖形結構轉換成向量表示法的 struc2vec 來進行圖形結構的擷取,將擷取到的圖形結構資訊加入異質型偏好排序表示法訓練模型當中,比較「圖形結構」對於推薦模型的訓練及預測是否有幫助。從實驗結果可以得知,HPEstruc 在社群網路類型的資料集可以得到比原始 HPE 模型更好的預測準確度,證明了「圖形結構」對於社群網路類型的資料,在推薦預測上是有所幫助的。

除了比較「圖形結構資訊加入與否」對於推薦模型預測的效果是否有改善外,另外有將預測結果與現今廣泛被使用的深度學習模型預測結果進行比較,以及與較為先進、預測結果較佳的 SEAL 模型進行比較,並且針對 HPEstruc 與 SEAL 模型作法上的差異進行深入探討與比較。
Recommender systems have been developed for about 30 years. In addition to collaborative filtering and content-based filtering those traditional methods, some hybrid methods are proposed in the field. In recent years, more advanced methods are also proposed based on deep learning techniques to include knowledge graphs for better prediction. So, leveraging graph structural information is one of the potentially crucial research directions.

In this paper, we proposed a training method named HPEstruc, which a model inspires named SEAL (Learning from Subgraphs, Embeddings, and Attributes for Link prediction). SEAL uses deep learning for training and adds "graph structural information" into the model. Moreover, it gets good results on prediction tasks. Due to those mentioned above, we believe that "graph structural information" can positively influence the recommender model for training and prediction. As a result, we choose Heterogeneous Preference Embedding (HPE) as our training model and use struc2vec, which can convert the graph structure to embeddings, to retrieve the graph structural information in our research. In addition, we add the information captured from the graph into the HPE model to compare whether adding structural information is helpful for training and prediction. It can be seen from the results that HPEstruc can get better prediction accuracy than the original HPE model on the social network datasets. This thesis provides that "graph structure" is helpful for prediction on this type of recommendation problem.

Furthermore, we also compare the results of HPEstruc with the results of methods used for prediction. We also discuss the difference between HPEstruc and SEAL, one of the most state-of-the-art training models, in detail.
dc.description.tableofcontents 致謝 1
中文摘要 2
Abstract 3

第一章 緒論....................................... 1
第二章 相關文獻探討.................................3
2.1 協同過濾(CollaborativeFiltering) ..............3
2.1.1 k-鄰近演算法(k-Nearest Neighbors,kNN) . . . 5
2.1.2 矩陣分解.............................. 5
2.2 基於內容過濾(Content-BasedFiltering)................ 7
2.2.1 TF-IDF(Term Frequency-Inverse Document Frequency) . 7
2.2.2 OkapiBM25 ............................ 8
2.3 混合方法(HybridMethod) ....................... 8
2.4 網路表示法用於推薦系統(Network Embedding for Recommender
Systems).................................. 9
2.4.1 表示法學習(RepresentationLearning) . . . . . . 9
2.4.2 網路表示法(NetworkEmbedding)............... 9
2.4.3 卷積神經網路(Convolutional Neural Network) .. . . . 10
2.4.4 圖卷積神經網路(Graph Convolutional Network) . . . . 11
2.4.5 Learning from Subgraphs, Embeddings, and Attributes for
Link prediction (SEAL)..................... 11
2.5 知識圖譜(KnowledgeGraph)...................... 12
2.5.1 Collaborative Knowledge Base Embedding for Recommender Systems (CKE) ................. 13
2.5.2 Propagating User Preferences on the Knowledge Graph for RecommenderSystems(RippleNet) .............. 14
第三章 研究方法..........................15
3.1 Struc2vec .................................. 15
3.1.1 計算結構相似度.......................... 15
3.1.2 建立前後關係 ........................... 17
3.1.3 生成節點之間的前後關係 .................... 18
3.1.4 學習語言模型 ........................... 19
3.2 異質型偏好排序表示法 .......................... 20
3.2.1 建立網路.............................. 20
3.2.2 藉由加權的隨機遊走進行邊取樣 ................ 21
3.2.3 藉由異質型偏好排序表示法進行問題導向的建模 . .. 22
3.3 基於圖形結構相似度之隨機遊走.................... 22
第四章 實驗結果與討論.............................. 26
4.1 資料集 ................................... 26
4.2 實驗設定.................................. 27
4.3 實驗結果.................................. 27
4.4 問題探討.................................. 28
4.4.1 節點之結構深度對於訓練模型的影響.......... 28
4.4.2 運算上的時間花費 ........................ 30
4.4.3 計算上的空間使用 ........................ 31
4.4.4 圖形結構資訊使用與否的比較................ 31
4.4.5 與SEAL模型之比較 ....................... 32
第五章 結論................................... 34
5.1 結論..................................... 34
參考文獻.......................................... 36
