學術產出-學位論文
文章檢視/開啟
書目匯出
-
題名 矩陣分解法與隨機效應模型法應用於電影評分資料分析比較
Application of Matrix Factorization and Random Effect Model to analysis and comparison of movie rating data作者 周鼎智
Chou, Ting Chih貢獻者 翁久幸
Weng, Chiu Hsing
周鼎智
Chou, Ting Chih關鍵詞 推薦系統
隨機效應模型法
矩陣分解法
MoiveLens
Recommender system
Random effects model
Matrix factorization
MoiveLens日期 2018 上傳時間 2-五月-2018 15:46:18 (UTC+8) 摘要 推薦系統的出現是為了解決訊息過載的問題,其需求隨著科技的進步、網路的普及而增加,相關技術也越發多樣且成熟。廣泛應用於各領域的統計模型也在技術的行列中。推薦系統的運作仰賴使用者偏好訊息,而使用者對項目所組成的偏好空間往往十分巨大且不平衡,統計上需要相對複雜的隨機效應模型或混合效應模型來描繪這樣的變數結構,且通常需要計算效率相對低的反覆疊代過程來估計模型參數。因此Perry(2014)、Gao & Owen(2016)先後提出以動差法處理階層線性模型與兩因子隨機效應模型,是一種犧牲統計效率換取計算效率的做法。本研究便是採用統計模型中的隨機效應模型法,分別以最大概似法和動差法估計參數,與同為協同過濾技術觀點的矩陣分解法進行分析比較。透過預測準確度和運算效率兩個層面,來評估各演算法在MoiveLens這筆資料上的推薦表現。根據試驗結果歸納出隨機效應模型法無論以什麼樣的參數估計方式,在預測準確度的表現上都不如矩陣分解法來得好;但以動差法估計參數在穩定度上與矩陣分解法的表現差不多,且在運算效率上好很多。
The recommender system (RS) appeared to solve the problem of information overload. The demand of the RS has increased with the advancement of technology and the popularity of the Internet, and related techniques have become more diverse and mature. The statistical models widely used in various fields are also in the list of techniques.The operation of the RS relies on user preference information, and the space of users’ preference to items is often large and unbalanced. Statistically, relatively complex random effects models or mixed effects models are needed to describe such variable structures, and often require a large number of iterations to estimate model parameters. Perry (2014), Gao & Owen (2016) proposed using the moment-based method to deal with hierarchical linear models and two-factor random effects models, respectively, expressing an idea of sacrificing statistical efficiency in exchange for computational efficiency.In this study, we analyze and compare the random effects model, using the maximum likelihood method and the moment-based method to estimate the parameters with the matrix factorization. Through the prediction accuracy and computational efficiency to evaluate the performance of each algorithm on the MoiveLens data.According to the experiment results, the random effects model is not as good as the matrix factorization in terms of the prediction accuracy no matter what kind of parameter estimation method is used; however, the performance of the moment-based parameter estimation is consistent with the matrix factorization in terms of the prediction stability, and much better in terms of the efficiency.參考文獻 Ansari, A., Essegaier, S., & Kohli, R. (2000). Internet recommendations systems. J. Marketing Research 37, 363–375.Bates, D., Maechler, M., Bolker, B., Walker, S., Christensen, R. H. B., Singmann, H., Dai, B., Grothendieck, G., & Green, P. (2015) Fitting Linear Mixed-Effects Models Using lme4 Journal of Statistical Software, 67(1), 1–48.Bates, D., Maechler, M., Bolker, B., Walker, S., Christensen, R. H. B., Singmann, H., Dai, B., Grothendieck, G., & Green, P. (2017) lme4: Linear Mixed-Effects Models using `Eigen` and S4. R package version 1.1-14. https://cran.r-project.org/web/packages/lme4/index.htmlChin, W. S., Zhuang, Y., Juan, Y. C., & Lin, C. J. (2015). A fast parallel stochastic gradient method for matrix factorization in shared memory systems. ACM Transactions on Intelligent Systems and Technology (TIST), 6(1), 2.Condliff, M. K., Lewis, D. D., Madigan, D., & Posse, C. (1999). Bayesian mixed-effects models for recommender systems. In Proceedings of the ACM SIGIR ’99 Workshop on Recommender Systems: Algorithms and Evaluation.Funk, S. (2006, Dec.). Netflix Update: Try This at Home. http://sifter.org/~simon/journal/20061211.html.Gao, K. & Owen, A. (2016). Efficient Moment Calculations for Variance Components in Large Unbalanced Crossed. Electron. J. Statist. Volume 11, Number 1, 2017, 1235-1296.Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8).Matloff, N. S. (2017) rectools: Advanced Package for Recommender Systems. R package. https://github.com/Pooja-Rajkumar/rectoolsPerry, P. (2014). Fast Moment-Based Estimation for Hierarchical Models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79(1), 267-291.Qiu, Y., Lin, C. J., Juan, Y. C., Chin, W. S., Zhuang, Y., Yuan, B. W., Yang, M. Y., et al. (2017) recosystem: Recommender System using Matrix Factorization. R package version 0.4.2. https://cran.r-project.org/web/packages/recosystem/index.htmlResnick, P., Iacovou, N., Suchak, M., Bergstrom, P., & Riedl, J. (1994). GroupLens: an open architecture for collaborative filtering of netnews. In Proceedings of the 1994 ACM conference on Computer supported cooperative work (pp. 175-186). ACM.Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2001). Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web (pp. 285-295). ACM. 描述 碩士
國立政治大學
統計學系
104354026資料來源 http://thesis.lib.nccu.edu.tw/record/#G0104354026 資料類型 thesis dc.contributor.advisor 翁久幸 zh_TW dc.contributor.advisor Weng, Chiu Hsing en_US dc.contributor.author (作者) 周鼎智 zh_TW dc.contributor.author (作者) Chou, Ting Chih en_US dc.creator (作者) 周鼎智 zh_TW dc.creator (作者) Chou, Ting Chih en_US dc.date (日期) 2018 en_US dc.date.accessioned 2-五月-2018 15:46:18 (UTC+8) - dc.date.available 2-五月-2018 15:46:18 (UTC+8) - dc.date.issued (上傳時間) 2-五月-2018 15:46:18 (UTC+8) - dc.identifier (其他 識別碼) G0104354026 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/117016 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 統計學系 zh_TW dc.description (描述) 104354026 zh_TW dc.description.abstract (摘要) 推薦系統的出現是為了解決訊息過載的問題,其需求隨著科技的進步、網路的普及而增加,相關技術也越發多樣且成熟。廣泛應用於各領域的統計模型也在技術的行列中。推薦系統的運作仰賴使用者偏好訊息,而使用者對項目所組成的偏好空間往往十分巨大且不平衡,統計上需要相對複雜的隨機效應模型或混合效應模型來描繪這樣的變數結構,且通常需要計算效率相對低的反覆疊代過程來估計模型參數。因此Perry(2014)、Gao & Owen(2016)先後提出以動差法處理階層線性模型與兩因子隨機效應模型,是一種犧牲統計效率換取計算效率的做法。本研究便是採用統計模型中的隨機效應模型法,分別以最大概似法和動差法估計參數,與同為協同過濾技術觀點的矩陣分解法進行分析比較。透過預測準確度和運算效率兩個層面,來評估各演算法在MoiveLens這筆資料上的推薦表現。根據試驗結果歸納出隨機效應模型法無論以什麼樣的參數估計方式,在預測準確度的表現上都不如矩陣分解法來得好;但以動差法估計參數在穩定度上與矩陣分解法的表現差不多,且在運算效率上好很多。 zh_TW dc.description.abstract (摘要) The recommender system (RS) appeared to solve the problem of information overload. The demand of the RS has increased with the advancement of technology and the popularity of the Internet, and related techniques have become more diverse and mature. The statistical models widely used in various fields are also in the list of techniques.The operation of the RS relies on user preference information, and the space of users’ preference to items is often large and unbalanced. Statistically, relatively complex random effects models or mixed effects models are needed to describe such variable structures, and often require a large number of iterations to estimate model parameters. Perry (2014), Gao & Owen (2016) proposed using the moment-based method to deal with hierarchical linear models and two-factor random effects models, respectively, expressing an idea of sacrificing statistical efficiency in exchange for computational efficiency.In this study, we analyze and compare the random effects model, using the maximum likelihood method and the moment-based method to estimate the parameters with the matrix factorization. Through the prediction accuracy and computational efficiency to evaluate the performance of each algorithm on the MoiveLens data.According to the experiment results, the random effects model is not as good as the matrix factorization in terms of the prediction accuracy no matter what kind of parameter estimation method is used; however, the performance of the moment-based parameter estimation is consistent with the matrix factorization in terms of the prediction stability, and much better in terms of the efficiency. en_US dc.description.tableofcontents 第一章 緒論 11.1 研究背景 11.1.1 推薦系統簡介 11.1.2 推薦系統面臨的問題 41.2 研究目的 5第二章 文獻探討 62.1 最近鄰法 62.2 矩陣分解法 92.3 統計模型法 11第三章 研究方法 133.1 矩陣分解法 133.2 隨機效應模型法 163.2.1 最大概似法 163.2.2 動差法 19第四章 實證研究 214.1 資料描述 214.2 試驗流程 224.3 結果分析 254.3.1 MovieLens 100K數據集 254.3.2 MovieLens 1M數據集 29第五章 結論與建議 34參考文獻 36 zh_TW dc.format.extent 1507963 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0104354026 en_US dc.subject (關鍵詞) 推薦系統 zh_TW dc.subject (關鍵詞) 隨機效應模型法 zh_TW dc.subject (關鍵詞) 矩陣分解法 zh_TW dc.subject (關鍵詞) MoiveLens zh_TW dc.subject (關鍵詞) Recommender system en_US dc.subject (關鍵詞) Random effects model en_US dc.subject (關鍵詞) Matrix factorization en_US dc.subject (關鍵詞) MoiveLens en_US dc.title (題名) 矩陣分解法與隨機效應模型法應用於電影評分資料分析比較 zh_TW dc.title (題名) Application of Matrix Factorization and Random Effect Model to analysis and comparison of movie rating data en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) Ansari, A., Essegaier, S., & Kohli, R. (2000). Internet recommendations systems. J. Marketing Research 37, 363–375.Bates, D., Maechler, M., Bolker, B., Walker, S., Christensen, R. H. B., Singmann, H., Dai, B., Grothendieck, G., & Green, P. (2015) Fitting Linear Mixed-Effects Models Using lme4 Journal of Statistical Software, 67(1), 1–48.Bates, D., Maechler, M., Bolker, B., Walker, S., Christensen, R. H. B., Singmann, H., Dai, B., Grothendieck, G., & Green, P. (2017) lme4: Linear Mixed-Effects Models using `Eigen` and S4. R package version 1.1-14. https://cran.r-project.org/web/packages/lme4/index.htmlChin, W. S., Zhuang, Y., Juan, Y. C., & Lin, C. J. (2015). A fast parallel stochastic gradient method for matrix factorization in shared memory systems. ACM Transactions on Intelligent Systems and Technology (TIST), 6(1), 2.Condliff, M. K., Lewis, D. D., Madigan, D., & Posse, C. (1999). Bayesian mixed-effects models for recommender systems. In Proceedings of the ACM SIGIR ’99 Workshop on Recommender Systems: Algorithms and Evaluation.Funk, S. (2006, Dec.). Netflix Update: Try This at Home. http://sifter.org/~simon/journal/20061211.html.Gao, K. & Owen, A. (2016). Efficient Moment Calculations for Variance Components in Large Unbalanced Crossed. Electron. J. Statist. Volume 11, Number 1, 2017, 1235-1296.Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8).Matloff, N. S. (2017) rectools: Advanced Package for Recommender Systems. R package. https://github.com/Pooja-Rajkumar/rectoolsPerry, P. (2014). Fast Moment-Based Estimation for Hierarchical Models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79(1), 267-291.Qiu, Y., Lin, C. J., Juan, Y. C., Chin, W. S., Zhuang, Y., Yuan, B. W., Yang, M. Y., et al. (2017) recosystem: Recommender System using Matrix Factorization. R package version 0.4.2. https://cran.r-project.org/web/packages/recosystem/index.htmlResnick, P., Iacovou, N., Suchak, M., Bergstrom, P., & Riedl, J. (1994). GroupLens: an open architecture for collaborative filtering of netnews. In Proceedings of the 1994 ACM conference on Computer supported cooperative work (pp. 175-186). ACM.Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2001). Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web (pp. 285-295). ACM. zh_TW