Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 矩陣分解法與隨機效應模型法應用於電影評分資料分析比較
Application of Matrix Factorization and Random Effect Model to analysis and comparison of movie rating data
作者 周鼎智
Chou, Ting Chih
貢獻者 翁久幸
Weng, Chiu Hsing
周鼎智
Chou, Ting Chih
關鍵詞 推薦系統
隨機效應模型法
矩陣分解法
MoiveLens
Recommender system
Random effects model
Matrix factorization
MoiveLens
日期 2018
上傳時間 2-May-2018 15:46:18 (UTC+8)
摘要 推薦系統的出現是為了解決訊息過載的問題,其需求隨著科技的進步、網路的普及而增加,相關技術也越發多樣且成熟。廣泛應用於各領域的統計模型也在技術的行列中。

推薦系統的運作仰賴使用者偏好訊息,而使用者對項目所組成的偏好空間往往十分巨大且不平衡,統計上需要相對複雜的隨機效應模型或混合效應模型來描繪這樣的變數結構,且通常需要計算效率相對低的反覆疊代過程來估計模型參數。因此Perry(2014)、Gao & Owen(2016)先後提出以動差法處理階層線性模型與兩因子隨機效應模型,是一種犧牲統計效率換取計算效率的做法。

本研究便是採用統計模型中的隨機效應模型法,分別以最大概似法和動差法估計參數,與同為協同過濾技術觀點的矩陣分解法進行分析比較。透過預測準確度和運算效率兩個層面,來評估各演算法在MoiveLens這筆資料上的推薦表現。

根據試驗結果歸納出隨機效應模型法無論以什麼樣的參數估計方式,在預測準確度的表現上都不如矩陣分解法來得好;但以動差法估計參數在穩定度上與矩陣分解法的表現差不多,且在運算效率上好很多。
The recommender system (RS) appeared to solve the problem of information overload. The demand of the RS has increased with the advancement of technology and the popularity of the Internet, and related techniques have become more diverse and mature. The statistical models widely used in various fields are also in the list of techniques.

The operation of the RS relies on user preference information, and the space of users’ preference to items is often large and unbalanced. Statistically, relatively complex random effects models or mixed effects models are needed to describe such variable structures, and often require a large number of iterations to estimate model parameters. Perry (2014), Gao & Owen (2016) proposed using the moment-based method to deal with hierarchical linear models and two-factor random effects models, respectively, expressing an idea of sacrificing statistical efficiency in exchange for computational efficiency.

In this study, we analyze and compare the random effects model, using the maximum likelihood method and the moment-based method to estimate the parameters with the matrix factorization. Through the prediction accuracy and computational efficiency to evaluate the performance of each algorithm on the MoiveLens data.

According to the experiment results, the random effects model is not as good as the matrix factorization in terms of the prediction accuracy no matter what kind of parameter estimation method is used; however, the performance of the moment-based parameter estimation is consistent with the matrix factorization in terms of the prediction stability, and much better in terms of the efficiency.
參考文獻 Ansari, A., Essegaier, S., & Kohli, R. (2000). Internet recommendations systems. J. Marketing Research 37, 363–375.
Bates, D., Maechler, M., Bolker, B., Walker, S., Christensen, R. H. B., Singmann, H., Dai, B., Grothendieck, G., & Green, P. (2015) Fitting Linear Mixed-Effects Models Using lme4 Journal of Statistical Software, 67(1), 1–48.
Bates, D., Maechler, M., Bolker, B., Walker, S., Christensen, R. H. B., Singmann, H., Dai, B., Grothendieck, G., & Green, P. (2017) lme4: Linear Mixed-Effects Models using `Eigen` and S4. R package version 1.1-14. https://cran.r-project.org/web/packages/lme4/index.html
Chin, W. S., Zhuang, Y., Juan, Y. C., & Lin, C. J. (2015). A fast parallel stochastic gradient method for matrix factorization in shared memory systems. ACM Transactions on Intelligent Systems and Technology (TIST), 6(1), 2.
Condliff, M. K., Lewis, D. D., Madigan, D., & Posse, C. (1999). Bayesian mixed-effects models for recommender systems. In Proceedings of the ACM SIGIR ’99 Workshop on Recommender Systems: Algorithms and Evaluation.
Funk, S. (2006, Dec.). Netflix Update: Try This at Home. http://sifter.org/~simon/journal/20061211.html.
Gao, K. & Owen, A. (2016). Efficient Moment Calculations for Variance Components in Large Unbalanced Crossed. Electron. J. Statist. Volume 11, Number 1, 2017, 1235-1296.
Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8).
Matloff, N. S. (2017) rectools: Advanced Package for Recommender Systems. R package. https://github.com/Pooja-Rajkumar/rectools
Perry, P. (2014). Fast Moment-Based Estimation for Hierarchical Models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79(1), 267-291.
Qiu, Y., Lin, C. J., Juan, Y. C., Chin, W. S., Zhuang, Y., Yuan, B. W., Yang, M. Y., et al. (2017) recosystem: Recommender System using Matrix Factorization. R package version 0.4.2. https://cran.r-project.org/web/packages/recosystem/index.html
Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., & Riedl, J. (1994). GroupLens: an open architecture for collaborative filtering of netnews. In Proceedings of the 1994 ACM conference on Computer supported cooperative work (pp. 175-186). ACM.
Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2001). Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web (pp. 285-295). ACM.
描述 碩士
國立政治大學
統計學系
104354026
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0104354026
資料類型 thesis
dc.contributor.advisor 翁久幸zh_TW
dc.contributor.advisor Weng, Chiu Hsingen_US
dc.contributor.author (Authors) 周鼎智zh_TW
dc.contributor.author (Authors) Chou, Ting Chihen_US
dc.creator (作者) 周鼎智zh_TW
dc.creator (作者) Chou, Ting Chihen_US
dc.date (日期) 2018en_US
dc.date.accessioned 2-May-2018 15:46:18 (UTC+8)-
dc.date.available 2-May-2018 15:46:18 (UTC+8)-
dc.date.issued (上傳時間) 2-May-2018 15:46:18 (UTC+8)-
dc.identifier (Other Identifiers) G0104354026en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/117016-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 統計學系zh_TW
dc.description (描述) 104354026zh_TW
dc.description.abstract (摘要) 推薦系統的出現是為了解決訊息過載的問題,其需求隨著科技的進步、網路的普及而增加,相關技術也越發多樣且成熟。廣泛應用於各領域的統計模型也在技術的行列中。

推薦系統的運作仰賴使用者偏好訊息,而使用者對項目所組成的偏好空間往往十分巨大且不平衡,統計上需要相對複雜的隨機效應模型或混合效應模型來描繪這樣的變數結構,且通常需要計算效率相對低的反覆疊代過程來估計模型參數。因此Perry(2014)、Gao & Owen(2016)先後提出以動差法處理階層線性模型與兩因子隨機效應模型,是一種犧牲統計效率換取計算效率的做法。

本研究便是採用統計模型中的隨機效應模型法,分別以最大概似法和動差法估計參數,與同為協同過濾技術觀點的矩陣分解法進行分析比較。透過預測準確度和運算效率兩個層面,來評估各演算法在MoiveLens這筆資料上的推薦表現。

根據試驗結果歸納出隨機效應模型法無論以什麼樣的參數估計方式,在預測準確度的表現上都不如矩陣分解法來得好;但以動差法估計參數在穩定度上與矩陣分解法的表現差不多,且在運算效率上好很多。
zh_TW
dc.description.abstract (摘要) The recommender system (RS) appeared to solve the problem of information overload. The demand of the RS has increased with the advancement of technology and the popularity of the Internet, and related techniques have become more diverse and mature. The statistical models widely used in various fields are also in the list of techniques.

The operation of the RS relies on user preference information, and the space of users’ preference to items is often large and unbalanced. Statistically, relatively complex random effects models or mixed effects models are needed to describe such variable structures, and often require a large number of iterations to estimate model parameters. Perry (2014), Gao & Owen (2016) proposed using the moment-based method to deal with hierarchical linear models and two-factor random effects models, respectively, expressing an idea of sacrificing statistical efficiency in exchange for computational efficiency.

In this study, we analyze and compare the random effects model, using the maximum likelihood method and the moment-based method to estimate the parameters with the matrix factorization. Through the prediction accuracy and computational efficiency to evaluate the performance of each algorithm on the MoiveLens data.

According to the experiment results, the random effects model is not as good as the matrix factorization in terms of the prediction accuracy no matter what kind of parameter estimation method is used; however, the performance of the moment-based parameter estimation is consistent with the matrix factorization in terms of the prediction stability, and much better in terms of the efficiency.
en_US
dc.description.tableofcontents 第一章 緒論 1
1.1 研究背景 1
1.1.1 推薦系統簡介 1
1.1.2 推薦系統面臨的問題 4
1.2 研究目的 5
第二章 文獻探討 6
2.1 最近鄰法 6
2.2 矩陣分解法 9
2.3 統計模型法 11
第三章 研究方法 13
3.1 矩陣分解法 13
3.2 隨機效應模型法 16
3.2.1 最大概似法 16
3.2.2 動差法 19
第四章 實證研究 21
4.1 資料描述 21
4.2 試驗流程 22
4.3 結果分析 25
4.3.1 MovieLens 100K數據集 25
4.3.2 MovieLens 1M數據集 29
第五章 結論與建議 34
參考文獻 36
zh_TW
dc.format.extent 1507963 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0104354026en_US
dc.subject (關鍵詞) 推薦系統zh_TW
dc.subject (關鍵詞) 隨機效應模型法zh_TW
dc.subject (關鍵詞) 矩陣分解法zh_TW
dc.subject (關鍵詞) MoiveLenszh_TW
dc.subject (關鍵詞) Recommender systemen_US
dc.subject (關鍵詞) Random effects modelen_US
dc.subject (關鍵詞) Matrix factorizationen_US
dc.subject (關鍵詞) MoiveLensen_US
dc.title (題名) 矩陣分解法與隨機效應模型法應用於電影評分資料分析比較zh_TW
dc.title (題名) Application of Matrix Factorization and Random Effect Model to analysis and comparison of movie rating dataen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) Ansari, A., Essegaier, S., & Kohli, R. (2000). Internet recommendations systems. J. Marketing Research 37, 363–375.
Bates, D., Maechler, M., Bolker, B., Walker, S., Christensen, R. H. B., Singmann, H., Dai, B., Grothendieck, G., & Green, P. (2015) Fitting Linear Mixed-Effects Models Using lme4 Journal of Statistical Software, 67(1), 1–48.
Bates, D., Maechler, M., Bolker, B., Walker, S., Christensen, R. H. B., Singmann, H., Dai, B., Grothendieck, G., & Green, P. (2017) lme4: Linear Mixed-Effects Models using `Eigen` and S4. R package version 1.1-14. https://cran.r-project.org/web/packages/lme4/index.html
Chin, W. S., Zhuang, Y., Juan, Y. C., & Lin, C. J. (2015). A fast parallel stochastic gradient method for matrix factorization in shared memory systems. ACM Transactions on Intelligent Systems and Technology (TIST), 6(1), 2.
Condliff, M. K., Lewis, D. D., Madigan, D., & Posse, C. (1999). Bayesian mixed-effects models for recommender systems. In Proceedings of the ACM SIGIR ’99 Workshop on Recommender Systems: Algorithms and Evaluation.
Funk, S. (2006, Dec.). Netflix Update: Try This at Home. http://sifter.org/~simon/journal/20061211.html.
Gao, K. & Owen, A. (2016). Efficient Moment Calculations for Variance Components in Large Unbalanced Crossed. Electron. J. Statist. Volume 11, Number 1, 2017, 1235-1296.
Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8).
Matloff, N. S. (2017) rectools: Advanced Package for Recommender Systems. R package. https://github.com/Pooja-Rajkumar/rectools
Perry, P. (2014). Fast Moment-Based Estimation for Hierarchical Models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79(1), 267-291.
Qiu, Y., Lin, C. J., Juan, Y. C., Chin, W. S., Zhuang, Y., Yuan, B. W., Yang, M. Y., et al. (2017) recosystem: Recommender System using Matrix Factorization. R package version 0.4.2. https://cran.r-project.org/web/packages/recosystem/index.html
Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., & Riedl, J. (1994). GroupLens: an open architecture for collaborative filtering of netnews. In Proceedings of the 1994 ACM conference on Computer supported cooperative work (pp. 175-186). ACM.
Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2001). Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web (pp. 285-295). ACM.
zh_TW