學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 網路評比資料之統計分析
Statistical analysis of online rating data
作者 張孫浩
貢獻者 翁久幸
Weng, Chui Hsing
張孫浩
關鍵詞 線上評分
推薦系統
IRT模型法
相關係數法
矩陣分解
online rating
recommender system
IRT model-based method
method, correlation-coefficient method
matrix factorization
日期 2010
上傳時間 5-Sep-2013 15:12:30 (UTC+8)
摘要 隨著網路的發達,各式各樣的資訊和商品也在網路上充斥著,使用者尋找資訊或是上網購物時,有的網站有推薦系統(recommender system)能提供使用者相關資訊或商品。若推薦系統能夠讓消費者所搜尋的相關資訊或商品能夠符合他們的習性時,便能讓消費者增加對系統的信賴程度,因此系統是否能準確預測出使用者的偏好就成為一個重要的課題。本研究使用兩筆資料,並以相關研究的三篇文獻進行分析和比較。這三篇文獻分別為IRT模型法(IRT model-based method)、相關係數法(correlation-coefficient method)、以及矩陣分解法(matrix factorization)。

在經過一連串的實證分析後,歸納出以下結論:
1. 模型法在預測方面雖然精確度不如其他兩種方法來的好,但是模型有解釋變數之間的關係以及預測機率的圖表展示,因此這個方法仍有存在的價值。
2. 相關係數法容易因為評分稀疏性的問題而無法預測,建議可以搭配內容式推薦系統的運作方式協助推薦。
3. 矩陣分解法在預測上雖然比IRT模型法還好,但分量的數字只是一個最佳化的結果,實際上無法解釋這些分量和數字的意義。
With the growth of the internet, websites are full of a variety of information and products. When users find the information or surf the internet to shopping, some websites provide users recommender system to find with which related. Hence, whether the recommender system can predict the users` preference is an important topic. This study used two data,which are "Mondo" and "MovieLens", and we used three related references to analyze and compare them. The three references are following: IRT model-based method, Correlation-coefficient method, and Matrix factorization.

After the data analysis, we get the following conclusions:
1. IRT model-based method is worse then other methods in predicting, but it can explain the relationship of variables and display the graph of predicting probabilities. Hence this method still has it`s value.
2. Correlation-coefficient method is hard to predict because of sparsity. We can connect it with content filtering approach.
3. Although matrix factorization is better then IRT model-based method in predicting, the vectors is a result of optimization. It may be hard to explain the meaning of the vectors.
參考文獻 Agresti, A. ,"An Introduction to Catogerical Data Analysis," Wiley-Introduction.
Cheung, K. , Tsui K. ,and Liu J. (2004), "Extended Latent Class Models for Collaborative Recommendation," IEEE Transactions on Systems, Man & Cybernetics: Part A, Jan 2004, Vol.34, Issue 1, pp. 143-148.
Conry, D. C. (2009), "Recommender Systems for the Conference Paper Assignment Problem," thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University.
Ho, D. E. ,and Quinn, K. M. (2008), "Improving the Presentation and Interpretation of Online Ratings Data with Model-Based Figures," The American Statistician, Nov 2008, Vol.62, Issue 4, pp. 279-288.
Kagie, M. ,Loos, M. ,and Wezel, M. (2009), "Including item characteristics in the probabilistic latent semantic analysis model for collaborative filtering," AI Communications, 22, 2009, pp. 249-265.
Konstan, J. A. ,Miller, B. N. ,Maltz, D. ,Herlocker, J. L. ,Gordon, L. R.,and Riedl, J. (1997), "GroupLens: Applying Collaborative Filtering to Usenet News," Comminications of the ACM, Mar1997, Vol.40, Issue 3, pp. 77-87.
Koren, K. ,Bell, R. ,and Volinsky, C. (2009), "Matrix Factorization Techniques for Recommender Systems," IEEE Computer Society, Aug 2009, Vol.42, Issue 8, pp. 42-49.
Koren, K. (2010), "Collaborative Filtering with Temporal Dynamics," Comminications of the ACM, Apr 2010, Vol.53, Issue 4, pp. 89-98.
Li, W. ,Lee, K. ,and Leung, K. (2006), "Generalized Regularized Least-Squares Learning with Predefined Features in a Hilbert Space," Neural Information Processing Systems - NIPS, pp. 881-888
Resnick, P. , Iacovou, N. ,Suchak, M. ,Bergstrom, P. and Riedl, J. (1994), "GroupLens: An Open Architecture for Collaborative Filtering of Netnews," Proceedings of ACM 1994 Conference on Computer Supported Cooperative Work, Chapel Hill, pp. 175-186.
Weisberg, S. "Applied linear regression," Wiley-Introduction.
Williamson, S. and Ghahramani, Z. (2008),"Probabilistic Models for Data Combination in Recommender Systems," Probabilistic models for data combination in recommender systems In: Learning from Multiple Sources Workshop, 8-12 December 2008, Vancouver and Whistler, British Columbia, Canada.
Zhou, H. ,and Lange, K. (2009), "Rating Movies and Rating the Raters Who Rate Them," The American Statistician, pp. 297-307.
馮文正 (2001),合作式網站推薦系統,國立交通大學資訊科學所碩士論文
吳肇銘 (2004),以消費者購買決策為基礎之適性化推薦系統,中原大學資訊管理學系碩士論文
Amazon. Retrieved Nov, 2010, from http://www.amazon.com
Mondo Times. Retrieved Nov, 2010, from http:///www.mondotimes.com
Netflix. Retrieved Nov, 2010, from http:///www.netflix.com
PC magazine. Retrieved Nov, 2010, from http://www.pcmag.com
TiVo台灣網站. Retrieved Nov, 2010, from http://www.tgc-taiwan.com.tw/index.php
描述 碩士
國立政治大學
統計研究所
98354010
99
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0098354010
資料類型 thesis
dc.contributor.advisor 翁久幸zh_TW
dc.contributor.advisor Weng, Chui Hsingen_US
dc.contributor.author (Authors) 張孫浩zh_TW
dc.creator (作者) 張孫浩zh_TW
dc.date (日期) 2010en_US
dc.date.accessioned 5-Sep-2013 15:12:30 (UTC+8)-
dc.date.available 5-Sep-2013 15:12:30 (UTC+8)-
dc.date.issued (上傳時間) 5-Sep-2013 15:12:30 (UTC+8)-
dc.identifier (Other Identifiers) G0098354010en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/60440-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 統計研究所zh_TW
dc.description (描述) 98354010zh_TW
dc.description (描述) 99zh_TW
dc.description.abstract (摘要) 隨著網路的發達,各式各樣的資訊和商品也在網路上充斥著,使用者尋找資訊或是上網購物時,有的網站有推薦系統(recommender system)能提供使用者相關資訊或商品。若推薦系統能夠讓消費者所搜尋的相關資訊或商品能夠符合他們的習性時,便能讓消費者增加對系統的信賴程度,因此系統是否能準確預測出使用者的偏好就成為一個重要的課題。本研究使用兩筆資料,並以相關研究的三篇文獻進行分析和比較。這三篇文獻分別為IRT模型法(IRT model-based method)、相關係數法(correlation-coefficient method)、以及矩陣分解法(matrix factorization)。

在經過一連串的實證分析後,歸納出以下結論:
1. 模型法在預測方面雖然精確度不如其他兩種方法來的好,但是模型有解釋變數之間的關係以及預測機率的圖表展示,因此這個方法仍有存在的價值。
2. 相關係數法容易因為評分稀疏性的問題而無法預測,建議可以搭配內容式推薦系統的運作方式協助推薦。
3. 矩陣分解法在預測上雖然比IRT模型法還好,但分量的數字只是一個最佳化的結果,實際上無法解釋這些分量和數字的意義。
zh_TW
dc.description.abstract (摘要) With the growth of the internet, websites are full of a variety of information and products. When users find the information or surf the internet to shopping, some websites provide users recommender system to find with which related. Hence, whether the recommender system can predict the users` preference is an important topic. This study used two data,which are "Mondo" and "MovieLens", and we used three related references to analyze and compare them. The three references are following: IRT model-based method, Correlation-coefficient method, and Matrix factorization.

After the data analysis, we get the following conclusions:
1. IRT model-based method is worse then other methods in predicting, but it can explain the relationship of variables and display the graph of predicting probabilities. Hence this method still has it`s value.
2. Correlation-coefficient method is hard to predict because of sparsity. We can connect it with content filtering approach.
3. Although matrix factorization is better then IRT model-based method in predicting, the vectors is a result of optimization. It may be hard to explain the meaning of the vectors.
en_US
dc.description.tableofcontents 第一章 緒論 page 6
1.1節 研究背景 page 6
1.1.1節 推薦系統簡介 page 7
1.1.2節 目前網路評分呈現的瑕疵 page 8
1.2節 研究目的 page 9
第二章 文獻回顧 page 10
2.1節 IRT模型法 page 10
2.2節 相關係數預測法 page 13
2.3節 矩陣分解理論 page 15
第三章 研究方法 page 18
3.1節 IRT模型法 page 18
3.2節 相關係數預測法 page 19
3.3節 矩陣分解法 page 20
第四章 實證研究 page 21
4.1節 實證資料 page 21
4.2節 IRT模型法分析 page 22
4.2.1節 beta_r是否要大於0 page 22
4.2.2節 Mondo預測結果 page 25
4.2.3節 MovieLens預測結果 page 33
4.2.4節 IRT模型法的改進 page 37
4.3節 相關係數法分析 page 38
4.3.1節 Mondo預測結果 page 38
4.3.2節 MovieLens預測結果 page 40
4.4節 矩陣分解法分析 page 42
4.4.1節 Mondo預測結果 page 42
4.4.2節 MovieLens預測結果 page 42
4.5節 預測結果比較 page 45
第五章 結論與建議 page 47
zh_TW
dc.format.extent 1462105 bytes-
dc.format.mimetype application/pdf-
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0098354010en_US
dc.subject (關鍵詞) 線上評分zh_TW
dc.subject (關鍵詞) 推薦系統zh_TW
dc.subject (關鍵詞) IRT模型法zh_TW
dc.subject (關鍵詞) 相關係數法zh_TW
dc.subject (關鍵詞) 矩陣分解zh_TW
dc.subject (關鍵詞) online ratingen_US
dc.subject (關鍵詞) recommender systemen_US
dc.subject (關鍵詞) IRT model-based methoden_US
dc.subject (關鍵詞) method, correlation-coefficient methoden_US
dc.subject (關鍵詞) matrix factorizationen_US
dc.title (題名) 網路評比資料之統計分析zh_TW
dc.title (題名) Statistical analysis of online rating dataen_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) Agresti, A. ,"An Introduction to Catogerical Data Analysis," Wiley-Introduction.
Cheung, K. , Tsui K. ,and Liu J. (2004), "Extended Latent Class Models for Collaborative Recommendation," IEEE Transactions on Systems, Man & Cybernetics: Part A, Jan 2004, Vol.34, Issue 1, pp. 143-148.
Conry, D. C. (2009), "Recommender Systems for the Conference Paper Assignment Problem," thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University.
Ho, D. E. ,and Quinn, K. M. (2008), "Improving the Presentation and Interpretation of Online Ratings Data with Model-Based Figures," The American Statistician, Nov 2008, Vol.62, Issue 4, pp. 279-288.
Kagie, M. ,Loos, M. ,and Wezel, M. (2009), "Including item characteristics in the probabilistic latent semantic analysis model for collaborative filtering," AI Communications, 22, 2009, pp. 249-265.
Konstan, J. A. ,Miller, B. N. ,Maltz, D. ,Herlocker, J. L. ,Gordon, L. R.,and Riedl, J. (1997), "GroupLens: Applying Collaborative Filtering to Usenet News," Comminications of the ACM, Mar1997, Vol.40, Issue 3, pp. 77-87.
Koren, K. ,Bell, R. ,and Volinsky, C. (2009), "Matrix Factorization Techniques for Recommender Systems," IEEE Computer Society, Aug 2009, Vol.42, Issue 8, pp. 42-49.
Koren, K. (2010), "Collaborative Filtering with Temporal Dynamics," Comminications of the ACM, Apr 2010, Vol.53, Issue 4, pp. 89-98.
Li, W. ,Lee, K. ,and Leung, K. (2006), "Generalized Regularized Least-Squares Learning with Predefined Features in a Hilbert Space," Neural Information Processing Systems - NIPS, pp. 881-888
Resnick, P. , Iacovou, N. ,Suchak, M. ,Bergstrom, P. and Riedl, J. (1994), "GroupLens: An Open Architecture for Collaborative Filtering of Netnews," Proceedings of ACM 1994 Conference on Computer Supported Cooperative Work, Chapel Hill, pp. 175-186.
Weisberg, S. "Applied linear regression," Wiley-Introduction.
Williamson, S. and Ghahramani, Z. (2008),"Probabilistic Models for Data Combination in Recommender Systems," Probabilistic models for data combination in recommender systems In: Learning from Multiple Sources Workshop, 8-12 December 2008, Vancouver and Whistler, British Columbia, Canada.
Zhou, H. ,and Lange, K. (2009), "Rating Movies and Rating the Raters Who Rate Them," The American Statistician, pp. 297-307.
馮文正 (2001),合作式網站推薦系統,國立交通大學資訊科學所碩士論文
吳肇銘 (2004),以消費者購買決策為基礎之適性化推薦系統,中原大學資訊管理學系碩士論文
Amazon. Retrieved Nov, 2010, from http://www.amazon.com
Mondo Times. Retrieved Nov, 2010, from http:///www.mondotimes.com
Netflix. Retrieved Nov, 2010, from http:///www.netflix.com
PC magazine. Retrieved Nov, 2010, from http://www.pcmag.com
TiVo台灣網站. Retrieved Nov, 2010, from http://www.tgc-taiwan.com.tw/index.php
zh_TW