基於增強學習的直播電商推薦系統

Publications-Theses

Article View/Open

pdf(0)

Publication Export

Google Scholar^TM

NCCU Library

Discovery System

Citation Infomation

Related Publications in TAIR

Simple Record
Full Record

題名	基於增強學習的直播電商推薦系統 Reinforcement learning based live streaming e-commerce recommender system
作者	唐思琪 Tang, Szu-Chi
貢獻者	林怡伶 Ling, Yi-Lin 唐思琪 Tang, Szu-Chi
關鍵詞	直播電商推薦系統強化學習探索與利用之權衡神經網路 Live commerce Live streaming E-commerce Recommender system Recommendation system Multi-armed bandit Reinforcement learning User context Uncertainty Exploitation-exploration trade-off Gated Recurrent Unit Variational Autoencoder Bayesian neural networks
日期	2022
上傳時間	1-Aug-2022 17:20:44 (UTC+8)
摘要	近年來，直播電商逐漸受到重視。不同於傳統的電商和單向推播的電視購物，直播電商更加強調即時互動性。由於開設直播的成本低，直播主發起直播的頻率很高、商品也是不斷推成出新，這些都促成了複雜且快速變動的環境，而推薦系統能夠幫助消費者在資訊爆炸的情況下快速做出決定。過往的推薦系統研究注重於準確率的最佳化，不只引發了同溫層效應，更因為總是推薦類似的商品，長期下來導致消費者的不滿意以及流失。為了在精準推薦與探索新喜好的取捨中獲得較好的平衡，我們將此議題看作是一個具備使用者情境的多臂吃角子老虎機問題。此研究在直播電商這種新的商業情境下，提出一個基於強化學習的推薦系統。它能夠通過靜態的顧客特徵以及具時序性的顧客特徵，找出顧客、直播主以及商品之間的關係。我們使用了一種循環神經網路——門基循環單元，來找出顧客隨時間變化的喜好。我們的直播電商推薦系統能夠藉由變分自動變碼器來模糊化顧客的特徵，並在推薦商品的過程中利用貝葉斯神經網路來引入不確定性，來達成控制探索顧客喜好與利用的平衡。據我們所知，我們是第一個提出以基於神經網路的上下文吃角子老虎機演算法，來解決直播電商平台環境下推薦問題的研究。我們比較了經典的多臂吃餃子老虎機演算法，並透過真實世界資料的實驗來初步驗證了我們的理論，並且展示了其在商業實務問題中的潛在應用。 In recent years, live stream e-commerce shopping has received extensive attention from e-commerce businesses and streaming platforms. Different from traditional TV shopping and online shopping, the emerging products roll out continuously on the live stream shopping platform where users and streamers interact and synchronize in real-time. Such a dynamic environment forms a complex user context. The recommender system plays a crucial role in assisting users in information-seeking tasks and decision-making from information overload. Previous recommender systems mainly focus on optimizing accuracy, which results in filter bubbles problem and high churn rates in the long run. To balance exploration and exploitation (EE) trade-off under a dynamic and fast-changing recommendation context, the research formulates the problem as a contextual bandit problem. This study provides a reinforcement learning (RL)-based solution for a new business scenario (i.e., live stream e-commerce) which addresses three relationships between customers, streamers, and products in both static and temporal user contexts. We use Gated Recurrent Unit (GRU) to model the context changes in users` preferences in streamers and products while maintaining their long-term engagement. By encoded uncertainty in neural networks with Variational Autoencoder (VAE) for user modeling and Bayesian Neural Network (BNN) for a product recommendation, the proposed Live E-commerce Recommender System (LERS) can control the balance of EE trade-off. To the best of our knowledge, our study is the first neural network-based contextual bandit algorithm dealing with the recommendation problem in the live streaming e-commerce platforms. We compared our algorithm with classic multi-armed bandit algorithms including UCB1, LinUCB, Exp3, and NeuralUCB. Preliminary experiment results on real-world data corroborate our theory and shed light on potential applications of our algorithm to real-world business problems.
參考文獻	Allesiardo, R., Féraud, R., & Bouneffouf, D. (2014). A neural networks committee for the contextual bandit problem. In Processings of the international conference on neural information processing (Vol. 8834, pp. 374–381). doi: 10.1007/978-3-319 -12637-1_47 Auer, P. (2002). Using confidence bounds for exploitation-exploration trade-offs. Ma- chine Learning Research, 3(Nov), 397–422. Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the mul- tiarmed bandit problem. Machine learning, 47(2), 235–256. doi: 10.1023/A: 1013689704352 Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (2002). The nonstochastic multiarmed bandit problem. SIAM journal on computing, 32(1), 48–77. doi: 10 .1137/S0097539701398375 Blundell, C., Cornebise, J., Kavukcuoglu, K., & Wierstra, D. (2015). Weight uncertainty in neural networks. In Proceedings of the 32nd international conference on inter- national conference on machine learning (Vol. 37, pp. 1613–1622). Bouneffouf, D., Bouzeghoub, A., & Gançarski, A. L. (2012). A contextual-bandit algo- rithm for mobile context-aware recommender system. In International conference on neural information processing (pp. 324–331). Burtini, G., Loeppky, J., & Lawrence, R. (2015). A survey of online experiment design with the stochastic multi-armed bandit. Retrieved from https://arxiv.org/abs/1510.00757 Cai, J., Wohn, D. Y., Mittal, A., & Sureshbabu, D. (2018). Utilitarian and hedonic moti- vations for live streaming shopping. In Proceedings of the 2018 acm international conference on interactive experiences for tv and online video (p. 81–88). doi: 10.1145/3210825.3210837 Cheng, Z., & Shen, J. (2016, April). On effective location-aware music recommen- dation. ACM Transactions on Information Systems (TOIS), 34(2), 1–32. doi: 10.1145/2846092 Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using rnn encoder-decoder for statistical machine translation. Retrieved from https://arxiv.org/abs/1406.1078 Choe, D.-E., Kim, H.-C., & Kim, M.-H. (2021). Sequence-based modeling of deep learn- ing with lstm and gru networks for structural damage detection of floating offshore wind turbine blades. Renewable Energy, 174, 218–235. Chu, W., Li, L., Reyzin, L., & Schapire, R. (2011). Contextual bandits with linear payoff functions. In Proceedings of the 14th international conference on artificial intelli- gence and statistics (pp. 208–214). Docherty, I. (2018). New governance challenges in the era of ‘smart’mobility. In Governance of the smart mobility transition. Du, C., Gao, Z., Yuan, S., Gao, L., Li, Z., Zeng, Y., ... Lee, K.-C. (2021). Exploration in online advertising systems with deep uncertainty-aware learning. In Proceedings of the 27th acm sigkdd conference on knowledge discovery & data mining (pp. 2792– 2801). Fang, H., Zhang, D., Shu, Y., & Guo, G. (2020). Deep learning for sequential recom- mendation: Algorithms, influential factors, and evaluations. ACM Transactions on Information Systems (TOIS), 39(1), 1–42. Gal, Y., & Ghahramani, Z. (2016). Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning (pp. 1050–1059). Gawlikowski, J., Tassi, C. R. N., Ali, M., Lee, J., Humt, M., Feng, J., ... others (2021). A survey of uncertainty in deep neural networks. Retrieved from https://arxiv.org/abs/2107.03342 Gediminas, A., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE transactions on knowledge and data engineering, 17(6), 734-749. doi: https://doi.org/10.1109/ TKDE.2005.99 Gulrajani, I., Kumar, K., Ahmed, F., Taiga, A. A., Visin, F., Vazquez, D., & Courville, A. (2016). Pixelvae: A latent variable model for natural images. Retrieved from https://arxiv.org/abs/1611.05013 He, X., Chen, T., Kan, M.-Y., & Chen, X. (2015). Trirank: Review-aware explainable recommendation by modeling aspects. In Proceedings of the 24th acm international on conference on information and knowledge management (p. 1661–1670). doi: 10.1145/2806416.2806504 Hu, M., & Chaudhry, S. S. (2020). Enhancing consumer engagement in e-commerce live streaming via relational bonds. Internet Research, 30(3). doi: 10.1108/INTR-03 -2019-0082 Kakade, S. M., Shalev-Shwartz, S., & Tewari, A. (2008). Efficient bandit algorithms for online multiclass prediction. In Proceedings of the 25th international conference on machine learning (pp. 440–447). doi: 10.1145/1390156.1390212 Katehakis, M. N., & Veinott Jr, A. F. (1987). The multi-armed bandit problem: Decom- position and computation. Mathematics of Operations Research, 12(2), 262–268. doi: 10.1287/moor.12.2.262 Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. Retrieved from https://arxiv.org/abs/1312.6114 Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM), 46(5), 604–632. Ko, H.-C., & Chen, Z.-Y. (2020). Exploring the factors driving live streaming shopping intention: a perspective of parasocial interaction. In Proceedings of the 2020 inter- national conference on management of e-commerce and e-government (pp. 36–40). Langford, J., & Zhang, T. (2007). The Epoch-Greedy algorithm for contextual multi- armed bandits. In Proceedings of the 20th international conference on neural in- formation processing systems (p. 817–824). Lauret, P., Fock, E., Randrianarivony, R. N., & Manicom-Ramsamy, J.-F. (2008). Bayesian neural network approach to short time load forecasting. Energy conver- sion and management, 49(5), 1156–1166. Lee, H. I., Choi, I. Y., Moon, H. S., & Kim, J. K. (2020). A multi-period product recom- mender system in online food market based on recurrent neural networks. Sustain- ability, 12(3), 969. Li, J., Ren, P., Chen, Z., Ren, Z., Lian, T., & Ma, J. (2017). Neural attentive session-based recommendation. In (pp. 1419–1428). doi: 10.1145/3132847 Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on world wide web (pp. 661–670). doi: 10.1145/1772690.1772758 Li, S., Karatzoglou, A., & Gentile, C. (2016). Collaborative filtering bandits. In Proceed- ings of the 39th international acm sigir conference on research and development in information retrieval (pp. 539–548). Lin, C.-Y., & Chen, H.-S. (2019). Personalized channel recommendation on live streaming platforms. Multimedia Tools and Applications, 78(2), 1999–2015. Liu, Y. W., Lin, C. Y., & Huang, J. L. (2015). Live streaming channel recommendation using hits algorithm. In 2015 ieee international conference on consumer electronics taiwan (pp. 118–119). Martinez-Cantin, R., De Freitas, N., Brochu, E., Castellanos, J., & Doucet, A. (2009). A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot. Autonomous Robots, 27(2), 93–103. doi: 10.1007/s10514-009-9130-2 Mullachery, V., Khera, A., & Husain, A. (2018). Bayesian neural networks. Retrieved from https://arxiv.org/abs/1801.07710 Pradel, B., Sean, S., Delporte, J., Guérif, S., Rouveirol, C., Usunier, N., ... France, O. (2011). A case study in a recommender system based on purchase data. In Proceed- ings of the 17th acm sigkdd international conference on knowledge discovery and data mining - kdd ’11 (pp. 377–385). doi: 10.1145/2020408 Rappaz, J., McAuley, J., & Aberer, K. (2021). Recommendation on live-streaming plat- forms: Dynamic availability and repeat consumption. In Fifteenth acm conference on recommender systems (pp. 390–399). Reinartz, W. J., & Kumar, V. (2003). The impact of customer relationship characteristics on profitable lifetime duration. Journal of marketing, 67(1), 77–99. Santana, L. L. B. d. S., Souza, A. B. S., Santana, D. L., Dourado, W. A., & Durão, F. A. (2017). Evaluating ensemble strategies for recommender systems under metadata reduction. In Proceedings of the 23rd brazillian symposium on multimedia and the web (pp. 125–132). doi: 10.1145/3126858.3126879 Satyal, S., Weber, I., Paik, H.-y., Di Ciccio, C., & Mendling, J. (2018). AB testing for process versions with contextual multi-armed bandit algorithms. In Proceedings of the international conference on advanced information systems engineering (pp. 19–34). doi: 10.1007/978-3-319-91563-0_2 Shahrampour, S., Rakhlin, A., & Jadbabaie, A. (2017). Multi-armed bandits in multi-agent networks. In Proceedings of the 2017 ieee international conference on acous- tics, speech and signal processing (p. 2786-2790). doi: 10.1109/ICASSP.2017.7952664 Shani, G., & Gunawardana, A. (2011). Evaluating recommendation systems. In F. Ricci, L. Rokach, B. Shapira, & P. B. Kantor (Eds.), Recommender systems handbook (pp. 257–297). doi: 10.1007/978-0-387-85820-3_8 Su, X. (2019, dec). An empirical study on the influencing factors of e-commerce live streaming. In 2019 international conference on economic management and model engineering, icemme 2019 (pp. 492–496). doi: 10.1109/ICEMME49371 .2019.00103 Sun, Y., Shao, X., Li, X., Guo, Y., & Nie, K. (2019). How live streaming influences purchase intentions in social commerce: An it affordance perspective. Electronic Commerce Research and Applications, 37, 100886. doi: https://doi.org/10.1016/ j.elerap.2019.100886 Takahashi, R., & Zhang, S. (2017). Towards bursting filter bubble via contextual risks and uncertainties. Retrieved from https://arxiv.org/abs/1706.09985 Truong, Q.-T., Salah, A., & Lauw, H. W. (2021). Bilateral variational autoencoder for collaborative filtering. In Proceedings of the 14th acm international conference on web search and data mining (pp. 292–300). Vanchinathan, H. P., Nikolic, I., De Bona, F., & Krause, A. (2014). Explore-exploit in top-n recommender systems via gaussian processes. In Proceedings of the 8th acm conference on recommender systems (pp. 225–232). Vuyyuru, V. A., Rao, G. A., & Murthy, Y. (2021). A novel weather prediction model using a hybrid mechanism based on mlp and vae with fire-fly optimization algorithm. Evolutionary Intelligence, 14(2), 1173–1185. Wang, H., Wu, Q., & Wang, H. (2016). Learning hidden features for contextual bandits. In Proceedings of the 25th acm international on conference on information and knowledge management (pp. 1633–1642). Wang, Z., Lee, S.-J., & Lee, K.-R. (2018). Factors influencing product purchase intentionin taobao live streaming shopping. Journal of Digital Contents Society, 19(4), 649–659. Wikipedia. (2022). Livestream shopping — Wikipedia, the free encyclopedia. Retrieved from http://en.wikipedia.org/w/index.php?title=Livestream\\ %20shopping&oldid=1065424656 Wongkitrungrueng, A., & Assarut, N. (2020). The role of live streaming in building consumer trust and engagement with social commerce sellers. Journal of Business Research, 117, 543-556. doi: https://doi.org/10.1016/j.jbusres.2018.08.032 Wongkitrungrueng, A., Dehouche, N., & Assarut, N. (2020). Live streaming commerce from the sellers’perspective: implications for online relationship marketing. Jour- nal of Marketing Management, 36(5-6), 488–518. Xu, X., Wu, J.-H., & Li, Q. (2020). What drives consumer shopping behavior in live streaming commerce? Journal of Electronic Commerce Research, 21(3), 144–167. Xue, F., He, X., Wang, X., Xu, J., Liu, K., & Hong, R. (2019, April). Deep item-based col- laborative filtering for top-N recommendation. ACM Transactions on Information Systems (TOIS), 37(3). doi: 10.1145/3314578 Yang, T.-W., Shih, W.-Y., Huang, J.-L., Ting, W.-C., & Liu, P.-C. (2013). A hybrid preference-aware recommendation algorithm for live streaming channels. In 2013 conference on technologies and applications of artificial intelligence (pp. 188– 193). Zhang, S., Liu, H., He, J., Han, S., & Du, X. (2021). Deep sequential model for anchor recommendation on live streaming platforms. Big Data Mining and Analytics, 4(3), 173–182. Zhang, S., Yao, L., Sun, A., & Tay, Y. (2019). Deep learning based recommender system: A survey and new perspectives. ACM Computing Surveys (CSUR), 52(1), 1–38. doi: 10.1145/3285029 Zhang, X., Xie, H., Li, H., & CS Lui, J. (2020). Conversational contextual bandit: Algorithm and application. In Proceedings of the web conference 2020 (pp. 662–672). Zhou, D., Li, L., & Gu, Q. (2020). Neural contextual bandits with UCB-based exploration. In Proceedings of the 37th international conference on machine learning (Vol. 119, pp. 11492–11502). Zhou, M., Huang, J., Wu, K., Huang, X., Kong, N., & Campy, K. S. (2021, nov). Characterizing Chinese consumers’ intention to use live e-commerce shopping. Technology in Society, 67, 101767. doi: 10.1016/J.TECHSOC.2021.101767 Zou, L., Xia, L., Ding, Z., Song, J., Liu, W., & Yin, D. (2019). Reinforcement learning to optimize long-term user engagement in recommender systems. In Proceedings of the 25th acm sigkdd international conference on knowledge discovery & data mining (pp. 2810–2818).
描述	碩士國立政治大學資訊管理學系 109356002
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0109356002
資料類型	thesis

dc.contributor.advisor	林怡伶	zh_TW
dc.contributor.advisor	Ling, Yi-Lin	en_US
dc.contributor.author (Authors)	唐思琪	zh_TW
dc.contributor.author (Authors)	Tang, Szu-Chi	en_US
dc.creator (作者)	唐思琪	zh_TW
dc.creator (作者)	Tang, Szu-Chi	en_US
dc.date (日期)	2022	en_US
dc.date.accessioned	1-Aug-2022 17:20:44 (UTC+8)	-
dc.date.available	1-Aug-2022 17:20:44 (UTC+8)	-
dc.date.issued (上傳時間)	1-Aug-2022 17:20:44 (UTC+8)	-
dc.identifier (Other Identifiers)	G0109356002	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/141029	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	資訊管理學系	zh_TW
dc.description (描述)	109356002	zh_TW
dc.description.abstract (摘要)	近年來，直播電商逐漸受到重視。不同於傳統的電商和單向推播的電視購物，直播電商更加強調即時互動性。由於開設直播的成本低，直播主發起直播的頻率很高、商品也是不斷推成出新，這些都促成了複雜且快速變動的環境，而推薦系統能夠幫助消費者在資訊爆炸的情況下快速做出決定。過往的推薦系統研究注重於準確率的最佳化，不只引發了同溫層效應，更因為總是推薦類似的商品，長期下來導致消費者的不滿意以及流失。為了在精準推薦與探索新喜好的取捨中獲得較好的平衡，我們將此議題看作是一個具備使用者情境的多臂吃角子老虎機問題。此研究在直播電商這種新的商業情境下，提出一個基於強化學習的推薦系統。它能夠通過靜態的顧客特徵以及具時序性的顧客特徵，找出顧客、直播主以及商品之間的關係。我們使用了一種循環神經網路——門基循環單元，來找出顧客隨時間變化的喜好。我們的直播電商推薦系統能夠藉由變分自動變碼器來模糊化顧客的特徵，並在推薦商品的過程中利用貝葉斯神經網路來引入不確定性，來達成控制探索顧客喜好與利用的平衡。據我們所知，我們是第一個提出以基於神經網路的上下文吃角子老虎機演算法，來解決直播電商平台環境下推薦問題的研究。我們比較了經典的多臂吃餃子老虎機演算法，並透過真實世界資料的實驗來初步驗證了我們的理論，並且展示了其在商業實務問題中的潛在應用。	zh_TW
dc.description.abstract (摘要)	In recent years, live stream e-commerce shopping has received extensive attention from e-commerce businesses and streaming platforms. Different from traditional TV shopping and online shopping, the emerging products roll out continuously on the live stream shopping platform where users and streamers interact and synchronize in real-time. Such a dynamic environment forms a complex user context. The recommender system plays a crucial role in assisting users in information-seeking tasks and decision-making from information overload. Previous recommender systems mainly focus on optimizing accuracy, which results in filter bubbles problem and high churn rates in the long run. To balance exploration and exploitation (EE) trade-off under a dynamic and fast-changing recommendation context, the research formulates the problem as a contextual bandit problem. This study provides a reinforcement learning (RL)-based solution for a new business scenario (i.e., live stream e-commerce) which addresses three relationships between customers, streamers, and products in both static and temporal user contexts. We use Gated Recurrent Unit (GRU) to model the context changes in users` preferences in streamers and products while maintaining their long-term engagement. By encoded uncertainty in neural networks with Variational Autoencoder (VAE) for user modeling and Bayesian Neural Network (BNN) for a product recommendation, the proposed Live E-commerce Recommender System (LERS) can control the balance of EE trade-off. To the best of our knowledge, our study is the first neural network-based contextual bandit algorithm dealing with the recommendation problem in the live streaming e-commerce platforms. We compared our algorithm with classic multi-armed bandit algorithms including UCB1, LinUCB, Exp3, and NeuralUCB. Preliminary experiment results on real-world data corroborate our theory and shed light on potential applications of our algorithm to real-world business problems.	en_US
dc.description.tableofcontents	Acknowledgements i 摘要 ii Abstract iii Contents v List of Figures viii List of Tables x 1 Introduction 1 2 RelatedWork 4 2.1 Live Streaming E-commerce 4 2.2 Recommender Systems 5 2.3 Live Streaming Recommender System 6 2.4 Contextual Multi-armed Bandit Methods 8 2.5 Uncertainty Modeling 10 3 The Proposed Framework 12 3.1 Problem Definition 12 3.2 Framework Overview 13 3.3 Gated Recurrent Unit Networks in Temporal Context Model 16 3.4 Variational Autoencoder for Blurry Context 18 3.5 Bayesian Neural Networks for Exploring Product Recommendation 20 3.6 Training Procedure 21 4 Experiments 25 4.1 Datasets 25 4.2 Implementation Environment 25 4.3 Customer Context Features 26 4.3.1 Static Context Features 26 4.3.2 Customer-Product Context Features 26 4.3.3 Customer-Streamer Context Features 27 4.4 Temporal Context Modeling 28 4.4.1 RNN-based Models for Temporal Context 28 4.4.2 Identify the Appropriate Sequence Length of Temporal Context 29 4.5 Full Context Analysis 33 4.6 Dimension Reduction Analysis 35 4.7 Production Recommendation Analysis 36 4.7.1 Evaluation Metrics 36 4.7.2 Experiment Dataset 38 4.7.3 Recommendation Context for Product Recommendation 40 4.7.4 Temporal Context for Product Recommendation 42 4.7.5 End-to-End Live E-commerce Recommender System 44 4.8 Algorithm Comparison Experiments 45 4.8.1 Experiments Settings 46 4.8.2 Normal Dataset 47 4.8.3 Active Dataset 50 4.8.4 Repeat Dataset 51 5 Discussion 54 5.1 Offline Environment 54 5.2 Feature Enrichment 54 5.3 Context Engineering 55 5.4 Neural Network 55 6 Conclusion 57 References 59	zh_TW
dc.format.extent	2478692 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0109356002	en_US
dc.subject (關鍵詞)	直播電商	zh_TW
dc.subject (關鍵詞)	推薦系統	zh_TW
dc.subject (關鍵詞)	強化學習	zh_TW
dc.subject (關鍵詞)	探索與利用之權衡	zh_TW
dc.subject (關鍵詞)	神經網路	zh_TW
dc.subject (關鍵詞)	Live commerce	en_US
dc.subject (關鍵詞)	Live streaming	en_US
dc.subject (關鍵詞)	E-commerce	en_US
dc.subject (關鍵詞)	Recommender system	en_US
dc.subject (關鍵詞)	Recommendation system	en_US
dc.subject (關鍵詞)	Multi-armed bandit	en_US
dc.subject (關鍵詞)	Reinforcement learning	en_US
dc.subject (關鍵詞)	User context	en_US
dc.subject (關鍵詞)	Uncertainty	en_US
dc.subject (關鍵詞)	Exploitation-exploration trade-off	en_US
dc.subject (關鍵詞)	Gated Recurrent Unit	en_US
dc.subject (關鍵詞)	Variational Autoencoder	en_US
dc.subject (關鍵詞)	Bayesian neural networks	en_US
dc.title (題名)	基於增強學習的直播電商推薦系統	zh_TW
dc.title (題名)	Reinforcement learning based live streaming e-commerce recommender system	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	Allesiardo, R., Féraud, R., & Bouneffouf, D. (2014). A neural networks committee for the contextual bandit problem. In Processings of the international conference on neural information processing (Vol. 8834, pp. 374–381). doi: 10.1007/978-3-319 -12637-1_47 Auer, P. (2002). Using confidence bounds for exploitation-exploration trade-offs. Ma- chine Learning Research, 3(Nov), 397–422. Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the mul- tiarmed bandit problem. Machine learning, 47(2), 235–256. doi: 10.1023/A: 1013689704352 Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (2002). The nonstochastic multiarmed bandit problem. SIAM journal on computing, 32(1), 48–77. doi: 10 .1137/S0097539701398375 Blundell, C., Cornebise, J., Kavukcuoglu, K., & Wierstra, D. (2015). Weight uncertainty in neural networks. In Proceedings of the 32nd international conference on inter- national conference on machine learning (Vol. 37, pp. 1613–1622). Bouneffouf, D., Bouzeghoub, A., & Gançarski, A. L. (2012). A contextual-bandit algo- rithm for mobile context-aware recommender system. In International conference on neural information processing (pp. 324–331). Burtini, G., Loeppky, J., & Lawrence, R. (2015). A survey of online experiment design with the stochastic multi-armed bandit. Retrieved from https://arxiv.org/abs/1510.00757 Cai, J., Wohn, D. Y., Mittal, A., & Sureshbabu, D. (2018). Utilitarian and hedonic moti- vations for live streaming shopping. In Proceedings of the 2018 acm international conference on interactive experiences for tv and online video (p. 81–88). doi: 10.1145/3210825.3210837 Cheng, Z., & Shen, J. (2016, April). On effective location-aware music recommen- dation. ACM Transactions on Information Systems (TOIS), 34(2), 1–32. doi: 10.1145/2846092 Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using rnn encoder-decoder for statistical machine translation. Retrieved from https://arxiv.org/abs/1406.1078 Choe, D.-E., Kim, H.-C., & Kim, M.-H. (2021). Sequence-based modeling of deep learn- ing with lstm and gru networks for structural damage detection of floating offshore wind turbine blades. Renewable Energy, 174, 218–235. Chu, W., Li, L., Reyzin, L., & Schapire, R. (2011). Contextual bandits with linear payoff functions. In Proceedings of the 14th international conference on artificial intelli- gence and statistics (pp. 208–214). Docherty, I. (2018). New governance challenges in the era of ‘smart’mobility. In Governance of the smart mobility transition. Du, C., Gao, Z., Yuan, S., Gao, L., Li, Z., Zeng, Y., ... Lee, K.-C. (2021). Exploration in online advertising systems with deep uncertainty-aware learning. In Proceedings of the 27th acm sigkdd conference on knowledge discovery & data mining (pp. 2792– 2801). Fang, H., Zhang, D., Shu, Y., & Guo, G. (2020). Deep learning for sequential recom- mendation: Algorithms, influential factors, and evaluations. ACM Transactions on Information Systems (TOIS), 39(1), 1–42. Gal, Y., & Ghahramani, Z. (2016). Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning (pp. 1050–1059). Gawlikowski, J., Tassi, C. R. N., Ali, M., Lee, J., Humt, M., Feng, J., ... others (2021). A survey of uncertainty in deep neural networks. Retrieved from https://arxiv.org/abs/2107.03342 Gediminas, A., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE transactions on knowledge and data engineering, 17(6), 734-749. doi: https://doi.org/10.1109/ TKDE.2005.99 Gulrajani, I., Kumar, K., Ahmed, F., Taiga, A. A., Visin, F., Vazquez, D., & Courville, A. (2016). Pixelvae: A latent variable model for natural images. Retrieved from https://arxiv.org/abs/1611.05013 He, X., Chen, T., Kan, M.-Y., & Chen, X. (2015). Trirank: Review-aware explainable recommendation by modeling aspects. In Proceedings of the 24th acm international on conference on information and knowledge management (p. 1661–1670). doi: 10.1145/2806416.2806504 Hu, M., & Chaudhry, S. S. (2020). Enhancing consumer engagement in e-commerce live streaming via relational bonds. Internet Research, 30(3). doi: 10.1108/INTR-03 -2019-0082 Kakade, S. M., Shalev-Shwartz, S., & Tewari, A. (2008). Efficient bandit algorithms for online multiclass prediction. In Proceedings of the 25th international conference on machine learning (pp. 440–447). doi: 10.1145/1390156.1390212 Katehakis, M. N., & Veinott Jr, A. F. (1987). The multi-armed bandit problem: Decom- position and computation. Mathematics of Operations Research, 12(2), 262–268. doi: 10.1287/moor.12.2.262 Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. Retrieved from https://arxiv.org/abs/1312.6114 Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM), 46(5), 604–632. Ko, H.-C., & Chen, Z.-Y. (2020). Exploring the factors driving live streaming shopping intention: a perspective of parasocial interaction. In Proceedings of the 2020 inter- national conference on management of e-commerce and e-government (pp. 36–40). Langford, J., & Zhang, T. (2007). The Epoch-Greedy algorithm for contextual multi- armed bandits. In Proceedings of the 20th international conference on neural in- formation processing systems (p. 817–824). Lauret, P., Fock, E., Randrianarivony, R. N., & Manicom-Ramsamy, J.-F. (2008). Bayesian neural network approach to short time load forecasting. Energy conver- sion and management, 49(5), 1156–1166. Lee, H. I., Choi, I. Y., Moon, H. S., & Kim, J. K. (2020). A multi-period product recom- mender system in online food market based on recurrent neural networks. Sustain- ability, 12(3), 969. Li, J., Ren, P., Chen, Z., Ren, Z., Lian, T., & Ma, J. (2017). Neural attentive session-based recommendation. In (pp. 1419–1428). doi: 10.1145/3132847 Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on world wide web (pp. 661–670). doi: 10.1145/1772690.1772758 Li, S., Karatzoglou, A., & Gentile, C. (2016). Collaborative filtering bandits. In Proceed- ings of the 39th international acm sigir conference on research and development in information retrieval (pp. 539–548). Lin, C.-Y., & Chen, H.-S. (2019). Personalized channel recommendation on live streaming platforms. Multimedia Tools and Applications, 78(2), 1999–2015. Liu, Y. W., Lin, C. Y., & Huang, J. L. (2015). Live streaming channel recommendation using hits algorithm. In 2015 ieee international conference on consumer electronics taiwan (pp. 118–119). Martinez-Cantin, R., De Freitas, N., Brochu, E., Castellanos, J., & Doucet, A. (2009). A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot. Autonomous Robots, 27(2), 93–103. doi: 10.1007/s10514-009-9130-2 Mullachery, V., Khera, A., & Husain, A. (2018). Bayesian neural networks. Retrieved from https://arxiv.org/abs/1801.07710 Pradel, B., Sean, S., Delporte, J., Guérif, S., Rouveirol, C., Usunier, N., ... France, O. (2011). A case study in a recommender system based on purchase data. In Proceed- ings of the 17th acm sigkdd international conference on knowledge discovery and data mining - kdd ’11 (pp. 377–385). doi: 10.1145/2020408 Rappaz, J., McAuley, J., & Aberer, K. (2021). Recommendation on live-streaming plat- forms: Dynamic availability and repeat consumption. In Fifteenth acm conference on recommender systems (pp. 390–399). Reinartz, W. J., & Kumar, V. (2003). The impact of customer relationship characteristics on profitable lifetime duration. Journal of marketing, 67(1), 77–99. Santana, L. L. B. d. S., Souza, A. B. S., Santana, D. L., Dourado, W. A., & Durão, F. A. (2017). Evaluating ensemble strategies for recommender systems under metadata reduction. In Proceedings of the 23rd brazillian symposium on multimedia and the web (pp. 125–132). doi: 10.1145/3126858.3126879 Satyal, S., Weber, I., Paik, H.-y., Di Ciccio, C., & Mendling, J. (2018). AB testing for process versions with contextual multi-armed bandit algorithms. In Proceedings of the international conference on advanced information systems engineering (pp. 19–34). doi: 10.1007/978-3-319-91563-0_2 Shahrampour, S., Rakhlin, A., & Jadbabaie, A. (2017). Multi-armed bandits in multi-agent networks. In Proceedings of the 2017 ieee international conference on acous- tics, speech and signal processing (p. 2786-2790). doi: 10.1109/ICASSP.2017.7952664 Shani, G., & Gunawardana, A. (2011). Evaluating recommendation systems. In F. Ricci, L. Rokach, B. Shapira, & P. B. Kantor (Eds.), Recommender systems handbook (pp. 257–297). doi: 10.1007/978-0-387-85820-3_8 Su, X. (2019, dec). An empirical study on the influencing factors of e-commerce live streaming. In 2019 international conference on economic management and model engineering, icemme 2019 (pp. 492–496). doi: 10.1109/ICEMME49371 .2019.00103 Sun, Y., Shao, X., Li, X., Guo, Y., & Nie, K. (2019). How live streaming influences purchase intentions in social commerce: An it affordance perspective. Electronic Commerce Research and Applications, 37, 100886. doi: https://doi.org/10.1016/ j.elerap.2019.100886 Takahashi, R., & Zhang, S. (2017). Towards bursting filter bubble via contextual risks and uncertainties. Retrieved from https://arxiv.org/abs/1706.09985 Truong, Q.-T., Salah, A., & Lauw, H. W. (2021). Bilateral variational autoencoder for collaborative filtering. In Proceedings of the 14th acm international conference on web search and data mining (pp. 292–300). Vanchinathan, H. P., Nikolic, I., De Bona, F., & Krause, A. (2014). Explore-exploit in top-n recommender systems via gaussian processes. In Proceedings of the 8th acm conference on recommender systems (pp. 225–232). Vuyyuru, V. A., Rao, G. A., & Murthy, Y. (2021). A novel weather prediction model using a hybrid mechanism based on mlp and vae with fire-fly optimization algorithm. Evolutionary Intelligence, 14(2), 1173–1185. Wang, H., Wu, Q., & Wang, H. (2016). Learning hidden features for contextual bandits. In Proceedings of the 25th acm international on conference on information and knowledge management (pp. 1633–1642). Wang, Z., Lee, S.-J., & Lee, K.-R. (2018). Factors influencing product purchase intentionin taobao live streaming shopping. Journal of Digital Contents Society, 19(4), 649–659. Wikipedia. (2022). Livestream shopping — Wikipedia, the free encyclopedia. Retrieved from http://en.wikipedia.org/w/index.php?title=Livestream\\ %20shopping&oldid=1065424656 Wongkitrungrueng, A., & Assarut, N. (2020). The role of live streaming in building consumer trust and engagement with social commerce sellers. Journal of Business Research, 117, 543-556. doi: https://doi.org/10.1016/j.jbusres.2018.08.032 Wongkitrungrueng, A., Dehouche, N., & Assarut, N. (2020). Live streaming commerce from the sellers’perspective: implications for online relationship marketing. Jour- nal of Marketing Management, 36(5-6), 488–518. Xu, X., Wu, J.-H., & Li, Q. (2020). What drives consumer shopping behavior in live streaming commerce? Journal of Electronic Commerce Research, 21(3), 144–167. Xue, F., He, X., Wang, X., Xu, J., Liu, K., & Hong, R. (2019, April). Deep item-based col- laborative filtering for top-N recommendation. ACM Transactions on Information Systems (TOIS), 37(3). doi: 10.1145/3314578 Yang, T.-W., Shih, W.-Y., Huang, J.-L., Ting, W.-C., & Liu, P.-C. (2013). A hybrid preference-aware recommendation algorithm for live streaming channels. In 2013 conference on technologies and applications of artificial intelligence (pp. 188– 193). Zhang, S., Liu, H., He, J., Han, S., & Du, X. (2021). Deep sequential model for anchor recommendation on live streaming platforms. Big Data Mining and Analytics, 4(3), 173–182. Zhang, S., Yao, L., Sun, A., & Tay, Y. (2019). Deep learning based recommender system: A survey and new perspectives. ACM Computing Surveys (CSUR), 52(1), 1–38. doi: 10.1145/3285029 Zhang, X., Xie, H., Li, H., & CS Lui, J. (2020). Conversational contextual bandit: Algorithm and application. In Proceedings of the web conference 2020 (pp. 662–672). Zhou, D., Li, L., & Gu, Q. (2020). Neural contextual bandits with UCB-based exploration. In Proceedings of the 37th international conference on machine learning (Vol. 119, pp. 11492–11502). Zhou, M., Huang, J., Wu, K., Huang, X., Kong, N., & Campy, K. S. (2021, nov). Characterizing Chinese consumers’ intention to use live e-commerce shopping. Technology in Society, 67, 101767. doi: 10.1016/J.TECHSOC.2021.101767 Zou, L., Xia, L., Ding, Z., Song, J., Liu, W., & Yin, D. (2019). Reinforcement learning to optimize long-term user engagement in recommender systems. In Proceedings of the 25th acm sigkdd international conference on knowledge discovery & data mining (pp. 2810–2818).	zh_TW
dc.identifier.doi (DOI)	10.6814/NCCU202201098	en_US

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM