Publications-Theses
Article View/Open
Publication Export
-
題名 互動式主題標籤推薦系統
Interactive hashtag recommendation system作者 林俊廷
Lin, Chun-Ting貢獻者 李蔡彥
Li, Tsai-Yen
林俊廷
Lin, Chun-Ting關鍵詞 推薦系統
自然語言處理
社群媒體
主題標籤
Recommendation system
Natural language processing
Social media
Hashtag日期 2022 上傳時間 1-Jul-2022 16:21:47 (UTC+8) 摘要 隨著網絡的不斷發展,越來越多的使用者將自己的所見所聞,透過推文(Tweet)的形式分享在社群媒體(Social Media)之中。這些推文以主題標籤(Hashtag)為聯結,在社群媒體中構成了許許多多的討論主題(Topic)。但由於大多數的使用者都沒有使用主題標籤的習慣,導致大量的推文無法被即時歸類到對應的主題,使得資訊呈現出離散的狀態。為了解決上述問題,本文提出了一種互動式主題標籤推薦系統,預測使用者所發推文的主題,以互動的方式推薦相關的主題標籤。此推薦系統可根據使用者的互動反饋,在編寫推文的不同階段提供適合的主題標籤,幫助社群形成主題共識,促進社群媒體意見的快速收斂。在實驗中,本研究邀請受試者使用此推薦系統,透過受試者的反饋來驗證系統的有用性。實驗結果顯示,本系統提出之互動式推薦流程可以幫助使用者找到適合推文主題的主題標籤。
With the progressive advance of Internet technologies, more and more users share their lives by posting tweets on social media platforms like Twitter. These tweets use hashtags as links to constitute discussion topics on social media. However, since most users are not used to using hashtags, a large number of tweets cannot be classified into corresponding topics immediately, which leads to a discrete state of information. To solve this problem, in this thesis, we propose an interactive hashtag recommendation system, which predicts the topic of an input tweet and interactively recommends rele-vant hashtags. This recommendation system can provide suitable hashtags in different phases of writing a tweet based on the interactive feedback of a user, help the commu-nity to reach a consensus, and increase the convergence speed of opinions on social media. We conducted user experiments to verify the usability of the recommendation system. The experimental results and user feedbacks reveal that the interactive hashtag recommendation can help users find suitable hashtags about the tweet’s topic.參考文獻 [1] S. M. Kywe, E.-P. Lim, and F. Zhu, "A survey of recommender systems in twitter," in Proceedings of the International Conference on Social Informatics, 2012: Springer, pp. 420-433.[2] H. Tsukayama, "Twitter turns 7: Users send over 400 million tweets per day," The Washington Post, vol. 21, 2013.[3] F. Godin, V. Slavkovikj, W. De Neve, B. Schrauwen, and R. Van de Walle, "Using topic models for twitter hashtag recommendation," in Proceedings of the 22nd International Conference on World Wide Web, 2013, pp. 593-596.[4] D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," arXiv preprint arXiv:1409.0473, 2014.[5] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, L. Kaiser, and I. Polosukhin, "Attention is all you need," in Advances in neural information processing systems, 2017, pp. 5998-6008.[6] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, 2018.[7] D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent dirichlet allocation," The Journal of Machine Learning Research, vol. 3, pp. 993-1022, 2003.[8] A. K. McCallum, "Mallet: A machine learning for language toolkit," http://mallet.cs.umass.edu, 2002.[9] X. Yan, J. Guo, Y. Lan, and X. Cheng, "A biterm topic model for short texts," in Proceedings of the 22nd international conference on World Wide Web, 2013, pp. 1445-1456.[10] C. Sievert and K. Shirley, "LDAvis: A method for visualizing and interpreting topics," in Proceedings of the workshop on interactive language learning, visualization, and interfaces, 2014, pp. 63-70.[11] K. Dey, R. Shrivastava, S. Kaushik, and L. V. Subramaniam, "Emtagger: a word embedding based novel method for hashtag recommendation on twitter," in 2017 IEEE International Conference on Data Mining Workshops (ICDMW), 2017: IEEE, pp. 1025-1032.[12] E. Otsuka, S. A. Wallace, and D. Chiu, "Design and evaluation of a twitter hashtag recommendation system," in Proceedings of the 18th International Database Engineering & Applications Symposium, 2014, pp. 330-333.[13] T. Li, Y. Wu, and Y. Zhang, "Twitter hash tag prediction algorithm," in Proceedings on the International Conference on Internet Computing (ICOMP), 2011: Citeseer, p. 1.[14] B. Dhingra, Z. Zhou, D. Fitzpatrick, M. Muehl, and W. W. Cohen, "Tweet2vec: Character-based distributed representations for social media," arXiv preprint arXiv:1605.03481, 2016.[15] D. Kowald, S. C. Pujari, and E. Lex, "Temporal effects on hashtag reuse in twitter: A cognitive-inspired hashtag recommendation approach," in Proceedings of the 26th International Conference on World Wide Web, 2017, pp. 1401-1410.[16] A. J. Lam and C. Cheng, "Utilizing Tweet Content for the Detection of Sentiment-Based Interaction Communities on Twitter," in 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), 2018: IEEE, pp. 682-691.[17] A. Trotman, A. Puurula, and B. Burgess, "Improvements to BM25 and language models examined," in Proceedings of the 2014 Australasian Document Computing Symposium, 2014, pp. 58-65.[18] S. Humeau, K. Shuster, M.-A. Lachaux, and J. Weston, "Poly-encoders: Transformer architectures and pre-training strategies for fast and accurate multi-sentence scoring," arXiv preprint arXiv:1905.01969, 2019.[19] N. Reimers and I. Gurevych, "Sentence-bert: Sentence embeddings using siamese bert-networks," arXiv preprint arXiv:1908.10084, 2019.[20] M. Kaviani and H. Rahmani, "Emhash: Hashtag recommendation using neural network based on bert embedding," in 2020 6th International Conference on Web Research (ICWR), 2020: IEEE, pp. 113-118.[21] D. Goldberg, D. Nichols, B. M. Oki, and D. Terry, "Using collaborative filtering to weave an information tapestry," Communications of the ACM, vol. 35, no. 12, pp. 61-70, 1992.[22] Y. Koren, R. Bell, and C. Volinsky, "Matrix factorization techniques for recommender systems," Computer, vol. 42, no. 8, pp. 30-37, 2009.[23] T. Miyanishi, K. Seki, and K. Uehara, "Improving pseudo-relevance feedback via tweet selection," in Proceedings of the 22nd ACM international conference on Information & Knowledge Management, 2013, pp. 439-448.[24] L. Richardson, "Beautiful soup documentation," Dosegljivo: https://www.crummy.com/software/BeautifulSoup/bs4/doc/. [Dostopano: 7. 7. 2018], 2007.[25] M. Lui and T. Baldwin, "langid. py: An off-the-shelf language identification tool," in Proceedings of the ACL 2012 system demonstrations, 2012, pp. 25-30.[26] R. Řehůřek and P. Sojka, "Gensim—statistical semantics in python," Retrieved from genism. org, 2011.[27] S. E. Robertson and K. S. Jones, "Relevance weighting of search terms," Journal of the American Society for Information science, vol. 27, no. 3, pp. 129-146, 1976.[28] S. Robertson and H. Zaragoza, The probabilistic relevance framework: BM25 and beyond. Now Publishers Inc, 2009.[29] J. Ramos, "Using tf-idf to determine word relevance in document queries," in Proceedings of the first instructional conference on machine learning, 2003, vol. 242, no. 1: Citeseer, pp. 29-48.[30] Y. Lv and C. Zhai, "Lower-bounding term frequency normalization," in Proceedings of the 20th ACM international conference on Information and knowledge management, 2011, pp. 7-16.[31] F. Jian, J. X. Huang, J. Zhao, Z. Ying, and Y. Wang, "A topic‐based term frequency normalization framework to enhance probabilistic information retrieval," Computational Intelligence, vol. 36, no. 2, pp. 486-521, 2020.[32] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M.Lewis, L. Zettleoyer, and V. Stoyanov, "Roberta: A robustly optimized bert pretraining approach," arXiv preprint arXiv:1907.11692, 2019.[33] V. Sanh, L. Debut, J. Chaumond, and T. Wolf, "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter," arXiv preprint arXiv:1910.01108, 2019.[34] T. M. H. Reenskaug, "The original MVC reports," 1979.[35] M. Potel, "MVP: Model-View-Presenter the Taligent programming model for C++ and Java," Taligent Inc, p. 20, 1996.[36] E. Wilde and C. Pautasso, REST: from research to practice. Springer Science & Business Media, 2011.[37] A. Fedosejev, React. js essentials. Packt Publishing Ltd, 2015.[38] B. Efron and R. J. Tibshirani, An introduction to the bootstrap. CRC press, 1994.[39] B. Bibeault, A. De Rosa, and Y. Katz, jQuery in Action. Simon and Schuster, 2015.[40] J. J. Garrett, "Ajax: A new approach to web applications," 2005.[41] J. Brooke, "System usability scale (SUS): a quick-and-dirty method of system evaluation user information," Reading, UK: Digital Equipment Co Ltd, vol. 43, pp. 1-7, 1986.[42] R. Likert, "A technique for the measurement of attitudes," Archives of psychology, 1932.[43] A. M. Lund, "Measuring usability with the use questionnaire12," Usability interface, vol. 8, no. 2, pp. 3-6, 2001.[44] M. F. Porter, "Snowball: A language for stemming algorithms," ed, 2001.[45] A. Bangor, P. Kortum, and J. Miller, "Determining what individual SUS scores mean: Adding an adjective rating scale," Journal of usability studies, vol. 4, no. 3, pp. 114-123, 2009.[46] P. Virtanen, R. Gommers, T. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser, and J. Bright, "SciPy 1.0: fundamental algorithms for scientific computing in Python," Nature methods, vol. 17, no. 3, pp. 261-272, 2020. 描述 碩士
國立政治大學
資訊科學系
109753208資料來源 http://thesis.lib.nccu.edu.tw/record/#G0109753208 資料類型 thesis dc.contributor.advisor 李蔡彥 zh_TW dc.contributor.advisor Li, Tsai-Yen en_US dc.contributor.author (Authors) 林俊廷 zh_TW dc.contributor.author (Authors) Lin, Chun-Ting en_US dc.creator (作者) 林俊廷 zh_TW dc.creator (作者) Lin, Chun-Ting en_US dc.date (日期) 2022 en_US dc.date.accessioned 1-Jul-2022 16:21:47 (UTC+8) - dc.date.available 1-Jul-2022 16:21:47 (UTC+8) - dc.date.issued (上傳時間) 1-Jul-2022 16:21:47 (UTC+8) - dc.identifier (Other Identifiers) G0109753208 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/140664 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 資訊科學系 zh_TW dc.description (描述) 109753208 zh_TW dc.description.abstract (摘要) 隨著網絡的不斷發展,越來越多的使用者將自己的所見所聞,透過推文(Tweet)的形式分享在社群媒體(Social Media)之中。這些推文以主題標籤(Hashtag)為聯結,在社群媒體中構成了許許多多的討論主題(Topic)。但由於大多數的使用者都沒有使用主題標籤的習慣,導致大量的推文無法被即時歸類到對應的主題,使得資訊呈現出離散的狀態。為了解決上述問題,本文提出了一種互動式主題標籤推薦系統,預測使用者所發推文的主題,以互動的方式推薦相關的主題標籤。此推薦系統可根據使用者的互動反饋,在編寫推文的不同階段提供適合的主題標籤,幫助社群形成主題共識,促進社群媒體意見的快速收斂。在實驗中,本研究邀請受試者使用此推薦系統,透過受試者的反饋來驗證系統的有用性。實驗結果顯示,本系統提出之互動式推薦流程可以幫助使用者找到適合推文主題的主題標籤。 zh_TW dc.description.abstract (摘要) With the progressive advance of Internet technologies, more and more users share their lives by posting tweets on social media platforms like Twitter. These tweets use hashtags as links to constitute discussion topics on social media. However, since most users are not used to using hashtags, a large number of tweets cannot be classified into corresponding topics immediately, which leads to a discrete state of information. To solve this problem, in this thesis, we propose an interactive hashtag recommendation system, which predicts the topic of an input tweet and interactively recommends rele-vant hashtags. This recommendation system can provide suitable hashtags in different phases of writing a tweet based on the interactive feedback of a user, help the commu-nity to reach a consensus, and increase the convergence speed of opinions on social media. We conducted user experiments to verify the usability of the recommendation system. The experimental results and user feedbacks reveal that the interactive hashtag recommendation can help users find suitable hashtags about the tweet’s topic. en_US dc.description.tableofcontents 致謝 I摘要 IIAbstract III目錄 IV圖目錄 VI表目錄 VIII第1章 緒論 11.1 研究背景與動機 11.2 研究目標 21.3 預期貢獻 31.4 本論文之架構 4第2章 相關研究 62.1 自然語言處理模型 62.2 主題模型 82.3 推薦系統 92.4 小结 14第3章 研究方法 153.1 問題定義 153.2 研究架構與設計 163.3 資料來源(Data Collection) 213.4 準備階段(Preparation Phase) 213.4.1 資料預處理 213.4.2 過濾字典 223.4.3 主題模型 233.5 發散階段(Divergence Phase) 253.5.1 主題預測 263.5.2 機率模型 283.6 收斂階段(Convergence Phase) 313.6.1 語言模型 313.6.2 互動推薦 36第4章 系統實作 384.1 系統架構 384.1.1 底層架構 404.1.2 後端架構 424.1.3 前端架構 434.2 互動式主題標籤推薦系統界面 444.2.1 發散階段 464.2.2 收斂階段 494.3 主題標籤推薦系統(Baseline)界面 544.3.1 初始界面 544.3.2 推薦界面 554.4 小結 57第5章 實驗方法 585.1 實驗目標 585.2 實驗對象 585.3 實驗流程 595.3.1 離線實驗 605.3.2 線上實驗 615.3.3 問卷、開放式問題與深度訪談 625.4 實驗結果分析 665.4.1 離線實驗結果分析 665.4.2 系統整體評估問卷分析 705.4.3 系統功能評估問卷分析 735.4.4 開放性問題與深度訪談分析 76第6章 結論與未來展望 806.1 研究結論 806.2 未來發展與改進 806.2.1主題推薦改進 816.2.2主題標籤推薦改進 816.2.3未來展望 826.2.4實際應用 82參考文獻 83 zh_TW dc.format.extent 4912262 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0109753208 en_US dc.subject (關鍵詞) 推薦系統 zh_TW dc.subject (關鍵詞) 自然語言處理 zh_TW dc.subject (關鍵詞) 社群媒體 zh_TW dc.subject (關鍵詞) 主題標籤 zh_TW dc.subject (關鍵詞) Recommendation system en_US dc.subject (關鍵詞) Natural language processing en_US dc.subject (關鍵詞) Social media en_US dc.subject (關鍵詞) Hashtag en_US dc.title (題名) 互動式主題標籤推薦系統 zh_TW dc.title (題名) Interactive hashtag recommendation system en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] S. M. Kywe, E.-P. Lim, and F. Zhu, "A survey of recommender systems in twitter," in Proceedings of the International Conference on Social Informatics, 2012: Springer, pp. 420-433.[2] H. Tsukayama, "Twitter turns 7: Users send over 400 million tweets per day," The Washington Post, vol. 21, 2013.[3] F. Godin, V. Slavkovikj, W. De Neve, B. Schrauwen, and R. Van de Walle, "Using topic models for twitter hashtag recommendation," in Proceedings of the 22nd International Conference on World Wide Web, 2013, pp. 593-596.[4] D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," arXiv preprint arXiv:1409.0473, 2014.[5] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, L. Kaiser, and I. Polosukhin, "Attention is all you need," in Advances in neural information processing systems, 2017, pp. 5998-6008.[6] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, 2018.[7] D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent dirichlet allocation," The Journal of Machine Learning Research, vol. 3, pp. 993-1022, 2003.[8] A. K. McCallum, "Mallet: A machine learning for language toolkit," http://mallet.cs.umass.edu, 2002.[9] X. Yan, J. Guo, Y. Lan, and X. Cheng, "A biterm topic model for short texts," in Proceedings of the 22nd international conference on World Wide Web, 2013, pp. 1445-1456.[10] C. Sievert and K. Shirley, "LDAvis: A method for visualizing and interpreting topics," in Proceedings of the workshop on interactive language learning, visualization, and interfaces, 2014, pp. 63-70.[11] K. Dey, R. Shrivastava, S. Kaushik, and L. V. Subramaniam, "Emtagger: a word embedding based novel method for hashtag recommendation on twitter," in 2017 IEEE International Conference on Data Mining Workshops (ICDMW), 2017: IEEE, pp. 1025-1032.[12] E. Otsuka, S. A. Wallace, and D. Chiu, "Design and evaluation of a twitter hashtag recommendation system," in Proceedings of the 18th International Database Engineering & Applications Symposium, 2014, pp. 330-333.[13] T. Li, Y. Wu, and Y. Zhang, "Twitter hash tag prediction algorithm," in Proceedings on the International Conference on Internet Computing (ICOMP), 2011: Citeseer, p. 1.[14] B. Dhingra, Z. Zhou, D. Fitzpatrick, M. Muehl, and W. W. Cohen, "Tweet2vec: Character-based distributed representations for social media," arXiv preprint arXiv:1605.03481, 2016.[15] D. Kowald, S. C. Pujari, and E. Lex, "Temporal effects on hashtag reuse in twitter: A cognitive-inspired hashtag recommendation approach," in Proceedings of the 26th International Conference on World Wide Web, 2017, pp. 1401-1410.[16] A. J. Lam and C. Cheng, "Utilizing Tweet Content for the Detection of Sentiment-Based Interaction Communities on Twitter," in 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), 2018: IEEE, pp. 682-691.[17] A. Trotman, A. Puurula, and B. Burgess, "Improvements to BM25 and language models examined," in Proceedings of the 2014 Australasian Document Computing Symposium, 2014, pp. 58-65.[18] S. Humeau, K. Shuster, M.-A. Lachaux, and J. Weston, "Poly-encoders: Transformer architectures and pre-training strategies for fast and accurate multi-sentence scoring," arXiv preprint arXiv:1905.01969, 2019.[19] N. Reimers and I. Gurevych, "Sentence-bert: Sentence embeddings using siamese bert-networks," arXiv preprint arXiv:1908.10084, 2019.[20] M. Kaviani and H. Rahmani, "Emhash: Hashtag recommendation using neural network based on bert embedding," in 2020 6th International Conference on Web Research (ICWR), 2020: IEEE, pp. 113-118.[21] D. Goldberg, D. Nichols, B. M. Oki, and D. Terry, "Using collaborative filtering to weave an information tapestry," Communications of the ACM, vol. 35, no. 12, pp. 61-70, 1992.[22] Y. Koren, R. Bell, and C. Volinsky, "Matrix factorization techniques for recommender systems," Computer, vol. 42, no. 8, pp. 30-37, 2009.[23] T. Miyanishi, K. Seki, and K. Uehara, "Improving pseudo-relevance feedback via tweet selection," in Proceedings of the 22nd ACM international conference on Information & Knowledge Management, 2013, pp. 439-448.[24] L. Richardson, "Beautiful soup documentation," Dosegljivo: https://www.crummy.com/software/BeautifulSoup/bs4/doc/. [Dostopano: 7. 7. 2018], 2007.[25] M. Lui and T. Baldwin, "langid. py: An off-the-shelf language identification tool," in Proceedings of the ACL 2012 system demonstrations, 2012, pp. 25-30.[26] R. Řehůřek and P. Sojka, "Gensim—statistical semantics in python," Retrieved from genism. org, 2011.[27] S. E. Robertson and K. S. Jones, "Relevance weighting of search terms," Journal of the American Society for Information science, vol. 27, no. 3, pp. 129-146, 1976.[28] S. Robertson and H. Zaragoza, The probabilistic relevance framework: BM25 and beyond. Now Publishers Inc, 2009.[29] J. Ramos, "Using tf-idf to determine word relevance in document queries," in Proceedings of the first instructional conference on machine learning, 2003, vol. 242, no. 1: Citeseer, pp. 29-48.[30] Y. Lv and C. Zhai, "Lower-bounding term frequency normalization," in Proceedings of the 20th ACM international conference on Information and knowledge management, 2011, pp. 7-16.[31] F. Jian, J. X. Huang, J. Zhao, Z. Ying, and Y. Wang, "A topic‐based term frequency normalization framework to enhance probabilistic information retrieval," Computational Intelligence, vol. 36, no. 2, pp. 486-521, 2020.[32] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M.Lewis, L. Zettleoyer, and V. Stoyanov, "Roberta: A robustly optimized bert pretraining approach," arXiv preprint arXiv:1907.11692, 2019.[33] V. Sanh, L. Debut, J. Chaumond, and T. Wolf, "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter," arXiv preprint arXiv:1910.01108, 2019.[34] T. M. H. Reenskaug, "The original MVC reports," 1979.[35] M. Potel, "MVP: Model-View-Presenter the Taligent programming model for C++ and Java," Taligent Inc, p. 20, 1996.[36] E. Wilde and C. Pautasso, REST: from research to practice. Springer Science & Business Media, 2011.[37] A. Fedosejev, React. js essentials. Packt Publishing Ltd, 2015.[38] B. Efron and R. J. Tibshirani, An introduction to the bootstrap. CRC press, 1994.[39] B. Bibeault, A. De Rosa, and Y. Katz, jQuery in Action. Simon and Schuster, 2015.[40] J. J. Garrett, "Ajax: A new approach to web applications," 2005.[41] J. Brooke, "System usability scale (SUS): a quick-and-dirty method of system evaluation user information," Reading, UK: Digital Equipment Co Ltd, vol. 43, pp. 1-7, 1986.[42] R. Likert, "A technique for the measurement of attitudes," Archives of psychology, 1932.[43] A. M. Lund, "Measuring usability with the use questionnaire12," Usability interface, vol. 8, no. 2, pp. 3-6, 2001.[44] M. F. Porter, "Snowball: A language for stemming algorithms," ed, 2001.[45] A. Bangor, P. Kortum, and J. Miller, "Determining what individual SUS scores mean: Adding an adjective rating scale," Journal of usability studies, vol. 4, no. 3, pp. 114-123, 2009.[46] P. Virtanen, R. Gommers, T. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser, and J. Bright, "SciPy 1.0: fundamental algorithms for scientific computing in Python," Nature methods, vol. 17, no. 3, pp. 261-272, 2020. zh_TW dc.identifier.doi (DOI) 10.6814/NCCU202200469 en_US
