Publications-Theses
Article View/Open
Publication Export
-
Google ScholarTM
NCCU Library
Citation Infomation
Related Publications in TAIR
題名 運用知識圖譜由交易資料預測人口統計資料
Demographic Prediction from Transaction Data Using Knowledge Graph作者 巫謹任
Wu, Jin-Ren貢獻者 沈錳坤
Shan, Man-Kwan
巫謹任
Wu, Jin-Ren關鍵詞 人口統計預測
知識圖
Demographic Prediction
Knowledge Graph日期 2024 上傳時間 4-Sep-2024 14:59:21 (UTC+8) 摘要 隨著大數據時代的興起,人們在網路上所提供的資訊擁有巨大的價值。對多數企業而言,用戶的背景資料是極為寶貴的,能助企業更精確地制定策略並提供個人化的服務。然而,隨著人們對隱私的日益重視,許多人選擇不在網路上公開自己的背景資料。為了解決這個問題,許多學者嘗試從用戶的互動交易資料(User-Item transaction data)中預測用戶的背景標籤資料(即人口統計資料)。本研究試圖從另一個角度切入,在預測結果的同時也提供模型的可解釋性,從而增加預測的可靠性和使用者的信賴度。本研究旨在通過知識圖譜(Knowledge Graph),來預測用戶的人口統計資料,並結合深度學習技術以增強預測的準確性和可解釋性。本論文由知識圖譜與互動交易資料所形成的二分圖中,產生用戶結點到人口統計節點的路徑,透過包含語意的嵌入向量轉換後,經過長短期記憶模型學習路徑中的前後關係,最後透過加權池化層來預測人口統計資料。
With the rise of the big data era, the information provided by individuals online possesses immense value. For most businesses, users' background data is highly valuable as it helps them formulate strategies more precisely and offer personalized services. However, as people become increasingly concerned about privacy, many choose not to disclose their background information online. To address this issue, many researchers have attempted to predict users' background labels (i.e., demographic data) from User-Item transaction data. This study approaches the problem from a different angle, aiming to enhance the reliability of predictions and build user trust by providing model interpretability alongside the predictions. The research aims to predict users' demographic data using a Knowledge Graph while integrating deep learning techniques to improve both the accuracy and interpretability of the predictions. In this thesis, paths from user nodes to demographic nodes are generated from the knowledge graph integrated with the bipartite graph formed from transaction data. After transforming the paths into semantically rich embedding vectors, a Long Short-Term Memory (LSTM) model is employed to learn the sequential relationships within the paths. Finally, a weighted pooling layer is used to predict the demographic data.參考文獻 [1] U. Weinsberg, S. Bhagat, S. Ioannidis, and N. Taft, “BlurMe: Inferring and Obfuscating User Gender Based on Ratings,” in Proceedings of the Sixth ACM Conference on Recommender Systems, 2012. [2] P. Wang, J. Guo, Y. Lan, J. Xu, and X. Cheng, “Your Cart Tells You: Inferring Demographic Attributes from Purchase Data,” in Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, 2016. [3] A. Mislove, B. Viswanath, K. P. Gummadi, and P. Druschel, “You Are Who You Know: Inferring User Profiles in Online Social Networks,” in Proceedings of the third ACM International Conference on Web Search and Data Mining, 2010. [4] X. Wang, D. Wang, C. Xu, X. He, Y. Cao, and T.-S. Chua, “Explainable Reasoning over Knowledge Graphs for Recommendation,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, 2019. [5] S. Bhagat, I. Rozenbaum, and G. Cormode, “Applying Link-based Classification to Label Blogs,” in Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web Mining and Social Network Analysis, 2007. [6] D. Rao, D. Yarowsky, A. Shreevats, and M. Gupta, “Classifying Latent User Attributes in Twitter,” in Proceedings of the 2nd International Workshop on Search and Mining User-Generated Contents, 2010. [7] J. Otterbacher, “Inferring Gender of Movie Reviewers: Exploiting Writing Style, Content and Metadata,” in Proceedings of the 19th ACM International Conference on Information and Knowledge Management, 2010. [8] J. Hu, H.-J. Zeng, H. Li, C. Niu, and Z. Chen, “Demographic Prediction Based on User’s Browsing Behavior,” in Proceedings of the 16th International Conference on World Wide Web, 2007. [9] E. Zhong, B. Tan, K. Mo, and Q. Yang, “User Demographics Prediction Based on Mobile Data,” Pervasive and Mobile Computing, vol. 9, no. 6, 2013. [10] A. Culotta, N. Kumar, and J. Cutler, “Predicting the Demographics of Twitter Users from Website Traffic Data,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29, no. 1, 2015. [11] C.-T. Lai, C.-T. Li, and S.-D. Lin, “Deep Energy Factorization Model for Demographic Prediction,” ACM Trans. Intell. Syst. Technol., vol. 12, no. 1, 2020. [12] A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, and O. Yakhnenko, “Translating Embeddings for Modeling Multi-relational Data,” in Advances in Neural Information Processing Systems, 2013. [13] Y. Lin, Z. Liu, M. Sun, Y. Liu, and X. Zhu, “Learning Entity and Relation Embeddings for Knowledge Graph Completion,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29, no. 1, 2015. [14] Z. Wang, J. Zhang, J. Feng, and Z. Chen, “Knowledge Graph Embedding by Translating on Hyperplanes,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 28, no. 1, 2014. [15] Y. Jiang, W. Tang, N. Gao, J. Xiang, and Y. Su, “Demographic Prediction from Purchase Data Based on Knowledge-Aware Embedding,” in Neural Information Processing: 26th International Conference, ICONIP, 2019. [16] S. Chaudhari, A. Azaria, and T. Mitchell, “An Entity Graph Based Recommender System,” AI Communications, vol. 30, no. 2, 2017. [17] Y. Sun, J. Han, X. Yan, P. S. Yu, and T. Wu, “Pathsim: Meta Path-based Top-K Similarity Search in Heterogeneous Information Networks,” Proceedings of the VLDB Endowment, 2011. 描述 碩士
國立政治大學
資訊科學系
111753124資料來源 http://thesis.lib.nccu.edu.tw/record/#G0111753124 資料類型 thesis dc.contributor.advisor 沈錳坤 zh_TW dc.contributor.advisor Shan, Man-Kwan en_US dc.contributor.author (Authors) 巫謹任 zh_TW dc.contributor.author (Authors) Wu, Jin-Ren en_US dc.creator (作者) 巫謹任 zh_TW dc.creator (作者) Wu, Jin-Ren en_US dc.date (日期) 2024 en_US dc.date.accessioned 4-Sep-2024 14:59:21 (UTC+8) - dc.date.available 4-Sep-2024 14:59:21 (UTC+8) - dc.date.issued (上傳時間) 4-Sep-2024 14:59:21 (UTC+8) - dc.identifier (Other Identifiers) G0111753124 en_US dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/153376 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 資訊科學系 zh_TW dc.description (描述) 111753124 zh_TW dc.description.abstract (摘要) 隨著大數據時代的興起,人們在網路上所提供的資訊擁有巨大的價值。對多數企業而言,用戶的背景資料是極為寶貴的,能助企業更精確地制定策略並提供個人化的服務。然而,隨著人們對隱私的日益重視,許多人選擇不在網路上公開自己的背景資料。為了解決這個問題,許多學者嘗試從用戶的互動交易資料(User-Item transaction data)中預測用戶的背景標籤資料(即人口統計資料)。本研究試圖從另一個角度切入,在預測結果的同時也提供模型的可解釋性,從而增加預測的可靠性和使用者的信賴度。本研究旨在通過知識圖譜(Knowledge Graph),來預測用戶的人口統計資料,並結合深度學習技術以增強預測的準確性和可解釋性。本論文由知識圖譜與互動交易資料所形成的二分圖中,產生用戶結點到人口統計節點的路徑,透過包含語意的嵌入向量轉換後,經過長短期記憶模型學習路徑中的前後關係,最後透過加權池化層來預測人口統計資料。 zh_TW dc.description.abstract (摘要) With the rise of the big data era, the information provided by individuals online possesses immense value. For most businesses, users' background data is highly valuable as it helps them formulate strategies more precisely and offer personalized services. However, as people become increasingly concerned about privacy, many choose not to disclose their background information online. To address this issue, many researchers have attempted to predict users' background labels (i.e., demographic data) from User-Item transaction data. This study approaches the problem from a different angle, aiming to enhance the reliability of predictions and build user trust by providing model interpretability alongside the predictions. The research aims to predict users' demographic data using a Knowledge Graph while integrating deep learning techniques to improve both the accuracy and interpretability of the predictions. In this thesis, paths from user nodes to demographic nodes are generated from the knowledge graph integrated with the bipartite graph formed from transaction data. After transforming the paths into semantically rich embedding vectors, a Long Short-Term Memory (LSTM) model is employed to learn the sequential relationships within the paths. Finally, a weighted pooling layer is used to predict the demographic data. en_US dc.description.tableofcontents 第一章 緒論 8 1.1 研究背景 8 1.2 研究動機 9 1.3 研究目的 10 第二章 相關研究 11 2.1 人口統計預測 11 2.2 知識圖譜相關知識及研究 12 第三章 研究方法 13 3.1 問題定義 13 3.2 研究架構 13 3.3 Graph Construction 13 3.4 利用Graph尋找路徑 16 3.5 Embedding Layer 18 3.6 LSTM Layer 19 3.7 Pooling Layer 21 3.8 Fully Connected Layer 21 3.9 模型評估指標 22 第四章 實驗 24 4.1 資料集 24 4.2 性別實驗 25 4.2.1 資料前處理 25 4.2.2 實驗設計 26 4.2.3 衡量指標 28 4.2.4 實驗結果 28 4.3 年紀實驗 35 4.3.1 資料前處理 35 4.3.2 實驗設計 37 4.3.3 衡量指標 39 4.3.4 實驗結果 39 4.4 預測性別之可解釋性 47 第五章 結論 49 參考資料 50 zh_TW dc.format.extent 1523720 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0111753124 en_US dc.subject (關鍵詞) 人口統計預測 zh_TW dc.subject (關鍵詞) 知識圖 zh_TW dc.subject (關鍵詞) Demographic Prediction en_US dc.subject (關鍵詞) Knowledge Graph en_US dc.title (題名) 運用知識圖譜由交易資料預測人口統計資料 zh_TW dc.title (題名) Demographic Prediction from Transaction Data Using Knowledge Graph en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] U. Weinsberg, S. Bhagat, S. Ioannidis, and N. Taft, “BlurMe: Inferring and Obfuscating User Gender Based on Ratings,” in Proceedings of the Sixth ACM Conference on Recommender Systems, 2012. [2] P. Wang, J. Guo, Y. Lan, J. Xu, and X. Cheng, “Your Cart Tells You: Inferring Demographic Attributes from Purchase Data,” in Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, 2016. [3] A. Mislove, B. Viswanath, K. P. Gummadi, and P. Druschel, “You Are Who You Know: Inferring User Profiles in Online Social Networks,” in Proceedings of the third ACM International Conference on Web Search and Data Mining, 2010. [4] X. Wang, D. Wang, C. Xu, X. He, Y. Cao, and T.-S. Chua, “Explainable Reasoning over Knowledge Graphs for Recommendation,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, 2019. [5] S. Bhagat, I. Rozenbaum, and G. Cormode, “Applying Link-based Classification to Label Blogs,” in Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web Mining and Social Network Analysis, 2007. [6] D. Rao, D. Yarowsky, A. Shreevats, and M. Gupta, “Classifying Latent User Attributes in Twitter,” in Proceedings of the 2nd International Workshop on Search and Mining User-Generated Contents, 2010. [7] J. Otterbacher, “Inferring Gender of Movie Reviewers: Exploiting Writing Style, Content and Metadata,” in Proceedings of the 19th ACM International Conference on Information and Knowledge Management, 2010. [8] J. Hu, H.-J. Zeng, H. Li, C. Niu, and Z. Chen, “Demographic Prediction Based on User’s Browsing Behavior,” in Proceedings of the 16th International Conference on World Wide Web, 2007. [9] E. Zhong, B. Tan, K. Mo, and Q. Yang, “User Demographics Prediction Based on Mobile Data,” Pervasive and Mobile Computing, vol. 9, no. 6, 2013. [10] A. Culotta, N. Kumar, and J. Cutler, “Predicting the Demographics of Twitter Users from Website Traffic Data,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29, no. 1, 2015. [11] C.-T. Lai, C.-T. Li, and S.-D. Lin, “Deep Energy Factorization Model for Demographic Prediction,” ACM Trans. Intell. Syst. Technol., vol. 12, no. 1, 2020. [12] A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, and O. Yakhnenko, “Translating Embeddings for Modeling Multi-relational Data,” in Advances in Neural Information Processing Systems, 2013. [13] Y. Lin, Z. Liu, M. Sun, Y. Liu, and X. Zhu, “Learning Entity and Relation Embeddings for Knowledge Graph Completion,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29, no. 1, 2015. [14] Z. Wang, J. Zhang, J. Feng, and Z. Chen, “Knowledge Graph Embedding by Translating on Hyperplanes,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 28, no. 1, 2014. [15] Y. Jiang, W. Tang, N. Gao, J. Xiang, and Y. Su, “Demographic Prediction from Purchase Data Based on Knowledge-Aware Embedding,” in Neural Information Processing: 26th International Conference, ICONIP, 2019. [16] S. Chaudhari, A. Azaria, and T. Mitchell, “An Entity Graph Based Recommender System,” AI Communications, vol. 30, no. 2, 2017. [17] Y. Sun, J. Han, X. Yan, P. S. Yu, and T. Wu, “Pathsim: Meta Path-based Top-K Similarity Search in Heterogeneous Information Networks,” Proceedings of the VLDB Endowment, 2011. zh_TW