基於性格特質的社群討論回應生成

學術產出-學位論文

文章檢視/開啟

pdf(3)

書目匯出

Google Scholar^TM

政大圖書館

學術資源探索系統

引文資訊

TAIR相關學術產出

Simple Record
Full Record

題名	基於性格特質的社群討論回應生成 Personality-based Response Generation for Social Discussion
作者	陳定宇 Chen, Ting-Yu
貢獻者	黃佳慧<br>黃瀚萱 Huang, Chia-Hui<br>Huang, Hen-Hsen 陳定宇 Chen, Ting-Yu
關鍵詞	對話生成鑑別學習人格特質建模 Dialog generation Personalities Discriminative learning
日期	2021
上傳時間	2-九月-2021 15:37:33 (UTC+8)
摘要	在對話生成的研究中，雖然有部份研究針對個人化的文字生成有所探討，但主要專注於個人化的語言風格、或是職業性別等個人化的背景資訊。本研究嘗試了另一個向度的個人化文字生成，產生具有特定人格特質的文字，模擬不同性格的人，在社群媒體上的發文。本研究利用現有的資料集，再爬取社群媒體平台上的討論串，建立訓練資料集。為了強化文字生成模型對不同人格特質的建模，本研究發展了創新的鑑別學習法，引入新的損失函數，讓模型不僅能生成通順、合理的文字，並且呈現較為明顯的個人特質。實驗結果經自動與人工驗證，顯示本研究所提出之方法的效度。 Previous works that attempt to emulate the human properties in dialog generation mostly focus on the incorporation of personal information or language style in the generated text. In this work, we aim to introduce a different kind of human properties in dialog generation, the personalities, to generate the response in social discussion according to a certain type of personality. We create a corpus that was crawled from a social platform with the label of personalities for the users. A novel discriminative learning approach is proposed to enhance the neural generation model toward the extrovert or the introvert personality. Both automatic and human evaluation are conducted for showing the effectiveness of our approach.
參考文獻	AbuShawar, B., & Atwell, E. (2015). ALICE chatbot: Trials and outputs. Computación y Sistemas, 19(4), 625-632 Adiwardana, D., & Luong, T. (2020). Towards a Conversational Agent that Can Chat About… Anything. Google AI Blog. Bogatu, A., Rotarescu, D., Rebedea, T., & Ruseti, S. (2015). Conversational Agent that Models a Historical Personality. In RoCHI (pp. 81-86). Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.. Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794). Furnham, A. (1996). The big five versus the big four: the relationship between the Myers-Briggs Type Indicator (MBTI) and NEO-PI five factor model of personality. Personality and individual differences, 21(2), 303-307. Gjurković, M., & Šnajder, J. (2018, June). Reddit: A gold mine for personality prediction. In Proceedings of the Second Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (pp. 87-97). Gaikwad, S. (2019). Chatbots with Personality Using Deep Learning Gao, X., Zhang, Y., Galley, M., Brockett, C., & Dolan, B. (2020). Dialogue response ranking training with large-scale human feedback data. arXiv preprint arXiv:2009.06978. Keskar, N. S., McCann, B., Varshney, L. R., Xiong, C., & Socher, R. (2019). Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858. Louridas, A., Halstead, A., & Beddoes-Jones, F. (2002). An evaluation of the thinking preferences of engineers to assist in their personal and professional development. Greece 4th International Conference on Education, Athens Luyckx, K., & Daelemans, W. (2008, August). Authorship attribution and verification with many authors and limited data. In Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008) (pp. 513-520). Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9. Wright, D. (2014). Stylistics versus Statistics: A corpus linguistic approach to combining techniques in forensic authorship analysis using Enron emails (Doctoral dissertation, University of Leeds) Weizenbaum, J. (1966). ELIZA—a computer program for the study of natural language communication between man and machine. Communications of the ACM, 9(1), 36-45 Wallace, R. S. (2009). The anatomy of ALICE. In Parsing the turing test (pp.181-210). Springer, Dordrecht. Zumstein, D., & Hundertmark, S. (2017). CHATBOTS--AN INTERACTIVE TECHNOLOGY FOR PERSONALIZED COMMUNICATION, TRANSACTIONS AND SERVICES. IADIS International Journal on WWW/Internet, 15(1) Zhou, L., Gao, J., Li, D., & Shum, H. Y. (2020). The design and implementation of xiaoice, an empathetic social chatbot. Computational Linguistics, 46(1), 53-93. Zhang, Y., Sun, S., Galley, M., Chen, Y. C., Brockett, C., Gao, X., ... & Dolan, B.(2019). Dialogpt: Large-scale generative pre-training for conversational response generation. arXiv preprint arXiv:1911.00536.
描述	碩士國立政治大學統計學系 108354004
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0108354004
資料類型	thesis

dc.contributor.advisor	黃佳慧<br>黃瀚萱	zh_TW
dc.contributor.advisor	Huang, Chia-Hui<br>Huang, Hen-Hsen	en_US
dc.contributor.author (作者)	陳定宇	zh_TW
dc.contributor.author (作者)	Chen, Ting-Yu	en_US
dc.creator (作者)	陳定宇	zh_TW
dc.creator (作者)	Chen, Ting-Yu	en_US
dc.date (日期)	2021	en_US
dc.date.accessioned	2-九月-2021 15:37:33 (UTC+8)	-
dc.date.available	2-九月-2021 15:37:33 (UTC+8)	-
dc.date.issued (上傳時間)	2-九月-2021 15:37:33 (UTC+8)	-
dc.identifier (其他識別碼)	G0108354004	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/136829	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	統計學系	zh_TW
dc.description (描述)	108354004	zh_TW
dc.description.abstract (摘要)	在對話生成的研究中，雖然有部份研究針對個人化的文字生成有所探討，但主要專注於個人化的語言風格、或是職業性別等個人化的背景資訊。本研究嘗試了另一個向度的個人化文字生成，產生具有特定人格特質的文字，模擬不同性格的人，在社群媒體上的發文。本研究利用現有的資料集，再爬取社群媒體平台上的討論串，建立訓練資料集。為了強化文字生成模型對不同人格特質的建模，本研究發展了創新的鑑別學習法，引入新的損失函數，讓模型不僅能生成通順、合理的文字，並且呈現較為明顯的個人特質。實驗結果經自動與人工驗證，顯示本研究所提出之方法的效度。	zh_TW
dc.description.abstract (摘要)	Previous works that attempt to emulate the human properties in dialog generation mostly focus on the incorporation of personal information or language style in the generated text. In this work, we aim to introduce a different kind of human properties in dialog generation, the personalities, to generate the response in social discussion according to a certain type of personality. We create a corpus that was crawled from a social platform with the label of personalities for the users. A novel discriminative learning approach is proposed to enhance the neural generation model toward the extrovert or the introvert personality. Both automatic and human evaluation are conducted for showing the effectiveness of our approach.	en_US
dc.description.tableofcontents	第一章緒論 10 一、背景 10 二、研究目標 13 第二章文獻探討 14 一、文獻回顧 14 第三章相關研究 16 一、序列對序列模型 16 二、基於規則系統(Rule-base System) 20 三、基於RNN 22 四、基於GPT-1 24 五、基於GPT-2 26 六、基於GPT-3 26 七、 Conditional Transformer Language Model 28 八、自然語言處理與性格相關文獻 29 第四章資料集介紹 30 一、資料集背景 30 二、資料集 30 三、資料清洗 33 四、探索資料分析 34 第五章研究方法 41 一、條件定義 41 二、 DialoGPT模型 41 三、 CTRL模型 42 四、 DialogRPT模型 43 五、 XGBoost 45 六、不同模型下的條件應用 47 第六章實驗 48 一、評估標準 48 二、超參數設定 49 三、實驗結果 49 四、 CTRL模型 49 五、 DialogRPT模型 54 六、 DialoGPT模型 63 七、 XGBoost 74 八、鑑別學習 75 九、人工驗證 78 十、人工驗證結果 79 第七章結論與展望 84 參考文獻 85	zh_TW
dc.format.extent	5788200 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0108354004	en_US
dc.subject (關鍵詞)	對話生成	zh_TW
dc.subject (關鍵詞)	鑑別學習	zh_TW
dc.subject (關鍵詞)	人格特質建模	zh_TW
dc.subject (關鍵詞)	Dialog generation	en_US
dc.subject (關鍵詞)	Personalities	en_US
dc.subject (關鍵詞)	Discriminative learning	en_US
dc.title (題名)	基於性格特質的社群討論回應生成	zh_TW
dc.title (題名)	Personality-based Response Generation for Social Discussion	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	AbuShawar, B., & Atwell, E. (2015). ALICE chatbot: Trials and outputs. Computación y Sistemas, 19(4), 625-632 Adiwardana, D., & Luong, T. (2020). Towards a Conversational Agent that Can Chat About… Anything. Google AI Blog. Bogatu, A., Rotarescu, D., Rebedea, T., & Ruseti, S. (2015). Conversational Agent that Models a Historical Personality. In RoCHI (pp. 81-86). Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.. Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794). Furnham, A. (1996). The big five versus the big four: the relationship between the Myers-Briggs Type Indicator (MBTI) and NEO-PI five factor model of personality. Personality and individual differences, 21(2), 303-307. Gjurković, M., & Šnajder, J. (2018, June). Reddit: A gold mine for personality prediction. In Proceedings of the Second Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (pp. 87-97). Gaikwad, S. (2019). Chatbots with Personality Using Deep Learning Gao, X., Zhang, Y., Galley, M., Brockett, C., & Dolan, B. (2020). Dialogue response ranking training with large-scale human feedback data. arXiv preprint arXiv:2009.06978. Keskar, N. S., McCann, B., Varshney, L. R., Xiong, C., & Socher, R. (2019). Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858. Louridas, A., Halstead, A., & Beddoes-Jones, F. (2002). An evaluation of the thinking preferences of engineers to assist in their personal and professional development. Greece 4th International Conference on Education, Athens Luyckx, K., & Daelemans, W. (2008, August). Authorship attribution and verification with many authors and limited data. In Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008) (pp. 513-520). Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9. Wright, D. (2014). Stylistics versus Statistics: A corpus linguistic approach to combining techniques in forensic authorship analysis using Enron emails (Doctoral dissertation, University of Leeds) Weizenbaum, J. (1966). ELIZA—a computer program for the study of natural language communication between man and machine. Communications of the ACM, 9(1), 36-45 Wallace, R. S. (2009). The anatomy of ALICE. In Parsing the turing test (pp.181-210). Springer, Dordrecht. Zumstein, D., & Hundertmark, S. (2017). CHATBOTS--AN INTERACTIVE TECHNOLOGY FOR PERSONALIZED COMMUNICATION, TRANSACTIONS AND SERVICES. IADIS International Journal on WWW/Internet, 15(1) Zhou, L., Gao, J., Li, D., & Shum, H. Y. (2020). The design and implementation of xiaoice, an empathetic social chatbot. Computational Linguistics, 46(1), 53-93. Zhang, Y., Sun, S., Galley, M., Chen, Y. C., Brockett, C., Gao, X., ... & Dolan, B.(2019). Dialogpt: Large-scale generative pre-training for conversational response generation. arXiv preprint arXiv:1911.00536.	zh_TW
dc.identifier.doi (DOI)	10.6814/NCCU202101488	en_US

學術產出-學位論文

文章檢視/開啟

書目匯出

Google ScholarTM

政大圖書館

引文資訊

TAIR相關學術產出

Google Scholar^TM