Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 基於事件常識的合理故事改寫文字生成模型
A Text Generation Model based on Event Commonsense Knowledge for Reasonable Story Revision
作者 莊景雲
Chuang, Ching-Yun
貢獻者 黃瀚萱<br>陳宜秀
Huang, Hen-Hsen<br>Chen, Yihsiu
莊景雲
Chuang, Ching-Yun
關鍵詞 自然語言處理
故事改寫
常識知識抽取
故事生成
Natural language processing
Story rewriting
Common sense knowledge extraction
Story generation
日期 2022
上傳時間 2-Sep-2022 15:42:52 (UTC+8)
摘要 隨著自然語言處理技術的發展,除了訓練模型能夠生成流暢的文字以外,我們也更注重機器對於人類知識的學習、常識的理解。近年來,也有越來越多自然語言處理任務加入包含人類常識的知識庫(knowledge base) 以賦予機器人類的背景常識(common sense),使機器可以根據文字中的背景常識進行任務。過往常識知識的應用常可見於故事的撰寫、故事結尾預測、閱讀文章後回答問題等。本研究將結合外部知識庫的背景常識資料,進行故事的改寫任務。在僅給予故事頭尾的情況下,本研究目的為將常識知識資料集納入模型,完成故事改寫任務,並且藉由理解句子背後隱含的常識知識,在串連故事的邏輯性、順暢性能優於過去的模型表現。
經實驗結果發現,本研究結合外部嘗試資料集的故事生成結果優於過往模型的故事改寫成果。在人為評估方面,也可見加入常識知識後對於生成語句之間因果關係、頭尾串連上合理性的提升。我們也從評估結果發現,語言學上用詞、語句撰寫的方式也可能會對人為評估有所影響。
With the development of natural language processing technology, in addition to generate fluent sentences, we also pay more attention to the learning of human knowledge and casual relationship. In recent years, more and more natural language processing tasks have added knowledge bases containing human common sense to endow robots with common sense, so that machine interpretation of events can incorporate background knowledge. The application of common sense knowledge is often seen in story writing, predicting the ending of a story, and answering questions after reading articles. This research will combine the background common sense dataset of the external knowledge base to carry out the task of rewriting the story. The purpose of this study is to optimize the performance in story rewriting and coherence between the beginning and end of the story. The logic and smoothness of the concatenation are better than the previous model, and it can understand the implicit common sense behind the sentence.
參考文獻 [1] Prithviraj Ammanabrolu et al. “Automated storytelling via causal, commonsense plot ordering”. In: arXiv preprint arXiv:2009.00829 (2020).
[2] Gabor Angeli, Melvin Jose Johnson Premkumar, and Christopher D Manning. “Lever- aging linguistic structure for open domain information extraction”. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2015, pp. 344–354.
[3] Ashutosh Baheti et al. “Generating more interesting responses in neural conversa- tion models with distributional constraints”. In: arXiv preprint arXiv:1809.01215 (2018).
[4] Antoine Bosselut et al. “Comet: Commonsense transformers for knowledge graph construction”. In: Association for Computational Linguistics (ACL). 2019.
[5] Kevin Clark and Christopher D Manning. “Deep reinforcement learning for mention- ranking coreference models”. In: arXiv preprint arXiv:1609.08667 (2016).
[6] Chris Donahue, Mina Lee, and Percy Liang. “Enabling language models to fill in the blanks”. In: arXiv preprint arXiv:2005.05339 (2020).
[7] Nick C Ellis. “Frequency effects in language processing: A review with implica- tions for theories of implicit and explicit language acquisition”. In: Studies in sec- ond language acquisition 24.2 (2002), pp. 143–188.
[8] Jessica Ficler and Yoav Goldberg. “Controlling linguistic style aspects in neural language generation”. In: arXiv preprint arXiv:1707.02633 (2017).
[9] Marjan Ghazvininejad et al. “Hafez: an interactive poetry generation system”. In: Proceedings of ACL 2017, System Demonstrations. 2017, pp. 43–48.
[10] Jian Guan et al. “A knowledge-enhanced pretraining model for commonsense story generation”. In: Transactions of the Association for Computational Linguistics 8 (2020), pp. 93–108.
[11] Lynn Hasher and Rose T Zacks. “Automatic processing of fundamental informa- tion: the case of frequency of occurrence.” In: American psychologist 39.12 (1984), p. 1372.
[12] Ari Holtzman et al. “Learning to write with cooperative discriminators”. In: arXiv preprint arXiv:1805.06087 (2018).
[13] Jena D Hwang et al. “Comet-atomic 2020: On symbolic and neural commonsense knowledge graphs”. In: arXiv preprint arXiv:2010.05953 (2020).
[14] Daphne Ippolito et al. “Unsupervised hierarchical story infilling”. In: Proceedings of the First Workshop on Narrative Understanding. 2019, pp. 37–43.
[15] Yuta Kikuchi et al. “Controlling output length in neural encoder-decoders”. In: arXiv preprint arXiv:1609.09552 (2016).
[16] Boyang Li et al. “Story generation with crowdsourced plot graphs”. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 27. 1. 2013.
[17] Hugo Liu and Push Singh. “ConceptNet—a practical commonsense reasoning tool- kit”. In: BT technology journal 22.4 (2004), pp. 211–226.
[18] Nicholas Metropolis et al. “Equation of state calculations by fast computing ma- chines”. In: The journal of chemical physics 21.6 (1953), pp. 1087–1092.
[19] Nasrin Mostafazadeh et al. “A corpus and cloze evaluation for deeper understand- ing of commonsense stories”. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Lan- guage Technologies. 2016, pp. 839–849.
[20] Srini Narayanan and Daniel Jurafsky. “A Bayesian model predicts human parse preference and reading times in sentence processing”. In: Advances in neural in- formation processing systems 14 (2001).
[21] Nathan Ng et al. “Facebook FAIR’s WMT19 News Translation Task Submission”. In: arXiv preprint arXiv:1907.06616 (2019).
[22] Kishore Papineni et al. “Bleu: a method for automatic evaluation of machine trans- lation”. In: Proceedings of the 40th annual meeting of the Association for Compu- tational Linguistics. 2002, pp. 311–318.
[23] Lianhui Qin et al. “Counterfactual Story Reasoning and Generation”. In: Proceed- ings of the 2019 Conference on Empirical Methods in Natural Language Process- ing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019, pp. 5043–5053.
[24] Alec Radford et al. “Improving language understanding by generative pre-training”. In: ().
[25] Alec Radford et al. “Language models are unsupervised multitask learners”. In: OpenAI blog 1.8 (2019), p. 9.
[26] Colin Raffel et al. “Exploring the limits of transfer learning with a unified text-to- text transformer”. In: arXiv preprint arXiv:1910.10683 (2019).
[27] Colin Raffel et al. “Exploring the limits of transfer learning with a unified text-to- text transformer.” In: J. Mach. Learn. Res. 21.140 (2020), pp. 1–67.
[28] Maarten Sap et al. “Atomic: An atlas of machine commonsense for if-then reason- ing”. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 01. 2019, pp. 3027–3035.
[29] Claude Elwood Shannon. “A mathematical theory of communication”. In: The Bell system technical journal 27.3 (1948), pp. 379–423.
[30] Zhixing Tian et al. “Scene Restoring for Narrative Machine Reading Comprehen- sion”. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020, pp. 3063–3073.
[31] Ashish Vaswani et al. “Attention is all you need”. In: Advances in neural informa- tion processing systems. 2017, pp. 5998–6008.
[32] Jingjing Xu et al. “A skeleton-based model for promoting coherence among sen- tences in narrative story generation”. In: arXiv preprint arXiv:1808.06945 (2018).
[33] Peng Xu et al. “MEGATRON-CNTRL: Controllable story generation with external knowledge using large-scale language models”. In: arXiv preprint arXiv:2010.00840 (2020).
[34] Kyra Yee et al. “Simple and effective noisy channel modeling for neural machine translation”. In: arXiv preprint arXiv:1908.05731 (2019).
[35] Lantao Yu et al. “Seqgan: Sequence generative adversarial nets with policy gradi- ent”. In: Proceedings of the AAAI conference on artificial intelligence. Vol. 31. 1. 2017.
[36] Lei Yu et al. “Putting Machine Translation in Context with the Noisy Channel Model”. In: (2019).
[37] Lei Yu et al. “The neural noisy channel”. In: arXiv preprint arXiv:1611.02554 (2016).
[38] Daniel M Ziegler et al. “Fine-tuning language models from human preferences”. In: arXiv preprint arXiv:1909.08593 (2019).
描述 碩士
國立政治大學
數位內容碩士學位學程
109462007
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0109462007
資料類型 thesis
dc.contributor.advisor 黃瀚萱<br>陳宜秀zh_TW
dc.contributor.advisor Huang, Hen-Hsen<br>Chen, Yihsiuen_US
dc.contributor.author (Authors) 莊景雲zh_TW
dc.contributor.author (Authors) Chuang, Ching-Yunen_US
dc.creator (作者) 莊景雲zh_TW
dc.creator (作者) Chuang, Ching-Yunen_US
dc.date (日期) 2022en_US
dc.date.accessioned 2-Sep-2022 15:42:52 (UTC+8)-
dc.date.available 2-Sep-2022 15:42:52 (UTC+8)-
dc.date.issued (上傳時間) 2-Sep-2022 15:42:52 (UTC+8)-
dc.identifier (Other Identifiers) G0109462007en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/141816-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 數位內容碩士學位學程zh_TW
dc.description (描述) 109462007zh_TW
dc.description.abstract (摘要) 隨著自然語言處理技術的發展,除了訓練模型能夠生成流暢的文字以外,我們也更注重機器對於人類知識的學習、常識的理解。近年來,也有越來越多自然語言處理任務加入包含人類常識的知識庫(knowledge base) 以賦予機器人類的背景常識(common sense),使機器可以根據文字中的背景常識進行任務。過往常識知識的應用常可見於故事的撰寫、故事結尾預測、閱讀文章後回答問題等。本研究將結合外部知識庫的背景常識資料,進行故事的改寫任務。在僅給予故事頭尾的情況下,本研究目的為將常識知識資料集納入模型,完成故事改寫任務,並且藉由理解句子背後隱含的常識知識,在串連故事的邏輯性、順暢性能優於過去的模型表現。
經實驗結果發現,本研究結合外部嘗試資料集的故事生成結果優於過往模型的故事改寫成果。在人為評估方面,也可見加入常識知識後對於生成語句之間因果關係、頭尾串連上合理性的提升。我們也從評估結果發現,語言學上用詞、語句撰寫的方式也可能會對人為評估有所影響。
zh_TW
dc.description.abstract (摘要) With the development of natural language processing technology, in addition to generate fluent sentences, we also pay more attention to the learning of human knowledge and casual relationship. In recent years, more and more natural language processing tasks have added knowledge bases containing human common sense to endow robots with common sense, so that machine interpretation of events can incorporate background knowledge. The application of common sense knowledge is often seen in story writing, predicting the ending of a story, and answering questions after reading articles. This research will combine the background common sense dataset of the external knowledge base to carry out the task of rewriting the story. The purpose of this study is to optimize the performance in story rewriting and coherence between the beginning and end of the story. The logic and smoothness of the concatenation are better than the previous model, and it can understand the implicit common sense behind the sentence.en_US
dc.description.tableofcontents 誌謝 i
摘要 iii
Abstract iv
Contents v
List of Figures vii
List of Tables viii
1 介紹 1
1.1 問題背景與研究動機 2
1.2 問題定義與研究策略 3
1.3 研究目標 6
2 文獻探討 8
2.1 自然語言文字生成 8
2.1.1
文字生成相關模型介紹 8
2.1.2 故事內容控制 10
2.2 常識資料集 11
2.3 基於常識的故事撰寫 15
3 研究方法 22
3.1 故事知識抽取 23
3.2 首尾故事連貫 25
3.3 故事內容優化 27
3.4 評估方式 28
3.4.1 量化評估 28
3.4.2 人為評估標準 28
3.4.3 比較基準 30
3.4.4 人為評估實驗設計 30
4 實驗結果與討論 34
4.1 實驗結果與分析 34
4.1.1 實驗結果 34
4.1.2 個案分析 38
5 結論 42
參考文獻 44
zh_TW
dc.format.extent 7433681 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0109462007en_US
dc.subject (關鍵詞) 自然語言處理zh_TW
dc.subject (關鍵詞) 故事改寫zh_TW
dc.subject (關鍵詞) 常識知識抽取zh_TW
dc.subject (關鍵詞) 故事生成zh_TW
dc.subject (關鍵詞) Natural language processingen_US
dc.subject (關鍵詞) Story rewritingen_US
dc.subject (關鍵詞) Common sense knowledge extractionen_US
dc.subject (關鍵詞) Story generationen_US
dc.title (題名) 基於事件常識的合理故事改寫文字生成模型zh_TW
dc.title (題名) A Text Generation Model based on Event Commonsense Knowledge for Reasonable Story Revisionen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] Prithviraj Ammanabrolu et al. “Automated storytelling via causal, commonsense plot ordering”. In: arXiv preprint arXiv:2009.00829 (2020).
[2] Gabor Angeli, Melvin Jose Johnson Premkumar, and Christopher D Manning. “Lever- aging linguistic structure for open domain information extraction”. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2015, pp. 344–354.
[3] Ashutosh Baheti et al. “Generating more interesting responses in neural conversa- tion models with distributional constraints”. In: arXiv preprint arXiv:1809.01215 (2018).
[4] Antoine Bosselut et al. “Comet: Commonsense transformers for knowledge graph construction”. In: Association for Computational Linguistics (ACL). 2019.
[5] Kevin Clark and Christopher D Manning. “Deep reinforcement learning for mention- ranking coreference models”. In: arXiv preprint arXiv:1609.08667 (2016).
[6] Chris Donahue, Mina Lee, and Percy Liang. “Enabling language models to fill in the blanks”. In: arXiv preprint arXiv:2005.05339 (2020).
[7] Nick C Ellis. “Frequency effects in language processing: A review with implica- tions for theories of implicit and explicit language acquisition”. In: Studies in sec- ond language acquisition 24.2 (2002), pp. 143–188.
[8] Jessica Ficler and Yoav Goldberg. “Controlling linguistic style aspects in neural language generation”. In: arXiv preprint arXiv:1707.02633 (2017).
[9] Marjan Ghazvininejad et al. “Hafez: an interactive poetry generation system”. In: Proceedings of ACL 2017, System Demonstrations. 2017, pp. 43–48.
[10] Jian Guan et al. “A knowledge-enhanced pretraining model for commonsense story generation”. In: Transactions of the Association for Computational Linguistics 8 (2020), pp. 93–108.
[11] Lynn Hasher and Rose T Zacks. “Automatic processing of fundamental informa- tion: the case of frequency of occurrence.” In: American psychologist 39.12 (1984), p. 1372.
[12] Ari Holtzman et al. “Learning to write with cooperative discriminators”. In: arXiv preprint arXiv:1805.06087 (2018).
[13] Jena D Hwang et al. “Comet-atomic 2020: On symbolic and neural commonsense knowledge graphs”. In: arXiv preprint arXiv:2010.05953 (2020).
[14] Daphne Ippolito et al. “Unsupervised hierarchical story infilling”. In: Proceedings of the First Workshop on Narrative Understanding. 2019, pp. 37–43.
[15] Yuta Kikuchi et al. “Controlling output length in neural encoder-decoders”. In: arXiv preprint arXiv:1609.09552 (2016).
[16] Boyang Li et al. “Story generation with crowdsourced plot graphs”. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 27. 1. 2013.
[17] Hugo Liu and Push Singh. “ConceptNet—a practical commonsense reasoning tool- kit”. In: BT technology journal 22.4 (2004), pp. 211–226.
[18] Nicholas Metropolis et al. “Equation of state calculations by fast computing ma- chines”. In: The journal of chemical physics 21.6 (1953), pp. 1087–1092.
[19] Nasrin Mostafazadeh et al. “A corpus and cloze evaluation for deeper understand- ing of commonsense stories”. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Lan- guage Technologies. 2016, pp. 839–849.
[20] Srini Narayanan and Daniel Jurafsky. “A Bayesian model predicts human parse preference and reading times in sentence processing”. In: Advances in neural in- formation processing systems 14 (2001).
[21] Nathan Ng et al. “Facebook FAIR’s WMT19 News Translation Task Submission”. In: arXiv preprint arXiv:1907.06616 (2019).
[22] Kishore Papineni et al. “Bleu: a method for automatic evaluation of machine trans- lation”. In: Proceedings of the 40th annual meeting of the Association for Compu- tational Linguistics. 2002, pp. 311–318.
[23] Lianhui Qin et al. “Counterfactual Story Reasoning and Generation”. In: Proceed- ings of the 2019 Conference on Empirical Methods in Natural Language Process- ing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019, pp. 5043–5053.
[24] Alec Radford et al. “Improving language understanding by generative pre-training”. In: ().
[25] Alec Radford et al. “Language models are unsupervised multitask learners”. In: OpenAI blog 1.8 (2019), p. 9.
[26] Colin Raffel et al. “Exploring the limits of transfer learning with a unified text-to- text transformer”. In: arXiv preprint arXiv:1910.10683 (2019).
[27] Colin Raffel et al. “Exploring the limits of transfer learning with a unified text-to- text transformer.” In: J. Mach. Learn. Res. 21.140 (2020), pp. 1–67.
[28] Maarten Sap et al. “Atomic: An atlas of machine commonsense for if-then reason- ing”. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 01. 2019, pp. 3027–3035.
[29] Claude Elwood Shannon. “A mathematical theory of communication”. In: The Bell system technical journal 27.3 (1948), pp. 379–423.
[30] Zhixing Tian et al. “Scene Restoring for Narrative Machine Reading Comprehen- sion”. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020, pp. 3063–3073.
[31] Ashish Vaswani et al. “Attention is all you need”. In: Advances in neural informa- tion processing systems. 2017, pp. 5998–6008.
[32] Jingjing Xu et al. “A skeleton-based model for promoting coherence among sen- tences in narrative story generation”. In: arXiv preprint arXiv:1808.06945 (2018).
[33] Peng Xu et al. “MEGATRON-CNTRL: Controllable story generation with external knowledge using large-scale language models”. In: arXiv preprint arXiv:2010.00840 (2020).
[34] Kyra Yee et al. “Simple and effective noisy channel modeling for neural machine translation”. In: arXiv preprint arXiv:1908.05731 (2019).
[35] Lantao Yu et al. “Seqgan: Sequence generative adversarial nets with policy gradi- ent”. In: Proceedings of the AAAI conference on artificial intelligence. Vol. 31. 1. 2017.
[36] Lei Yu et al. “Putting Machine Translation in Context with the Noisy Channel Model”. In: (2019).
[37] Lei Yu et al. “The neural noisy channel”. In: arXiv preprint arXiv:1611.02554 (2016).
[38] Daniel M Ziegler et al. “Fine-tuning language models from human preferences”. In: arXiv preprint arXiv:1909.08593 (2019).
zh_TW
dc.identifier.doi (DOI) 10.6814/NCCU202201499en_US