ScoreRAG：基於檢索增強生成與一致性相關性評分的結構化新聞生成框架 | Publication

Publications-Theses

Article View/Open

pdf(0)

Publication Export

Google Scholar^TM

題名	ScoreRAG：基於檢索增強生成與一致性相關性評分的結構化新聞生成框架 ScoreRAG: A Retrieval-Augmented Generation Framework with Consistency-Relevance Scoring and Structured Summarization for News Generation
作者	林佩昀 Lin, Pei-Yun
貢獻者	蔡炎龍 Tsai, Yen-Lung 林佩昀 Lin, Pei-Yun
關鍵詞	檢索增強生成新聞生成大型語言模型語意重排序分級摘要生成自然語言處理 Retrieval-Augmented Generation News Generation Large Language Models Semantic Reranking Graded Summarization Natural Language Processing
日期	2025
上傳時間	1-Sep-2025 16:30:18 (UTC+8)
摘要	本研究提出名為ScoreRAG的方法，旨在提升大型語言模型生成新聞文章的品質。雖然自然語言處理和大型語言模型已有顯著發展，然而在新聞生成任務中，語言模型仍面臨幻覺、事實不一致以及缺乏領域專業知識等挑戰。ScoreRAG透過結合檢索增強生成、一致性的相關性評估和結構化摘要的多階段框架整合來解決這些問題。該系統首先從向量資料庫中檢索相關新聞文檔，並對應至新聞資料庫以獲取完整的文章內容，接著利用大型語言模型評估檢索文檔與新聞的相關性分數，並根據相關性分數對檢索文檔進行排序和過濾，移除低相關文章。最後，系統根據相關性分數進行分級摘要，並將結果與系統提示詞一同輸入語言模型進行最終輸出。透過此方法，ScoreRAG旨在顯著提高生成新聞內容的準確性、連貫性、資訊豐富度以及專業性，同時在整個生成過程中保持穩定性和一致性。程式碼與演示：https://github.com/peiyun2260/ScoreRAG This research introduces ScoreRAG, an approach to enhance the quality of automated news generation. Despite advancements in Natural Language Processing and large language models, current news generation methods often struggle with hallucinations, factual inconsistencies, and lack of domain-specific expertise when producing news articles. ScoreRAG addresses these challenges through a multi-stage framework combining retrieval-augmented generation, consistency relevance evaluation, and structured summarization. The system first retrieves relevant news documents from a vector database, maps them to complete news items, and assigns consistency relevance scores based on large language model evaluations. These documents are then reranked according to relevance, with low-quality items filtered out. The framework proceeds to generate graded summaries based on relevance scores, which guide the large language model in producing complete news articles following professional journalistic standards. Through this methodical approach, ScoreRAG aims to significantly improve the accuracy, coherence, informativeness, and professionalism of generated news articles while maintaining stability and consistency throughout the generation process. The code and demo are available at: https://github.com/peiyun2260/ScoreRAG
參考文獻	[1] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. Attention is all you need. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. [2] Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qianglong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, and Ting Liu. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. ACM Trans. Inf. Syst., 43(2), January 2025. [3] Ziwei Xu, Sanjay Jain, and Mohan Kankanhalli. Hallucination is inevitable: An innate limitation of large language models, 2024. [4] Sebastian Farquhar et al. Detecting hallucinations in large language models using semantic entropy. Nature, 630:625–630, June 2024. [5] Daniel Jurafsky and James H. Martin. Speech and Language Processing. Stanford University, 3rd edition, 2023. Draft edition. [6] Adam L. Berger, Stephen A. Della Pietra, and Vincent J. Della Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39–71, 1996. [7] Corinna Cortes and Vladimir Vapnik. Support-vector networks. Machine Learning, 20(3):273–297, 1995. [8] Geoffrey E. Hinton, Simon Osindero, and Yee Whye Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18(7):1527–1554, 2006. [9] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436–444, 2015. [10] David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. Learning internal representations by error propagation. Tech. Rep. ICS 8506, Institute for Cognitive Science, University of California, San Diego, 1985. [11] Jeffrey L. Elman. Finding structure in time. Cognitive Science, 14(2):179–211, 1990. [12] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. In Proceedings of the International Conference on Learning Representations (ICLR 2013), Workshop Track, 2013. Presented at the ICLR 2013 Workshop Track. [13] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735–1780, 1997. [14] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. [15] Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Pearson, 4 edition, 2021. [16] Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Matt Kusner, Willie Neiswanger, Wen-tau Yu, and Sebastian Riedel. Retrieval-augmented generation for knowledge-intensive nlp tasks. In Advances in Neural Information Processing Systems (NeurIPS), 2020. [17] Hongyin Li, Jie Huang, Jiahui Huang, Lei Han, and Bing Qin. Re2g: Retrieve, rerank, and generate for factual open-domain question answering. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 3143–3157, December 2023. [18] Zeming Ji, Nayeon Lee, Rudolf Frieske, Tao Yu, Dan Su, Yan Xu, Etsuko Ishii, Yejin Bang, Andrea Madotto, and Pascale Fung. Hallucination in natural language generation: A survey. ACM Computing Surveys, 2023. [19] Drew McDermott, Malik Ghallab, Adele Howe, Craig Knoblock, James Kurien, John L. Bresina, Brian Drabble, Alan Garvey, Keith Golden, J. Scott Penberthy, David Smith, and Daniel Weld. Pddl—the planning domain definition language. Tech. Rep. CVC TR-98-003/DCS TR-1165, Yale Center for Computational Vision and Control, 1998. 40 [20] Shinn Yao, Yujia Zhao, Dian Yu, Jing Cao, Michael K. Y. Li, Nanyun Peng, and Daniel S. Weld. Plan-and-solve prompting for complex reasoning tasks. In Findings of the Association for Computational Linguistics: EMNLP 2023, December 2023. [21] Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, and Furu Wei. Multilingual e5 text embeddings: A technical report, 2024. [22] Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8440–8451. Association for Computational Linguistics, 2020. [23] Chroma Core. Chroma: The AI-native open-source embedding database. https://github.com/chroma-core/chroma. Accessed: Apr. 21, 2025. [24] LangChain-AI. LangChain: Build context-aware reasoning applications. https://github.com/langchain-ai/langchain. Accessed: Apr. 21, 2025. [25] Chroma. Chroma: Open-source AI application database. https://www.trychroma.com/.Accessed: Apr. 21, 2025. [26] MongoDB, Inc. MongoDB: The developer data platform. https://www.mongodb.com/.Accessed: Apr. 21, 2025. [27] Xuezhi Wang, Jason Wei, Dale Schuurmans, Ed Chi, Quoc Le, and Eric Chi. Self-consistency improves chain of thought reasoning in language models. In Proceedings of the International Conference on Learning Representations (ICLR), 2023.
描述	碩士國立政治大學應用數學系 111751002
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0111751002
資料類型	thesis

dc.contributor.advisor	蔡炎龍	zh_TW
dc.contributor.advisor	Tsai, Yen-Lung	en_US
dc.contributor.author (Authors)	林佩昀	zh_TW
dc.contributor.author (Authors)	Lin, Pei-Yun	en_US
dc.creator (作者)	林佩昀	zh_TW
dc.creator (作者)	Lin, Pei-Yun	en_US
dc.date (日期)	2025	en_US
dc.date.accessioned	1-Sep-2025 16:30:18 (UTC+8)	-
dc.date.available	1-Sep-2025 16:30:18 (UTC+8)	-
dc.date.issued (上傳時間)	1-Sep-2025 16:30:18 (UTC+8)	-
dc.identifier (Other Identifiers)	G0111751002	en_US
dc.identifier.uri (URI)	https://ah.lib.nccu.edu.tw/item?item_id=178881	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	應用數學系	zh_TW
dc.description (描述)	111751002	zh_TW
dc.description.abstract (摘要)	本研究提出名為ScoreRAG的方法，旨在提升大型語言模型生成新聞文章的品質。雖然自然語言處理和大型語言模型已有顯著發展，然而在新聞生成任務中，語言模型仍面臨幻覺、事實不一致以及缺乏領域專業知識等挑戰。ScoreRAG透過結合檢索增強生成、一致性的相關性評估和結構化摘要的多階段框架整合來解決這些問題。該系統首先從向量資料庫中檢索相關新聞文檔，並對應至新聞資料庫以獲取完整的文章內容，接著利用大型語言模型評估檢索文檔與新聞的相關性分數，並根據相關性分數對檢索文檔進行排序和過濾，移除低相關文章。最後，系統根據相關性分數進行分級摘要，並將結果與系統提示詞一同輸入語言模型進行最終輸出。透過此方法，ScoreRAG旨在顯著提高生成新聞內容的準確性、連貫性、資訊豐富度以及專業性，同時在整個生成過程中保持穩定性和一致性。程式碼與演示：https://github.com/peiyun2260/ScoreRAG	zh_TW
dc.description.abstract (摘要)	This research introduces ScoreRAG, an approach to enhance the quality of automated news generation. Despite advancements in Natural Language Processing and large language models, current news generation methods often struggle with hallucinations, factual inconsistencies, and lack of domain-specific expertise when producing news articles. ScoreRAG addresses these challenges through a multi-stage framework combining retrieval-augmented generation, consistency relevance evaluation, and structured summarization. The system first retrieves relevant news documents from a vector database, maps them to complete news items, and assigns consistency relevance scores based on large language model evaluations. These documents are then reranked according to relevance, with low-quality items filtered out. The framework proceeds to generate graded summaries based on relevance scores, which guide the large language model in producing complete news articles following professional journalistic standards. Through this methodical approach, ScoreRAG aims to significantly improve the accuracy, coherence, informativeness, and professionalism of generated news articles while maintaining stability and consistency throughout the generation process. The code and demo are available at: https://github.com/peiyun2260/ScoreRAG	en_US
dc.description.tableofcontents	中文摘要i Abstract ii Contents iii List of Tables vi List of Figures vii 1 Introduction 1 1.1 Research Background 1 2 Literature Review 3 2.1 Overview of Natural Language Processing 3 2.2 Transformer Architecture 4 2.2.1 Input Embedding and Positional Encoding 5 2.2.2 Self-attention mechanism 7 2.2.3 Multi-head self-attention mechanism 9 2.2.4 Position-wise fully connected feed-forward network 11 2.2.5 Residual and normalization 11 2.2.6 Masked multi-head attention mechanism 13 2.2.7 The final linear and softmax layer 14 2.3 Decoding Strategies 15 2.3.1 Temperature Sampling 15 2.3.2 Top-k Sampling and Top-p Sampling 16 2.4 Enhancing LLMs into AI Agents 17 2.4.1 Retrieval-Augmented Generation 17 2.4.2 Rerank 18 2.4.3 AI Planning 19 3 Methods 22 3.1 Data Preprocessing 22 3.1.1 Text Embeddings 23 3.2 System Architecture 25 3.2.1 RAG-based News Retrieval 25 3.2.2 Mapping News from Database 26 3.2.3 Consistency Scoring and Reranking 27 3.2.4 Score-based Summarization Generation 27 3.2.5 Guided News Generation 28 4 Experiments 30 4.1 Experiment Setup 30 4.1.1 Compared Methods 30 4.1.2 Evaluation Strategy 30 4.2 Results 32 4.2.1 LLM Evaluation 32 4.2.2 Expert Evaluation 33 4.2.3 Analysis 35 5 Conclusion 36 6 Future Work 37 Bibliography 39 Appendix A Detailed generated news from different prompts 42 Appendix B Clean text function in data preprocessing 44 Appendix C Embedding configuration for text preprocessing 45 Appendix D Retriever function for news embedding 46 Appendix E The function of evaluating consistency score 47 Appendix F The function of generating graded summaries 49 Appendix G The function of guided news generation 51 Appendix H Frontend interface of ScoreRAG 52 Appendix I The evaluation criteria of ScoreRAG 54	zh_TW
dc.format.extent	13648174 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0111751002	en_US
dc.subject (關鍵詞)	檢索增強生成	zh_TW
dc.subject (關鍵詞)	新聞生成	zh_TW
dc.subject (關鍵詞)	大型語言模型	zh_TW
dc.subject (關鍵詞)	語意重排序	zh_TW
dc.subject (關鍵詞)	分級摘要生成	zh_TW
dc.subject (關鍵詞)	自然語言處理	zh_TW
dc.subject (關鍵詞)	Retrieval-Augmented Generation	en_US
dc.subject (關鍵詞)	News Generation	en_US
dc.subject (關鍵詞)	Large Language Models	en_US
dc.subject (關鍵詞)	Semantic Reranking	en_US
dc.subject (關鍵詞)	Graded Summarization	en_US
dc.subject (關鍵詞)	Natural Language Processing	en_US
dc.title (題名)	ScoreRAG：基於檢索增強生成與一致性相關性評分的結構化新聞生成框架	zh_TW
dc.title (題名)	ScoreRAG: A Retrieval-Augmented Generation Framework with Consistency-Relevance Scoring and Structured Summarization for News Generation	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	[1] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. Attention is all you need. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. [2] Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qianglong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, and Ting Liu. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. ACM Trans. Inf. Syst., 43(2), January 2025. [3] Ziwei Xu, Sanjay Jain, and Mohan Kankanhalli. Hallucination is inevitable: An innate limitation of large language models, 2024. [4] Sebastian Farquhar et al. Detecting hallucinations in large language models using semantic entropy. Nature, 630:625–630, June 2024. [5] Daniel Jurafsky and James H. Martin. Speech and Language Processing. Stanford University, 3rd edition, 2023. Draft edition. [6] Adam L. Berger, Stephen A. Della Pietra, and Vincent J. Della Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39–71, 1996. [7] Corinna Cortes and Vladimir Vapnik. Support-vector networks. Machine Learning, 20(3):273–297, 1995. [8] Geoffrey E. Hinton, Simon Osindero, and Yee Whye Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18(7):1527–1554, 2006. [9] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436–444, 2015. [10] David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. Learning internal representations by error propagation. Tech. Rep. ICS 8506, Institute for Cognitive Science, University of California, San Diego, 1985. [11] Jeffrey L. Elman. Finding structure in time. Cognitive Science, 14(2):179–211, 1990. [12] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. In Proceedings of the International Conference on Learning Representations (ICLR 2013), Workshop Track, 2013. Presented at the ICLR 2013 Workshop Track. [13] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735–1780, 1997. [14] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. [15] Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Pearson, 4 edition, 2021. [16] Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Matt Kusner, Willie Neiswanger, Wen-tau Yu, and Sebastian Riedel. Retrieval-augmented generation for knowledge-intensive nlp tasks. In Advances in Neural Information Processing Systems (NeurIPS), 2020. [17] Hongyin Li, Jie Huang, Jiahui Huang, Lei Han, and Bing Qin. Re2g: Retrieve, rerank, and generate for factual open-domain question answering. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 3143–3157, December 2023. [18] Zeming Ji, Nayeon Lee, Rudolf Frieske, Tao Yu, Dan Su, Yan Xu, Etsuko Ishii, Yejin Bang, Andrea Madotto, and Pascale Fung. Hallucination in natural language generation: A survey. ACM Computing Surveys, 2023. [19] Drew McDermott, Malik Ghallab, Adele Howe, Craig Knoblock, James Kurien, John L. Bresina, Brian Drabble, Alan Garvey, Keith Golden, J. Scott Penberthy, David Smith, and Daniel Weld. Pddl—the planning domain definition language. Tech. Rep. CVC TR-98-003/DCS TR-1165, Yale Center for Computational Vision and Control, 1998. 40 [20] Shinn Yao, Yujia Zhao, Dian Yu, Jing Cao, Michael K. Y. Li, Nanyun Peng, and Daniel S. Weld. Plan-and-solve prompting for complex reasoning tasks. In Findings of the Association for Computational Linguistics: EMNLP 2023, December 2023. [21] Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, and Furu Wei. Multilingual e5 text embeddings: A technical report, 2024. [22] Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8440–8451. Association for Computational Linguistics, 2020. [23] Chroma Core. Chroma: The AI-native open-source embedding database. https://github.com/chroma-core/chroma. Accessed: Apr. 21, 2025. [24] LangChain-AI. LangChain: Build context-aware reasoning applications. https://github.com/langchain-ai/langchain. Accessed: Apr. 21, 2025. [25] Chroma. Chroma: Open-source AI application database. https://www.trychroma.com/.Accessed: Apr. 21, 2025. [26] MongoDB, Inc. MongoDB: The developer data platform. https://www.mongodb.com/.Accessed: Apr. 21, 2025. [27] Xuezhi Wang, Jason Wei, Dale Schuurmans, Ed Chi, Quoc Le, and Eric Chi. Self-consistency improves chain of thought reasoning in language models. In Proceedings of the International Conference on Learning Representations (ICLR), 2023.	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM