Publications-Theses
Article View/Open
Publication Export
-
Google ScholarTM
NCCU Library
Citation Infomation
Related Publications in TAIR
題名 基於大型語言模型的即時新聞檢索與生成系統開發
Building a Real-Time News Retrieval-Augmented Generation System with Large Language Models作者 林佩欣
Lin, Pei-Hsin貢獻者 蔡炎龍
Tsai, Yen-Lung
林佩欣
Lin, Pei-Hsin關鍵詞 檢索增強生成(RAG)
大型語言模型(LLM)
新聞檢索
生成式人工智慧
開源模型
Retrieval-Augmented Generation
Large Language Models
News Retrieval
Generative AI
Open-Source Models日期 2025 上傳時間 1-Jul-2025 15:49:00 (UTC+8) 摘要 大型語言模型(LLM)具備優異的查詢理解與文本生成能力,有效解決傳統檢索系統中的語意挑戰。然而,其高度依賴預訓練知識,容易產生過時或虛構的資訊,特別是在即時新聞檢索的應用情境中更為明顯。 相較於其他資料類型,新聞資料具備時效性高、語意重複率高、資訊片段化等特性,進一步加深了檢索與生成任務的挑戰。因此,本研究選擇以新聞資料作為基礎,實作一套檢索增強生成系統,並將情境設定於新聞專業人員使用內部查詢系統時的多樣需求,藉此觀察RAG技術是否能有效降低幻覺現象,提升資訊正確性。 研究首先自行爬取ETtoday新聞資料作為知識來源,並設計實驗比較四種模型配置:Llama3-8B、Llama3-8B 搭配 RAG、GPT-4、以及 GPT-4 搭配 RAG。測試任務涵蓋事實問答、事件綜合與新聞摘要等多個面向,並針對生成內容的正確性與完整性等進行評分。結果顯示,引入RAG技術可顯著提升模型回應的事實準確性,並有效減少生成幻覺的情況。 在系統開發方面,涵蓋從資料爬取、前處理、文件切分至向量資料庫設計,最終以 Gradio 實作互動介面。研究過程中亦強調檢索品質的追蹤與回饋機制,確保最終生成答案的可靠性。
Large language models (LLMs) excel at query understanding and text generation but are prone to outdated or hallucinated content, especially in real-time news retrieval. News data poses unique challenges such as high redundancy, temporal sensitivity, and fragmented information. This study develops a retrieval-augmented generation (RAG) system tailored for internal news query scenarios and evaluates its effectiveness using ETtoday news data. Four model setups—LLaMA3-8B, LLaMA3-8B+RAG, GPT-4, and GPT-4+RAG—were tested across tasks like factual Q&A and summarization. The evaluation focused on factual accuracy and completeness. Results show that RAG significantly improves output reliability while reducing hallucinations. The system implementation includes data crawling, data preprocessing, word chunking, and vector database building, with an interactive frontend built using Gradio. Throughout the experiment, emphasis was placed on monitoring retrieval quality and incorporating feedback checking to ensure the reliability of final outputs.參考文獻 Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., . . . Amodei, D. (2020). Language Models are Few-Shot Learners. Neural Information Processing Systems, 33, 1877–1901. https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the Dangers of Stochastic Parrots. On The Dangers of Stochastic Parrots: Can Language Models Be Too Big?, 610–623. https://doi.org/10.1145/3442188.3445922 Chase, L. (2022). LangChain: Building applications with LLMs through composability. GitHub Documentation. Gharge, S., & Chavan, M. (2017). An integrated approach for malicious tweets detection using NLP. 2017 International Conference on Inventive Communication and Computational Technologies (ICICCT). https://doi.org/10.1109/icicct.2017.7975235 Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems 27, Montreal, Quebec, Canada, 2014, pp. 2672−2680. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. In MIT Press eBooks. https://dl.acm.org/citation.cfm?id=3086952 Guu, K., Lee, K., Tung, Z., Pasupat, P., & Chang, M. (2020, February 10). REALM: Retrieval-Augmented Language Model Pre-Training. arXiv.org. https://arxiv.org/abs/2002.08909 Hearst, M. A. (2009). Search user interfaces. http://ci.nii.ac.jp/ncid/BA91702558 Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735 Hochreiter, S. (1998). The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty Fuzziness and Knowledge-Based Systems, 06(02), 107–116. https://doi.org/10.1142/s0218488598000094 Holtzman, A., Buys, J., Du, L., Forbes, M., & Choi, Y. (2020). The curious case of neural text degeneration. arXiv (Cornell University). https://arxiv.org/pdf/1904.09751.pdf Izacard, G., & Grave, E. (2020, July 2). Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering. arXiv.org. https://arxiv.org/abs/2007.01282 Johnson, J., Douze, M., & Jégou, H. (2017, February 28). Billion-scale similarity search with GPUs. arXiv.org. https://arxiv.org/abs/1702.08734 Karpukhin, V., Oguz, B., Min, S., Lewis, P., Wu, L., Edunov, S., Chen, D., & Yih, W. (2020). Dense Passage Retrieval for Open-Domain Question Answering. Dense Passage Retrieval for Open-domain Question Answering. https://doi.org/10.18653/v1/2020.emnlp-main.550 LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539 Maynez, J., Narayan, S., Bohnet, B., & McDonald, R. (2020, May 2). On faithfulness and factuality in abstractive summarization. arXiv.org. https://arxiv.org/abs/2005.00661 Manning, C. D., Raghavan, P., & Schütze, H. (2009). Introduction to information retrieval. Choice Reviews Online, 46(05), 46–2715. https://doi.org/10.5860/choice.46-2715 Mitra, B., & Craswell, N. (2018). An Introduction to Neural Information Retrieval t. Foundations and Trends® in Information Retrieval, 13(1), 1–126. https://doi.org/10.1561/1500000061 Mikolov, T., Karafiát, M., Burget, L., Černocký, J., & Khudanpur, S. (2010). Recurrent neural network based language model. Interspeech 2022. https://doi.org/10.21437/interspeech.2010-343 Masterman, T., Besen, S., Sawtell, M., & Chao, A. (n.d.). The landscape of emerging AI agent architectures for reasoning, planning, and tool calling: a survey. arXiv.org. https://arxiv.org/abs/2404.11584 Nogueira, R., & Cho, K. (2019, January 13). Passage Re-ranking with BERT. arXiv.org. https://arxiv.org/abs/1901.04085 Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval Vol. 2, No 1-2, 1–135 http://dx.doi.org/10.1561/1500000011 Robertson, S., & Zaragoza, H. (2009). The Probabilistic Relevance Framework: BM25 and beyond. Foundations and Trends® in Information Retrieval, 3(4), 333–389. https://doi.org/10.1561/1500000019 Ramit Sawhney, Harshit Joshi, Saumya Gandhi, and Rajiv Ratn Shah. 2020. A Time-Aware Transformer Based Model for Suicide Ideation Detection on Social Media. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7685–7697, Online. Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-main.619 Shuster, K., Poff, S., Chen, M., Kiela, D., & Weston, J. (2021). Retrieval augmentation reduces hallucination in conversation. Empirical Methods in Natural Language Processing, 3784–3803. https://aclanthology.org/2021.findings-emnlp.320/ Trabelsi, M., Chen, Z., Davison, B. D., & Heflin, J. (2021). Neural ranking models for document retrieval. Information Retrieval, 24(6), 400–444. https://doi.org/10.1007/s10791-021-09398-0 Thorne, J., & Vlachos, A. (2018). Automated fact checking: task formulations, methods and future directions. arXiv (Cornell University), 3346–3359. https://arxiv.org/pdf/1806.07687.pdf Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is All you Need. Advances in neural information processing systems, 30. https://doi.org/10.48550/arXiv.1706.03762 Yasunaga, M., Ren, H., Bosselut, A., Liang, P., & Leskovec, J. (2021). QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. https://doi.org/10.18653/v1/2021.naacl-main.45 Zhang, Y., Ni, A., Mao, Z., Wu, C. H., Zhu, C., Deb, B., Awadallah, A. H., Radev, D., & Zhang, R. (2021, October 16). SUMM^N: a Multi-Stage summarization framework for long input dialogues and documents. arXiv.org. https://arxiv.org/abs/2110.10150 Zhang, T., Ladhak, F., Durmus, E., Liang, P., McKeown, K., & Hashimoto, T. B. (2023, January 31). Benchmarking large language models for news summarization. arXiv.org. https://arxiv.org/abs/2301.13848 描述 碩士
國立政治大學
全球傳播與創新科技碩士學位學程
111ZM1027資料來源 http://thesis.lib.nccu.edu.tw/record/#G0111ZM1027 資料類型 thesis dc.contributor.advisor 蔡炎龍 zh_TW dc.contributor.advisor Tsai, Yen-Lung en_US dc.contributor.author (Authors) 林佩欣 zh_TW dc.contributor.author (Authors) Lin, Pei-Hsin en_US dc.creator (作者) 林佩欣 zh_TW dc.creator (作者) Lin, Pei-Hsin en_US dc.date (日期) 2025 en_US dc.date.accessioned 1-Jul-2025 15:49:00 (UTC+8) - dc.date.available 1-Jul-2025 15:49:00 (UTC+8) - dc.date.issued (上傳時間) 1-Jul-2025 15:49:00 (UTC+8) - dc.identifier (Other Identifiers) G0111ZM1027 en_US dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/157880 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 全球傳播與創新科技碩士學位學程 zh_TW dc.description (描述) 111ZM1027 zh_TW dc.description.abstract (摘要) 大型語言模型(LLM)具備優異的查詢理解與文本生成能力,有效解決傳統檢索系統中的語意挑戰。然而,其高度依賴預訓練知識,容易產生過時或虛構的資訊,特別是在即時新聞檢索的應用情境中更為明顯。 相較於其他資料類型,新聞資料具備時效性高、語意重複率高、資訊片段化等特性,進一步加深了檢索與生成任務的挑戰。因此,本研究選擇以新聞資料作為基礎,實作一套檢索增強生成系統,並將情境設定於新聞專業人員使用內部查詢系統時的多樣需求,藉此觀察RAG技術是否能有效降低幻覺現象,提升資訊正確性。 研究首先自行爬取ETtoday新聞資料作為知識來源,並設計實驗比較四種模型配置:Llama3-8B、Llama3-8B 搭配 RAG、GPT-4、以及 GPT-4 搭配 RAG。測試任務涵蓋事實問答、事件綜合與新聞摘要等多個面向,並針對生成內容的正確性與完整性等進行評分。結果顯示,引入RAG技術可顯著提升模型回應的事實準確性,並有效減少生成幻覺的情況。 在系統開發方面,涵蓋從資料爬取、前處理、文件切分至向量資料庫設計,最終以 Gradio 實作互動介面。研究過程中亦強調檢索品質的追蹤與回饋機制,確保最終生成答案的可靠性。 zh_TW dc.description.abstract (摘要) Large language models (LLMs) excel at query understanding and text generation but are prone to outdated or hallucinated content, especially in real-time news retrieval. News data poses unique challenges such as high redundancy, temporal sensitivity, and fragmented information. This study develops a retrieval-augmented generation (RAG) system tailored for internal news query scenarios and evaluates its effectiveness using ETtoday news data. Four model setups—LLaMA3-8B, LLaMA3-8B+RAG, GPT-4, and GPT-4+RAG—were tested across tasks like factual Q&A and summarization. The evaluation focused on factual accuracy and completeness. Results show that RAG significantly improves output reliability while reducing hallucinations. The system implementation includes data crawling, data preprocessing, word chunking, and vector database building, with an interactive frontend built using Gradio. Throughout the experiment, emphasis was placed on monitoring retrieval quality and incorporating feedback checking to ensure the reliability of final outputs. en_US dc.description.tableofcontents 1. Introduction 1 1.1 Research Objectives 2 1.2 Research Framework 3 2. Theocratical Background 4 2.1 Traditional Information Retrieval Methods 4 2.1.1 Keyword-Based Retrieval 4 2.1.2 TF-IDF 5 2.1.3 BM25 6 2.2 Deep Learning for Information Retrieval 7 2.2.1 Recurrent Neural Networks (RNN) 7 2.2.2 Transformer 9 2.3 Large Language Models (LLM) 12 Advantages and Limitations 13 Common Solutions 14 2.4 Generative artificial intelligence (AI) 15 Advantages and Limitations 16 Applications of Generative AI in News Information Retrieval 17 2.5 Retrieval-Augmented Generation (RAG) 18 Applications 18 Limitations and Challenges 19 2.6 AI agent 20 3. Methodology 22 3.1 System structure 22 3.2 Data Preprocessing 23 Data Set 23 Text Segmentation 24 3.3 Create Vector database 26 Optimization of Sentence Embeddings 26 Storing Vectorized Data with FAISS 27 3.4 Response Generation 29 Llama3-8B 29 Llama3-8B + RAG 30 GPT-4 30 GPT-4 + RAG 30 System prompt 31 4. Experiment Results 33 Validation Question Design 33 Experiment Result and Demo 37 Key Factors Influencing Evaluation Differences 42 Recommended Task Types for Each Model 43 Key Finding 44 Summary of Model Performance 45 5. Conclusions 46 Limitations 47 Adaptive Retrieval Techniques 47 Future Expect 49 6. References 50 zh_TW dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0111ZM1027 en_US dc.subject (關鍵詞) 檢索增強生成(RAG) zh_TW dc.subject (關鍵詞) 大型語言模型(LLM) zh_TW dc.subject (關鍵詞) 新聞檢索 zh_TW dc.subject (關鍵詞) 生成式人工智慧 zh_TW dc.subject (關鍵詞) 開源模型 zh_TW dc.subject (關鍵詞) Retrieval-Augmented Generation en_US dc.subject (關鍵詞) Large Language Models en_US dc.subject (關鍵詞) News Retrieval en_US dc.subject (關鍵詞) Generative AI en_US dc.subject (關鍵詞) Open-Source Models en_US dc.title (題名) 基於大型語言模型的即時新聞檢索與生成系統開發 zh_TW dc.title (題名) Building a Real-Time News Retrieval-Augmented Generation System with Large Language Models en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., . . . Amodei, D. (2020). Language Models are Few-Shot Learners. Neural Information Processing Systems, 33, 1877–1901. https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the Dangers of Stochastic Parrots. On The Dangers of Stochastic Parrots: Can Language Models Be Too Big?, 610–623. https://doi.org/10.1145/3442188.3445922 Chase, L. (2022). LangChain: Building applications with LLMs through composability. GitHub Documentation. Gharge, S., & Chavan, M. (2017). An integrated approach for malicious tweets detection using NLP. 2017 International Conference on Inventive Communication and Computational Technologies (ICICCT). https://doi.org/10.1109/icicct.2017.7975235 Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems 27, Montreal, Quebec, Canada, 2014, pp. 2672−2680. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. In MIT Press eBooks. https://dl.acm.org/citation.cfm?id=3086952 Guu, K., Lee, K., Tung, Z., Pasupat, P., & Chang, M. (2020, February 10). REALM: Retrieval-Augmented Language Model Pre-Training. arXiv.org. https://arxiv.org/abs/2002.08909 Hearst, M. A. (2009). Search user interfaces. http://ci.nii.ac.jp/ncid/BA91702558 Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735 Hochreiter, S. (1998). The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty Fuzziness and Knowledge-Based Systems, 06(02), 107–116. https://doi.org/10.1142/s0218488598000094 Holtzman, A., Buys, J., Du, L., Forbes, M., & Choi, Y. (2020). The curious case of neural text degeneration. arXiv (Cornell University). https://arxiv.org/pdf/1904.09751.pdf Izacard, G., & Grave, E. (2020, July 2). Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering. arXiv.org. https://arxiv.org/abs/2007.01282 Johnson, J., Douze, M., & Jégou, H. (2017, February 28). Billion-scale similarity search with GPUs. arXiv.org. https://arxiv.org/abs/1702.08734 Karpukhin, V., Oguz, B., Min, S., Lewis, P., Wu, L., Edunov, S., Chen, D., & Yih, W. (2020). Dense Passage Retrieval for Open-Domain Question Answering. Dense Passage Retrieval for Open-domain Question Answering. https://doi.org/10.18653/v1/2020.emnlp-main.550 LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539 Maynez, J., Narayan, S., Bohnet, B., & McDonald, R. (2020, May 2). On faithfulness and factuality in abstractive summarization. arXiv.org. https://arxiv.org/abs/2005.00661 Manning, C. D., Raghavan, P., & Schütze, H. (2009). Introduction to information retrieval. Choice Reviews Online, 46(05), 46–2715. https://doi.org/10.5860/choice.46-2715 Mitra, B., & Craswell, N. (2018). An Introduction to Neural Information Retrieval t. Foundations and Trends® in Information Retrieval, 13(1), 1–126. https://doi.org/10.1561/1500000061 Mikolov, T., Karafiát, M., Burget, L., Černocký, J., & Khudanpur, S. (2010). Recurrent neural network based language model. Interspeech 2022. https://doi.org/10.21437/interspeech.2010-343 Masterman, T., Besen, S., Sawtell, M., & Chao, A. (n.d.). The landscape of emerging AI agent architectures for reasoning, planning, and tool calling: a survey. arXiv.org. https://arxiv.org/abs/2404.11584 Nogueira, R., & Cho, K. (2019, January 13). Passage Re-ranking with BERT. arXiv.org. https://arxiv.org/abs/1901.04085 Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval Vol. 2, No 1-2, 1–135 http://dx.doi.org/10.1561/1500000011 Robertson, S., & Zaragoza, H. (2009). The Probabilistic Relevance Framework: BM25 and beyond. Foundations and Trends® in Information Retrieval, 3(4), 333–389. https://doi.org/10.1561/1500000019 Ramit Sawhney, Harshit Joshi, Saumya Gandhi, and Rajiv Ratn Shah. 2020. A Time-Aware Transformer Based Model for Suicide Ideation Detection on Social Media. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7685–7697, Online. Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-main.619 Shuster, K., Poff, S., Chen, M., Kiela, D., & Weston, J. (2021). Retrieval augmentation reduces hallucination in conversation. Empirical Methods in Natural Language Processing, 3784–3803. https://aclanthology.org/2021.findings-emnlp.320/ Trabelsi, M., Chen, Z., Davison, B. D., & Heflin, J. (2021). Neural ranking models for document retrieval. Information Retrieval, 24(6), 400–444. https://doi.org/10.1007/s10791-021-09398-0 Thorne, J., & Vlachos, A. (2018). Automated fact checking: task formulations, methods and future directions. arXiv (Cornell University), 3346–3359. https://arxiv.org/pdf/1806.07687.pdf Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is All you Need. Advances in neural information processing systems, 30. https://doi.org/10.48550/arXiv.1706.03762 Yasunaga, M., Ren, H., Bosselut, A., Liang, P., & Leskovec, J. (2021). QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. https://doi.org/10.18653/v1/2021.naacl-main.45 Zhang, Y., Ni, A., Mao, Z., Wu, C. H., Zhu, C., Deb, B., Awadallah, A. H., Radev, D., & Zhang, R. (2021, October 16). SUMM^N: a Multi-Stage summarization framework for long input dialogues and documents. arXiv.org. https://arxiv.org/abs/2110.10150 Zhang, T., Ladhak, F., Durmus, E., Liang, P., McKeown, K., & Hashimoto, T. B. (2023, January 31). Benchmarking large language models for news summarization. arXiv.org. https://arxiv.org/abs/2301.13848 zh_TW
