Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 結合多種群基因演算法提升問答系統生成品質之研究 ——以奇美博物館為例
Enhancing the Answer Quality of Question-Answering Systems with Multi-Population Genetic Algorithms: A Case Study of the Chimei Museum
作者 羅永傑
Luo, Yong-Jie
貢獻者 蔡子傑
Tsai, Tzu-Chieh
羅永傑
Luo, Yong-Jie
關鍵詞 多種群基因演算法
博物館 QA
幻覺問題
檢索增強生成
Multi-Population Genetic Algorithm
Museum QA
Hallucination
Retrieval-Augmented Generation
日期 2026
上傳時間 2-Mar-2026 12:22:18 (UTC+8)
摘要 當生成式大型語言模型(Large Language Models, LLMs)被應用於知識問答系統時,常會產生與事實不符的答案,亦即「幻覺」(Hallucination)問題,這對系統的可靠性與應用價值帶來挑戰。為解決此問題,近年來出現了檢索增強生成(Retrieval-Augmented Generation, RAG)技術,藉由整合外部知識庫以提升答案的正確性。然而,傳統 RAG 方法仍有其限制,例如:語意檢索過程可能遺失文本中關鍵資訊、使用者提問過於模糊時可能檢索到不相關內容、以及單次檢索難以滿足複雜任務所需的多步推理。 本研究提出一種結合 多種群基因演算法(Multi-Population Genetic Algorithm, MPGA) 的問答推演方法,以提升生成答案的品質。研究流程包含四個主要步驟:首先,透過 多頭自注意力機制(Multi-head Self-Attention) 捕捉問題中多面向的語意;其次,利用 餘弦相似度與 K-近鄰演算法(K-Nearest Neighbor, K-NN) 從向量資料庫檢索相關文本;再者,透過 MPGA 對檢索結果進行優化,以增強上下文的關聯性與完整性;最後,將問題與最佳候選文本整合為提示(Prompt),輸入至生成模型以產生最終答案。 在實驗設計上,本研究採用 《閱讀 新.奇.美 - 奇美博物館典藏欣賞 (中文版)》 作為主要知識來源,建構博物館專屬的向量資料庫,並進行 QA 測試。實驗結果顯示,透過多頭自注意力與 MPGA 的結合,系統能更好地捕捉文本的多重語意特徵,有效降低幻覺的發生,並在答案的正確性、完整性與多樣性上皆優於基準模型。 本研究證明 MPGA-RAG 方法能夠提升博物館知識問答系統的可靠性,並展現出應用於智慧導覽與文化教育推廣的實務潛力。
When large language models (LLMs) are applied to knowledge question-answering (QA) systems, they often generate factually incorrect responses, a phenomenon known as hallucination, which poses challenges to system reliability and practical usability. To address this issue, Retrieval-Augmented Generation (RAG) techniques have been introduced in recent years, leveraging external knowledge bases to improve answer accuracy. However, traditional RAG methods still face several limitations: semantic retrieval may overlook critical information in the source text, overly broad or vague user queries may lead to irrelevant retrievals, and single-step retrieval strategies are insufficient for complex tasks that require iterative reasoning. This study proposes a QA inference method that integrates a Multi-Population Genetic Algorithm (MPGA) to enhance the quality of generated answers. The research framework consists of four main steps: (1) applying a Multi-head Self-Attention mechanism to capture multiple semantic perspectives of the query; (2) employing cosine similarity and the K-Nearest Neighbor (K-NN) algorithm to retrieve relevant text fragments from a vector database; (3) optimizing the retrieved results through MPGA to strengthen the contextual relevance and coherence; and (4) combining the query with the optimized text into a prompt, which is then fed into the language model to generate the final answer. For the experimental design, this study uses the book “New Chimei: A Guide to the Collections of the Chimei Museum (Chinese Edition)” as the primary knowledge source, constructing a museum-specific vector database for QA testing. The experimental results demonstrate that the combination of multi-head self-attention and MPGA enables the system to better capture diverse semantic features of the text, effectively reduce hallucinations, and outperform the baseline model in terms of answer accuracy, completeness, and diversity. This research confirms that the MPGA-RAG approach can significantly improve the reliability of museum QA systems and highlights its potential for practical applications in smart museum guides and cultural education.
參考文獻 [1] Zhang, Y., Li, Y., Cui, L., Cai, D., Liu, L., Fu, T., Huang, X., Zhao, E., Zhang, Y., Chen, Y., Wang, L., Luu, A. T., Bi, W., Shi, F., and Shi, S. (2023). Siren’s song in the ai ocean: A survey on hallucination in large language models. ArXiv, abs/2309.01219. [2] 奇美博物館(2014)。《閱讀 新.奇.美 - 奇美博物館典藏欣賞 (中文版)》。財團法人奇美博物館出版。 [3] Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, YJ., Madotto, A & Fung, P. (2023). Survey of hallucination in natural language generation. ACM Computing Surveys. [4] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W.-tau., Rocktäschel, T., Riedel, S., & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. Neural Information Processing Systems. [5] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q & Zhou, D. (2022). Chain-of-Thought prompting elicits reasoning in large language models. Neural Information Processing Systems. [6] Yao, S., Zhao, J., Yu, D., Narasimhan, K., & Cao, Y. (2023). Tree of Thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601. [7] Zhou, Y., & Ma, X. (2023). EvoPrompt: Evolving Prompts for Large Language Models. arXiv preprint arXiv:2211.12560. [8] Chen, H., & Shu, K. (2023). GAuGE: Genetic Algorithm with Unified Guidance for Evolving Prompts. arXiv preprint arXiv:2309.10821. [9] Tallon, L., & Walker, K. (Eds.). (2008). Digital technologies and the museum experience: Handheld guides and other media. AltaMira Press. [10] Cox, J., & Kolodner, J. (2021). Intelligent conversational agents in museums: Opportunities and challenges. Journal on Computing and Cultural Heritage. [11] Muennighoff, N., Wang, T., Le Scao, T., et al. (2024). Mixtral 8x7B: Sparse Mixture-of-Experts language model. Mistral AI. https://arxiv.org/abs/2401.04088
描述 碩士
國立政治大學
資訊科學系碩士在職專班
111971012
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0111971012
資料類型 thesis
dc.contributor.advisor 蔡子傑zh_TW
dc.contributor.advisor Tsai, Tzu-Chiehen_US
dc.contributor.author (Authors) 羅永傑zh_TW
dc.contributor.author (Authors) Luo, Yong-Jieen_US
dc.creator (作者) 羅永傑zh_TW
dc.creator (作者) Luo, Yong-Jieen_US
dc.date (日期) 2026en_US
dc.date.accessioned 2-Mar-2026 12:22:18 (UTC+8)-
dc.date.available 2-Mar-2026 12:22:18 (UTC+8)-
dc.date.issued (上傳時間) 2-Mar-2026 12:22:18 (UTC+8)-
dc.identifier (Other Identifiers) G0111971012en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/161854-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學系碩士在職專班zh_TW
dc.description (描述) 111971012zh_TW
dc.description.abstract (摘要) 當生成式大型語言模型(Large Language Models, LLMs)被應用於知識問答系統時,常會產生與事實不符的答案,亦即「幻覺」(Hallucination)問題,這對系統的可靠性與應用價值帶來挑戰。為解決此問題,近年來出現了檢索增強生成(Retrieval-Augmented Generation, RAG)技術,藉由整合外部知識庫以提升答案的正確性。然而,傳統 RAG 方法仍有其限制,例如:語意檢索過程可能遺失文本中關鍵資訊、使用者提問過於模糊時可能檢索到不相關內容、以及單次檢索難以滿足複雜任務所需的多步推理。 本研究提出一種結合 多種群基因演算法(Multi-Population Genetic Algorithm, MPGA) 的問答推演方法,以提升生成答案的品質。研究流程包含四個主要步驟:首先,透過 多頭自注意力機制(Multi-head Self-Attention) 捕捉問題中多面向的語意;其次,利用 餘弦相似度與 K-近鄰演算法(K-Nearest Neighbor, K-NN) 從向量資料庫檢索相關文本;再者,透過 MPGA 對檢索結果進行優化,以增強上下文的關聯性與完整性;最後,將問題與最佳候選文本整合為提示(Prompt),輸入至生成模型以產生最終答案。 在實驗設計上,本研究採用 《閱讀 新.奇.美 - 奇美博物館典藏欣賞 (中文版)》 作為主要知識來源,建構博物館專屬的向量資料庫,並進行 QA 測試。實驗結果顯示,透過多頭自注意力與 MPGA 的結合,系統能更好地捕捉文本的多重語意特徵,有效降低幻覺的發生,並在答案的正確性、完整性與多樣性上皆優於基準模型。 本研究證明 MPGA-RAG 方法能夠提升博物館知識問答系統的可靠性,並展現出應用於智慧導覽與文化教育推廣的實務潛力。zh_TW
dc.description.abstract (摘要) When large language models (LLMs) are applied to knowledge question-answering (QA) systems, they often generate factually incorrect responses, a phenomenon known as hallucination, which poses challenges to system reliability and practical usability. To address this issue, Retrieval-Augmented Generation (RAG) techniques have been introduced in recent years, leveraging external knowledge bases to improve answer accuracy. However, traditional RAG methods still face several limitations: semantic retrieval may overlook critical information in the source text, overly broad or vague user queries may lead to irrelevant retrievals, and single-step retrieval strategies are insufficient for complex tasks that require iterative reasoning. This study proposes a QA inference method that integrates a Multi-Population Genetic Algorithm (MPGA) to enhance the quality of generated answers. The research framework consists of four main steps: (1) applying a Multi-head Self-Attention mechanism to capture multiple semantic perspectives of the query; (2) employing cosine similarity and the K-Nearest Neighbor (K-NN) algorithm to retrieve relevant text fragments from a vector database; (3) optimizing the retrieved results through MPGA to strengthen the contextual relevance and coherence; and (4) combining the query with the optimized text into a prompt, which is then fed into the language model to generate the final answer. For the experimental design, this study uses the book “New Chimei: A Guide to the Collections of the Chimei Museum (Chinese Edition)” as the primary knowledge source, constructing a museum-specific vector database for QA testing. The experimental results demonstrate that the combination of multi-head self-attention and MPGA enables the system to better capture diverse semantic features of the text, effectively reduce hallucinations, and outperform the baseline model in terms of answer accuracy, completeness, and diversity. This research confirms that the MPGA-RAG approach can significantly improve the reliability of museum QA systems and highlights its potential for practical applications in smart museum guides and cultural education.en_US
dc.description.tableofcontents 第一章 緒論 1 第一節 研究動機 1 第二節 研究目的 3 第三節 文獻探討 5 第二章 研究方法 9 第一節 系統架構設計 9 第二節 多頭自注意力機制 11 第三節 向量檢索與 K-近鄰演算法 12 第四節 多種群基因演算法(Multi-Population Genetic Algorithm, MPGA)優化機制 14 第五節 答案生成與提示設計 19 第六節 系統實作環境 22 第三章 實驗與分析(Experiment and Analysis) 26 第一節 資料來源與前處理 26 第二節 實驗設計(Experiment Setup) 29 第三節 評估指標與分析方法 33 第四章 結果與討論 39 第ㄧ節 實驗結果概述 39 第二節 自動化評估結果分析 39 第三節 人工評估與主觀觀察 40 第四節 綜合討論與研究啟示 47 第五章 結論與未來發展 49 第ㄧ節 研究結論 49 第二節 未來發展建議 50 參考文獻 52zh_TW
dc.format.extent 1344422 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0111971012en_US
dc.subject (關鍵詞) 多種群基因演算法zh_TW
dc.subject (關鍵詞) 博物館 QAzh_TW
dc.subject (關鍵詞) 幻覺問題zh_TW
dc.subject (關鍵詞) 檢索增強生成zh_TW
dc.subject (關鍵詞) Multi-Population Genetic Algorithmen_US
dc.subject (關鍵詞) Museum QAen_US
dc.subject (關鍵詞) Hallucinationen_US
dc.subject (關鍵詞) Retrieval-Augmented Generationen_US
dc.title (題名) 結合多種群基因演算法提升問答系統生成品質之研究 ——以奇美博物館為例zh_TW
dc.title (題名) Enhancing the Answer Quality of Question-Answering Systems with Multi-Population Genetic Algorithms: A Case Study of the Chimei Museumen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] Zhang, Y., Li, Y., Cui, L., Cai, D., Liu, L., Fu, T., Huang, X., Zhao, E., Zhang, Y., Chen, Y., Wang, L., Luu, A. T., Bi, W., Shi, F., and Shi, S. (2023). Siren’s song in the ai ocean: A survey on hallucination in large language models. ArXiv, abs/2309.01219. [2] 奇美博物館(2014)。《閱讀 新.奇.美 - 奇美博物館典藏欣賞 (中文版)》。財團法人奇美博物館出版。 [3] Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, YJ., Madotto, A & Fung, P. (2023). Survey of hallucination in natural language generation. ACM Computing Surveys. [4] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W.-tau., Rocktäschel, T., Riedel, S., & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. Neural Information Processing Systems. [5] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q & Zhou, D. (2022). Chain-of-Thought prompting elicits reasoning in large language models. Neural Information Processing Systems. [6] Yao, S., Zhao, J., Yu, D., Narasimhan, K., & Cao, Y. (2023). Tree of Thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601. [7] Zhou, Y., & Ma, X. (2023). EvoPrompt: Evolving Prompts for Large Language Models. arXiv preprint arXiv:2211.12560. [8] Chen, H., & Shu, K. (2023). GAuGE: Genetic Algorithm with Unified Guidance for Evolving Prompts. arXiv preprint arXiv:2309.10821. [9] Tallon, L., & Walker, K. (Eds.). (2008). Digital technologies and the museum experience: Handheld guides and other media. AltaMira Press. [10] Cox, J., & Kolodner, J. (2021). Intelligent conversational agents in museums: Opportunities and challenges. Journal on Computing and Cultural Heritage. [11] Muennighoff, N., Wang, T., Le Scao, T., et al. (2024). Mixtral 8x7B: Sparse Mixture-of-Experts language model. Mistral AI. https://arxiv.org/abs/2401.04088zh_TW