Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 基於對比式學習之負樣本探勘於改進進階檢索模型
Improving Dense Retrieval Model via Hard Negative Mining based on Contrastive Learning
作者 陳采宗
Chen, Tsai-Tsung
貢獻者 蔡銘峰
Tsai, Ming-Feng
陳采宗
Chen, Tsai-Tsung
關鍵詞 資訊檢索
對比式學習
負樣本選擇
Information Retrieval
Contrastive Learning
Negative Sample Selection
日期 2024
上傳時間 4-Sep-2024 15:01:22 (UTC+8)
摘要 資訊檢索在現實生活中具有廣泛的應用,其重要性日益顯著。隨著人工智慧時代的到來,資訊檢索技術已成為我們日常生活和工作中不可或缺的一部分。例如,網路搜索和問答系統是常見的應用,能夠根據用戶的查詢從大量文本中檢索出相關的訊息。然而,如何在訓練中選擇適當的負樣本以提升檢索模型的性能,仍然是一個挑戰。負樣本的選擇對於對比式學習特別重要,因為它們直接影響模型的鑑別能力和最終效果。過於簡單的負樣本可能不足以挑戰模型,而過於困難的負樣本則可能混淆模型,導致效果不佳。因此,找到一種能夠平衡負樣本難度的方法至關重要。 本研究針對密集檢索任務中的負樣本選擇問題,提出了一種基於對比式學習和動態調整邊界的方法,以挖掘適中難度的負樣本來提升檢索性能。我們的方法不僅有效避免了過於簡單或過於困難的負樣本對模型訓練的負面影響,還能夠顯著提高模型的檢索效果。在 E5- Small、SBERT 和 BERT 模型上,我們的方法在 NFCorpus 和 SciFact 等多個資料集上均顯示了優異的性能,在 Recall@100 和 NDCG@10 兩個指標上,效果最佳。此外,我們的方法相較於其他動態負樣本選擇策略,具有更高的訓練效率,顯著減少了訓練時間和資源消耗。 實驗結果顯示,我們的方法能夠動態調整負樣本的難度適應不同模型和資料集的需求,從而提升模型的訓練效果和穩定性。未來,我們將進一步擴展研究範圍,在更多資料集和模型上驗證我們方法的有效性,並探索其他可能的改進方向,以提升自然語言處理任務中的檢索性能。
Information retrieval has widespread applications in real life, and its importance is becoming increasingly significant. With the advent of the artificial intelligence era, information retrieval technology has become an indispensable part of our daily lives and work. For example, web search and question-answering systems are common applications that can retrieve relevant information from a vast amount of text based on user queries. However, selecting appropriate negative samples to enhance the performance of retrieval models remains a challenge. The selection of negative samples is particularly critical for contrastive learning, as it directly affects the model’s discriminative power and final performance. Overly simple negative samples may not challenge the model sufficiently, while overly difficult negative samples can confuse the model, leading to suboptimal results. Therefore, finding a method to balance the difficulty of negative samples is crucial. This study addresses the issue of negative sample selection in dense retrieval tasks by proposing a method based on contrastive learning and dynamic margin adjustment to mine moderately difficult negative samples, thereby enhancing retrieval performance. Our method effectively avoids the adverse effects of overly simple or overly difficult negative samples on model training and significantly improves retrieval effectiveness. On models such as E5-Small, SBERT, and BERT, our method demonstrates superior performance across multiple datasets, including NFCorpus and SciFact, particularly excelling in the Recall@100 and NDCG@10 metrics. Additionally, compared to other dynamic negative sample selection strategies, our approach offers higher training efficiency, significantly reducing training time and resource consumption. Experimental results indicate that our method can dynamically adjust the difficulty of negative samples, accommodating the needs of different models and datasets, thereby enhancing model training effectiveness and stability. In the future, we plan to extend our research to validate the effectiveness of our method on more datasets and models and explore potential improvements to further enhance retrieval performance in natural language processing tasks.
參考文獻 [1] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. [2] T. Gao, X. Yao, and D. Chen. Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821, 2021. [3] V. Karpukhin, B. Og ̆uz, S. Min, P. Lewis, L. Wu, S. Edunov, D. Chen, and W.-t. Yih. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906, 2020. [4] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Ku ̈ttler, M.Lewis,W.-t.Yih,T.Rockta ̈schel,etal. Retrieval-augmentedgenerationfor knowledge-intensive nlp tasks. Advances in Neural Information Processing Sys- tems, 33:9459–9474, 2020. [5] B. Mitra, F. Diaz, and N. Craswell. Learning to match using local and distributed representations of text for web search. In Proceedings of the 26th international n g conference on world wide web, pages 1291–1299, 2017. [6] R. Nogueira and K. Cho. Passage re-ranking with bert. arXiv preprint arXiv:1901.04085, 2019. [7] N. Reimers and I. Gurevych. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084, 2019. [8] S. E. Robertson, S. Walker, S. Jones, M. M. Hancock-Beaulieu, M. Gatford, et al. Okapi at trec-3. Nist Special Publication Sp, 109:109, 1995. [9] G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information processing & management, 24(5):513–523, 1988. [10] N.Thakur,N.Reimers,A.Ru ̈ckle ́,A.Srivastava,andI.Gurevych.Beir:Aheteroge- nous benchmark for zero-shot evaluation of information retrieval models. arXiv preprint arXiv:2104.08663, 2021. [11] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. Advances in neural information pro- cessing systems, 30, 2017. [12] L. Wang, N. Yang, X. Huang, B. Jiao, L. Yang, D. Jiang, R. Majumder, and F. Wei. Text embeddings by weakly-supervised contrastive pre-training. arXiv preprint arXiv:2212.03533, 2022. [13] L.Xiong,C.Xiong,Y.Li,K.-F.Tang,J.Liu,P.Bennett,J.Ahmed,andA.Overwijk. Approximate nearest neighbor negative contrastive learning for dense text retrieval. arXiv preprint arXiv:2007.00808, 2020. [14] J. Zhan, J. Mao, Y. Liu, J. Guo, M. Zhang, and S. Ma. Optimizing dense retrieval model training with hard negatives. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1503–1512, 2021.
描述 碩士
國立政治大學
資訊科學系
111753160
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0111753160
資料類型 thesis
dc.contributor.advisor 蔡銘峰zh_TW
dc.contributor.advisor Tsai, Ming-Fengen_US
dc.contributor.author (Authors) 陳采宗zh_TW
dc.contributor.author (Authors) Chen, Tsai-Tsungen_US
dc.creator (作者) 陳采宗zh_TW
dc.creator (作者) Chen, Tsai-Tsungen_US
dc.date (日期) 2024en_US
dc.date.accessioned 4-Sep-2024 15:01:22 (UTC+8)-
dc.date.available 4-Sep-2024 15:01:22 (UTC+8)-
dc.date.issued (上傳時間) 4-Sep-2024 15:01:22 (UTC+8)-
dc.identifier (Other Identifiers) G0111753160en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/153387-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學系zh_TW
dc.description (描述) 111753160zh_TW
dc.description.abstract (摘要) 資訊檢索在現實生活中具有廣泛的應用,其重要性日益顯著。隨著人工智慧時代的到來,資訊檢索技術已成為我們日常生活和工作中不可或缺的一部分。例如,網路搜索和問答系統是常見的應用,能夠根據用戶的查詢從大量文本中檢索出相關的訊息。然而,如何在訓練中選擇適當的負樣本以提升檢索模型的性能,仍然是一個挑戰。負樣本的選擇對於對比式學習特別重要,因為它們直接影響模型的鑑別能力和最終效果。過於簡單的負樣本可能不足以挑戰模型,而過於困難的負樣本則可能混淆模型,導致效果不佳。因此,找到一種能夠平衡負樣本難度的方法至關重要。 本研究針對密集檢索任務中的負樣本選擇問題,提出了一種基於對比式學習和動態調整邊界的方法,以挖掘適中難度的負樣本來提升檢索性能。我們的方法不僅有效避免了過於簡單或過於困難的負樣本對模型訓練的負面影響,還能夠顯著提高模型的檢索效果。在 E5- Small、SBERT 和 BERT 模型上,我們的方法在 NFCorpus 和 SciFact 等多個資料集上均顯示了優異的性能,在 Recall@100 和 NDCG@10 兩個指標上,效果最佳。此外,我們的方法相較於其他動態負樣本選擇策略,具有更高的訓練效率,顯著減少了訓練時間和資源消耗。 實驗結果顯示,我們的方法能夠動態調整負樣本的難度適應不同模型和資料集的需求,從而提升模型的訓練效果和穩定性。未來,我們將進一步擴展研究範圍,在更多資料集和模型上驗證我們方法的有效性,並探索其他可能的改進方向,以提升自然語言處理任務中的檢索性能。zh_TW
dc.description.abstract (摘要) Information retrieval has widespread applications in real life, and its importance is becoming increasingly significant. With the advent of the artificial intelligence era, information retrieval technology has become an indispensable part of our daily lives and work. For example, web search and question-answering systems are common applications that can retrieve relevant information from a vast amount of text based on user queries. However, selecting appropriate negative samples to enhance the performance of retrieval models remains a challenge. The selection of negative samples is particularly critical for contrastive learning, as it directly affects the model’s discriminative power and final performance. Overly simple negative samples may not challenge the model sufficiently, while overly difficult negative samples can confuse the model, leading to suboptimal results. Therefore, finding a method to balance the difficulty of negative samples is crucial. This study addresses the issue of negative sample selection in dense retrieval tasks by proposing a method based on contrastive learning and dynamic margin adjustment to mine moderately difficult negative samples, thereby enhancing retrieval performance. Our method effectively avoids the adverse effects of overly simple or overly difficult negative samples on model training and significantly improves retrieval effectiveness. On models such as E5-Small, SBERT, and BERT, our method demonstrates superior performance across multiple datasets, including NFCorpus and SciFact, particularly excelling in the Recall@100 and NDCG@10 metrics. Additionally, compared to other dynamic negative sample selection strategies, our approach offers higher training efficiency, significantly reducing training time and resource consumption. Experimental results indicate that our method can dynamically adjust the difficulty of negative samples, accommodating the needs of different models and datasets, thereby enhancing model training effectiveness and stability. In the future, we plan to extend our research to validate the effectiveness of our method on more datasets and models and explore potential improvements to further enhance retrieval performance in natural language processing tasks.en_US
dc.description.tableofcontents 第一章 緒論 1 第二章 相關文獻探討 4 第三章 研究方法 12 第四章 實驗結果與討論 20 第五章 結論 32 參考文獻 33zh_TW
dc.format.extent 1468707 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0111753160en_US
dc.subject (關鍵詞) 資訊檢索zh_TW
dc.subject (關鍵詞) 對比式學習zh_TW
dc.subject (關鍵詞) 負樣本選擇zh_TW
dc.subject (關鍵詞) Information Retrievalen_US
dc.subject (關鍵詞) Contrastive Learningen_US
dc.subject (關鍵詞) Negative Sample Selectionen_US
dc.title (題名) 基於對比式學習之負樣本探勘於改進進階檢索模型zh_TW
dc.title (題名) Improving Dense Retrieval Model via Hard Negative Mining based on Contrastive Learningen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. [2] T. Gao, X. Yao, and D. Chen. Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821, 2021. [3] V. Karpukhin, B. Og ̆uz, S. Min, P. Lewis, L. Wu, S. Edunov, D. Chen, and W.-t. Yih. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906, 2020. [4] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Ku ̈ttler, M.Lewis,W.-t.Yih,T.Rockta ̈schel,etal. Retrieval-augmentedgenerationfor knowledge-intensive nlp tasks. Advances in Neural Information Processing Sys- tems, 33:9459–9474, 2020. [5] B. Mitra, F. Diaz, and N. Craswell. Learning to match using local and distributed representations of text for web search. In Proceedings of the 26th international n g conference on world wide web, pages 1291–1299, 2017. [6] R. Nogueira and K. Cho. Passage re-ranking with bert. arXiv preprint arXiv:1901.04085, 2019. [7] N. Reimers and I. Gurevych. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084, 2019. [8] S. E. Robertson, S. Walker, S. Jones, M. M. Hancock-Beaulieu, M. Gatford, et al. Okapi at trec-3. Nist Special Publication Sp, 109:109, 1995. [9] G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information processing & management, 24(5):513–523, 1988. [10] N.Thakur,N.Reimers,A.Ru ̈ckle ́,A.Srivastava,andI.Gurevych.Beir:Aheteroge- nous benchmark for zero-shot evaluation of information retrieval models. arXiv preprint arXiv:2104.08663, 2021. [11] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. Advances in neural information pro- cessing systems, 30, 2017. [12] L. Wang, N. Yang, X. Huang, B. Jiao, L. Yang, D. Jiang, R. Majumder, and F. Wei. Text embeddings by weakly-supervised contrastive pre-training. arXiv preprint arXiv:2212.03533, 2022. [13] L.Xiong,C.Xiong,Y.Li,K.-F.Tang,J.Liu,P.Bennett,J.Ahmed,andA.Overwijk. Approximate nearest neighbor negative contrastive learning for dense text retrieval. arXiv preprint arXiv:2007.00808, 2020. [14] J. Zhan, J. Mao, Y. Liu, J. Guo, M. Zhang, and S. Ma. Optimizing dense retrieval model training with hard negatives. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1503–1512, 2021.zh_TW