為建構勞訴類案推薦系統以法律資料訓練生成式語言模型 | Publication

Publications-Theses

Article View/Open

pdf(0)

Publication Export

Google Scholar^TM

Title	為建構勞訴類案推薦系統以法律資料訓練生成式語言模型 Training Large Language Models for Similar Case Recommendation of Labor and Employment Disputes
Creator	李韋杰 Li, Wei-Jie
Contributor	劉昭麟 Liu, Chao-Lin 李韋杰 Li, Wei-Jie
Key Words	生成式語言模型 RAG 類案推薦自然語言處理法律文件分析勞資爭議 LLMs generative language model case recommendation Natural Language Processing legal document analysis
Date	2024
Date Issued	2-Jan-2025 11:39:38 (UTC+8)
Summary	本研究旨在開發一個基於大型語言模型的勞資爭議案件推薦系統。透過司法院與法務部提供的開放法律數據，訓練一個專門處理台灣法律文件的模型。該模型特別聚焦於勞資爭議案件的回答能力。本研究為提高模型回答能力，建立了可分析並推薦相似案例的向量資料庫，有了該資料庫與訓練好的大型語言模型，能有效解決了繁複的勞資爭議問題，並大幅提高案件處理效率。並且透過多種自制實驗驗證與多種常用模型進行對比，證明了系統在相似案例推薦準確度與對話回應的能力皆達到令人滿意的水平，對於未來的法律AI應用具有實質貢獻。 This study aims to develop a recommendation system for labor dispute cases based on large language models (LLMs) . By utilizing open legal data provided by the Judicial Yuan and the Ministry of Justice, we train a model specifically designed to handle Taiwanese legal documents, with a particular focus on its ability to address labor dispute cases. To enhance the model's response capability, we established a vector database that can analyze and recommend similar cases. With this database and the trained large language model, we effectively tackle the complexities of labor disputes and significantly improve case handling efficiency. Furthermore, through various self-conducted experiments and comparisons with commonly used models, we demonstrate that the system achieves satisfactory levels in both the accuracy of similar case recommendations and dialogue responses, making a substantial contribution to future applications of AI in law.
參考文獻	[1] J. S. Dhani, R. Bhatt, B. Ganesan, P. Sirohi, and V. Bhatnagar, “Similar Cases Recommendation using Legal Knowledge Graphs,” Jul. 10, 2021, arXiv: arXiv:2107.04771. doi: 10.48550/arXiv.2107.04771. [2] C.-L. Liu and Y.-F. Liu, “Some Practical Analyses of the Judgment Documents of Labor Litigations for Social Conflicts and Similar Cases,” presented at the LegalAIIA 2023, Braga, Minho, Portugal, Jun. 2023, p. 100‒109. [Online]. Available: https://ceur-ws.org/Vol-3423/paper10.pdf [3] D. M. Katz, M. J. Bommarito, S. Gao, and P. Arredondo, “GPT-4 Passes the Bar Exam,” Mar. 15, 2023, Rochester, NY: 4389233. doi: 10.2139/ssrn.4389233. [4] H. Surden, “ChatGPT, Artificial Intelligence (AI) Large Language Models, and Law,” Mar. 31, 2024, Social Science Research Network, Rochester, NY: 4779694. Accessed: Oct. 26, 2024. [Online]. Available: https://papers.ssrn.com/abstract=4779694 [5] Y.-T. Lin and Y.-N. Chen, “Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model,” Nov. 29, 2023, arXiv: arXiv:2311.17487. doi: 10.48550/arXiv.2311.17487. [6] Y. Bengio, R. Ducharme, and P. Vincent, “A Neural Probabilistic Language Model,” in Advances in Neural Information Processing Systems, MIT Press, 2000. Accessed: Aug. 15, 2023. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2000/hash/728f206c2a01bf572b5940d7d9a8fa4c-Abstract.html [7] S. R. Eddy, “Hidden Markov models,” Current Opinion in Structural Biology, vol. 6, no. 3, pp. 361–365, Jun. 1996, doi: 10.1016/S0959-440X(96)80056-X. [8] A. McCallum, D. Freitag, and F. Pereira, “Maximum Entropy Markov Models for Information Extraction and Segmentation,” presented at the 17th International Conf. on Machine Learning, 2000. [9] T. R. Niesler and P. C. Woodland, “A variable-length category-based n-gram language model,” in 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, May 1996, pp. 164–167 vol. 1. doi: 10.1109/ICASSP.1996.540316. [10] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” Sep. 06, 2013, arXiv: arXiv:1301.3781. Accessed: Oct. 09, 2024. [Online]. Available: http://arxiv.org/abs/1301.3781 [11] A. Vaswani et al., “Attention Is All You Need,” Dec. 05, 2017, arXiv: arXiv:1706.03762. doi: 10.48550/arXiv.1706.03762. [12] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” May 24, 2019, arXiv: arXiv:1810.04805. doi: 10.48550/arXiv.1810.04805. [13] M. Lewis et al., “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension,” Oct. 29, 2019, arXiv: arXiv:1910.13461. doi: 10.48550/arXiv.1910.13461. [14] A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving Language Understanding by Generative Pre-Training,” 2018, [Online]. Available: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf [15] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language Models are Unsupervised Multitask Learners,” OpenAI blog, 2019, [Online]. Available: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf [16] T. Brown et al., “Language Models are Few-Shot Learners,” in Advances in Neural Information Processing Systems, Curran Associates, Inc., 2020, pp. 1877–1901. Accessed: Mar. 22, 2023. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html [17] OpenAI, “GPT-4 Technical Report,” Mar. 27, 2023, arXiv: arXiv:2303.08774. doi: 10.48550/arXiv.2303.08774. [18] J. Weizenbaum, “ELIZA—a computer program for the study of natural language communication between man and machine,” Commun. ACM, vol. 9, no. 1, pp. 36–45, Jan. 1966, doi: 10.1145/365153.365168. [19] R. S. Wallace, “The Anatomy of A.L.I.C.E.,” in Parsing the Turing Test: Philosophical and Methodological Issues in the Quest for the Thinking Computer, R. Epstein, G. Roberts, and G. Beber, Eds., Dordrecht: Springer Netherlands, 2009, pp. 181–210. doi: 10.1007/978-1-4020-6710-5_13. [20] H. Touvron et al., “LLaMA: Open and Efficient Foundation Language Models,” Feb. 27, 2023, arXiv: arXiv:2302.13971. doi: 10.48550/arXiv.2302.13971. [21] Y. Wang et al., “Self-Instruct: Aligning Language Models with Self-Generated Instructions,” May 25, 2023, arXiv: arXiv:2212.10560. doi: 10.48550/arXiv.2212.10560. [22] B. Goertzel, “Artificial General Intelligence: Concept, State of the Art, and Future Prospects,” Journal of Artificial General Intelligence, vol. 5, no. 1, pp. 1–48, Dec. 2014, doi: 10.2478/jagi-2014-0001. [23] H. Touvron et al., “Llama 2: Open Foundation and Fine-Tuned Chat Models,” Jul. 19, 2023, arXiv: arXiv:2307.09288. doi: 10.48550/arXiv.2307.09288. [24] J. Su, M. Ahmed, Y. Lu, S. Pan, W. Bo, and Y. Liu, “RoFormer: Enhanced transformer with Rotary Position Embedding,” Neurocomputing, vol. 568, p. 127063, Feb. 2024, doi: 10.1016/j.neucom.2023.127063. [25] S. Chen, S. Wong, L. Chen, and Y. Tian, “Extending Context Window of Large Language Models via Positional Interpolation,” Jun. 28, 2023, arXiv: arXiv:2306.15595. Accessed: Jul. 25, 2024. [Online]. Available: http://arxiv.org/abs/2306.15595 [26] C. Li et al., “DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing,” arXiv.org. Accessed: Aug. 16, 2023. [Online]. Available: https://arxiv.org/abs/2212.03597v2 [27] S. Rajbhandari, J. Rasley, O. Ruwase, and Y. He, “ZeRO: Memory Optimizations Toward Training Trillion Parameter Models,” arXiv.org. Accessed: Aug. 16, 2023. [Online]. Available: https://arxiv.org/abs/1910.02054v3 [28] G. Wang et al., “ZeRO++: Extremely Efficient Collective Communication for Giant Model Training,” Jun. 16, 2023, arXiv: arXiv:2306.10209. doi: 10.48550/arXiv.2306.10209. [29] E. J. Hu et al., “LoRA: Low-Rank Adaptation of Large Language Models,” Oct. 16, 2021, arXiv: arXiv:2106.09685. doi: 10.48550/arXiv.2106.09685. [30] T. Dettmers, A. Pagnoni, A. Holtzman, and L. Zettlemoyer, “QLoRA: Efficient Finetuning of Quantized LLMs,” May 23, 2023, arXiv: arXiv:2305.14314. doi: 10.48550/arXiv.2305.14314. [31] X. L. Li and P. Liang, “Prefix-Tuning: Optimizing Continuous prompts for Generation,” Jan. 01, 2021, arXiv: arXiv:2101.00190. doi: 10.48550/arXiv.2101.00190. [32] X. Liu et al., “P-Tuning v2: prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks,” Mar. 20, 2022, arXiv: arXiv:2110.07602. doi: 10.48550/arXiv.2110.07602. [33] T. Dao, D. Y. Fu, S. Ermon, A. Rudra, and C. Ré, “FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness,” Jun. 23, 2022, arXiv: arXiv:2205.14135. doi: 10.48550/arXiv.2205.14135. [34] P. Lewis et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” Apr. 12, 2021, arXiv: arXiv:2005.11401. Accessed: Oct. 09, 2024. [Online]. Available: http://arxiv.org/abs/2005.11401 [35] Y. Gao et al., “Retrieval-Augmented Generation for Large Language Models: A Survey,” Mar. 27, 2024, arXiv: arXiv:2312.10997. Accessed: Jul. 31, 2024. [Online]. Available: http://arxiv.org/abs/2312.10997 [36] Z. Zhou et al., “LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model,” Jun. 06, 2024, arXiv: arXiv:2406.04614. Accessed: Aug. 15, 2024. [Online]. Available: http://arxiv.org/abs/2406.04614 [37] H.-T. Nguyen, “A Brief Report on LawGPT 1.0: A Virtual Legal Assistant Based on GPT-3,” Feb. 14, 2023, arXiv: arXiv:2302.05729. doi: 10.48550/arXiv.2302.05729. [38] V. R. Doncel and E. M. Ponsoda, “LYNX: Towards a Legal Knowledge Graph for Multilingual Europe,” Law in Context. A Socio-legal Journal, vol. 37, no. 1, pp. 175–178, Dec. 2020, doi: 10.26826/law-in-context.v37i1.129. [39] X. Wei et al., “Zero-Shot Information Extraction via Chatting with ChatGPT,” Feb. 20, 2023, arXiv: arXiv:2302.10205. Accessed: Mar. 07, 2023. [Online]. Available: http://arxiv.org/abs/2302.10205 [40] N. Muennighoff, “SGPT: GPT Sentence Embeddings for Semantic Search,” Aug. 05, 2022, arXiv: arXiv:2202.08904. Accessed: Nov. 29, 2023. [Online]. Available: http://arxiv.org/abs/2202.08904 [41] Y. Cui, Z. Yang, and X. Yao, “Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca,” Apr. 17, 2023, arXiv: arXiv:2304.08177. doi: 10.48550/arXiv.2304.08177. [42] T. Kudo and J. Richardson, “SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing,” Aug. 19, 2018, arXiv: arXiv:1808.06226. doi: 10.48550/arXiv.1808.06226. [43] P.-H. Wu, C.-L. Liu, and W.-J. Li, “An empirical evaluation of using ChatGPT to summarize disputes for recommending similar labor and employment cases in Chinese,” presented at the Proceedings of the Eighteenth International Workshop on Juris-Informatics (JURISIN 2024), associated with the Sixteenth JSAI International Symposium on AI (JSAI-isAI 2024), Hamamatsu, Shizuoka, Japan, Sep. 2024, p. 101‒114. doi: 10.48550/arXiv.2409.09280. [44] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “BERTScore: Evaluating Text Generation with BERT,” Feb. 24, 2020, arXiv: arXiv:1904.09675. doi: 10.48550/arXiv.1904.09675. [45] Y. S. Chan and H. T. Ng, “MAXSIM: A Maximum Similarity Metric for Machine Translation Evaluation,” in Proceedings of ACL-08: HLT, J. D. Moore, S. Teufel, J. Allan, and S. Furui, Eds., Columbus, Ohio: Association for Computational Linguistics, Jun. 2008, pp. 55–62. Accessed: Oct. 08, 2024. [Online]. Available: https://aclanthology.org/P08-1007 [46] Q. Huang et al., “Lawyer LLaMA Technical Report,” Oct. 13, 2023, arXiv: arXiv:2305.15062. doi: 10.48550/arXiv.2305.15062. [47] N. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China: Association for Computational Linguistics, Jan. 2019, pp. 3982–3992. doi: 10.18653/v1/D19-1410. [48] Y. Cui, W. Che, T. Liu, B. Qin, S. Wang, and G. Hu, “Revisiting Pre-Trained Models for Chinese Natural Language Processing,” in Findings of the Association for Computational Linguistics: EMNLP 2020, 2020, pp. 657–668. doi: 10.18653/v1/2020.findings-emnlp.58. [49] R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality Reduction by Learning an Invariant Mapping,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2 (CVPR’06), New York, NY, USA: IEEE, 2006, pp. 1735–1742. doi: 10.1109/CVPR.2006.100.
Description	碩士國立政治大學資訊科學系 110753128
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0110753128
Type	thesis

dc.contributor.advisor	劉昭麟	zh_TW
dc.contributor.advisor	Liu, Chao-Lin	en_US
dc.contributor.author (Authors)	李韋杰	zh_TW
dc.contributor.author (Authors)	Li, Wei-Jie	en_US
dc.creator (作者)	李韋杰	zh_TW
dc.creator (作者)	Li, Wei-Jie	en_US
dc.date (日期)	2024	en_US
dc.date.accessioned	2-Jan-2025 11:39:38 (UTC+8)	-
dc.date.available	2-Jan-2025 11:39:38 (UTC+8)	-
dc.date.issued (上傳時間)	2-Jan-2025 11:39:38 (UTC+8)	-
dc.identifier (Other Identifiers)	G0110753128	en_US
dc.identifier.uri (URI)	https://nccur.lib.nccu.edu.tw/handle/140.119/154979	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	資訊科學系	zh_TW
dc.description (描述)	110753128	zh_TW
dc.description.abstract (摘要)	本研究旨在開發一個基於大型語言模型的勞資爭議案件推薦系統。透過司法院與法務部提供的開放法律數據，訓練一個專門處理台灣法律文件的模型。該模型特別聚焦於勞資爭議案件的回答能力。本研究為提高模型回答能力，建立了可分析並推薦相似案例的向量資料庫，有了該資料庫與訓練好的大型語言模型，能有效解決了繁複的勞資爭議問題，並大幅提高案件處理效率。並且透過多種自制實驗驗證與多種常用模型進行對比，證明了系統在相似案例推薦準確度與對話回應的能力皆達到令人滿意的水平，對於未來的法律AI應用具有實質貢獻。	zh_TW
dc.description.abstract (摘要)	This study aims to develop a recommendation system for labor dispute cases based on large language models (LLMs) . By utilizing open legal data provided by the Judicial Yuan and the Ministry of Justice, we train a model specifically designed to handle Taiwanese legal documents, with a particular focus on its ability to address labor dispute cases. To enhance the model's response capability, we established a vector database that can analyze and recommend similar cases. With this database and the trained large language model, we effectively tackle the complexities of labor disputes and significantly improve case handling efficiency. Furthermore, through various self-conducted experiments and comparisons with commonly used models, we demonstrate that the system achieves satisfactory levels in both the accuracy of similar case recommendations and dialogue responses, making a substantial contribution to future applications of AI in law.	en_US
dc.description.tableofcontents	第一章緒論 1 第一節研究背景 1 第二節研究目的 3 第三節論文貢獻 3 第四節論文架構 4 第二章文獻回顧 6 第一節語言模型 6 第二節 AI聊天機器人與模型 7 第三節 Llama 2模型 8 第四節模型訓練技術演進 9 第五節 Retrieval-Augmented Generation 13 第六節語言模型在法律上的應用 14 第三章研究架構 16 第四章模型預訓練與驗證 18 第一節訓練資料 18 第二節 tokenizer擴增 19 第三節基礎模型訓練策略詳述 20 第四節勞訴案件爭點驗證任務規劃 21 第五節模型預訓練結果 32 第五章模型微調與對話訓練 36 第一節訓練資料前處理 36 第二節模型微調參數 41 第三節模型微調驗證任務 43 第四節對話能力測驗結果 46 第六章勞訴類案系統 62 第一節問答系統與判決書摘要抽取功能架設 62 第二節向量化模型微調 62 第三節向量資料庫建立與搜尋 62 第四節類案推薦系統分數 64 第五節相似統整回答機制 73 第七章結論 74 參考文獻 75	zh_TW
dc.format.extent	5445123 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0110753128	en_US
dc.subject (關鍵詞)	生成式語言模型	zh_TW
dc.subject (關鍵詞)	RAG	zh_TW
dc.subject (關鍵詞)	類案推薦	zh_TW
dc.subject (關鍵詞)	自然語言處理	zh_TW
dc.subject (關鍵詞)	法律文件分析	zh_TW
dc.subject (關鍵詞)	勞資爭議	zh_TW
dc.subject (關鍵詞)	LLMs	en_US
dc.subject (關鍵詞)	generative language model	en_US
dc.subject (關鍵詞)	case recommendation	en_US
dc.subject (關鍵詞)	Natural Language Processing	en_US
dc.subject (關鍵詞)	legal document analysis	en_US
dc.title (題名)	為建構勞訴類案推薦系統以法律資料訓練生成式語言模型	zh_TW
dc.title (題名)	Training Large Language Models for Similar Case Recommendation of Labor and Employment Disputes	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	[1] J. S. Dhani, R. Bhatt, B. Ganesan, P. Sirohi, and V. Bhatnagar, “Similar Cases Recommendation using Legal Knowledge Graphs,” Jul. 10, 2021, arXiv: arXiv:2107.04771. doi: 10.48550/arXiv.2107.04771. [2] C.-L. Liu and Y.-F. Liu, “Some Practical Analyses of the Judgment Documents of Labor Litigations for Social Conflicts and Similar Cases,” presented at the LegalAIIA 2023, Braga, Minho, Portugal, Jun. 2023, p. 100‒109. [Online]. Available: https://ceur-ws.org/Vol-3423/paper10.pdf [3] D. M. Katz, M. J. Bommarito, S. Gao, and P. Arredondo, “GPT-4 Passes the Bar Exam,” Mar. 15, 2023, Rochester, NY: 4389233. doi: 10.2139/ssrn.4389233. [4] H. Surden, “ChatGPT, Artificial Intelligence (AI) Large Language Models, and Law,” Mar. 31, 2024, Social Science Research Network, Rochester, NY: 4779694. Accessed: Oct. 26, 2024. [Online]. Available: https://papers.ssrn.com/abstract=4779694 [5] Y.-T. Lin and Y.-N. Chen, “Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model,” Nov. 29, 2023, arXiv: arXiv:2311.17487. doi: 10.48550/arXiv.2311.17487. [6] Y. Bengio, R. Ducharme, and P. Vincent, “A Neural Probabilistic Language Model,” in Advances in Neural Information Processing Systems, MIT Press, 2000. Accessed: Aug. 15, 2023. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2000/hash/728f206c2a01bf572b5940d7d9a8fa4c-Abstract.html [7] S. R. Eddy, “Hidden Markov models,” Current Opinion in Structural Biology, vol. 6, no. 3, pp. 361–365, Jun. 1996, doi: 10.1016/S0959-440X(96)80056-X. [8] A. McCallum, D. Freitag, and F. Pereira, “Maximum Entropy Markov Models for Information Extraction and Segmentation,” presented at the 17th International Conf. on Machine Learning, 2000. [9] T. R. Niesler and P. C. Woodland, “A variable-length category-based n-gram language model,” in 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, May 1996, pp. 164–167 vol. 1. doi: 10.1109/ICASSP.1996.540316. [10] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” Sep. 06, 2013, arXiv: arXiv:1301.3781. Accessed: Oct. 09, 2024. [Online]. Available: http://arxiv.org/abs/1301.3781 [11] A. Vaswani et al., “Attention Is All You Need,” Dec. 05, 2017, arXiv: arXiv:1706.03762. doi: 10.48550/arXiv.1706.03762. [12] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” May 24, 2019, arXiv: arXiv:1810.04805. doi: 10.48550/arXiv.1810.04805. [13] M. Lewis et al., “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension,” Oct. 29, 2019, arXiv: arXiv:1910.13461. doi: 10.48550/arXiv.1910.13461. [14] A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving Language Understanding by Generative Pre-Training,” 2018, [Online]. Available: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf [15] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language Models are Unsupervised Multitask Learners,” OpenAI blog, 2019, [Online]. Available: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf [16] T. Brown et al., “Language Models are Few-Shot Learners,” in Advances in Neural Information Processing Systems, Curran Associates, Inc., 2020, pp. 1877–1901. Accessed: Mar. 22, 2023. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html [17] OpenAI, “GPT-4 Technical Report,” Mar. 27, 2023, arXiv: arXiv:2303.08774. doi: 10.48550/arXiv.2303.08774. [18] J. Weizenbaum, “ELIZA—a computer program for the study of natural language communication between man and machine,” Commun. ACM, vol. 9, no. 1, pp. 36–45, Jan. 1966, doi: 10.1145/365153.365168. [19] R. S. Wallace, “The Anatomy of A.L.I.C.E.,” in Parsing the Turing Test: Philosophical and Methodological Issues in the Quest for the Thinking Computer, R. Epstein, G. Roberts, and G. Beber, Eds., Dordrecht: Springer Netherlands, 2009, pp. 181–210. doi: 10.1007/978-1-4020-6710-5_13. [20] H. Touvron et al., “LLaMA: Open and Efficient Foundation Language Models,” Feb. 27, 2023, arXiv: arXiv:2302.13971. doi: 10.48550/arXiv.2302.13971. [21] Y. Wang et al., “Self-Instruct: Aligning Language Models with Self-Generated Instructions,” May 25, 2023, arXiv: arXiv:2212.10560. doi: 10.48550/arXiv.2212.10560. [22] B. Goertzel, “Artificial General Intelligence: Concept, State of the Art, and Future Prospects,” Journal of Artificial General Intelligence, vol. 5, no. 1, pp. 1–48, Dec. 2014, doi: 10.2478/jagi-2014-0001. [23] H. Touvron et al., “Llama 2: Open Foundation and Fine-Tuned Chat Models,” Jul. 19, 2023, arXiv: arXiv:2307.09288. doi: 10.48550/arXiv.2307.09288. [24] J. Su, M. Ahmed, Y. Lu, S. Pan, W. Bo, and Y. Liu, “RoFormer: Enhanced transformer with Rotary Position Embedding,” Neurocomputing, vol. 568, p. 127063, Feb. 2024, doi: 10.1016/j.neucom.2023.127063. [25] S. Chen, S. Wong, L. Chen, and Y. Tian, “Extending Context Window of Large Language Models via Positional Interpolation,” Jun. 28, 2023, arXiv: arXiv:2306.15595. Accessed: Jul. 25, 2024. [Online]. Available: http://arxiv.org/abs/2306.15595 [26] C. Li et al., “DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing,” arXiv.org. Accessed: Aug. 16, 2023. [Online]. Available: https://arxiv.org/abs/2212.03597v2 [27] S. Rajbhandari, J. Rasley, O. Ruwase, and Y. He, “ZeRO: Memory Optimizations Toward Training Trillion Parameter Models,” arXiv.org. Accessed: Aug. 16, 2023. [Online]. Available: https://arxiv.org/abs/1910.02054v3 [28] G. Wang et al., “ZeRO++: Extremely Efficient Collective Communication for Giant Model Training,” Jun. 16, 2023, arXiv: arXiv:2306.10209. doi: 10.48550/arXiv.2306.10209. [29] E. J. Hu et al., “LoRA: Low-Rank Adaptation of Large Language Models,” Oct. 16, 2021, arXiv: arXiv:2106.09685. doi: 10.48550/arXiv.2106.09685. [30] T. Dettmers, A. Pagnoni, A. Holtzman, and L. Zettlemoyer, “QLoRA: Efficient Finetuning of Quantized LLMs,” May 23, 2023, arXiv: arXiv:2305.14314. doi: 10.48550/arXiv.2305.14314. [31] X. L. Li and P. Liang, “Prefix-Tuning: Optimizing Continuous prompts for Generation,” Jan. 01, 2021, arXiv: arXiv:2101.00190. doi: 10.48550/arXiv.2101.00190. [32] X. Liu et al., “P-Tuning v2: prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks,” Mar. 20, 2022, arXiv: arXiv:2110.07602. doi: 10.48550/arXiv.2110.07602. [33] T. Dao, D. Y. Fu, S. Ermon, A. Rudra, and C. Ré, “FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness,” Jun. 23, 2022, arXiv: arXiv:2205.14135. doi: 10.48550/arXiv.2205.14135. [34] P. Lewis et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” Apr. 12, 2021, arXiv: arXiv:2005.11401. Accessed: Oct. 09, 2024. [Online]. Available: http://arxiv.org/abs/2005.11401 [35] Y. Gao et al., “Retrieval-Augmented Generation for Large Language Models: A Survey,” Mar. 27, 2024, arXiv: arXiv:2312.10997. Accessed: Jul. 31, 2024. [Online]. Available: http://arxiv.org/abs/2312.10997 [36] Z. Zhou et al., “LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model,” Jun. 06, 2024, arXiv: arXiv:2406.04614. Accessed: Aug. 15, 2024. [Online]. Available: http://arxiv.org/abs/2406.04614 [37] H.-T. Nguyen, “A Brief Report on LawGPT 1.0: A Virtual Legal Assistant Based on GPT-3,” Feb. 14, 2023, arXiv: arXiv:2302.05729. doi: 10.48550/arXiv.2302.05729. [38] V. R. Doncel and E. M. Ponsoda, “LYNX: Towards a Legal Knowledge Graph for Multilingual Europe,” Law in Context. A Socio-legal Journal, vol. 37, no. 1, pp. 175–178, Dec. 2020, doi: 10.26826/law-in-context.v37i1.129. [39] X. Wei et al., “Zero-Shot Information Extraction via Chatting with ChatGPT,” Feb. 20, 2023, arXiv: arXiv:2302.10205. Accessed: Mar. 07, 2023. [Online]. Available: http://arxiv.org/abs/2302.10205 [40] N. Muennighoff, “SGPT: GPT Sentence Embeddings for Semantic Search,” Aug. 05, 2022, arXiv: arXiv:2202.08904. Accessed: Nov. 29, 2023. [Online]. Available: http://arxiv.org/abs/2202.08904 [41] Y. Cui, Z. Yang, and X. Yao, “Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca,” Apr. 17, 2023, arXiv: arXiv:2304.08177. doi: 10.48550/arXiv.2304.08177. [42] T. Kudo and J. Richardson, “SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing,” Aug. 19, 2018, arXiv: arXiv:1808.06226. doi: 10.48550/arXiv.1808.06226. [43] P.-H. Wu, C.-L. Liu, and W.-J. Li, “An empirical evaluation of using ChatGPT to summarize disputes for recommending similar labor and employment cases in Chinese,” presented at the Proceedings of the Eighteenth International Workshop on Juris-Informatics (JURISIN 2024), associated with the Sixteenth JSAI International Symposium on AI (JSAI-isAI 2024), Hamamatsu, Shizuoka, Japan, Sep. 2024, p. 101‒114. doi: 10.48550/arXiv.2409.09280. [44] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “BERTScore: Evaluating Text Generation with BERT,” Feb. 24, 2020, arXiv: arXiv:1904.09675. doi: 10.48550/arXiv.1904.09675. [45] Y. S. Chan and H. T. Ng, “MAXSIM: A Maximum Similarity Metric for Machine Translation Evaluation,” in Proceedings of ACL-08: HLT, J. D. Moore, S. Teufel, J. Allan, and S. Furui, Eds., Columbus, Ohio: Association for Computational Linguistics, Jun. 2008, pp. 55–62. Accessed: Oct. 08, 2024. [Online]. Available: https://aclanthology.org/P08-1007 [46] Q. Huang et al., “Lawyer LLaMA Technical Report,” Oct. 13, 2023, arXiv: arXiv:2305.15062. doi: 10.48550/arXiv.2305.15062. [47] N. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China: Association for Computational Linguistics, Jan. 2019, pp. 3982–3992. doi: 10.18653/v1/D19-1410. [48] Y. Cui, W. Che, T. Liu, B. Qin, S. Wang, and G. Hu, “Revisiting Pre-Trained Models for Chinese Natural Language Processing,” in Findings of the Association for Computational Linguistics: EMNLP 2020, 2020, pp. 657–668. doi: 10.18653/v1/2020.findings-emnlp.58. [49] R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality Reduction by Learning an Invariant Mapping,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2 (CVPR’06), New York, NY, USA: IEEE, 2006, pp. 1735–1742. doi: 10.1109/CVPR.2006.100.	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM