政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/154979

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | Items with full text/Total items : 115256/146303 (79%)
Visitors : 54507526 Online Users : 271

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

Scope

please add "double quotation mark" for query phrases to get precise results

please goto advance search for comprehansive author search

Adv. Search

Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer

Goto mobile version

政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 > Item 140.119/154979

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/154979

Title:	為建構勞訴類案推薦系統以法律資料訓練生成式語言模型 Training Large Language Models for Similar Case Recommendation of Labor and Employment Disputes
Authors:	李韋杰 Li, Wei-Jie
Contributors:	劉昭麟 Liu, Chao-Lin 李韋杰 Li, Wei-Jie
Keywords:	生成式語言模型 RAG 類案推薦自然語言處理法律文件分析勞資爭議 LLMs generative language model case recommendation Natural Language Processing legal document analysis
Date:	2024
Issue Date:	2025-01-02 11:39:38 (UTC+8)
Abstract:	本研究旨在開發一個基於大型語言模型的勞資爭議案件推薦系統。透過司法院與法務部提供的開放法律數據，訓練一個專門處理台灣法律文件的模型。該模型特別聚焦於勞資爭議案件的回答能力。本研究為提高模型回答能力，建立了可分析並推薦相似案例的向量資料庫，有了該資料庫與訓練好的大型語言模型，能有效解決了繁複的勞資爭議問題，並大幅提高案件處理效率。並且透過多種自制實驗驗證與多種常用模型進行對比，證明了系統在相似案例推薦準確度與對話回應的能力皆達到令人滿意的水平，對於未來的法律AI應用具有實質貢獻。 This study aims to develop a recommendation system for labor dispute cases based on large language models (LLMs) . By utilizing open legal data provided by the Judicial Yuan and the Ministry of Justice, we train a model specifically designed to handle Taiwanese legal documents, with a particular focus on its ability to address labor dispute cases. To enhance the model's response capability, we established a vector database that can analyze and recommend similar cases. With this database and the trained large language model, we effectively tackle the complexities of labor disputes and significantly improve case handling efficiency. Furthermore, through various self-conducted experiments and comparisons with commonly used models, we demonstrate that the system achieves satisfactory levels in both the accuracy of similar case recommendations and dialogue responses, making a substantial contribution to future applications of AI in law.
Reference:	[1] J. S. Dhani, R. Bhatt, B. Ganesan, P. Sirohi, and V. Bhatnagar, “Similar Cases Recommendation using Legal Knowledge Graphs,” Jul. 10, 2021, arXiv: arXiv:2107.04771. doi: 10.48550/arXiv.2107.04771. [2] C.-L. Liu and Y.-F. Liu, “Some Practical Analyses of the Judgment Documents of Labor Litigations for Social Conflicts and Similar Cases,” presented at the LegalAIIA 2023, Braga, Minho, Portugal, Jun. 2023, p. 100‒109. [Online]. Available: https://ceur-ws.org/Vol-3423/paper10.pdf [3] D. M. Katz, M. J. Bommarito, S. Gao, and P. Arredondo, “GPT-4 Passes the Bar Exam,” Mar. 15, 2023, Rochester, NY: 4389233. doi: 10.2139/ssrn.4389233. [4] H. Surden, “ChatGPT, Artificial Intelligence (AI) Large Language Models, and Law,” Mar. 31, 2024, Social Science Research Network, Rochester, NY: 4779694. Accessed: Oct. 26, 2024. [Online]. Available: https://papers.ssrn.com/abstract=4779694 [5] Y.-T. Lin and Y.-N. Chen, “Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model,” Nov. 29, 2023, arXiv: arXiv:2311.17487. doi: 10.48550/arXiv.2311.17487. [6] Y. Bengio, R. Ducharme, and P. Vincent, “A Neural Probabilistic Language Model,” in Advances in Neural Information Processing Systems, MIT Press, 2000. Accessed: Aug. 15, 2023. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2000/hash/728f206c2a01bf572b5940d7d9a8fa4c-Abstract.html [7] S. R. Eddy, “Hidden Markov models,” Current Opinion in Structural Biology, vol. 6, no. 3, pp. 361–365, Jun. 1996, doi: 10.1016/S0959-440X(96)80056-X. [8] A. McCallum, D. Freitag, and F. Pereira, “Maximum Entropy Markov Models for Information Extraction and Segmentation,” presented at the 17th International Conf. on Machine Learning, 2000. [9] T. R. Niesler and P. C. Woodland, “A variable-length category-based n-gram language model,” in 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, May 1996, pp. 164–167 vol. 1. doi: 10.1109/ICASSP.1996.540316. [10] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” Sep. 06, 2013, arXiv: arXiv:1301.3781. Accessed: Oct. 09, 2024. [Online]. Available: http://arxiv.org/abs/1301.3781 [11] A. Vaswani et al., “Attention Is All You Need,” Dec. 05, 2017, arXiv: arXiv:1706.03762. doi: 10.48550/arXiv.1706.03762. [12] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” May 24, 2019, arXiv: arXiv:1810.04805. doi: 10.48550/arXiv.1810.04805. [13] M. Lewis et al., “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension,” Oct. 29, 2019, arXiv: arXiv:1910.13461. doi: 10.48550/arXiv.1910.13461. [14] A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving Language Understanding by Generative Pre-Training,” 2018, [Online]. Available: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf [15] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language Models are Unsupervised Multitask Learners,” OpenAI blog, 2019, [Online]. Available: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf [16] T. Brown et al., “Language Models are Few-Shot Learners,” in Advances in Neural Information Processing Systems, Curran Associates, Inc., 2020, pp. 1877–1901. Accessed: Mar. 22, 2023. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html [17] OpenAI, “GPT-4 Technical Report,” Mar. 27, 2023, arXiv: arXiv:2303.08774. doi: 10.48550/arXiv.2303.08774. [18] J. Weizenbaum, “ELIZA—a computer program for the study of natural language communication between man and machine,” Commun. ACM, vol. 9, no. 1, pp. 36–45, Jan. 1966, doi: 10.1145/365153.365168. [19] R. S. Wallace, “The Anatomy of A.L.I.C.E.,” in Parsing the Turing Test: Philosophical and Methodological Issues in the Quest for the Thinking Computer, R. Epstein, G. Roberts, and G. Beber, Eds., Dordrecht: Springer Netherlands, 2009, pp. 181–210. doi: 10.1007/978-1-4020-6710-5_13. [20] H. Touvron et al., “LLaMA: Open and Efficient Foundation Language Models,” Feb. 27, 2023, arXiv: arXiv:2302.13971. doi: 10.48550/arXiv.2302.13971. [21] Y. Wang et al., “Self-Instruct: Aligning Language Models with Self-Generated Instructions,” May 25, 2023, arXiv: arXiv:2212.10560. doi: 10.48550/arXiv.2212.10560. [22] B. Goertzel, “Artificial General Intelligence: Concept, State of the Art, and Future Prospects,” Journal of Artificial General Intelligence, vol. 5, no. 1, pp. 1–48, Dec. 2014, doi: 10.2478/jagi-2014-0001. [23] H. Touvron et al., “Llama 2: Open Foundation and Fine-Tuned Chat Models,” Jul. 19, 2023, arXiv: arXiv:2307.09288. doi: 10.48550/arXiv.2307.09288. [24] J. Su, M. Ahmed, Y. Lu, S. Pan, W. Bo, and Y. Liu, “RoFormer: Enhanced transformer with Rotary Position Embedding,” Neurocomputing, vol. 568, p. 127063, Feb. 2024, doi: 10.1016/j.neucom.2023.127063. [25] S. Chen, S. Wong, L. Chen, and Y. Tian, “Extending Context Window of Large Language Models via Positional Interpolation,” Jun. 28, 2023, arXiv: arXiv:2306.15595. Accessed: Jul. 25, 2024. [Online]. Available: http://arxiv.org/abs/2306.15595 [26] C. Li et al., “DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing,” arXiv.org. Accessed: Aug. 16, 2023. [Online]. Available: https://arxiv.org/abs/2212.03597v2 [27] S. Rajbhandari, J. Rasley, O. Ruwase, and Y. He, “ZeRO: Memory Optimizations Toward Training Trillion Parameter Models,” arXiv.org. Accessed: Aug. 16, 2023. [Online]. Available: https://arxiv.org/abs/1910.02054v3 [28] G. Wang et al., “ZeRO++: Extremely Efficient Collective Communication for Giant Model Training,” Jun. 16, 2023, arXiv: arXiv:2306.10209. doi: 10.48550/arXiv.2306.10209. [29] E. J. Hu et al., “LoRA: Low-Rank Adaptation of Large Language Models,” Oct. 16, 2021, arXiv: arXiv:2106.09685. doi: 10.48550/arXiv.2106.09685. [30] T. Dettmers, A. Pagnoni, A. Holtzman, and L. Zettlemoyer, “QLoRA: Efficient Finetuning of Quantized LLMs,” May 23, 2023, arXiv: arXiv:2305.14314. doi: 10.48550/arXiv.2305.14314. [31] X. L. Li and P. Liang, “Prefix-Tuning: Optimizing Continuous prompts for Generation,” Jan. 01, 2021, arXiv: arXiv:2101.00190. doi: 10.48550/arXiv.2101.00190. [32] X. Liu et al., “P-Tuning v2: prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks,” Mar. 20, 2022, arXiv: arXiv:2110.07602. doi: 10.48550/arXiv.2110.07602. [33] T. Dao, D. Y. Fu, S. Ermon, A. Rudra, and C. Ré, “FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness,” Jun. 23, 2022, arXiv: arXiv:2205.14135. doi: 10.48550/arXiv.2205.14135. [34] P. Lewis et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” Apr. 12, 2021, arXiv: arXiv:2005.11401. Accessed: Oct. 09, 2024. [Online]. Available: http://arxiv.org/abs/2005.11401 [35] Y. Gao et al., “Retrieval-Augmented Generation for Large Language Models: A Survey,” Mar. 27, 2024, arXiv: arXiv:2312.10997. Accessed: Jul. 31, 2024. [Online]. Available: http://arxiv.org/abs/2312.10997 [36] Z. Zhou et al., “LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model,” Jun. 06, 2024, arXiv: arXiv:2406.04614. Accessed: Aug. 15, 2024. [Online]. Available: http://arxiv.org/abs/2406.04614 [37] H.-T. Nguyen, “A Brief Report on LawGPT 1.0: A Virtual Legal Assistant Based on GPT-3,” Feb. 14, 2023, arXiv: arXiv:2302.05729. doi: 10.48550/arXiv.2302.05729. [38] V. R. Doncel and E. M. Ponsoda, “LYNX: Towards a Legal Knowledge Graph for Multilingual Europe,” Law in Context. A Socio-legal Journal, vol. 37, no. 1, pp. 175–178, Dec. 2020, doi: 10.26826/law-in-context.v37i1.129. [39] X. Wei et al., “Zero-Shot Information Extraction via Chatting with ChatGPT,” Feb. 20, 2023, arXiv: arXiv:2302.10205. Accessed: Mar. 07, 2023. [Online]. Available: http://arxiv.org/abs/2302.10205 [40] N. Muennighoff, “SGPT: GPT Sentence Embeddings for Semantic Search,” Aug. 05, 2022, arXiv: arXiv:2202.08904. Accessed: Nov. 29, 2023. [Online]. Available: http://arxiv.org/abs/2202.08904 [41] Y. Cui, Z. Yang, and X. Yao, “Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca,” Apr. 17, 2023, arXiv: arXiv:2304.08177. doi: 10.48550/arXiv.2304.08177. [42] T. Kudo and J. Richardson, “SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing,” Aug. 19, 2018, arXiv: arXiv:1808.06226. doi: 10.48550/arXiv.1808.06226. [43] P.-H. Wu, C.-L. Liu, and W.-J. Li, “An empirical evaluation of using ChatGPT to summarize disputes for recommending similar labor and employment cases in Chinese,” presented at the Proceedings of the Eighteenth International Workshop on Juris-Informatics (JURISIN 2024), associated with the Sixteenth JSAI International Symposium on AI (JSAI-isAI 2024), Hamamatsu, Shizuoka, Japan, Sep. 2024, p. 101‒114. doi: 10.48550/arXiv.2409.09280. [44] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “BERTScore: Evaluating Text Generation with BERT,” Feb. 24, 2020, arXiv: arXiv:1904.09675. doi: 10.48550/arXiv.1904.09675. [45] Y. S. Chan and H. T. Ng, “MAXSIM: A Maximum Similarity Metric for Machine Translation Evaluation,” in Proceedings of ACL-08: HLT, J. D. Moore, S. Teufel, J. Allan, and S. Furui, Eds., Columbus, Ohio: Association for Computational Linguistics, Jun. 2008, pp. 55–62. Accessed: Oct. 08, 2024. [Online]. Available: https://aclanthology.org/P08-1007 [46] Q. Huang et al., “Lawyer LLaMA Technical Report,” Oct. 13, 2023, arXiv: arXiv:2305.15062. doi: 10.48550/arXiv.2305.15062. [47] N. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China: Association for Computational Linguistics, Jan. 2019, pp. 3982–3992. doi: 10.18653/v1/D19-1410. [48] Y. Cui, W. Che, T. Liu, B. Qin, S. Wang, and G. Hu, “Revisiting Pre-Trained Models for Chinese Natural Language Processing,” in Findings of the Association for Computational Linguistics: EMNLP 2020, 2020, pp. 657–668. doi: 10.18653/v1/2020.findings-emnlp.58. [49] R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality Reduction by Learning an Invariant Mapping,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2 (CVPR’06), New York, NY, USA: IEEE, 2006, pp. 1735–1742. doi: 10.1109/CVPR.2006.100.
Description:	碩士國立政治大學資訊科學系 110753128
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0110753128
Data Type:	thesis
Appears in Collections:	[資訊科學系] 學位論文

Files in This Item:

File	Description	Size	Format
312801.pdf		5317Kb	Adobe PDF	0	View/Open

All items in 政大典藏 are protected by copyright, with all rights reserved.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - Feedback