Reference: | [1] J. S. Dhani, R. Bhatt, B. Ganesan, P. Sirohi, and V. Bhatnagar, “Similar Cases Recommendation using Legal Knowledge Graphs,” Jul. 10, 2021, arXiv: arXiv:2107.04771. doi: 10.48550/arXiv.2107.04771. [2] C.-L. Liu and Y.-F. Liu, “Some Practical Analyses of the Judgment Documents of Labor Litigations for Social Conflicts and Similar Cases,” presented at the LegalAIIA 2023, Braga, Minho, Portugal, Jun. 2023, p. 100‒109. [Online]. Available: https://ceur-ws.org/Vol-3423/paper10.pdf [3] D. M. Katz, M. J. Bommarito, S. Gao, and P. Arredondo, “GPT-4 Passes the Bar Exam,” Mar. 15, 2023, Rochester, NY: 4389233. doi: 10.2139/ssrn.4389233. [4] H. Surden, “ChatGPT, Artificial Intelligence (AI) Large Language Models, and Law,” Mar. 31, 2024, Social Science Research Network, Rochester, NY: 4779694. Accessed: Oct. 26, 2024. [Online]. Available: https://papers.ssrn.com/abstract=4779694 [5] Y.-T. Lin and Y.-N. Chen, “Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model,” Nov. 29, 2023, arXiv: arXiv:2311.17487. doi: 10.48550/arXiv.2311.17487. [6] Y. Bengio, R. Ducharme, and P. Vincent, “A Neural Probabilistic Language Model,” in Advances in Neural Information Processing Systems, MIT Press, 2000. Accessed: Aug. 15, 2023. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2000/hash/728f206c2a01bf572b5940d7d9a8fa4c-Abstract.html [7] S. R. Eddy, “Hidden Markov models,” Current Opinion in Structural Biology, vol. 6, no. 3, pp. 361–365, Jun. 1996, doi: 10.1016/S0959-440X(96)80056-X. [8] A. McCallum, D. Freitag, and F. Pereira, “Maximum Entropy Markov Models for Information Extraction and Segmentation,” presented at the 17th International Conf. on Machine Learning, 2000. [9] T. R. Niesler and P. C. Woodland, “A variable-length category-based n-gram language model,” in 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, May 1996, pp. 164–167 vol. 1. doi: 10.1109/ICASSP.1996.540316. [10] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” Sep. 06, 2013, arXiv: arXiv:1301.3781. Accessed: Oct. 09, 2024. [Online]. Available: http://arxiv.org/abs/1301.3781 [11] A. Vaswani et al., “Attention Is All You Need,” Dec. 05, 2017, arXiv: arXiv:1706.03762. doi: 10.48550/arXiv.1706.03762. [12] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” May 24, 2019, arXiv: arXiv:1810.04805. doi: 10.48550/arXiv.1810.04805. [13] M. Lewis et al., “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension,” Oct. 29, 2019, arXiv: arXiv:1910.13461. doi: 10.48550/arXiv.1910.13461. [14] A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving Language Understanding by Generative Pre-Training,” 2018, [Online]. Available: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf [15] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language Models are Unsupervised Multitask Learners,” OpenAI blog, 2019, [Online]. Available: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf [16] T. Brown et al., “Language Models are Few-Shot Learners,” in Advances in Neural Information Processing Systems, Curran Associates, Inc., 2020, pp. 1877–1901. Accessed: Mar. 22, 2023. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html [17] OpenAI, “GPT-4 Technical Report,” Mar. 27, 2023, arXiv: arXiv:2303.08774. doi: 10.48550/arXiv.2303.08774. [18] J. Weizenbaum, “ELIZA—a computer program for the study of natural language communication between man and machine,” Commun. ACM, vol. 9, no. 1, pp. 36–45, Jan. 1966, doi: 10.1145/365153.365168. [19] R. S. Wallace, “The Anatomy of A.L.I.C.E.,” in Parsing the Turing Test: Philosophical and Methodological Issues in the Quest for the Thinking Computer, R. Epstein, G. Roberts, and G. Beber, Eds., Dordrecht: Springer Netherlands, 2009, pp. 181–210. doi: 10.1007/978-1-4020-6710-5_13. [20] H. Touvron et al., “LLaMA: Open and Efficient Foundation Language Models,” Feb. 27, 2023, arXiv: arXiv:2302.13971. doi: 10.48550/arXiv.2302.13971. [21] Y. Wang et al., “Self-Instruct: Aligning Language Models with Self-Generated Instructions,” May 25, 2023, arXiv: arXiv:2212.10560. doi: 10.48550/arXiv.2212.10560. [22] B. Goertzel, “Artificial General Intelligence: Concept, State of the Art, and Future Prospects,” Journal of Artificial General Intelligence, vol. 5, no. 1, pp. 1–48, Dec. 2014, doi: 10.2478/jagi-2014-0001. [23] H. Touvron et al., “Llama 2: Open Foundation and Fine-Tuned Chat Models,” Jul. 19, 2023, arXiv: arXiv:2307.09288. doi: 10.48550/arXiv.2307.09288. [24] J. Su, M. Ahmed, Y. Lu, S. Pan, W. Bo, and Y. Liu, “RoFormer: Enhanced transformer with Rotary Position Embedding,” Neurocomputing, vol. 568, p. 127063, Feb. 2024, doi: 10.1016/j.neucom.2023.127063. [25] S. Chen, S. Wong, L. Chen, and Y. Tian, “Extending Context Window of Large Language Models via Positional Interpolation,” Jun. 28, 2023, arXiv: arXiv:2306.15595. Accessed: Jul. 25, 2024. [Online]. Available: http://arxiv.org/abs/2306.15595 [26] C. Li et al., “DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing,” arXiv.org. Accessed: Aug. 16, 2023. [Online]. Available: https://arxiv.org/abs/2212.03597v2 [27] S. Rajbhandari, J. Rasley, O. Ruwase, and Y. He, “ZeRO: Memory Optimizations Toward Training Trillion Parameter Models,” arXiv.org. Accessed: Aug. 16, 2023. [Online]. Available: https://arxiv.org/abs/1910.02054v3 [28] G. Wang et al., “ZeRO++: Extremely Efficient Collective Communication for Giant Model Training,” Jun. 16, 2023, arXiv: arXiv:2306.10209. doi: 10.48550/arXiv.2306.10209. [29] E. J. Hu et al., “LoRA: Low-Rank Adaptation of Large Language Models,” Oct. 16, 2021, arXiv: arXiv:2106.09685. doi: 10.48550/arXiv.2106.09685. [30] T. Dettmers, A. Pagnoni, A. Holtzman, and L. Zettlemoyer, “QLoRA: Efficient Finetuning of Quantized LLMs,” May 23, 2023, arXiv: arXiv:2305.14314. doi: 10.48550/arXiv.2305.14314. [31] X. L. Li and P. Liang, “Prefix-Tuning: Optimizing Continuous prompts for Generation,” Jan. 01, 2021, arXiv: arXiv:2101.00190. doi: 10.48550/arXiv.2101.00190. [32] X. Liu et al., “P-Tuning v2: prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks,” Mar. 20, 2022, arXiv: arXiv:2110.07602. doi: 10.48550/arXiv.2110.07602. [33] T. Dao, D. Y. Fu, S. Ermon, A. Rudra, and C. Ré, “FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness,” Jun. 23, 2022, arXiv: arXiv:2205.14135. doi: 10.48550/arXiv.2205.14135. [34] P. Lewis et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” Apr. 12, 2021, arXiv: arXiv:2005.11401. Accessed: Oct. 09, 2024. [Online]. Available: http://arxiv.org/abs/2005.11401 [35] Y. Gao et al., “Retrieval-Augmented Generation for Large Language Models: A Survey,” Mar. 27, 2024, arXiv: arXiv:2312.10997. Accessed: Jul. 31, 2024. [Online]. Available: http://arxiv.org/abs/2312.10997 [36] Z. Zhou et al., “LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model,” Jun. 06, 2024, arXiv: arXiv:2406.04614. Accessed: Aug. 15, 2024. [Online]. Available: http://arxiv.org/abs/2406.04614 [37] H.-T. Nguyen, “A Brief Report on LawGPT 1.0: A Virtual Legal Assistant Based on GPT-3,” Feb. 14, 2023, arXiv: arXiv:2302.05729. doi: 10.48550/arXiv.2302.05729. [38] V. R. Doncel and E. M. Ponsoda, “LYNX: Towards a Legal Knowledge Graph for Multilingual Europe,” Law in Context. A Socio-legal Journal, vol. 37, no. 1, pp. 175–178, Dec. 2020, doi: 10.26826/law-in-context.v37i1.129. [39] X. Wei et al., “Zero-Shot Information Extraction via Chatting with ChatGPT,” Feb. 20, 2023, arXiv: arXiv:2302.10205. Accessed: Mar. 07, 2023. [Online]. Available: http://arxiv.org/abs/2302.10205 [40] N. Muennighoff, “SGPT: GPT Sentence Embeddings for Semantic Search,” Aug. 05, 2022, arXiv: arXiv:2202.08904. Accessed: Nov. 29, 2023. [Online]. Available: http://arxiv.org/abs/2202.08904 [41] Y. Cui, Z. Yang, and X. Yao, “Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca,” Apr. 17, 2023, arXiv: arXiv:2304.08177. doi: 10.48550/arXiv.2304.08177. [42] T. Kudo and J. Richardson, “SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing,” Aug. 19, 2018, arXiv: arXiv:1808.06226. doi: 10.48550/arXiv.1808.06226. [43] P.-H. Wu, C.-L. Liu, and W.-J. Li, “An empirical evaluation of using ChatGPT to summarize disputes for recommending similar labor and employment cases in Chinese,” presented at the Proceedings of the Eighteenth International Workshop on Juris-Informatics (JURISIN 2024), associated with the Sixteenth JSAI International Symposium on AI (JSAI-isAI 2024), Hamamatsu, Shizuoka, Japan, Sep. 2024, p. 101‒114. doi: 10.48550/arXiv.2409.09280. [44] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “BERTScore: Evaluating Text Generation with BERT,” Feb. 24, 2020, arXiv: arXiv:1904.09675. doi: 10.48550/arXiv.1904.09675. [45] Y. S. Chan and H. T. Ng, “MAXSIM: A Maximum Similarity Metric for Machine Translation Evaluation,” in Proceedings of ACL-08: HLT, J. D. Moore, S. Teufel, J. Allan, and S. Furui, Eds., Columbus, Ohio: Association for Computational Linguistics, Jun. 2008, pp. 55–62. Accessed: Oct. 08, 2024. [Online]. Available: https://aclanthology.org/P08-1007 [46] Q. Huang et al., “Lawyer LLaMA Technical Report,” Oct. 13, 2023, arXiv: arXiv:2305.15062. doi: 10.48550/arXiv.2305.15062. [47] N. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China: Association for Computational Linguistics, Jan. 2019, pp. 3982–3992. doi: 10.18653/v1/D19-1410. [48] Y. Cui, W. Che, T. Liu, B. Qin, S. Wang, and G. Hu, “Revisiting Pre-Trained Models for Chinese Natural Language Processing,” in Findings of the Association for Computational Linguistics: EMNLP 2020, 2020, pp. 657–668. doi: 10.18653/v1/2020.findings-emnlp.58. [49] R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality Reduction by Learning an Invariant Mapping,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2 (CVPR’06), New York, NY, USA: IEEE, 2006, pp. 1735–1742. doi: 10.1109/CVPR.2006.100. |