Utilization of LLM in the Automation Process of Contract Template Recognition

##plugins.themes.academic_pro.article.main##

Adam Ramdani Kusnandar
Herman Bedi Agtriadi

Abstract

This research investigates the use of various text similarity methods in automating the recognition of varied contract templates. Determining the correct template is a crucial step before the automation process proceeds to the clause-by-clause evaluation stage. This recognition process involves dynamically comparing clause text between drafts and templates without data labeling, relying on available text. Testing was conducted using traditional methods (Jaccard similarity, TF-IDF, BM25) and natural language processing methods (BERT, LaBSE, LLM). The research methodology involves acquiring contract samples from various sources, creating templates, and testing template recognition. The testing output is evaluated based on its effectiveness in capturing semantic equivalence and contextual understanding. Research results show that LLM is highly robust in recognizing templates by only learning from the first few sample clauses. These findings indicate that template recognition automation through LLM will provide the best precision and accuracy compared to traditional methods and other natural language processing methods. Thus, this research can serve as a foundation for developing a template-based contract review automation system that is more robust against contract variations.

##plugins.themes.academic_pro.article.details##

How to Cite
Kusnandar, A. R., & Herman Bedi Agtriadi. (2025). Utilization of LLM in the Automation Process of Contract Template Recognition. Jurnal E-Komtek (Elektro-Komputer-Teknik), 9(2), 563-581. https://doi.org/10.37339/e-komtek.v9i2.2511

References

[1] K. Nitta and K. Satoh, “AI Applications to the Law Domain in Japan,” AsianJLS, vol. 7, no. 3, pp. 471–494, Oct. 2020, doi: 10.1017/als.2020.35.
[2] Merine Thomas, “Quick Check: A Legal Research Recommendation System,” , San Diego, US, Aug. 2020. [Online]. Available: https://ceur-ws.org/Vol-2645/short3.pdf
[3] Jhanvi Arora, Tanay Patankar, Alay Shah, and Shubham Joshi, “Artificial Intelligence as Legal Research Assistant,” in Forum for Information Retrieval Evaluation 2020, Hyderabad, India, Dec. 2020.
[4] J. Drápal, H. Westermann, and J. Savelka, “Using Large Language Models to Support Thematic Analysis in Empirical Legal Studies,” in Frontiers in Artificial Intelligence and Applications, G. Sileno, J. Spanakis, and G. Van Dijck, Eds., IOS Press, 2023. doi: 10.3233/FAIA230965.
[5] L. Karl Branting, “Automating Judicial Document Analysis,” London, UK., Jun. 2017. [Online]. Available: https://ceur-ws.org/Vol-2143/paper2.pdf
[6] Kwok-Yan Lam, Victor C.W. Cheng, and Zee Kin Yeong, “Applying Large Language Models for Enhancing Contract Drafting,” in CEUR Workshop Proceedings, Jun. 2023. [Online]. Available: https://ceur-ws.org/Vol-3423/paper7.pdf
[7] B. T. Wang, “Prompts and Large Language Models: A New Tool for Drafting, Reviewing and Interpreting Contracts?,” Law Tech Hum, vol. 6, no. 2, pp. 88–106, Jul. 2024, doi: 10.5204/lthj.3483.
[8] D. Shu, H. Zhao, X. Liu, D. Demeter, M. Du, and Y. Zhang, “LawLLM: Law Large Language Model for the US Legal System,” in Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, Boise ID USA: ACM, Oct. 2024, pp. 4882–4889. doi: 10.1145/3627673.3680020.
[9] M. Nithya, H. S, K. S, and S. K, “AI-Driven Legal Automation to Enhance Legal Processes with Natural Language Processing,” in 2024 International Conference on IoT Based Control Networks and Intelligent Systems (ICICNIS), Bengaluru, India: IEEE, Dec. 2024, pp. 1246–1253. doi: 10.1109/ICICNIS64247.2024.10823316.
[10] Dan Simonson, Daniel Broderick, and Jonathan Herr, “The Extent of Repetition in Contract Language,” Jun. 2019.
[11] S. S. Ola, “BAHASA INDONESIA RAGAM HUKUM,” Leksika, vol. 3, pp. 37–43, Feb. 2009.
[12] E. W. Kim, Y. J. Shin, K. J. Kim, and S. Kwon, “Development of an Automated Construction Contract Review Framework Using Large Language Model and Domain Knowledge,” Buildings, vol. 15, no. 6, p. 923, Mar. 2025, doi: 10.3390/buildings15060923.
[13] J. Zeng et al., “ContractMind: Trust-calibration interaction design for AI contract review tools,” International Journal of Human-Computer Studies, vol. 196, p. 103411, Feb. 2025, doi: 10.1016/j.ijhcs.2024.103411.
[14] I. Dikmen, G. Eken, H. Erol, and M. T. Birgonul, “Automated construction contract analysis for risk and responsibility assessment using natural language processing and machine learning,” Computers in Industry, vol. 166, p. 104251, Apr. 2025, doi: 10.1016/j.compind.2025.104251.
[15] G. F. C. F. Almeida, J. L. Nunes, N. Engelmann, A. Wiegmann, and M. D. Araújo, “Exploring the psychology of LLMs’ moral and legal reasoning,” Artificial Intelligence, vol. 333, p. 104145, Aug. 2024, doi: 10.1016/j.artint.2024.104145.
[16] C. Zhang, J. Chen, J. Li, Y. Peng, and Z. Mao, “Large language models for human–robot interaction: A review,” Biomimetic Intelligence and Robotics, vol. 3, no. 4, p. 100131, Dec. 2023, doi: 10.1016/j.birob.2023.100131.
[17] “Qwen/Qwen2-VL-72B-Instruct · Hugging Face.” Accessed: Nov. 27, 2024. [Online]. Available: https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct
[18] Y. Liu et al., “OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models,” Sci. China Inf. Sci., vol. 67, no. 12, p. 220102, Dec. 2024, doi: 10.1007/s11432-024-4235-6.
[19] “LICENSE · Qwen/Qwen2-VL-72B-Instruct at main.” Accessed: Nov. 27, 2024. [Online]. Available: https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct/blob/main/LICENSE
[20] Oren Ben-Kiki, Clark Evans, and Ingy, “YAML Ain’t Markup Language (YAMLTM),” YAML Ain’t Markup Language (YAMLTM) version 1.2. Accessed: Nov. 17, 2024. [Online]. Available: https://yaml.org/spec/1.2.2/
[21] S. Bag, S. K. Kumar, and M. K. Tiwari, “An efficient recommendation generation using relevant Jaccard similarity,” Information Sciences, vol. 483, pp. 53–64, May 2019, doi: 10.1016/j.ins.2019.01.023.
[22] M. Kim and Y. Ko, “Multitask Fine-Tuning for Passage Re-Ranking Using BM25 and Pseudo Relevance Feedback,” IEEE Access, vol. 10, pp. 54254–54262, 2022, doi: 10.1109/ACCESS.2022.3176894.
[23] S. Moon, S. Chi, and S.-B. Im, “Automated detection of contractual risk clauses from construction specifications using bidirectional encoder representations from transformers (BERT),” Automation in Construction, vol. 142, p. 104465, Oct. 2022, doi: 10.1016/j.autcon.2022.104465.
[24] F. Feng, Y. Yang, D. Cer, N. Arivazhagan, and W. Wang, “Language-agnostic BERT Sentence Embedding,” 2020, arXiv. doi: 10.48550/ARXIV.2007.01852.
[25] Qwen Team, “Qwen/Qwen3-8B-AWQ,” Qwen/Qwen3-8B-AWQ. Accessed: Nov. 27, 2024. [Online]. Available: https://huggingface.co/Qwen/Qwen3-8B-AWQ
[26] Apache, “Choose an open source license,” Choose an open source license. Accessed: Nov. 27, 2024. [Online]. Available: https://choosealicense.com/licenses/apache-2.0/
[27] A. Yang et al., “Qwen3 Technical Report,” 2025, arXiv. doi: 10.48550/ARXIV.2505.09388.
[28] Salim H. S, Hukum kontrak: teori dan teknik penyusunan kontrak, Cet. 1. Jakarta: Sinar Grafika, 2003.
[29] Nanda Amalia, Ramziati, and Tri Widya Kurniasari, Modul Praktek Kemahiran Hukum, Perancangan Kontrak, Cetakan Pertama. Unimal Press, 2015.