Articles

The Unreasonable Effectiveness of Large Language Models in Zero-Shot Semantic Annotation of Legal Texts

Jaromir Savelka, Carnegie Mellon University School of Computer ScienceFollow
Kevin D. Ashley, University of Pittsburgh School of LawFollow

Document Type

Article

Publication Date

11-17-2023

Abstract

The emergence of ChatGPT has sensitized the general public, including the legal profession, to large language models' (LLMs) potential uses (e.g., document drafting, question answering, and summarization). Although recent studies have shown how well the technology performs in diverse semantic annotation tasks focused on legal texts, an influx of newer, more capable (GPT-4) or cost-effective (GPT-3.5-turbo) models requires another analysis. This paper addresses recent developments in the ability of LLMs to semantically annotate legal texts in zero-shot learning settings. Given the transition to mature generative AI systems, we examine the performance of GPT-4 and GPT-3.5-turbo(-16k), comparing it to the previous generation of GPT models, on three legal text annotation tasks involving diverse documents such as adjudicatory opinions, contractual clauses, or statutory provisions. We also compare the models' performance and cost to better understand the trade-offs. We found that the GPT-4 model clearly outperforms the GPT-3.5 models on two of the three tasks. The cost-effective GPT-3.5-turbo matches the performance of the 20× more expensive text-davinci-003 model. While one can annotate multiple data points within a single prompt, the performance degrades as the size of the batch increases. This work provides valuable information relevant for many practical applications (e.g., in contract review) and research projects (e.g., in empirical legal studies). Legal scholars and practicing lawyers alike can leverage these findings to guide their decisions in integrating LLMs in a wide range of workflows involving semantic annotation of legal texts.

Recommended Citation

Jaromir Savelka & Kevin D. Ashley, The Unreasonable Effectiveness of Large Language Models in Zero-Shot Semantic Annotation of Legal Texts, 6 Frontiers in Artificial Intelligence 1 (2023).
Available at: https://scholarship.law.pitt.edu/fac_articles/582

DOI Link

10.3389/frai.2023.1279794

Download

Included in

Artificial Intelligence and Robotics Commons, Computer Law Commons, Educational Assessment, Evaluation, and Research Commons, Educational Technology Commons, Law and Society Commons, Legal Profession Commons, Legal Writing and Research Commons, Speech and Rhetorical Studies Commons

COinS

Articles

The Unreasonable Effectiveness of Large Language Models in Zero-Shot Semantic Annotation of Legal Texts

Document Type

Publication Date

Abstract

Recommended Citation

DOI Link

Included in

Search

Browse

Author Corner

Articles

The Unreasonable Effectiveness of Large Language Models in Zero-Shot Semantic Annotation of Legal Texts

Authors

Document Type

Publication Date

Abstract

Recommended Citation

DOI Link

Included in

Share

Search

Browse

Author Corner