Repository logo
  • English
  • Deutsch
  • Español
  • Français
  • Log In
    New user? Click here to register.Have you forgotten your password?
Universidad Panamericana
  • Communities & Collections
  • Research Outputs
  • Fundings & Projects
  • Researchers
  • Statistics
  • Feedback
  • English
  • Deutsch
  • Español
  • Français
  1. Home
  2. CRIS
  3. Publications
  4. Assessing AI-Generated Legal Reasoning: A Benchmark for Legal Text Quality from Literature Review
 
  • Details
Options

Assessing AI-Generated Legal Reasoning: A Benchmark for Legal Text Quality from Literature Review

Journal
Artificial Intelligence – COMIA 2025 17th Mexican Congress, Mexico City, Mexico, May 12–16, 2025, Proceedings, Part I
ISSN
1865-0929
Publisher
Springer Nature Switzerland
Date Issued
2025
Author(s)
Prince Tritto, Philippe  
Facultad de Derecho - CampCM  
Ponce, Hiram  
Facultad de Ingeniería - CampCM  
Type
text::book::book part
DOI
10.1007/978-3-031-97907-1_5
URL
https://scripta.up.edu.mx/handle/20.500.12552/12538
Abstract
The adoption of Large Language Models in law has sparked debate over how best to evaluate AI-generated legal reasoning. Existing benchmarks focus on surface-level accuracy, overlooking deeper dimensions such as argumentative coherence, practical usability, and alignment with jurisprudential values. This paper provides a comprehensive framework that integrates insights from formalism, interpretivism, realism, and argumentation theory to assess legal AI outputs. We first explore the philosophical foundations of legal reasoning, drawing on MacCormick’s concepts of internal and external justification and Perelman’s notions of audience-centered persuasion to highlight the rhetorical and moral dimensions essential for evaluation. Next, we examine structured approaches to evaluation from related fields before showing why existing benchmarks (e.g., LexGLUE, LegalBench, LegalAgentBench) only partially capture the subtleties of legal reasoning. We also contrast common law and civil law traditions to illustrate how a one-size-fits-all approach neglects the distinct roles of precedent versus codified statutes. Building on these theoretical and comparative insights, we propose a three-stage evaluation methodology that begins with automated screening for factual consistency, proceeds to expert-led rubric assessment across five dimensions (Accuracy, Reasoning, Clarity, Usefulness, and Safety), and concludes with iterative refinement through reliability checks. This structured approach, validated through a pilot study, aims to strike a balance between scalability and nuance, equipping researchers and practitioners with a robust tool for assessing AI-generated legal texts. Unifying theoretical rigor, domain-specific practicality, and cross-jurisdictional adaptability, this framework lays a solid foundation for legal AI benchmarks and paves the way for safer, more transparent deployment of AI in law. ©The authors ©Springer.
Subjects

Legal AI Benchmarking...

AI-Generated Legal Te...

Legal Reasoning Evalu...

Legal NLP Metrics

Argumentative Coheren...

Legal Text Quality As...

Human-in-the-Loop AI ...

License
Acceso Restringido
URL License
Prince Tritto, P., Ponce, H. (2025). Assessing AI-Generated Legal Reasoning: A Benchmark for Legal Text Quality from Literature Review. In: Martínez-Villaseñor, L., Martínez-Seis, B., Pichardo, O. (eds) Artificial Intelligence – COMIA 2025. COMIA 2025. Communications in Computer and Information Science, vol 2552. Springer, Cham. https://doi.org/10.1007/978-3-031-97907-1_5
How to cite
Prince Tritto, P., Ponce, H. (2025). Assessing AI-Generated Legal Reasoning: A Benchmark for Legal Text Quality from Literature Review. In: Martínez-Villaseñor, L., Martínez-Seis, B., Pichardo, O. (eds) Artificial Intelligence – COMIA 2025. COMIA 2025. Communications in Computer and Information Science, vol 2552. Springer, Cham. https://doi.org/10.1007/978-3-031-97907-1_5

Copyright 2024 Universidad Panamericana
Términos y condiciones | Política de privacidad | Reglamento General

Built with DSpace-CRIS software - Extension maintained and optimized by - Hosting & support SCImago Lab

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback