Generative AI Passes National Lawyer Ethics Exam: A New Era in Legal Studies

GPT-4 and Claude 2 both scored above the approximate passing threshold for the MPRE, with GPT-4 even outperforming the average human test-taker.

Generative AI Passes National Lawyer Ethics Exam: A New Era in Legal Studies

In a groundbreaking development, two generative AI large language models (LLMs) successfully passed the Multistate Professional Responsibility Exam (MPRE), marking a significant milestone in legal technology. According to a new study conducted by contract review and drafting startup LegalOn Technologies, both OpenAI’s GPT-4 and Anthropic’s Claude 2 were tested without any specific training about legal ethics. 

The MPRE is a two-hour, 60-question multiple-choice examination that is administered three times per year. Developed by the National Conference of Bar Examiners (NCBE), the MPRE is required for admission to the bars of all but two U.S. jurisdictions. The exam is designed to measure the examinee’s knowledge and understanding of established standards related to a lawyer’s professional conduct.

The Study

The study, conducted by Gabor Melli, PhD, VP of AI at LegalOn Technologies,  Daniel Lewis, JD, US CEO of LegalOn Technologies and Professor Dru Stevenson, JD, South Texas College of Law Houston, tested OpenAI's GPT-4 and GPT-3.5, Anthropic's Claude 2, and Google's PaLM 2 Bison to 100 simulated exams.

Among these, GPT-4 performed best, answering 74% of questions correctly, an estimated 6% better than the average human test-taker. GPT-4 and Claude 2 both scored above the approximate passing threshold for the MPRE, estimated to range between 56-64% depending on the jurisdiction. GPT-3.5 and PaLM 2 Bison both scored below the estimated passing threshold.

Additionally, LLMs were found to have performed better in certain subject areas. For example, GPT-4 scored higher on topics related to conflicts of interest and client relationships, and lower in areas like safekeeping of funds.

GPT-4 Passes UBE

The results of the LegalOn Technologies study follow earlier research this year that tested a preliminary version of GPT-4 against prior generations of GPT on the entire Uniform Bar Examination (UBE). The UBE not only includes the multiple-choice Multistate Bar Examination (MBE), but also the open-ended Multistate Essay Exam (MEE) and Multistate Performance Test (MPT) components. This comprehensive evaluation allowed the researchers to assess the AI’s ability to handle a variety of question types and legal scenarios.

On the MBE, GPT-4 significantly outperformed both human test-takers and prior models, demonstrating a 26% increase over ChatGPT and beating humans in five of seven subject areas. This is a remarkable achievement, as it shows that the AI was able to understand and correctly answer complex legal questions.

The MEE and MPT, which have not previously been evaluated by scholars, posed a different challenge. These components required the AI to generate longer, more detailed responses. Despite this, GPT-4 scored an average of 4.2/6.0, as compared to much lower scores for ChatGPT. This suggests that the AI has made significant strides in its ability to generate coherent, relevant, and accurate legal arguments.

When graded across the UBE components, in the manner in which a human test-taker would be, GPT-4 scored approximately 297 points. This is significantly in excess of the passing threshold for all UBE jurisdictions. These findings document not just the rapid and remarkable advance of large language model performance generally, but also the potential for such models to support the delivery of legal services in society.

Implications and Future Directions

The success of ChatGPT in passing the MPRE opens up exciting possibilities for the use of AI in the legal field. AI chatbots could potentially assist in legal research, document review, and even provide basic legal advice, thereby increasing efficiency and reducing costs.

“This research advances our understanding of how AI can assist lawyers and helps us assess its current strengths and limitations,” LegalOn Technologies CEO Daniel Lewis said in a press release. “We are not suggesting that AI knows right from wrong or that its behavior is guided by moral principles, but these findings do indicate that AI has potential to support ethical decision-making.”

As AI takes center stage in legal decision-making, the ethical terrain becomes increasingly intricate to navigate. At its core, the ethical framework of transparency, fairness, and accountability serves as the guiding light to ensure that AI-infused decisions align with the exacting ethical standards expected within the legal domain.

As we move forward, it will be crucial to continue exploring the potential applications of AI in law, while also addressing the ethical and practical challenges that arise.

Share post:
Legal.io Logo
Welcome to Legal.io

Connect with peers, level up skills, and find jobs at the world's best in-house legal departments

Legal.io Logo
Welcome to Legal.io

Connect with peers, level up your skills, and find jobs at the world's best in-house legal departments