AI Tools Surpass Lawyers in Legal Research Accuracy, Vals Report Finds

A new Vals AI report shows tools like Alexi, Counsel Stack, Midpage, and ChatGPT outperform lawyers in legal research accuracy and authoritativeness.

Key points:

  • AI tools scored higher than lawyers in a 200-question legal research evaluation.
  • Alexi, Counsel Stack, Midpage, and ChatGPT all surpassed the human lawyer baseline.
  • ChatGPT performed well despite not being purpose-built for legal work.
  • AI systems struggled with multi-jurisdictional and citation-specific questions.
  • Experts say human review remains essential for accuracy and interpretation.

The report from LLM evaluation startup Vals AI compared the performance of Alexi, Counsel Stack, Midpage, and OpenAI’s ChatGPT against human lawyers on 200 U.S. legal research questions. The questions were sourced from attorneys at firms including Reed Smith, Fisher Phillips, McDermott Will & Emery, Ogletree Deakins, Paul Hastings, and Paul, Weiss, Rifkind, Wharton & Garrison.

Each response—AI and human—was scored for accuracy, authoritativeness, and clarity. The lawyer baseline averaged 69%, while the AI tools outperformed: Counsel Stack led at 78%, followed by Alexi at 77%, Midpage at 76%, and ChatGPT at 74%.

Tara Waters, the project’s lead, said she expected ChatGPT to excel in citation quality but found the opposite. “ChatGPT doesn't seem to be, yet, well-engineered for the sourcing and citation,” she told Legaltech News. The generalist AI tended to rely on broad web-based materials rather than pinpointing authoritative statutes or cases, she said.

Still, the legal-focused tools showed weaknesses too. When prompted to survey all 50 states for a single statute, they underperformed ChatGPT. “That was surprising,” said Vals AI CEO Rayan Krishnan, who noted that the systems “should be able to check each one procedurally” without fatigue. He speculated that some tools may have jurisdictional coverage limits or outdated data.

Krishnan cautioned that despite their strong aggregate scores, AI outputs still leave critical gaps. “Even if these tools are getting 70% accuracy, that remaining 30% is really valuable to have human input for,” he said.

Waters added that Vals AI plans to make its evaluation process more repeatable and automated but emphasized the need for continued human involvement in review and scoring. “There won't ever be a pure automated answer for this,” she said, “but we’ll be able to do it more frequently and consistently.”

This study follows Vals AI’s February benchmark that evaluated legal AI platforms from Thomson Reuters, Harvey, vLex, LexisNexis, and Vecflow, assessing how accurately they handled case analysis and transactional work. The new results suggest that, while AI continues to narrow the performance gap with lawyers, the future of legal research may depend as much on oversight as on automation.

Customer Stories

See how leading enterprise in-house teams have scaled smarter with Legal.io's high-caliber flex talent.

More from Legal.io


TikTok ‘Ban’ Upheld by Appeals Court
TikTok ‘Ban’ Upheld by Appeals Court

On December 6, 2024, the U.S. Court of Appeals upheld the Protecting Americans from Foreign Adversary Controlled Applications Act, mandating ByteDance to divest TikTok U.S. The court deemed ownership a national security risk due to potential Chinese government interference. TikTok’s arguments against the Act’s constitutionality were rejected, affirming the law’s necessity.

Dec 07, 2024
Read More
Troutman Pepper and Locke Lord Merge to Form 1,600-Lawyer Firm

Troutman Pepper and Locke Lord have agreed to merge, creating a powerhouse firm with over 1,600 lawyers and 35 offices spread across the U.S. and Europe. The merger is set to go live on January 1.

Sep 06, 2024
Read More
The Most Expensive Law Schools Ranked
The Most Expensive Law Schools Ranked

U.S. News put together a ranking of law schools based on the cost of tuition and fees for out-of-state students

Sep 16, 2024
Read More
IBM Finalizes $6.4 Billion Acquisition of HashiCorp, Enhancing Hybrid Cloud Strategy
IBM Finalizes $6.4 Billion Acquisition of HashiCorp, Enhancing Hybrid Cloud Strategy

IBM completes its $6.4 billion acquisition of HashiCorp, strengthening its hybrid cloud and AI capabilities. The deal follows regulatory approvals from the FTC and UK's CMA.

Feb 28, 2025
Read More
Valve Removes Mandatory Arbitration from Steam Subscriber Agreement

Valve updated the Steam Subscriber Agreement, eliminating the mandatory arbitration clause and class-action waiver, now directing gamers to the courts instead.

Oct 01, 2024
Read More
Ready to hire?

Schedule a free consultation to discuss your hiring needs.

Free 15-min consultation
Legal.io Platform
5 star reviews
Hiring made smarter

Easy-to-use platform for hiring legal talent, managing spend, and optimizing your panel — plus an average savings of 50%.

Need Immediate Help?

Submit a hiring request and let our experts handle the entire process for you.