AI Terms Legal Professionals Should Know: A Practical Glossary

This glossary explains core AI terms in plain English, with context and examples geared towards attorneys and other legal professionals.

Also from Legal.io:

For talent
- Find jobs that require AI and technology skills
- Compensation data for a wide variety of legal roles
For legal departments
- Hire top legal talent for temporary and permanent roles
- Manage your panel and benchmark outside counsel rates

Artificial intelligence is becoming part of everyday legal work. Lawyers, legal operations teams, contract managers, compliance professionals, and outside counsel are all being asked to evaluate, use, supervise, or buy AI tools.

The vocabulary can still sound technical: large language models, tokens, embeddings, transformers, RAG, grounding, agents, MCP, context windows, fine-tuning, reasoning models, and more.

Legal professionals do not need to become AI engineers. But understanding the core concepts makes it easier to evaluate tools, design better workflows, and use AI more effectively.

This glossary explains key AI terms in plain English, with practical examples for legal professionals.

1. Large Language Model

A large language model, or LLM, is the technology behind tools like ChatGPT, Claude, Gemini, and many AI products used in legal work.

At its core, an LLM predicts the next piece of language based on the language that came before it. If the input is “Time is of the…,” the model can predict “essence.” If the input is “This Mutual NDA is entered into by and between…,” the model can infer that the next language will likely identify the parties.

This basic capability becomes powerful at scale. Drafting, summarizing, clause review, legal research assistance, issue spotting, email drafting, and document comparison are all built on top of language prediction.

LLMs are especially relevant to legal work because legal work is language-heavy. Contracts, policies, regulations, briefs, memos, discovery documents, emails, board materials, and billing guidelines are all text-rich.

Practical example: An LLM can create a first-pass summary of an NDA, identify key provisions, and suggest areas for review.

2. Datasets

Datasets are the collections of text and other materials used to train an AI model.

For LLMs, datasets may include books, websites, articles, code, public records, legal materials, contracts, regulations, cases, filings, and other written content. The dataset gives the model examples of how language is structured and how concepts relate to each other.

In legal work, datasets matter because the quality, breadth, and relevance of training materials can influence how well a model recognizes legal language, document structures, and professional writing patterns.

Practical example: A model exposed to many examples of contracts may be better at recognizing common sections such as definitions, confidentiality obligations, indemnities, limitations of liability, termination rights, and governing law.

3. Neural Networks

A neural network is the computational structure that allows an AI model to learn patterns from data.

Neural networks are inspired by the idea of connected nodes, where many small calculations combine to recognize patterns and produce outputs. In an LLM, the neural network learns relationships between words, phrases, concepts, and structures.

For legal professionals, the key point is that a neural network is not a database of stored answers. It is a pattern-learning system that generates responses based on what it has learned.

Practical example: A neural network can learn that NDAs, MSAs, SOWs, DPAs, employment agreements, and outside counsel guidelines each have recognizable structures and recurring concepts.

4. Parameters

Parameters are internal settings the model adjusts during training.

When a model is trained, it repeatedly updates these parameters to become better at predicting useful language. Large models may have billions of parameters, which is one reason they can capture subtle relationships across many kinds of text.

Parameters are not manually written rules. They are learned internal values that influence how the model responds.

Practical example: Parameters help the model learn that “governing law,” “venue,” “jurisdiction,” and “choice of law” often appear in related legal contexts, even though they are not the same thing.

5. Tokens

A token is a piece of text that an AI model processes. A token may be a word, part of a word, punctuation, or even a space.

Tokens matter because they affect cost, speed, and context. When a user uploads a long contract, policy manual, or transcript, the model processes that material as tokens rather than pages.

Tokens also matter because meaning can depend on small textual differences. “Indemnitor” and “indemnitee” are related terms, but they point in different directions. The same is true of “lessor” and “lessee,” or “offeror” and “offeree.”

Practical example: In contract review, token-level understanding helps the model distinguish between related terms and identify how they are used in context.

6. Context Window

A context window is the amount of text or information a model can consider at one time.

This is important in legal work because legal documents are often long and interconnected. A limitation of liability clause may need to be read alongside indemnity, confidentiality, data security, insurance, remedies, and carveout provisions. A defined term introduced early in a document may control language much later.

A larger context window allows a model to consider more material at once. The most effective workflows combine sufficient context with good document retrieval and clear instructions.

Practical example: A model reviewing a long MSA benefits from enough context to understand definitions, cross-references, schedules, exhibits, and related clauses.

7. Embeddings

Embeddings are numerical representations of meaning.

An embedding allows an AI system to represent a word, sentence, clause, document, or concept in a way that can be compared to other words, sentences, clauses, documents, or concepts.

A useful way to think about embeddings is as a map of meaning. Items that are similar in meaning are placed closer together on that map.

Practical example: “Force majeure,” “act of God,” “impossibility,” and “frustration of purpose” may appear close together because they often relate to similar legal concepts.

8. Vectors

A vector is the mathematical form of an embedding.

In AI systems, text can be converted into vectors so that similar meanings can be compared and retrieved. This is what allows a system to find relevant information even when the exact words do not match.

For legal professionals, the practical value is concept-based search. A user can describe what they are looking for in plain English, and the system can search for materials with similar meaning.

Practical example: A legal team might ask, “Which agreements let the customer walk away without cause?” A vector-based system may find termination-for-convenience language even if the contract uses different wording.

9. Vector Database

A vector database stores and retrieves embeddings, or meaning-based representations of text.

In practical terms, a vector database enables concept-based search. Instead of asking only, “Which documents contain this exact word?,” it asks, “Which documents are closest in meaning to this question?”

This is useful in legal because the same concept can appear in many different forms. A party saying, “I want out,” may implicate termination, breach, cure periods, walk-away rights, force majeure, or convenience termination. A keyword search may miss some of that context. A vector search can retrieve more conceptually relevant materials.

Practical example: A vector database can help retrieve contracts with clauses similar to a proposed limitation of liability provision, even when the wording differs.

10. Attention

Attention is one of the key breakthroughs behind modern AI. It allows models to weigh surrounding words and understand what a term means in context.

Legal language is full of words with multiple meanings. “Consideration” can mean the bargained-for exchange in contract law, or it can simply mean something being taken into account. “Execution” can mean signing a document, carrying out an obligation, enforcing a judgment, or something else entirely.

Attention helps the model determine which meaning is intended based on surrounding words.

The famous 2017 paper “Attention Is All You Need” introduced the transformer architecture that became foundational to modern language models. The title is worth remembering because attention is central to how these systems handle context.

Practical example: In the phrase “adequate consideration,” the model should understand a contract-law context. In “out of consideration for the family,” it should understand a different meaning.

11. Transformer

A transformer is a type of AI model architecture that uses attention to process language.

People sometimes use “transformer” and “large language model” interchangeably, but they are not the same thing. The LLM is the overall model or product. The transformer is the underlying architecture, or engine design, that made many modern LLMs possible.

Transformers process language in layers. Early layers may help disambiguate words. Later layers may capture more complex relationships, such as structure, implication, contradiction, or reasoning patterns.

Practical example: A transformer-based model can analyze how a defined term is used across a long contract and connect that usage to related obligations elsewhere in the document.

12. Training

Training is the process by which a model learns patterns from data.

During training, a model is exposed to very large amounts of text. It learns relationships between words, concepts, styles, and structures. That is how it learns what an NDA usually looks like, how a legal memo is structured, what a contract clause looks like, and how legal arguments are typically framed.

Training creates the model’s general language capability. Later steps, such as fine-tuning, retrieval, and workflow design, can make that capability more useful for specific professional tasks.

Practical example: A model may learn common contract structures during training, then apply that general knowledge when summarizing or comparing agreements.

13. Self-Supervised Learning

Self-supervised learning is a training method where the model learns by predicting missing or next pieces of data.

In text, this can mean hiding part of a sentence and asking the model to predict what comes next. For example, “Time is of the…” likely leads to “essence.” The correct answer is already present in the original text, so humans do not need to manually label every example.

This approach allowed AI models to learn from massive volumes of text. It is one reason LLMs can learn broad patterns from books, websites, articles, contracts, cases, regulations, filings, commentary, and other materials.

Practical example: By observing many legal documents, a model can learn that certain concepts often appear together, such as confidentiality obligations and exclusions, or indemnity provisions and liability caps.

14. Inference

Inference is the moment when a trained model is actually used to produce an answer.

Training is when the model learns. Inference is when the model responds.

For example, when a lawyer asks an AI system to summarize a contract, identify clauses, draft fallback language, or compare a provision against a playbook, the model is performing inference.

This distinction matters because many workflow decisions happen at inference time: what context is sent to the model, which documents are retrieved, what instructions are included, and whether the model is allowed to take actions or only suggest them.

Practical example: A legal team may use the same underlying model for different inference workflows, such as contract summarization, legal intake triage, policy Q&A, and matter updates.

15. Fine-Tuning

Fine-tuning means taking a general model and further training it for a specific domain, task, behavior, or style.

A model can be fine-tuned to classify documents, answer in a specific format, follow a certain drafting style, or perform a narrow task more reliably.

In legal, fine-tuning might be useful for clause classification, legal intake routing, document extraction, or applying a consistent style to certain outputs.

Fine-tuning changes the model. Retrieval gives the model relevant context. In many legal workflows, the two approaches can be complementary.

Practical example: A model might be fine-tuned to classify incoming legal requests, while a retrieval workflow provides current playbooks and policies for answering specific questions.

16. Reinforcement Learning from Human Feedback

Reinforcement learning from human feedback, or RLHF, is a process where humans help shape model behavior.

A model may generate two possible answers. A human selects the better one. The preferred answer is rewarded, and the less helpful answer is penalized. Over time, the model becomes more likely to produce answers that humans prefer.

This is one reason modern AI tools often feel more helpful, polished, and conversational than earlier systems.

Practical example: Feedback from users can help models produce clearer summaries, more helpful explanations, and better-formatted responses for professional workflows.

17. Prompting

Prompting is the way a user instructs an AI model.

A weak prompt is vague: “Review this contract.”

A stronger prompt gives context, role, task, standards, format, and constraints:

“Review this NDA from the perspective of a B2B SaaS company receiving confidential information. Identify deviations from our standard positions. Separate legal risk from business risk. Use plain English. Include proposed fallback language.”

For legal professionals, prompting is best understood as clear delegation. The same way a senior lawyer would give context to a junior lawyer, a legal professional should give the model the background needed to perform the task well.

Good prompting usually includes the role, task, materials, risks to focus on, desired format, and any limits on what the model should do.

Practical example: A contract review prompt should specify whether the company is the customer or vendor, whether the clause is on company paper or third-party paper, and what output format is most useful.

18. Few-Shot Prompting

Few-shot prompting means giving the model examples before asking it to perform a task.

Instead of simply telling the model what to do, the user shows what good output looks like.

For example, a legal team might provide three examples of contract issue summaries:

A low-priority issue that can be accepted.
A medium-priority issue requiring business input.
A high-priority issue requiring legal review.

The model can then follow the pattern.

This is especially useful for repeatable workflows, such as intake summaries, contract issue lists, privilege log descriptions, matter updates, outside counsel billing reviews, board updates, employment investigation summaries, or regulatory alerts.

Practical example: A legal operations team can improve AI-generated matter updates by giving examples of prior updates that were clear, concise, and useful to the business.

19. Zero-Shot Prompting

Zero-shot prompting means asking a model to perform a task without providing examples.

This works best when the task is familiar, straightforward, or well-described in the prompt. The model relies on its general training and the instructions provided by the user.

Zero-shot prompting is often useful for quick first drafts, simple summaries, reformatting, or general explanations.

Practical example: A user might ask, “Summarize this clause in plain English,” without providing examples of prior summaries.

20. Prompt Chaining

Prompt chaining means breaking a larger task into multiple smaller prompts or steps.

Instead of asking the model to complete a complex workflow all at once, the user or system breaks the process into stages. Each stage produces an output that becomes input for the next stage.

This can improve quality and make review easier because each step has a defined purpose.

Practical example: A contract review workflow might first identify key clauses, then compare them to a playbook, then generate an issue list, then draft a business-facing summary.

21. Retrieval-Augmented Generation

Retrieval-augmented generation, or RAG, is one of the most important concepts in legal AI.

In a RAG workflow, the system retrieves relevant documents before asking the model to answer. The model then generates a response using those documents as context.

This matters because the most important information for many legal tasks is often in a legal team’s own materials:

Contract templates
Playbooks
Clause libraries
Outside counsel guidelines
Billing rules
Matter histories
Employment policies
Board materials
Regulatory memos
Knowledge bases
Negotiated precedent

RAG helps AI output become more specific to the documents and standards supplied to the workflow.

Practical example: If a user asks, “Can we accept this indemnity clause?,” a RAG system can retrieve the company’s indemnity playbook, fallback language, and similar prior agreements before generating an answer.

22. Grounding

Grounding means tying an AI response to specific source material.

A grounded answer does not just provide a conclusion. It also points to the document, section, clause, policy, case, or source that supports the answer.

For legal professionals, grounding is useful because it makes AI output easier to review, cite, and incorporate into existing workflows.

A general answer might say:

“The contract allows termination for convenience.”

A grounded answer might say:

“Section 12.3 allows either party to terminate for convenience on 30 days’ written notice.”

Practical example: A contract review tool can link each issue it identifies back to the specific clause being analyzed.

23. Hallucination

A hallucination is when an AI system provides an answer that appears plausible but is not supported by the underlying source material.

In legal work, this can include incorrect citations, misstated holdings, inaccurate summaries, or references to language that does not appear in the document being reviewed.

The practical response is to design workflows that make review straightforward: use grounding, citations, source links, document comparison, and appropriate human review.

Practical example: An AI-generated research summary is more useful when it includes citations and links to the underlying authorities so the user can quickly confirm the result.

24. Chain of Thought

Chain of thought refers to step-by-step reasoning.

Legal professionals already use structured reasoning. A legal memo often follows issue, rule, application, conclusion. A contract review often follows clause, standard position, deviation, priority level, fallback, escalation.

The AI version is similar: instead of jumping directly to an answer, the model works through a problem in steps.

For legal work, the most useful output is often not every internal step, but a clear explanation of the conclusion, source, assumptions, priority level, recommended action, and any open questions.

Practical example: Instead of merely saying “high priority,” an AI contract review can explain which clause creates the issue, what standard is being applied, and what action is suggested.

25. Reasoning Models

Reasoning models are designed to spend more effort on complex problems.

They are useful when a task requires multi-step analysis, comparison, tradeoff evaluation, or planning. Rather than producing an immediate answer, a reasoning model may work through a problem more carefully before responding.

For legal work, reasoning models can be useful for tasks such as:

Comparing multiple indemnity clauses
Reviewing deposition excerpts for inconsistencies
Evaluating a contract redline against a playbook
Preparing a negotiation strategy
Analyzing risks across multiple documents
Structuring a legal memo from messy source material

Practical example: A reasoning model may help organize the issues in a complex contract negotiation before a lawyer reviews and finalizes the position.

26. Connectors

Connectors allow AI systems to access external tools, databases, and work systems.

For legal teams, these systems may include email, calendar, document management, CLM, e-billing, matter management, shared drives, Slack, Teams, CRM, HRIS, ticketing systems, legal research platforms, panel management systems, and rate databases.

Without connectors, users often copy and paste information into a chat window. With connectors, AI can work across the systems where legal work already happens.

Practical example: A connected AI assistant could prepare a meeting brief by reviewing the calendar invite, email thread, matter summary, recent invoices, and open tasks.

27. Model Context Protocol

Model Context Protocol, or MCP, is an emerging standard for connecting AI models to tools and data sources.

A useful analogy is that MCP acts like a standardized port for AI context and tool access. It gives models a more consistent way to interact with external systems.

For legal workflows, this matters because much of the relevant context lives outside the chat window: in document systems, matter systems, contract systems, email, calendars, and databases.

Practical example: An MCP-enabled workflow could allow an AI assistant to retrieve the latest matter update, check a related document, and prepare a summary using current information from connected systems.

28. Tool Use and Function Calling

Tool use, sometimes called function calling, means an AI model can call a specific tool to complete part of a task.

This is different from simply generating text. A model might call a search tool, query a database, calculate a date, create a ticket, draft an email, or update a system of record.

Tool use is one reason AI systems are becoming more operational. The model can decide when a tool is needed, pass the right inputs, receive the result, and incorporate that result into the workflow.

Practical example: A legal intake assistant might classify a request, create a ticket, assign it to the right queue, and draft a confirmation message.

29. Context Engineering

Context engineering is the practice of giving an AI model the right information, in the right structure, at the right time.

Prompt engineering focuses on the instruction. Context engineering focuses on the full package of information the model needs to do the job well.

For legal work, context may include:

The document being reviewed
The governing playbook
Preferred positions
Fallback language
Counterparty type
Deal size
Business urgency
Prior approvals
Similar precedent
Escalation rules
Output format

Context engineering is often the difference between a generic answer and a useful workflow-specific answer.

Practical example: An NDA review workflow can provide the NDA, the company’s standard template, the NDA playbook, fallback language, whether the company is disclosing or receiving information, and escalation rules.

30. Agents

An agent is an AI system that can take a goal, break it into steps, use tools, and work through a process.

A chatbot answers a question. An agent works on a task.

For example, a chatbot might answer, “What are the key issues in this contract?”

An agent might:

Find the contract
Identify the relevant playbook
Compare clauses
Create an issue list
Draft proposed redlines
Prepare an email to the business owner
Flag issues requiring review
Update a tracker
Schedule a follow-up

Agents are powerful because they move AI from answering questions to assisting with workflows.

Practical example: An agent may support daily legal intake triage, contract first-pass review, invoice guideline checks, regulatory monitoring, or recurring matter updates.

31. Multimodal Models

Multimodal models can work with more than text. They may process PDFs, images, screenshots, spreadsheets, audio, video, charts, and slide decks.

This matters because legal information often lives in many formats: scanned contracts, exhibits, invoice PDFs, product screenshots, deposition recordings, emails, spreadsheets, and presentations.

Practical example: A multimodal model may help summarize a scanned agreement, extract information from an invoice PDF, review a slide deck, or interpret a spreadsheet of legal spend data.

32. Small Language Models

Small language models are smaller, faster, cheaper models designed for narrower tasks.

Not every legal workflow requires the most powerful frontier model. Some workflows benefit from models that are optimized for speed, cost, privacy, or a specific task.

Small language models may be useful for classification, routing, extraction, formatting, or other repeatable workflows where the task is narrow and well-defined.

Practical example: Classifying incoming legal requests, extracting renewal dates, identifying governing law, or detecting missing signatures may be handled by smaller models.

33. Structured Output

Structured output means the AI returns information in a predictable format rather than free-form prose.

Structured output can take the form of a table, checklist, JSON object, issue list, intake form, risk matrix, clause summary, or set of fields that another system can use.

This is important because structured outputs are easier to review, compare, store, and route through workflows.

Practical example: A contract review system might return each issue with fields for clause name, section reference, priority level, summary, recommended action, and fallback language.

34. Evaluation

Evaluation is the discipline of testing whether AI performs well on real tasks.

For legal teams, evaluation should be tied to specific workflows. A model that performs well on contract summarization may not perform equally well on legal research, billing review, intake triage, or policy Q&A.

Useful evaluation looks at accuracy, consistency, source support, review burden, speed, and whether the output saves time after human review.

Practical example: A legal team evaluating an AI contract review tool might test whether it identifies the right issues, links them to the right clauses, applies the right playbook positions, and produces a useful summary for the business.

35. Human-in-the-Loop

Human-in-the-loop means a human remains involved in reviewing, approving, or supervising the AI’s work.

In legal workflows, the right level of human involvement depends on the task. Some outputs may need light review. Others may require detailed review by a lawyer or subject-matter expert.

The concept is not unique to AI. Legal work already has review layers: junior lawyers, senior lawyers, specialists, business approvers, and clients. AI can fit into similar review structures.

Practical example: A legal intake system may automatically classify requests, but a legal operations professional may review exceptions or high-priority matters before assignment.

Final Takeaway

AI systems are powerful because they can process, generate, retrieve, and reason over language at scale. That makes them highly relevant to legal work.

For legal professionals, the most useful AI workflows usually combine strong models with the right context, source materials, workflow design, and human review.

The goal is not to replace legal judgment. The goal is to give legal professionals more leverage: better first drafts, faster review, stronger knowledge retrieval, more consistent workflows, and clearer decision support.

The legal teams that benefit most from AI will be the ones that understand their workflows, structure their knowledge, apply the right guardrails, and use AI to make legal work faster, more consistent, and more strategic.

Community

Job Board

Salary Insights

Legal.io