Vertical · 2026-05-27 · 8 min read

AI for law firms, what actually ships and what stays a pilot

An operator's view of AI for law firms in 2026. Where the wins are real, where the risk is concentrated, and how a small or mid-size firm should sequence its first six months.

Most "AI for law firms" pitches in 2026 are written by people who have never billed an hour. The promises rhyme with the dashboards they are trying to sell. The constraint, the part the studio actually has to design around, never makes it into the deck.

The constraint is this. A law firm is a regulated trust business that sells time, runs on documents, and survives on referrals. Any AI you put inside it has to respect privilege, has to behave well around discovery, and cannot make the firm sound dumber in front of a client than the partner does on the phone. That is a tighter design brief than the brochures suggest.

We have shipped enough vertical AI now (HIPAA-grade dental on Vertex AI Gemini 3 Pro, production dispatch on Howdy Dispatch, the stealth multi-tenant nonprofit platform) to recognize a pattern that maps to law firms cleanly. This post is the honest version of where AI actually helps a small or mid-size firm today, where it stays a pilot, and how we would sequence the first six months.

If you want the methodology that sits underneath this, the agent orchestration methodology is the longer read. This is the vertical translation.

The four real use cases for legal AI today

Across the firms we have looked at, the same four use cases keep showing up as actually worth shipping. The rest is theater.

1. Document review for diligence and discovery

This is the cleanest win. The work product is well-defined (find the relevant clauses, the changes from a prior version, the indemnity language, the privileged communications), the volume is high, and the reviewer is a junior associate burning $300 to $500 an hour of effective cost on a task that has a well-bounded answer.

The right tool here is a focused review agent, not a generic chat interface. Anthropic Claude's prompt caching and citation features fit this well, because a diligence dataset is often re-queried thousands of times and the citations let the reviewing attorney trace every assertion back to the source paragraph. Google Vertex AI Gemini is a strong alternative when the dataset is multimodal (scanned exhibits, signature pages, hand-marked redlines). OpenAI's file search inside the Responses API is the easiest off-the-shelf path for firms that do not want a build.

The honest part: a junior associate still has to verify the output before the partner sees it. The win is not "AI does diligence." The win is "diligence that took two associates a week now takes one associate two days, with better completeness."

2. Intake triage

A solo or small firm loses a meaningful percentage of qualified leads in the first 48 hours after the initial inquiry. The lead form gets answered late, the conflict check is delayed, the intake call happens on a day the prospect has already retained someone else.

A simple intake agent can do three things well: capture the structured facts from the inquiry, run a conflict pre-check against the firm's matter database, and draft a first-response email for the partner to review. Built right, this is two days of work with Anthropic's Claude Agent SDK or OpenAI's Agents SDK and a connection into the firm's case management system through MCP or a small custom integration.

This is also where most "AI for legal intake" vendors are oversold. The agent does not replace the lawyer's judgment on whether to take the matter. It removes the silence between inquiry and first response, which is where the lead is actually lost.

3. First-draft legal research

A first-draft research memo on a question with established case law is now a credible AI task, provided the agent is forced to cite. Claude with citations turned on is the strongest tool here today, because the citation behavior is built into the model rather than bolted on. Vertex AI Search Grounding and OpenAI's file search do similar work with different integration tradeoffs.

The trap is the unsupervised version. A research memo that hallucinates a case is worse than no memo at all, because it costs the firm trust with the partner reviewing it. The non-negotiable design rule for legal research AI is that every assertion is citable, every citation is verifiable, and the human reviewing the memo is forced to confirm the citations before the work product is used.

This is one of the use cases where build-vs-buy actually splits cleanly. For solo firms, an off-the-shelf tool with a verified citation feature is the right call. For mid-size firms with a real research desk, a thin custom layer over Claude with the firm's house style baked in pays back inside the first quarter.

4. Client communications drafting

Status update emails, fee letters, intake responses, motion notifications. This is the lowest-risk and highest-volume category, and most firms are already doing it informally in ChatGPT, which is the worst possible answer for confidentiality.

The right pattern is a small drafting surface inside the firm's existing tools (Outlook, the matter system, Teams) that uses a model under the right contract (Anthropic with a zero-retention agreement, Vertex AI with a BAA-equivalent contract, OpenAI with an enterprise data processing addendum) and pulls in firm-approved templates as context.

The build is small. The right answer here is almost always a thin slice, not a platform.

The three things that quietly kill legal AI projects

Across the projects we have reviewed and the ones we have shipped, the same three failure modes show up.

Data hygiene that is worse than the firm admitted. Every firm believes its matter system is the source of truth. In practice, the real source of truth is a partner's inbox, a paralegal's hard drive, and a SharePoint folder nobody can find. Until the data the agent reads from is the actual data the firm operates on, the AI will be confidently wrong in a way that is hard to debug.

A platform purchase where a thin slice belongs. The vendor sells the firm a 12-month all-in platform deal that promises to do diligence and intake and research and drafting. Eighteen months later the firm has paid for everything and shipped nothing. The right pattern is one workflow at a time, in production, with a measurable before-and-after, before the next one starts. This is the Interview, Analyze, Execute discipline we run on every engagement, and legal is one of the verticals where skipping it costs the most.

A model choice that is downstream of a sales rep. The firm picks a model because a vendor pitched it, not because the use case demanded it. Diligence on multimodal scanned exhibits should not be running on a text-only model. Real-time intake triage should not be running on a slow batch model. The right answer is to pick the lab per use case, not to pick a lab for the firm. We are explicitly multi-provider for this reason.

How we would sequence the first six months

For a 10 to 60 attorney firm that has decided this year is the year, here is the sequencing we would actually run.

Month 1, the interview. Two weeks with the managing partner, the COO, two partners from the highest-volume practice area, the head of intake, and the head of IT. Map the document flow, the matter system schema, the email infrastructure, the conflict process, and the existing tool stack. End the month with a one-page constraints document and a ranked list of the five highest-leverage workflows.

Month 2, the thin slice. Ship one workflow into production. Almost always intake triage or client communications drafting, because the surface area is small, the risk is contained, and the time-savings show up immediately. Single model choice, single integration surface, single team using it. The success criterion is hours back per week, not a feature list.

Months 3 to 4, the second slice with the harder lift. Usually document review for diligence, because by month three the firm has built enough internal confidence to handle a higher-stakes workflow. This is where build-vs-buy gets real, because diligence tools span an enormous price range and a custom thin layer can be cheaper inside three months than an off-the-shelf seat-priced product over the year.

Months 5 to 6, the platform decision. By month five the firm has two workflows in production and a real internal sense of which model behaviors fit which problems. Now is when the platform conversation happens, not in month one. The decision is informed by what shipped, not by the vendor deck.

This sequencing fails every firm that wants to skip ahead to the platform decision. It works every time when the firm holds the line on shipping one workflow first.

Compliance and the non-negotiables

A few things are not negotiable for any AI work inside a law firm, regardless of size.

Every model the firm uses has to be under a contract that prohibits training on the firm's data and provides a zero-retention or short-retention mode. This is available from Anthropic, Google Vertex AI, and OpenAI under their enterprise contracts. Consumer ChatGPT is not.
Every assertion in a work product has to be traceable. This means citations on research memos, source links on diligence findings, and review queues on drafted communications.
The conflict check is the AI's hard stop, not its suggestion. An intake agent must not draft a first-response email until the conflict check has cleared.
Privileged content is segregated from any context that touches an external API. This is a design decision made at the data layer, not a prompt instruction.

These are not optional. A firm that cannot meet these four lines should not be putting AI inside its operations yet, no matter what its competitors are doing.

What we will not do

We will not promise a specific percentage of associate hours saved. The honest number depends on the firm's existing workflows, and we will not say "30 percent" before we have measured the before.

We will not pitch the firm a build when an off-the-shelf tool is the right call. The intake category in particular has decent off-the-shelf options for firms under 20 attorneys.

We will not name the legal tech vendors we think are weakest in public. We will tell you in the room. There is a real difference between a tool that demos well and one that survives a partner's first week of use.

If you are thinking about shipping AI inside your firm this year

The single highest-value thing a firm can do in the first conversation is bring a real workflow to the table. Not "we want AI." Not "what's possible." A specific workflow with a specific bottleneck and a specific person whose week we are trying to free up.

If that sounds like the conversation you want to have, start a project with us or read the method. We work with a small number of firms at a time, on purpose. The studio is set up to ship, not to pitch.

AI for law firmslegal AI agentAI document reviewbuild vs buyAI integration for business

Liked this?

Tell us what is broken. We’ll tell you what the first week looks like.

Start a project →

Read the approach

Next read →

Gemini 3 Pro for HIPAA-grade AI, what we shipped on Vertex AI for Smile PreVue