What Claude Opus 4.8 Means for a Lake Forest Tax Practice
Anthropic's new flagship model is a modest upgrade with one change that matters to fiduciaries: it is far more willing to admit what it does not know.
Key Takeaways
- ✓ Anthropic released Claude Opus 4.8 on May 28, 2026, at the same price as the prior version: $5 per million words in, $25 per million out.
- ✓ The change that matters for a tax practice is not the benchmarks. It is honesty: the model is about four times less likely to let an error in its own work pass unflagged.
- ✓ It reads dense documents (returns, K-1s, IRS notices) more accurately and cheaply, and can hold an entire client file in memory at once.
- ✓ The practical use is operations, not advice: intake, client letters, and practice finances. It extends a small team. It does not replace judgment.
Lake Forest, Ill. May 31, 2026. On May 28, Anthropic released Claude Opus 4.8, its most capable model to date, three days after this was written and forty-one days after the version it replaces. Most model releases are noise for a law firm. This one is mostly an incremental upgrade, with a single exception that is genuinely relevant to anyone who signs returns for a living. The model is now markedly more willing to say it does not know.
That is the thread worth pulling for a tax practice. The rest of this is a short overview of what shipped, why that one change matters to a fiduciary, and three jobs the model can do on the operations side of a firm in Lake Forest or anywhere else.
What Changed
Opus 4.8 is Anthropic's top model, above the smaller and cheaper Claude models (Sonnet and Haiku). Firms already using Claude through its website, the desktop tool Claude Cowork, or a vendor integration are being moved onto it automatically over the coming weeks. Four things changed that bear on tax work.
First, honesty. The known failure of these tools, and the reason careful lawyers have kept their distance, is that they sound certain while being wrong. Anthropic's own testing shows Opus 4.8 is roughly four times less likely than the prior version to let a flaw in its work pass without flagging it. In practice it is more likely to write "I cannot confirm this figure, verify it" than to invent one.
Second, document reading. Tax is a document business, and the model is better at dense, messy source material: scanned returns, brokerage statements, a K-1 (the form reporting a partner's share of partnership income), and IRS notices. One enterprise tester reported processing PDFs at 61% lower cost than before, another better "citation precision," meaning it points to the right line rather than one nearby.
Third, memory. A one-million-token context window (a token is about three-quarters of a word) is enough to load a complete client file, prior returns, and correspondence at once, and reason across all of it without losing the start by the end.
Fourth, an effort control on every plan that lets a user set how hard the model works on a task: fast and cheap for reformatting, slow and thorough for analysis.
Pricing did not move: five dollars per million words in, twenty-five out. A more careful model arrived at last month's price. The primary sources, for anyone who wants to check, are Anthropic's announcement and the developer release notes.
Why the Honesty Gain Matters for Tax
The companies that build legal and tax software tested this model before launch and said unusual things on the record. The maker of one legal-research tool called it the highest score it had recorded on its internal benchmark. The chief technology officer of Thomson Reuters, behind the CoCounsel assistant many firms already pay for, singled out "legal and tax professionals" by name.
"As we build fiduciary-grade AI systems for legal and tax professionals, advances like these help raise the standard for trusted AI performance in real-world workflows."
Joel Hron, Chief Technology Officer, Thomson Reuters (CoCounsel)The blocker for tax was never that the model could not write. It was that it could not be trusted to say "I do not know." A tool that confidently supplies a basis it cannot support (basis being the figure subtracted from a sale price to compute taxable gain) is worse than no tool, because it hides risk behind the appearance of work. The honesty gain is aimed at exactly that. It is the first version where the right answer for a small firm is "yes, carefully," rather than "not yet."
The three uses below are all operations. None asks the model to give advice, take a position, or sign anything. Each takes a task that eats attorney or paralegal hours and hands the routine first pass to the model, with a person reviewing before anything leaves the firm.
1. Intake: Sorting the Document Pile
Turning a client's pile of files into an organized matter, with a missing-items list, before an hour is billed.
Every matter starts with a pile: prior returns, brokerage statements, a K-1 or three, an IRS notice, and a note to "let me know if you need anything else." Finding out what is missing means a person reading every page. With the document gains and the large memory, the model can take the whole pile at once and report what is there and what is not.
SAMPLE CLAUDE PROMPT
"Attached is everything a new client sent for their 2025 return: three prior 1040s, brokerage statements, two K-1s, and an IRS CP2000 notice. Build an intake summary listing every document, the tax year it covers, and key figures. Then give me a missing-items checklist of anything I would normally need that is not here. For any figure you are not confident you read correctly, flag it and name the page to verify. Do not estimate anything you cannot see."
The last two sentences are the point, and they work only because of the honesty change. The valuable output is not the tidy summary; it is the short list of figures the model could not read cleanly and the documents it expected and did not find. Catching a missing K-1 at intake, rather than in October, is the difference that prevents the deadline scramble.
2. Client Letters: Explaining Without Losing the Day
Plain-English client letters drafted in your voice, so you review and sign rather than write from scratch.
A surprising share of a tax attorney's day is explaining: what a notice means, why a refund shrank, the difference between an extension to file and an extension to pay. This writing is real work, hard to bill, and the basis on which clients judge the relationship. It is also what Opus 4.8 now does well, given examples of the firm's own voice.
SAMPLE CLAUDE PROMPT
"Here is a client's IRS CP2000 notice and my one-line note on how we plan to respond. Draft a short letter to the client in plain English: what the notice is, what it means for them, what we will do, and what we need from them. Calm and reassuring, no jargon. Match the voice in the three sample letters attached. Leave every dollar figure and deadline blank for me to fill in, since I verify those myself."
Two guardrails make this safe: feeding the model past letters so the draft sounds like the firm, and instructing it to leave numbers and dates blank because it is poor at being the system of record for a deadline. The attorney reviews, fills in verified figures, and signs. A letter that took twenty-five minutes from a blank page takes five to review. With Cowork's scheduled tasks, a solo practitioner could have weekly client status updates drafted automatically every Friday, ready to skim and send Monday.
3. Practice Finances: The Hire-or-Not Decision
Reading messy billing exports to show which matters make money and whether you can afford another person.
The question that keeps a small-firm owner up at night is whether to hire. The data needed to answer it sits trapped across a practice-management system, a billing export, and a QuickBooks file that do not talk to each other. Two terms decide the answer. Realization is the share of recorded time actually collected (log ten hours, bill eight, collect seven, and realization is 70%). Work in progress, or WIP, is time done but not yet billed. A firm can look busy and quietly starve.
SAMPLE CLAUDE PROMPT
"Attached are exports from my practice-management and billing systems for the last 18 months. Calculate realization rate by client and matter type. Show which matter types are profitable once hours are counted, and which lose money. List work in progress older than 90 days, and flag any client consistently slow to pay. If a number looks wrong or an export is missing data, tell me before calculating. Do not guess."
The output resembles what a fractional finance chief would charge thousands to build, run monthly for the cost of a sandwich. It replaces a feeling with figures: whether the trust and estate work carries the firm while one-off returns lose money, or the reverse. That is the read worth having before raising a fee, adding a person, or letting go of a chronic slow-pay client. It informs the decision; it does not make it, and it is not the books of record.
Where to Start, and What It Will Not Do
The sensible first step is one pilot on a real matter the firm already understands, so the output can be checked: run the intake prompt on a single messy file and judge whether the missing-items list saves the first read. If it does, build a voice file of past letters for the second use, and run the finance pass once after the deadline rush. Keep the toolset small. A law firm is not a software project.
The limits are firm. Opus 4.8 does not give tax advice, decide a position, sign a return, or speak to the IRS, and it should never be allowed to. It is a clerk, not the attorney. It is not the system of record for deadlines or dollar figures; those live in the calendar, the tax software, and the attorney's own verification. The practices that pull ahead this year are not the ones replacing staff with software. They are the ones letting a small, careful team handle more clients by giving the routine first pass to the model and keeping every judgment human.
If you want to work out which of these fits your firm first, book a free 30-minute AI audit: in person in Lake Forest or on video, no obligation, with a one-page plan to show for it.
Frequently Asked Questions
Is it safe to use Claude Opus 4.8 for tax work? +
For operations (organizing documents, drafting letters, summarizing files, first-pass financial analysis), yes, with a human reviewing every output. For giving advice, deciding a position, or filing, no. The honesty gains make the operations use safer than before, but they do not change the rule that a person verifies the work.
What is the single most important change for a law firm? +
Honesty. Opus 4.8 is about four times less likely than the prior version to let an error in its own work pass unflagged. For a field where a confidently wrong answer is malpractice, a model that says "check this" beats one that always sounds certain.
How much does it cost to run? +
Unchanged from the prior version: $5 per million tokens of input and $25 per million of output (a token is about three-quarters of a word). A single client matter typically costs a few dollars to run. The real value is the review time saved.
Will my client data be safe? +
That depends on which version of Claude you use and how it is configured, and it should be settled before any client document goes into any AI tool. Enterprise and team plans offer stronger data-handling terms than a personal account. It is the first thing to nail down in a rollout for a fiduciary practice.
Related Articles

AI Conflict Checks: Lake Forest Law Firms' $200K Problem
Conflict misses cost Lake Forest law firms more than any software subscription. AI-powered conflict checking catches what manual searches miss.

AI Contract Review for Evanston Law Firms Without Privilege Risk
AI contract review protects privilege better than tired associates. Here's the implementation framework North Shore law firms are using.

How a Wilmette Law Firm Could Use Claude Managed Agents for Discovery, Conflict Checks, and Deal Diligence
Anthropic's new cloud-hosted agent service finally lets the long-running work of a law office, discovery review, conflict checks, deal diligence, survive a disconnect.
About the author
Written by
Michael Pavlovskyi
Founder, Bace Agency
Michael builds custom Claude and GPT workflows for insurance agencies, law firms, and PE firms on Chicago's North Shore. Speaker at Northwestern and Lake Forest College on practical AI adoption for professional services.
Connect on LinkedInWant to see how AI fits in your firm?
Book a free 30-minute AI audit. No obligation, no pitch deck.
Book a Free AI Audit →