The Journal. - How a $1.5 Billion Settlement Could Alter the Course of AI
Episode Date: September 10, 2025Get more information about our first-ever live show here! Limited tickets left. Artificial intelligence company Anthropic agreed to pay at least $1.5 billion to settle a copyright infringement lawsui...t over the company's use of pirated books to train large-language models. WSJ’s Melissa Korn unpacks the settlement and explores what the precedent could mean for the AI industry. Ryan Knutson hosts. Further Listening: Why Elon Musk's AI Chatbot Went Rogue The Company Behind ChatGPT Sign up for WSJ’s free What’s News newsletter. Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
The boom in AI over the last few years has also led to a boom in lawsuits.
As artificial intelligent chat bots are growing more and more popular,
a number of authors are now suing the companies behind them.
Two major studios have sued an AI startup claiming it as, quote, blatantly copied famous movie characters.
Sarah Silverman is going into battle against the chatbot's
George R.R. Martin, and more than a dozen other authors, now suing.
The New York Times becoming the first major media company to sue over AI.
So we've got some movie studios suing.
We've got book publishers and authors.
We've got newspaper publishers, including two subsidiaries of News Corp, our parent company.
There's a lot of litigation right now over how AI companies are kind of
coming into the material that they're using to train their large language models and then
also what they're doing with that material once it is in their system.
Companies like OpenAI have argued that training AI models amounts to fair use.
Last year, a group of authors filed one of those lawsuits against the AI company Anthropic.
They alleged that Anthropic, the AI startup,
infringed their copyrights for a number of books because of the way Anthropic uploaded and
then used the material for training and other purposes.
Last week, there was a big development in the case.
Artificial Intelligence Company Anthropic has agreed to pay $1.5 billion to settle a class action
lawsuit.
It's big.
If it's approved, it will still help influence how AI companies
think about taking content, how publishers or other media companies might want to try to strike
deals with AI companies, and it starts to put a bit of a price tag on what that material is
worth to the AI company.
Welcome to The Journal, our show about money, business, and power.
I'm Ryan Knutzen.
It's Wednesday, September 10th.
Coming up on the show, Anthropics proposed settlement and what it means for the future of AI.
This episode is brought to you by Nespresso Professional.
Designed to create meaningful moments of connection at work,
Nespresso Professional delivers exceptional coffee every time.
Variety at your fingertips, you are sure to satisfy every taste.
Discover the right solution for your workplace today.
Unforgettable coffee, meaningful connections.
Nespresso professional.
Visit nespresso.com to learn more.
All right.
Well, so first of all, can you introduce yourself?
My name is Melissa Korn, and I'm an editor on the tech and media team.
And are you a human or a robot?
I am a human.
Are you sure?
Fairly certain.
I can tell you look like a human.
It's natural intelligence, not artificial.
Okay, so we'll now introduce us to Anthropic, which makes the Claude Chatbot, whose intelligence is artificial.
Anthropic is a very fast-growing AI company.
I mean, its valuation has tripled in just the last six or seven months.
It is now valued at $183 billion.
It is one of the kind of white-hot AI companies.
There's quite a few of them these days.
But it is really one of the giants in this space.
And how does Anthropic train Claude?
So Claude, we're acting as if Claude is a guy sitting in the room with us, right?
So Anthropic and all the other AI companies feed tons and tons of content into these computers.
to train them, to make them learn, to make Claude learn how to put a sentence together,
what's happened in history, how to do math, all of that.
It's sort of like the underlying principle, right, is just sort of the same way
human learns.
Would you just expose it to enough material and then it'll identify the patterns
and then it can start kind of, I guess, arguably, thinking for itself?
Right.
And it's not quite thinking for itself yet, but it is learning how to assess information
that it's given. So some of these computer models, unlike you and me, it ingests everything
really, really fast, right? It takes me a few seconds to read a sentence. Cloud can read an entire
book extremely quickly, so it can ingest millions of books very quickly and then get to training
and get to learning. Are you saying you haven't read millions of books? I mean, I have a pretty
good library at home, but I don't think it's quite millions now. And I certainly haven't done it in a
matter of, you know, minutes or hours.
Oh, God, that would be so cool, wouldn't it?
Just to be, like, to read every book that it came out?
I don't know.
I kind of enjoy the process.
I relish the experience.
I don't want it to go too fast.
Okay, human.
Last year, three humans.
The author's Andrea Barts, Charles Graber, and Kirk Wallace Johnson filed a lawsuit
against Anthropic, arguing the company improperly used their copyrighted works to train
his AI models.
They said that they think this should be a class action lawsuit because they are representing a number of other authors.
So they filed the suit last summer.
Things started to move through the court system.
Obviously, that's a slow, laborious process.
It's not millions of books at second speed.
It's a judicial process speed.
Correct, correct.
So according to the lawsuit, how did Anthropic allegedly acquire the books that it used to train its models?
So there were a few different ways that Anthropic got the books. The lawsuit says that Anthropic stole a lot. So they use these kind of shadow libraries, places like LibGen and there's a few others. Almost if you remember Napster back in our teen days. Limewire.
Lime wire, exactly, right? Like it just people had uploaded these things and they downloaded them. That simple. They also bought books.
and had some other access to books that are, you know, out of copyright.
So they got the books through a few different means,
and in some cases got the same book a few different ways.
But they did not go out and purchase a copy of every single book
that they used to train Claude.
And that's according to the lawsuit.
So in the court case, what did Anthropic argue,
and what did the authors argue?
So Anthropic asked for this summary judgment
on the fair use argument.
Meaning they wanted the case dismissed.
Exactly.
And just help me understand
what is the fair use argument
they're making here?
If you wrote a book
and I copied your book
word for word
and published it as my own,
that would be problematic.
If you wrote a book
and I wrote a book report
about it, an analysis of it,
or I used it to inspire
another book
that maybe has some similar characters
but otherwise quite different.
Or I wrote a song
about your book.
That's fair use because I'm not just copying what you already did and have protected under the law.
You're reading my book and then you are creating some new work out of it as transforming it into something else.
Exactly. So that word transformative is important here. That's generally the bar. That it needs to be transformative or it needs to be, have been accessed in a fair way that there is, you know, compensation for the artist.
And so Anthropic is arguing, yeah, we're reading the book, but we're also reading millions of books.
It's just making our AI really smart.
We're not just, like, stealing the book
and then distributing it to other people to read.
Exactly.
The authors, meanwhile, argue that Anthropics stole their books
and that they deserve to be compensated.
And earlier this summer, the judge reached an important decision.
So in June, the judge issued summary judgment
for Anthropic on the fair use argument,
saying it's okay for AI companies
or for Anthropics specifically here to use this content in this way, it's fair use, that part of the argument, the authors are not going to win.
And that seems like kind of a big deal here because if a judge does issue a summary judgment, that is a court precedent where a judge is saying, yes, reading all these books, using it to train your model, that's an okay thing to do.
That is fair use.
Right, exactly.
So this was seen as a big win for Anthropic, for AI companies in general.
But came with a bit of an asterisk with the judge saying,
yes, the way you've used this stuff is fair use,
but the way you got some of it is not okay.
And that asterisk is a very expensive asterisk.
That's next.
Wendy's most important deal of the day has a fresh lineup.
Pick any two breakfast items for $4.
New four-piece French toast sticks, bacon or sausage wrap, biscuit or English muffin sandwiches, small hot coffee, and more.
Limited time only at participating Wendy's taxes extra.
The judge's ruling came with some good news for Anthropic.
He said the company's fair use claim was valid.
It could use copyrighted books to train its AI models.
But the judge also gave Anthropics.
some bad news.
What he did say was
Anthropic had no entitlement
to use pirated copies for its library.
So how they got the books
became the thing at issue.
In other words, it's okay to use the books,
it's not okay to steal the books.
Exactly.
Violating copyright law can carry a penalty
of anywhere between $750
and $150,000 per book.
And how many books are we talking here?
The initial number was that there were 7 million works that were reused by Anthropic.
The numbers, the dollar signs could start to add up very, very quickly if this class action lawsuit went forward and a jury found in favor of this class of authors against Anthropic.
It could be existential.
This brings us to the proposed settlement that was announced last week.
So the headline number of the proposed settlement is one.
$1.5 billion, which is how much Anthropic would pay to the class to compensate them for the
stolen materials, for the pirated materials. As part of the settlement, the number of impacted
books got whittled down from over $7 million to around $500,000. And you get to that $500,000 from
that initial $7 million because some works were ultimately purchased. There's some duplication, a lot of
duplication in the initial list of what was downloaded. Some books were not actually registered
with the Copyright Office, the way one would assume they are right away. So we get to about roughly
500,000. We do not know what books, what works are included in that list yet. We do not know
exactly how the payouts will come, right? Authors, publishers, who gets what portion of what money.
How much money do we think will go to each author? So it's roughly $3,000 per work.
But again, we don't know if that's to the author, a portion to the author, a portion to a publisher, how that gets worked out.
And one and a half billion dollars, is that a lot of money to Anthropic?
Anthropics valuation is $183 billion.
So it's, you know, not going to bankrupt the company, which is, you know, part of why they wanted to settle.
But it's not nothing, but it is not going to debilitate them.
While the authors and anthropic have agreed to the terms of the settlement, there's still one more hurdle.
The judge has to approve it, and he's expressed some skepticism.
So over the weekend, the judge issued an order saying, we're going to have a hearing to discuss the proposed settlement, but it was very, like, I'm not angry, I'm disappointed vibes from the letter.
I'm not mad at you.
I'm mad at your behavior.
Pretty much.
And it used the word disappointed, right?
The judge said, like, there's still a lot of unanswered questions here.
And I am not totally convinced that you have enough of it worked out yet for me to approve the settlement.
I see.
So not necessarily it should be more money or less money, just more like there's not really enough detail here yet.
Right.
So he says, we don't know what the list of works is yet.
We don't know what the list of authors is yet.
How do you make sure everyone knows that they're eligible?
for a potential payout.
How are you going to do the payouts?
So they held a hearing earlier this week in San Francisco in court,
and the judge was pretty harsh in laying out,
especially to the plaintiff's lawyers.
I don't think you have it all there yet.
It's not fully baked.
And he also said that he was concerned that there would be some more lawsuits,
like people coming out of the woodwork,
after this, right, saying, wait, I wasn't included. I should have been included. I wasn't
notified fairly. I should have been notified. The goal of a settlement is to end litigation,
not launch a whole bunch of new stuff. He set a schedule of deadlines over the next few weeks.
You know, you have to have a list of the works that are involved by X state. You have to have
the class list by X date. They're going to convene again in a few weeks in early October,
and he will presumably then approve or reject the proposed settlement.
So what might this mean for all those other cases that are playing out in court right now
between creators and AI companies?
It will have an impact.
Again, it's not legal precedent,
so it won't immediately dismiss everyone else's cases or anything like that.
But the other cases are still moving forward in a lot of instances,
including there was a new case filed last week, right?
The day that Anthropic and the authors announced their proposed settlement,
there was another lawsuit filed by some authors against Apple
about how they ingested books for their training models for AI.
So there's still a lot of litigation action coming down the pike
and not just in books, right?
We're talking movies, we're talking songs,
we're talking all sorts of creative content.
It feels like as a society and as a legal system,
We're feeling our way through the dark to try to figure out and find an answer to this really big question.
Yes, and I think we're starting to get a few answers, right, that the AI companies can use this content.
It is fair use, but there's still a lot of unknowns about what a compensation model should look like or could look like and how that's going to differ by medium, how that's going to differ by AI company, and,
And so you're seeing this in litigation, you're also seeing this in discussions between AI companies and media companies about what can they do on the front end so that they don't end up in litigation on the back end.
Has anybody asked Claude, Anthropics AI machine? What does it think the answer should be for how authors and creators should be compensated?
I have not, but I have a feeling I know what you're going to be doing later this morning.
Certainly.
Hey, Claude, uh, mind if I ask you a question?
Of course, go ahead and ask me anything.
How much do you think authors should be compensated for having their work used to train AI models?
That's a really complex question with valid arguments on multiple sides.
Authors deserve fair compensation when their creative work contributes value to AI systems.
But the mechanics of how that would work are challenging to figure out.
What's your take on it?
I'm asking you.
I think authors should receive some form of compensation,
especially when their work directly contributes to commercial AI systems.
The challenge is creating a fair system that recognizes their contribution
without making AI development prohibitively expensive.
Okay. Thanks, Claude.
You're welcome.
Before we go, I want to make sure you heard that the journal is hosting
our first ever live show in New York City next month.
We'll be at the Gramercy Theater on Tuesday, October 7th.
There are literally only a handful of tickets left.
We are almost sold out.
So if you want to be there, head to bit.ly slash the journal Live 25 for tickets and more information.
You can find the link in our show notes.
We'd love to see you there.
That's all for today.
Wednesday, September 10th.
A quick note, the Wall Street Journal's parent company, News Corp, has a content.
deal with Open AI.
Two of News Corp's subsidiary
have also sued perplexity.
The Journal is a co-production of Spotify
and the Wall Street Journal.
Additional reporting in this episode
by Jeffrey Tractonberg
and Angel Al Young.
Thanks for listening.
See you tomorrow.
