The AI Daily Brief: Artificial Intelligence News and Analysis - New York Times Sues OpenAI In What Could Be A Landmark Lawsuit
Episode Date: December 28, 2023The New York Times has sued OpenAI and Microsoft. While it's not the first copyright suit of its type, it's certainly the most significant so far. The case -- which will almost certainly end up at the... Supreme Court -- could have significant ramifications for how LLMs and generative AI develops. Today's Sponsors: Notion - Notion AI. Knowledge, answers, ideas. One click away. - https://notion.com/aibreakdown Listen to the chart-topping podcast 'web3 with a16z crypto' wherever you get your podcasts or here: https://link.chtbl.com/xz5kFVEK?sid=AIBreakdown ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI. Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/
Transcript
Discussion (0)
Today on the AI breakdown, we're looking at a new lawsuit from the New York Times against OpenAI and Apple's simultaneous courting of publishers to pay them for data to train their AI models.
Before that on the brief, OpenAI is raising at a $100 billion valuation.
The AI breakdown is a daily podcast and video about the most important news and discussions in AI.
Go to Breakdown.network for more information about our YouTube, our newsletter, and our Discord.
Welcome back to the AI breakdown brief.
All the AI headline news you need in around five minutes.
Given that this is the week between Christmas and New Year's,
it's about the lowest information density that we've had in AI since, I don't know,
before ChatGPT first launched.
Still, there are some interesting nuggets for those of you who are still paying attention
this week.
And we kick off with reports that OpenAI is in talks to raise another round of funding
at a valuation of more than $100 billion,
which would make it the second most valuable U.S. startup
behind only SpaceX.
According to Bloomberg, OpenAI is in early discussions
to raise a fresh round of funding at a valuation at or above $100 billion.
Investors potentially involved in the fundraising round
have been included in preliminary discussions,
but details like the terms, valuation, and timing
haven't yet been finalized and could still change.
Now, this is apparently separate from the tender offer
being led by Thrive Capital that we've been talking to,
talking about for over a month now, that tender offer is happening at an $86 billion valuation,
or at least was before the whole Sam Mulman fired from the board thing, and seemed like a way
specifically to give early employees access to liquidity. According to Bloomberg reporting,
that tender offer saw more demand that they could accommodate, which has led to this new
discussion at an even increased valuation. I think right now that realistically speaking, the way
that investors are looking at the frontier model companies is that the companies that have a
plausible chance to actually be one of the leaders in this space are basically worth
infinity dollars. The allocation seems to be considering that if you can get into these companies
at basically any price, you should do so. Now, on top of this, there is also reporting that OpenAI
has been holding discussions to raise between $8 and $10 billion from Abu Dhabi-based G-42.
If you've been listening closely to the AI breakdown, you'll know that G42 has been
somewhat controversial this year. More than just about any other company, they've been trying to have
their cake and needed too when it comes to their relationships with U.S.-based companies and China-based
companies, but recently had to make the decision to focus entirely on the U.S. and cut ties with
their Chinese partners based on pressure from the U.S. government. Now, it's unclear apparently to
Bloomberg sources, whether this $8 to $10 billion fundraise from G-42 is related to that $100 billion
valuation round, or is something entirely separate. What is known,
is that the G42 partnership has been exploring a dedicated AI chip project,
although it's not clear how much of that is within OpenAI versus outside of it.
One of the speculations around why Sam Altman was fired is the idea that he was trading
on his brand associated with OpenAI in order to raise money for a completely separate venture
that OpenAI didn't really have a stake in.
The codename for that project related to OpenAI or not has been reported to be Tigris.
Speaking of Frontier Model companies,
the information is reporting that Anthropics' revenue projections have gone up.
significantly. As recently, I feel like as a week or two ago, Anthropic had been suggesting to
investors that they expected to make somewhere in the order of $500 million next year, or at least
to achieve a $500 million annualized run rate by the end of the year. Now it appears that
projection is up to $850 million. Writes the information, it isn't clear why the latest projections
are materially higher. The projection underscores the three-year-old startup's remarkable growth
expectations and may offer more evidence that generative artificial intelligence is gaining steam among
enterprises. Now, you'll remember that it was also recently reported that Anthropic is raising a fresh
$750 million led by Menlo Ventures at at least a $15 billion pre-investment valuation. Not at that
heady $100 billion level yet like OpenAI, but certainly nothing to chuff at either. Moving over into
product releases for a moment, one of the most hyped platforms of the last couple months has been
PICA Labs. They've been slowly letting people.
belong to their 1.0 video generation platform, many of the outputs of which have been making their way
to Twitter slash X, and as a special Christmas gift, they've now opened up their platform generally.
Now, one 2024 speculation among many is that in the same way that 2023 was the year that AI
image generation came to the mainstream, perhaps the same will be true for video generation
when it comes to 2024. Certainly runway has been moving extremely quickly in this space.
Stability, AI, and Google also released new models recently.
But PICA is definitely a contender here.
Writes Ventrabeat, Pika says that the model can produce a wide range of content,
including 3D animations, live action clips, and cinematic videos,
as well as modify moving objects like a horse or an outfit with simple text prompts.
Now, this is still much less precise than image generation,
but the technology is advancing fast.
Speaking of new releases, some eagle-eyed commentators have noticed
that Microsoft has launched co-pilot for Android devices.
This is a free way to use GPT4 and Dolly 3.3.
on your phone. And one interesting thing that I've been noticing more broadly is that Microsoft
is really liking these ways of experimenting with integrating its products, and by its products,
I also mean OpenAI's products, into mobile experiences in ways that don't necessarily require
subscriptions. I'm not sure exactly what the goal is or the business strategy. Perhaps the logic
is that if people start using these things on their phones, they'll port them over to the main
business use case, which is where Microsoft makes its money, but ultimately, I don't know.
Last day, today is some interesting new data on the most visited AI tools between September
2022 and August 2023.
Now, right at the top, as you might imagine, is chat GPT with over 14.6 billion visits,
but not as far behind it as you think is character AI with 3.8 billion visits.
Bard lags behind that at 241.6 million visits.
Still, the survey does note that Bard users spend an average of 10 minutes per session,
and that 67% of them are on mobile.
But I think the one that's surprising most people is character.AI.
Certainly it surprised me when I started to see statistics like this, but when you dig in, the numbers are even more shocking.
Many of Character AI's users are spending two hours a day on their site, bringing up serious questions for some around the implications for the future of social interactions.
What does it mean when people are interacting with AI's for such a big portion of their day?
Will it make their interactions with regular humans better?
Or will it simply be a substitute?
Whatever the case, it is worth keeping an eye on because like I said, I think a lot of you are going
to be surprised by these numbers. However, that will do it for today's AI breakdown brief.
Next up, the main AI breakdown.
Quickly a brief word from today's sponsor.
As a listener of this show, I suspect you like to stay up to date on all things AI and tech,
which is why you have to check out the chart-topping podcast Web3 with A16Z crypto.
Produced by venture firm Andresen Horowitz. Web3 with A16Z is the perfect companion podcast to the
AI breakdown. Web 3 with A16Z Crypto is your definitive resource for the future of the
internet. Whether you're interested in the convergence of AI and crypto or simply curious about what's
next. If you need a place to start, they recently released an excellent episode with Stanford
Cryptography Professor Dan Bonae and former Google X engineer Aliya in conversation with
host Sonal Choxi about the intersection of AI and crypto. From fighting deepfakes and proving
humanity to large language models like ChatGBT, they cover it all. I highly recommend checking
it out, especially if you'd like to learn more about how AI and crypto will impact our everyday
lives. Beyond crypto and AI, the show is for creators seeking more ways to truly own their work,
for business leaders trying to prepare for the future today, and for innovators exploring
trending tech topics. Don't miss out. Follow Web3 with A16Z crypto on Apple Podcasts, Spotify,
or your favorite listening app. And now a quick word from today's sponsor. I am a huge
Notion user. We're talking multiple accounts for multiple projects. I use it for everything from
applicant tracking to note taking to project management, to sharing public documents, to frantically
capturing ideas I have while out hiking or just driving around. Given that and given the topic of the
AI breakdown, I was excited to learn that they've launched a new AI tool called Q&A. It's like a personal
assistant that responds in seconds with exactly what you need. Notion AI can give you instant
answers to your questions using information from across your wiki, projects, docs and
For someone like me who makes dozens of notes per day around a huge array of topics,
having a built-in AI tool to help recall that is incredibly useful.
Now beyond that use case, think about this.
Have an urgent question you normally turn to a coworker to answer?
Just ask Q&A instead.
It'll search through thousands of documents in seconds and answer your question in clear language
no matter how large or complex your workspace is.
Plus, you can trust your data is secure because Notion AI is designed to protect your
information.
No AI models are trained with your information, the data is encrypted, and answers will
never use information from pages you don't have access to. With Notion AI, it's even easier to do
your most meaningful work. Try Notion AI for free when you go to notion.com slash AI breakdown.
That's all lowercase letters, notion.com slash AI breakdown to try the powerful, easy to use
notion AI today. And when you use our link, you're supporting the show. One more time, that's
notion.com slash AI breakdown. Welcome back to the AI breakdown. Today for our main episode,
we have a really interesting set of almost contrasting stories.
And the first of them has to do with yet another lawsuit against OpenAI around copyright and
AI training.
Now, there have been many of this category of lawsuit flying around throughout 2023, but this
one is a little bit bigger because of who's involved.
The New York Times is suing Open AI and Microsoft claiming that millions of articles from
the Times were used to train their AI models,
in violation of copyright protections.
Now, from the Times itself, quote,
The Times is the first major American media organization
to sue the companies, the creators of ChadGBTBT,
and other popular AI platforms over copyright issues
associated with its written works.
The suit does not include an exact monetary demand,
but it says the defendant should be held responsible
for billions of dollars in statutory and actual damages
related to the unlawful copying and use of the Times' uniquely valuable works.
It also calls for the companies to destroy any chatbodies,
models and training data that use copyrighted material from the Times. Says the complaint,
defendants seek to free ride on the Times' massive investment in its journalism by using the Times
content without payment to create products that substitute for the Times and steal audiences away from it.
Now, obviously, this is going to be a protracted legal battle. I fail to see how it could be resolved
without going all the way to the Supreme Court. I don't think, however, it's a slam dunk.
You heard in the quote that I just shared, the crux of the Times argument. It's not a
just that OpenAI has used the Times corpus of work to train its models, but that the product
that it creates with them is a substitute for the Times and steals audiences away from it.
That feels like it's going to be a very hard thing to prove. The main use case of something like
ChatGPT is not to get access to news information, at least not in a way that substitutes
from a daily publication like the Times. Or at least I assume that that's part of what OpenAI's
argument is going to be. It just seems to me like an incredibly tricky thing to prove, but in some
ways it feels like the battle and gumbing up the works is part of the strategy. Now, what I mean by
that is that in some ways it feels like a negotiation tactic. And giving more credence to that, we go again
to the New York Times' own reporting of its own lawsuit. Quote, the lawsuit filed on Wednesday
apparently follows an impasse in negotiations involving the Times, Microsoft, and OpenAI. In its
complaint, the Times said it approached Microsoft and Open AI in April to raise concerns about
the use of its intellectual property and explore, quote, an amicable resolution, possibly
involving a commercial agreement and technological guardrails around generative AI products,
but that the talks reached no resolution. Now, keeping on this theme of negotiation,
another story that is popping up right now is that Apple is exploring AI deals with these big news
publishers. From the Times again, Apple explores AI deals with news publishers. The company has
discussed multi-year deals worth at least 50 million to train its generative AI systems.
Now, it doesn't appear that the New York Times was one of these publications, but the news
organizations that Apple has approached include Condé Nast, NBC News, and IAC, which owns publications
like People, The Daily Beast, and Better Homes and Gardens. Now, of course, Apple is not the only
company that's gone out and tried to do deals with publishers in advance of this copyright issue.
Earlier this month, OpenAI struck a similar deal with European publishing giant Axel Springer
to be able to explicitly use that company's news content in chat GPT in a way that would actually
reference where that information was coming from. From CNN, as part of the deal,
chat GPT users will receive summaries of news stories from Axel Springer brands, including Politico,
Business Insider, Build and Weld, with attribution and links to the original sources of reporting.
No financial terms were disclosed when that deal was announced. Now, what I think many are
recognizing is that this may be a vector for Apple to compete with AI and make up and take back
some of the lead that others like OpenAI have gotten on it. Kyle Russell writes,
Maybe the thing people are underestimating about Tim Cook-era Apple's ability to compete in
AI is how much cash they can drop on buying training data with quote-unquote clean rights.
Chimath Palahapatia said something very similar.
The interesting thing about this NYT OpenAI lawsuit is the counterfactual.
If Apple is indeed writing substantial checks to media companies to license their content
for training models, the impact of this and other lawsuits against AI companies training
on non-public data will be swift and meaningful.
A very clever move by Apple if this lawsuit goes the way of the New York Times.
When Luke 16 asked,
Wouldn't this move give a huge edge to Google Bard and XAI,
given all the proprietary data they can train on without licensing?
In part, I would think so.
YouTube plus Gmail plus App Store, et cetera, et cetera, et cetera,
only way to compete would be buying access to data.
So the implications here are one, in the context of Apple itself,
using its huge warchest of cash to be able to out-compete someone like OpenAI
for access to data that is not going to get,
them into trouble legally speaking, but two, if these copyright lawsuits go the way of the publishers,
one of the second order effects is probably that it's only big tech companies who already
own significant sources of data who will be able to compete in the LLM space. This is why these
lawsuits are so significant and have the potential to have such a big impact on how this
space plays out. Now, meanwhile, a couple more little interesting nuggets surrounding Apple and
AI, people started noticing over the last week that back in October, Apple,
quietly released an open-source multimodal LLM called Ferret.
Writes Venture Beat.
With little fanfare, researchers from Apple and Columbia University
released an open-source multimodal LLM called Ferret in October 2023.
At the time, the release, which included the code and waits,
but for research use only, not a commercial license,
did not receive much attention.
But now that may be changing.
With open-source models from Mistrel making recent headlines,
and Google's Gemini model coming to Pixel Pro and eventually to Android,
there has been increased chatter about the potential for local LLMs
to power small devices.
So that's the context in which people are situating this Apple news, because in addition to this
model, they also recently released two new research papers whose net impact would be to enable
more advanced experiences to run directly on devices like the iPhone and the iPad without having
to go to the cloud.
Many people have speculated that part of why Apple hasn't jumped into the generative AI race with
two feet is because they're waiting for things to get to the point where hardware advances
plus model size reductions have gotten to the point where they can do things in the on-device
privacy-preserving way that has become such a hallmark of all of their products.
At the same time, though, Apple is losing some amount of talent in this AI race.
According to Bloomberg, the head of iPhone design is leaving Apple to go work at that new AI
device startup led by former Apple designer Johnny Ive and Sam Altman.
Writes Bloomberg, as part of the effort, outgoing Apple executive Tang Tan will join Ives'
design firm Love From, which will shape the look and capabilities of the new products,
according to people familiar with the matter.
So I think that to the extent that you're keeping track of those 2024 predictions,
for anyone who had Apple getting more involved in the AI space on their list,
you're off to a good start.
That's going to do it for today's AI breakdown.
Until next time, peace.
