The AI Daily Brief: Artificial Intelligence News and Analysis - For the Hyperscalers, There's No Such Thing as "Spending Too Much on AI"
Episode Date: August 23, 2024NLW discusses the commoditization of LLMs and reflects on a new essay from VC Sarah Tavel about the logic behind the Foundation Model companies' seemingly endless appetite to spend on the AI build...-out. Read the piece: https://www.sarahtavel.com/p/the-big-stack-game-of-llm-poker Concerned about being spied on? Tired of censored responses? AI Daily Brief listeners receive a 20% discount on Venice Pro. Visit https://venice.ai/nlw and enter the discount code NLWDAILYBRIEF. Learn how to use AI with the world's biggest library of fun and useful tutorials: https://besuper.ai/ Use code 'podcast' for 50% off your first month. The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614 Subscribe to the newsletter: https://aidailybrief.beehiiv.com/ Join our Discord: https://bit.ly/aibreakdown
Transcript
Discussion (0)
Today on the AI Daily Brief, why the AI hyper-scalers are spending billions and even
trillions of dollars building out AI and why their bet might make sense.
Before that in the headlines, and frankly quite related, are AI models getting totally
commoditized? The AI Daily Brief is a daily podcast and video about the most important news
and discussions in AI. To join the conversation, follow the Discord link in our show notes.
Welcome back to the AI Daily Brief Headlines edition, all the daily AI news you need in around
five minutes. We kick off today with a really interesting conversation.
that I'm seeing emerging more and more on AI Twitter, which is about model commoditization.
Part of the interesting shift that this discussion represents is, in the wake of chat
GPT being launched in a million companies getting funding, there has been a pejorative sense in many
ways of startups that are simply quote-unquote chat GPT wrappers. The idea here being that if you're
not building your own proprietary model, you have no moat. The interesting question, however,
is how much proprietary models actually do create a moat. Take, for example, this tweet from
Sully Omar, an AI entrepreneur. He writes,
At the start of 2024, my startup was using 0% Google, 5% Anthropic, 95% OpenAI.
Now it's 35% Google and growing, 35% Anthropic and 30% OpenAI.
We just switched to the cheapest slash best model. Maybe no one has a moat after all.
Certainly this is my experience as an individual user.
I am constantly jumping between whatever model I think is the most performant at any given time,
with absolutely nothing resembling any sort of brand loyalty.
Product experience does matter.
For example, I think that the latest version of GPT-40 has in general been more performant to me than
even Claude 3.5 Sonnet, but the interface of Claude, particularly with artifacts, does make me
in many cases try to use that instead of chat GPT.
In any case, this question of whether model builders can't actually build a moat around what
they're creating, or whether there's just going to be constant shifting sands among people
who are willing to switch models is a really interesting question.
Of course, Enterprise Lock-in and things like that could be X-factors.
but it's something that I'm watching closely.
In the meantime, new models keep coming out,
and it's clear that the competition isn't just at the state of the art
in the biggest models,
but is also about more performance, smaller models.
Microsoft has released three new Phi-3.5 models.
There is Phi 3.5 Mini-instruct,
with 3.82 billion parameters,
Phi 3.5 M-O-E instruct, which is 41.9 billion parameters,
and 5,3.5 Vision Instruct,
which is 4.15 billion parameters.
Now, these are small models that are putting up some really good numbers
on benchmark tests.
Developer Jan Peleg writes,
how the hell is Phi 3.5.5 even possible?
Phi 3.5 Mini somehow beats Lama 3.1AB.
Phi 3.5 M-OE somehow beats Gemini Flash.
Phi 3.5 Vision somehow beats GPT40.
How?
Lull.
Now, people haven't had that much of a chance
to get their hands on these models yet,
and many cautioned, assuming too much
from self-published benchmarks.
Still, like I said, I think the more interesting thing here,
even outside where this leaves these Microsoft models
in the rankings, is what they say about the state of
competition. Two dimensions of this that are interesting for FI specifically. One is, this is yet another
sign that Microsoft is doing a heck of a lot of hedging when it comes to its approach to AI and is very
clearly not just resting on its open AI relationship. And two, once again, it really does suggest
how much of the competition is happening in these smaller models, not just to create the most
powerful large model. Invitya and Mistrel have also released a new model called Mistral Nemo Minitron 8B.
This comes a month after the two companies teamed up to release Mistral Neum.
Demo 12B. This new model was created using something called model pruning and distillation.
They described this as the process of making a model smaller and leaner either by dropping layers
or dropping neurons and attention heads and embedding channels. Model distillation is a technique
used to transfer knowledge from a large complex model, often called the teacher model,
to a smaller, simpler, student model. The goal is to create a more efficient model that retains
much of the predictive power of the original larger model while being faster and less resource
intensive to run. Now again, part of why these things are interesting and why it's relevant
that there is this competition around smaller, more performance models,
is that it suggests that we're moving strongly into a phase
of real practical utility and commercialization of generative AI.
Companies are racing to build models that can operate on devices
and at a cost that works for average consumer use cases.
In other words, from a distribution of efforts and time standpoint,
a lot more emphasis is going into things that could actually show up in consumer products.
And of course, the competition for adoption remains fierce.
The information today posted an article called
meta's search for AI Clout takes it to new terrain. The story is basically all about how
meta is having to develop a new skill, which is to get big businesses to buy into their software.
The article reads, Zuckerberg wants to turn meta's LLM, Lama 3, into the industry standard for
AI. Initially, he has relied mostly on other tech companies to handle selling the software to customers
with mixed results so far. Specifically, they point to Amazon Web Services, which through their
bedrock platform offers a variety of different LLMs to their enterprise customers. Right now, however,
AWS doesn't appear to be a huge channel for them.
According to insiders, Anthropics Claude is the most popular model on the platform,
which could also represent preferential treatment from AWS who has a huge investment
in Anthropic.
When it comes to the Azure marketplace, the information sources say that salespeople at Microsoft
typically only pitch Lama to customers that have existing data expertise, rather than
to more general enterprise customers.
The rest of the article is all about the different ways that META is trying to resolve
the situation.
And again, for our purposes, it's not so much that there's a big, interesting piece of
news here, but more that this reflects where a lot of generative AI is going to be in the next year or so,
which is much more focused on actual business competition.
Speaking of models and commoditization, there are now so many great image generation models,
and perhaps not surprisingly, part of the battle is moving to user interface.
On that front, Mid Journey has announced that their web experience is now open to everyone,
meaning you no longer have to go through Discord to use it, and in addition to that,
they're even turning on temporary free trials, which is something they haven't had for quite some time.
Ideogram, meanwhile, released Ideogram 2.0, which I have not yet had a chance to try, but which has a ton of people so far really impressed.
The AI for Success account on Twitter writes,
Ideogram 2 is by far the best model for handling text and AI images and can easily handle 15 to 20 words.
You can now make memes, posters, and even create YouTube thumbnails, and more in seconds.
Anyways, lots of goodies out there for the Image Generation folks to try.
And finally today, a cool story from 11 Labs.
The company has announced an impact program with a vision to empower, quote,
1 million new voices to communicate, learn, and experience life without limits. This is basically a
non-profit partner program that provides free licenses for anything from enhancing accessibility,
advancing education for those in need, or improving shared cultural experiences. By way of example,
they shared their first initiative, a partnership with bridging voice in the Scott Morgan Foundation,
focused on helping people who are losing their voice to ALS or MND, create copies of their voices
that match their natural speech so they can retain that voice even if the disease progresses to a place
that makes communication in a traditional way impossible.
Pretty cool, a little initiative.
Glad to see companies like 11 Labs doing this.
For now, though, that is going to do it for today's AI Daily Brief Headlines.
Next up, the main episode.
Today's episode is brought to you by Plum.
Want to use AI to automate your work but don't know where to start?
Plum lets you create AI workflows by simply describing what you want.
No coding or API keys required.
Imagine typing out AI, analyze my Zoom meetings and send me your insights in Notion
and watching it come to life before your eyes.
Whether you're an operations leader, marketer, or even a non-ta,
technical founder, Plum gives you the power of AI without the technical hassle. Get instant access to
top models like GPT40, Claude Sonnet 3.5, assembly AI, and many more. Don't let technology hold
you back. Check out Use Plum, that's Plum with a B, for early access to the future of workflow
automation. Today's episode is brought to you by Venice. The leading AI company store your
entire conversation history and attach it to your identity forever. That's every question you
ask, every answer you receive, every image you generate, every thought you share with the
machine it's all being spied on.
If you trust all the company's hackers and NSA board members that will ever have access to your AI
conversations, then rejoice, for you are well served. For the rest of us, Venice is an alternative.
Venice is a powerful AI app for text, image, and code generation that respects you as a sovereign
individual and believes privacy and free speech are not only human rights, but necessary for
civilizational advancement. Private, permissionless, and uncensored, you can try it for free
without an account. AIA Daily Brief listeners receive a 20% discount on Venice Pro.
Visit venice.a.ai slash nLW and enter the discount code, NLW Daily Brief. That's NLW Daily Brief, all one word.
Welcome back to the AI Daily Brief. It's slightly quieter today, assuming you're not just in a world of finally using Mid Journey's web features.
And I just came across this great essay in VC Sarah Tavel's newsletter that deals with a question that we've been discussing all summer.
That question is, of course, the AI bubble question, specifically when it comes to valuations and money.
Right, as we've discussed lots of times there is a big difference between discussing whether
Wall Street is pricing AI correctly and whether that is a bubble versus whether AI itself as a
technology and as a disruptive technology is overhyped.
You may remember the essay that I discussed from David Kahn at Sequoia called AI $600 billion
question, which by the way wasn't nearly as negative as people tried to make it out.
And then of course there was Goldman Sachs report, Gen AI, too much spend, too little benefit.
I spent a lot of time dissecting those two pieces.
and in particular took umbrage with the interview with Jim Covello, the head of global equity research
at Goldman Sachs, who made the very bold claim that, quote, the tech world is to complicate it in its
assumption that AI costs will decline substantially over time, and that the starting point for
costs is also so high that even if cost decline, they would have to do so dramatically to make
automating tasks with AI affordable. Today we are going to read this essay, and it will be me
actually reading no AI, and then we'll come back and talk about what it adds to this overall discussion.
Sarah writes,
I'm sure you read David Kahn's provocative piece,
AI $600 billion question,
in which he argues that,
given NVIDIA's projected Q4-2020 run rate of $150 billion,
the amount of AI revenue required to pay back
the enormous investment being made to train
and run large language models is now $600 billion,
and we are at least $100 billion in the hole on that payback.
The numbers are certainly staggering and are just going to get bigger,
until we reach an efficient frontier of the marginal value of adding more compute,
or we hit some other roadblock that causes people
to lose faith in the current architecture, this is a contest now of not blinking first. If you're a
big stack player like Microsoft, Google, or any of the other foundation model PurePlays, you have no
choice but to keep raising your bet. The prize and power of winning is too great. If you blink,
you are left empty-handed, watching someone else count your chips. It's likely hundreds of billions
will be destroyed and trillions earned. Too early to know who the winner or losers are, but for all
of us in the startup ecosystem, among many things, it's going to create new waves of AI opportunities.
Taking a step back, as LLMs progress, they are able to handle more complicated tasks.
Many of the foundation model companies talk about the amount of time it would take a human
to do the work as a measure of the power of the LLM.
If today, LLMs can handle tasks that would have taken a human five minutes to complete,
as LLMs progressed, they'll be able to handle increasingly complicated tasks that would
have taken a human more time.
In the next decade, the belief is that they'll be able to handle tasks that would take
years for a human to do.
Therefore, as the LLMs become more and more sophisticated, the economic value that they
will be able to unlock becomes greater and greater. For example, annually, it is estimated that we spend
$1 trillion on software engineers globally. When people talk about GitHub copilot, you hear people
throw around numbers like 10 to 20% productivity improvements, of course GitHub claims higher. That
translates to $100 to $200 billion of value annually, were it to be fully deployed, of which
GitHub would capture some percentage. Indeed, co-pilot is likely already a multi-billion dollar revenue line
for Microsoft. As LLMs progress and are able to go beyond code completion, like copilot, there is almost
no limit in value creation as it would dramatically expand the market, a potential multi-trillion
dollar opportunity if someone emerges as a dominant player. And that's just coding. We've all experienced
the productivity improving benefits of LLMs, or been on the receiving end of an automated customer
support response. The potential value creation and capture with AI is beyond our existing
mental models. The challenge is the amount of capital required to train each successively more sophisticated
at LLM increases by an order of magnitude. And once a model is leapfrogged by another, the pricing power
of the older model quickly falls to zero. There are now more 3.5 equivalents for a developer to
choose from. Not surprisingly, when GPT 3.5-5 launched in November 2022, it was head and shoulders
ahead of any competitive model, and cost two cents for a thousand tokens. It's now 5-100th of a cent,
2.5% of its original pricing in just one and a half years. I can't remember another technology
that is commoditized as quickly as LLMs. It's a dynamic that makes it almost
impossible to rationalize any ROI at this stage in the game, because any investment in an
LLM is almost instantly depreciated by the next version. But you can't really skip a step. You need to go
through countless worthless versions to get to the ultimate, the idealized AGI. So you have a bit of a
perfect storm. One, the economic model you are able to unlock as models become more sophisticated
should increase significantly with each upgrade of the model. The economic value of AGI is constrained
only by our imaginations. Two, pricing leverage comes from being a step function ahead of the
competition, at least along some dimension. If you fall behind, the value of your model to external
customers gets rapidly commoditized, of course, there is still value for your internal use cases.
Three, Microsoft, Google, and Meta have core businesses that produce fire hydrants of cash,
Anthropic has found love with Google and Amazon, and OpenAI should continue to be able to
raise money from sovereigns that have their own more physical fire hydrants of cash.
The net result is that in the short term, until an efficient frontier is reached on the
marginal value of continuing to invest in infrastructure, with the existing transformer architecture,
or we run out of electricity, or a group pulls ahead with an untouchable lead thanks to some
smart algorithmic work, investment in this space by these giants should continue to increase
dramatically, and costs necessarily precede revenue. The prize is theoretically so large,
and if a clear winner emerges, their market opportunity so uncapped, you have to keep increasing
your bet. We are all massive beneficiaries of this battle playing out. The extreme pace of investment
in infrastructure training, etc., combined with the urgency that only comes from intense competition,
is giving us all the gift of an insane pace of innovation
with models that are able to handle increasingly complicated tasks
at bargain basement prices.
Applications that might not be possible today, let alone economic,
such as most voice and video applications,
will be profitable before we know it.
Giddy up.
All right, so back to NLW here.
First, thanks to Sarah for a great and provocative piece.
Two things that I want to hone in on.
Sarah breaks this apart into what it means for them
and what it means for us,
which is something that people don't do enough.
When it comes to what this means for them,
specifically the hypers. Sarah argues basically the same thing that I argued in my previous
refutation of those pieces when she writes, the prize is theoretically so large and if a clear
winner emerges, their market opportunity so uncapped, you have to keep increasing your bet.
This is the logic that all of this investment is based on. Part of the reason that I've said
Wall Street is so uncomfortable with trying to price this is that the approach of these companies
is forcing Wall Street to think like a venture capitalist instead of like a Wall Street
investor. How to work backwards from and handicap the odds of reaching some new totally different
economic paradigm is just not an easy thing to do. And so I anticipate that there will continue to be
debates effectively for as long as it takes to get to the other side of AGI around whether this is
money well spent or not. My strong suspicion is in fact that in many cases these debates will tell us
less about how investors are feeling about AI and a lot more about how they're feeling about everything else.
In other words, I think that part of the reason that we're seeing some fatigue in the AI narrative right now,
in fact a lot of the reason, has nothing to do with AI itself and everything to do with the fact
that Wall Street has a new narrative champion in forthcoming Federal Reserve rate cuts.
Remember, the entire period of the post-chat GPT AI boom on Wall Street has happened during
the Fed's hiking cycle and then higher for longer cycle.
Wall Street has often clung to the AI narrative as a counterbalance to the negative implications
of those higher rates. Now that rates are going to start coming down again, Wall Street feels more
comfortable jettising some of those narratives. As is so often the case, the vibe shift potentially tells us
a lot more about the vibe feeler than about the vibe creator. The second piece of this, though,
that I want to hone in on, is the point that she makes in the concluding paragraph, which is so
salient, that we are the beneficiaries of all of this playing out, that the extraordinary amount
of competition, which is driving prices down so quickly, increasing capacity so quickly, and increasing capacity,
so quickly is creating an unbelievably fertile landscape for building.
Solopreneurs who are hacking together applications that never would have been possible before
without venture capital are feeling it. Venture sector of startups who are getting to slosh around
and experiment with totally new paradigms of human computer interactions are feeling it,
and enterprises while stumbling over themselves, with lots of false starts and proofs of concept
and concerns around ROI, are also for the first time in a long time really starting to sniff out
how a new category of technology can actually transform how they operate and what they can
achieve. In other words, rather than lamenting the gobs and gobs of cash that the foundation
models are throwing at this space, there's something to be said for just enjoying and frankly
creating the positive externalities of all that. Anyways, once again, big thank you to Sarah for
her newsletter. If you want to find more, you can go to Sarah at Tavill, that's T-A-V-E-L.com,
and that's going to do it for today's AI Daily Brief. Thanks for listening or watching as always,
and until next time, peace.
