The AI Daily Brief: Artificial Intelligence News and Analysis - The Four Wars of the AI Stack - An Update with Swyx and Alessio of Latent Space
Episode Date: March 28, 2024NLW is joined today by the hosts of the Latent Space podcast for part one of a wide-ranging conversation about the changes and shifts in the AI market in early 2024. Find our guests online: https://tw...itter.com/swyx https://twitter.com/fanahova *** Be the first to learn about our new AI education platform: https://besuper.ai/ *** Today's Episode Brought to You By: Plumb - Build, test, and deploy AI features with confidence - https://useplumb.com/ ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI. Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/
Transcript
Discussion (0)
Today on the AI Breakdown, part one of my conversation with Alessio and Swix from Layton Space.
The AI breakdown is a daily podcast and video about the most important news and discussions in AI.
Go to Breakdown.com for more information about our YouTube, our Discord, and our newsletter.
Hello, friends. Right now, I am doing a bit of travel with the family, and so we are doing interviews this week.
And today and tomorrow, I am bringing you two parts of a great conversation with my friends from Layton Space,
Alessio and Sean, better known as Swix. You guys might remember a show we did together,
back six months ago or so, and this follows a similar format where we are talking about some of
the biggest trends shaping the AI space right now.
Hello, friends, quick note before we get to the rest of the episode, you have probably
heard me talk about the AI education beta over the past few months. We've had a ton of you
participate, which has been amazing, and now we're almost ready to announce something big and
something new. If you want to be one of the first to hear about our new approach to learning
AI that is hyper-practical, hands-on, immediately relevant, continuously upgrading, and anchored by
community, go to B-super.aI and sign up to be notified when the project goes live. We're getting
there in just a few weeks, and I want all of you along for the journey. Once again, that's
B-super.a.i. In this half of the show, we talk about the four wars of the AI stack, what we've
learned about the power of being GPU rich, and some of the realigning battles around big tech.
I don't want to waste any more time before we dive in, so let's listen.
All right, fellas, welcome back to the AI Breakdown.
How are you doing?
Very good.
Yeah.
The last time we did the show, we were like, oh, yeah, let's do check-ins monthly about
all the things that are going on.
And then, of course, six months later, and, you know, the world has changed in a thousand ways.
It's just getting too busy to even think about podcasting sometimes.
But I'm super excited to be chatting with you again.
I think there's a lot to catch up on just that's happened.
I think in the beginning of 2024.
And so, you know, we're going to talk today about just kind of a broad sense of where
things are in some of the key battles in the AI space.
And then one of the big things that I'm really excited to have you guys on here for is to talk
about where, sort of what patterns you're seeing and what people are actually trying to build,
you know, where developers are spending their time and energy and any sort of, you know,
trends there.
But maybe let's start, I guess, by checking in on a,
framework that you guys actually introduced, which I've loved and I've cribbed a couple of times now,
which is the sort of four wars of the AI stack. Because first since I have you here, I'd love to
hear sort of like where that started jelling. And then maybe we can get into, I think, a couple of
them that are, you know, particularly interesting, you know, in light of some recent news.
Yeah, so maybe I'll take this one. So the Four Wars is a framework that I came up around trying
to recap the, all of 2023. I tried to write sort of monthly recap pieces. And, and,
And I was trying to figure out, like, what makes one piece of news last longer than another or more significant than another?
And I think it's basically always around battlegrounds.
Wars are fought around limited resources.
And I think probably the most limited resource is talent, but the talent expresses itself in a number of areas.
And so I kind of focus on those areas first.
So the four wars that we cover are the data wars, the GPU-rich-Poor war, the multi-mobile.
the World War and the Rag and Ops War.
And I think you actually did a dedicated episode to that.
So thanks for you covering that.
Yeah, yeah.
Not only did I do a dedicated episode, I actually used that.
I can't remember if I told you guys.
I did give you big shoutouts,
but I used it as a framework for a presentation at Intel's big AI event
that they hold each year where they have all their folks who are working on AI
internally.
And it totally resonated.
That's amazing.
That's amazing.
Yeah.
So what got me thinking about it again,
is specifically this inflection news that we recently had, this sort of, you know, basically,
I can't imagine that anyone who's listening wouldn't have thought about it. But, you know,
inflection is a one of the big contenders, right? I think probably most folks would have put them,
you know, just a half step behind the Anthropics and Open AIs of the world in terms of labs.
But it's a company that raised $1.3 billion last year, less than a year ago.
Reid Hoffman's a co-founder.
Mustafa Sullyman, who is a co-founder of DeepMind, you know, so it's like, this is not a
a small startup, let's say, at least in terms of perception.
And then we get the news that basically most of the team it appears is heading over to Microsoft
and they're bringing in a new CEO.
And, you know, I'm interested in kind of your take on how much that reflects.
Like, hold aside, I guess, you know, all the other things that it might be about,
how much it reflects this sort of the stark, brutal reality of competing in the frontier
model space right now and, you know, just the access to compute.
There are a lot of things to say.
So first of all, there's always somebody who's more GPU rich than you.
So inflection is GPU rich by startup standard.
I think they bought 22,000 H-100s, but obviously that pales compared to Microsoft.
The other thing is that this is probably good news maybe for the startups.
It's like being GPU rich is not enough.
You know, like I think they were building something pretty interesting in Pi, their own model,
their own kind of experience, but at the end of the day, the interface that people consume
as end users is really similar to a lot of the others. And we'll talk about Jupy4 and Cloud 3 and all
this stuff. Sometimes when you're a startup, you're going to have a lot of success being GPU poor,
doing something that the GPU rich are not interested in. We just had our AI Center of Excellence
at Dettable and one of the AI leads at one of the big companies,
It's like, oh, we just save $10 million and we use these models to do a translation,
you know, and that's it.
It's not a GI.
It's just translation.
So I think like the inflection part is maybe a calling and awakening to a lot of startups
and say, hey, you know, trying to get as much capital as possible, trying to get as many
GPUs as possible.
It's good, but at the end of the day, doesn't build a business, you know, and maybe what
inflection.
I don't, again, I don't know the reasons behind the inflection choice, but if you say, I don't
want to build my own company that has $1.3 billion and I want to go do it at Microsoft,
it's probably not a resources problem. It's more a strategic decisions that you're making as a
company. So yeah, that was kind of my take on it. Yeah. And I guess on my end, two things actually
happened yesterday. There was a little bit quieter news, but stability AI had some pretty
major departures as well. And you may not be considering it, but stability is actually also a GPU
rich company in a sense that they were the first new startup in this AI wave to brag about how many
GPUs that they have and you should join them and you know, Imadis is definitely a GPU trader in
some sense from his hedge fund days. So Robin Rombach and like the most of the stable diffusion three
people left stability yesterday as well. So yesterday was kind of like a big news day for the GPU
rich companies, both inflection and stability having sort of wind taken out of their sales.
I think, yes, it's a data point in the favor of, like, just because you have the GPUs doesn't mean you can, you automatically win.
And I think, you know, kind of I'll echo what Alessio says there.
But in general also, like, I wonder if this is like the start of a major consolidation wave just in terms of, you know, I think there was a lot of funding last year.
And, you know, the business models have not been worked out very well.
Even inflection couldn't do it.
And so I think maybe that's the start of a small consolidation wave.
I don't think like that's like a sign of AI winter.
I keep looking for AI winter coming.
I think this is kind of like a brief cold front.
Yeah, it's super interesting.
So I think a bunch of stuff here.
One is I think to both of your points,
in some ways there are already been this very clear demarcation
between these two sides where like the GPU pours to use the terminology
just weren't trying to compete on the same level, right?
you know, the vast majority of people who have started something over the last year,
year and a half call it, we're racing in a different direction.
They're trying to find some edge somewhere else.
They're trying to build something different.
If they're really trying to innovate, it's in different areas.
And so it's really just this very small handful of companies that are in this like very, you know,
it's like the coheres and jaspers of the world that like this sort of, you know, that are
just sort of a little bit less resourced than, you know, than the other set that I think
that this potentially even applies to.
You know, everyone else that could clearly demarcated into these two sides.
And there's only a small handful kind of sitting uncomfortably in the middle, perhaps.
Let's come back to the idea of this sort of AI winter or, you know, a cold front or anything like that.
So this is something that I spent a lot of time kind of thinking about and noticing.
And my perception is that the vast majority of the folks who are trying to call for sort of, you know, a trough of disillusionment or, you know, a shifting of the phase.
to that are people who either, A, just don't like AI for some other reason. There's plenty of that,
you know, people who are saying, look, they're doing way worse than they ever thought. You know,
there's a lot of sort of confirmation bias kind of thing going on. Or two, media that just needs a
different narrative, right, because they're sort of sick of, you know, telling the same story.
Same thing happened last summer when every, every outlet jumped on the chat GPT at its first down month
story to try to really like kind of hammer this idea.
that the hype was too much.
Meanwhile, you have, you know, just ridiculous levels of investment from enterprises, you know,
coming in.
You have, you know, huge volumes of, you know, individual behavior change happening.
But I do think that there's nothing incoherent sort of to your point, Swix, about that and the
consolidation period.
Like, you know, if you look right now, for example, there are, I don't know, probably 25 or 30
credible, like, build your own chatbot platforms that, you know, a lot of which have, you know,
raised funding. There's no universe in which all of those are successful across, you know,
even with a total addressable market of every enterprise in the world, you know, you're just
inevitably going to see some amount of consolidation. Same with, you know, image generators. There are,
if you look at the A16s top 50 consumer AI apps just based on, you know, web traffic or whatever,
there's still like, I don't know, a half dozen or 10 or something, like some ridiculous number of like basically things like Mid Journey or Dolly 3. And it just seems impossible that we're going to have that many, you know, ultimately as sort of, you know, going concerned. So I don't know. I think that there will be inevitable consolidation because, you know, just it's also what kind of like venture rounds are supposed to do. Not everyone who gets a seed round is supposed to get to Series A and not everyone who gets a series A is supposed to get to Series B. That's sort of the natural.
process. I think it will be tempting for a lot of people to try to infer from that something about
AI not being as sort of bigger as as sort of relevant as it was hyped up to be. But I kind of think
that's the wrong conclusion to come to. I would say the experimentation surface is a little
smaller for image generation. So if you go back maybe six, nine months, most people will tell you
why would you build a coding assistant
when like co-pilot and GitHub
are just going to win everything
because they have the data and they've all the stuff.
If you fast forward today,
a lot of people use cursor.
Everybody was excited about the Devon
release on Twitter.
There are a lot of different ways
of attacking the market
that are not completion of code,
the ID. And even cursors,
like they evolved beyond single line to like chat,
to do multi-line edits and all that stuff.
image generation I would say yeah as a just as from what I've seen like maybe the product innovation
has slowed down at the ux level and people are improving the models so the race is like how do
I make better images it's not like how do I make the user interact with the generation process better
and that gets tough you know it's hard to like really differentiate yourself so yeah that's that's
kind of how I look at it and when we think about multi-modality maybe the people why the reason why people
got so excited about SORA, it's like, oh, this is like a completely, it's not a better image model.
This is like a completely different thing, you know? And I think the creative mind is always looking
for something that impacts the viewer in a different way, you know, like they really want something
different versus the developer mind. It's like, oh, I have this like very annoying thing. I want
better. I have this like very specific use cases that I want to go after. So it's just different. And
That's why you see a lot more companies in a generation, but I agree with you that if you fast forward, there's not going to be 10 of them.
You know, it's probably going to be one or two.
Yeah, I mean, to me, that's why I call it a war.
Like individually, all these companies can make a story that kind of makes sense, but collectively, they cannot all be true.
Therefore, they all, there is some kind of fight over limited resources here.
Yeah, so it's interesting.
We wandered very naturally into sort of another one of these wars, which is the multimodality kind of idea, which is, you know,
basically a question of whether it's going to be these sort of big everything models that
end up winning or whether, you know, you're going to have really specific things, you know,
like something, you know, Dolly 3 inside of sort of Open AI's larger models versus, you know,
a mid-jury or something like that. And at first, you know, I was kind of thinking like,
for most of the last call it six months or whatever, it feels pretty definitively both and
in some ways, you know, and that you're seeing just like,
great innovation on sort of the everything models, but you're also seeing lots and lots happen
at sort of the level of kind of individual use cases. But then SORA comes along and just like
obliterates what I think anyone thought, you know, where we were when it comes to video generation.
So how are you guys thinking about this particular battle or war at the moment? Yeah. This was definitely
a both-end story and Sora tip things one way for me. And,
terms of scale being all you need. And the benefit, I think, of having multiple models being
developed under one roof. I think a lot of people aren't aware that SORA was developed in a similar
fashion to Dolly 3. And Dolly 3 had a very interesting paper out where they talked about how they
sort of bootstrapped their synthetic data based on GPT4 Vision and GPT4. And it was just all like really
interesting, like if you work on one modality, it enables you to work on other modalities,
and all that is more beneficial if it's all in the same house. Whereas the individual
startups who don't, who sort of carve out a single modality and work on that, definitely, you
know, won't have the state-of-the-art stuff on helping them out on synthetic data. So I do think
like the balance is tilted a little bit towards the God model companies, which is challenging
for the dedicated modality companies, but everyone's carving out different niches.
You know, like we just interviewed Suno AI, the sort of music model company.
And, you know, I don't see OpenEI pursue music anytime soon.
Yeah, Suno's been phenomenal to play with.
Suno has done that rare thing where, which I think a number of different AI product
categories have done, where people who don't consider themselves particularly interested
in doing the thing that the AI enables find themselves doing a lot more of that thing, right?
Like, it'd be one thing if just musicians were excited about Suno and using it.
But what you're seeing is tons of people who just like music all of a sudden, like, playing
around with it and finding themselves kind of down that rabbit hole, which I think is kind of
like the highest compliment that you can give one of these startups at their early days of it.
Yeah.
You know, I asked them directly, you know, in the interview about whether they consider themselves
mid-jury for music.
And he had a more sort of nuanced response there.
But I think that probably the business model is going to be very similar because he's
focus on the B to C element of that.
So yeah, I mean, you know, just to tie back to the question about, you know, large multi-modality
companies versus small dedicated modality companies.
Yeah, I highly recommend people to read the SORA blog posts and then read through to the Dolly
blog post because they strongly correlated themselves with the same synthetic data bootstrapping
methods as Dolly.
And I think once you make those connections, you're like, oh, like it is beneficial to have
multiple state-of-the-art models in-house that all help each other. And that's the one thing
that a dedicated modality company cannot do. Today's podcast is brought to you by Plum. You've probably
noticed by now that many of the AI features that are embedded in your favorite products kind of suck.
They're cool the first time, but pretty soon you're underwhelmed. That's because truly great
AI features require complex pipelines and rigorous testing that most startups simply don't have
time or tooling to get right. That's why Plum created a collaborative AI app builder that's
purpose build for product teams. Your users to serve better than a glorified GPT wrapper.
Blow their minds with plum. Check out useplum.com. That's plum with a B. Send me a note to get
early access. So I want to jump, I want to kind of build off that and move into the sort of like
updated GPT4 class landscape because that's obviously been another big change over the last couple
months. But for the sake of the completeness, is there anything that's worth touching on with,
with sort of the quality data or sort of rag ops wars just in terms of, you know, anything that's
I guess for you fundamentally in the last couple months
about where those things stand.
So I think we're going to talk about RAG
for the Gemini and Clouds discussion later.
And so maybe briefly discussed the data piece.
I think maybe the only new thing
was this Reddit deal with Google
for like a $60 million deal
just ahead of their IPO,
very conveniently turning Reddit into an AI data company.
Also very interestingly, a non-exclusive deal,
meaning that Reddit can resell that data
to someone else.
and it probably does become table stakes.
A lot of people don't know, but a lot of the web text data set that originally started for GPT 1, 2, and 3
was actually scraped from Reddit, at least the sort of vote scores.
And I think that's a very valuable piece of information.
So, like, yeah, I think people are figuring out how to pay for data.
People are suing each other over data.
This war is definitely very, very much heating up.
And I don't think, I don't see it getting any.
less intense.
Next to GPU's data is going to be the most expensive thing in a model stack company.
And a lot of people are resorting to synthetic versions of it, which may or may not be kosher
based on how far or long or how commercially blessed the forms of creating that synthetic data
are.
I don't know if, unless you have any other interactions with data source companies, but that's my two cents.
Yeah. Yeah, actually
I saw Quentin Anthony from Aluteraa at GTC this week.
He's also been working on this.
I saw Technium. He's also been working on the data side.
I think especially in open source, people are like, okay,
if everybody is putting the gates up, so to speak, to the data,
we need to make it easier for people that don't have 50 million a year to get access to good data sets.
And Jensen at his keynote, he did talk about synthetic data a little bit.
So I think that's something that we'll definitely hear more and more of in the enterprise, which never boats well, because then all the view with the data like, oh, the enterprises want to pay now.
Let me let me put a pay here, Stripe link so that they can give me $50 million.
But it worked for Reddit.
I think the stock is up 40% today after opening.
So, yeah, I don't know if it's all about the Google deal, but it's obviously Reddit as being one of those companies where, hey, you got all these.
like great community, but like, how are you going to make money? And like, they try to sell the
avatars. I don't know if that, it's a great business for them. The data part sounds, as an investor,
you know, the data part sounds a lot more interesting than consumer cosmetics. Yeah. Yeah. So I think,
you know, there's more questions around data. You know, I think a lot of people are talking about
the interview that Mira Muradi did with the Wall Street Journal, where she like just basically had
had no good answer for where they got the data for SORA. I think this is where, you know,
it's in nobody's interest to be transparent about data. And it's kind of sad for the state of
ML and state of AI research. But it is what it is. We have to figure this out as a society,
just like we did for music and music sharing, you know, in sort of the Napster to Spotify transition.
And that might take us a decade. Yeah. I agree. I think I think that you're right to identify it,
not just as that sort of technical problem, but as one where society has to have a debate with itself.
Because I think that there's, if you sit rationally within it, there's great kind of points on all
side, not to be the sort of, you know, person who sits in the middle constantly.
But it's why I think a lot of these legal decisions are going to be really important because, you know,
the job of judges is to listen all this stuff and try to come to things and then have other judges
disagree and, you know, and have the rest of us all debate at the same time.
By the way, as a total aside, I feel like the synthetic data right now is like A,
in the 80s and 90s, like whether they're good for you or bad for you.
Like, you know, we get one study that's like synthetic data, you know, there's model collapse.
And then we have like a hint that Lama, you know, to the most high-performance version of it,
which was one they didn't release, was trained on synthetic data.
So maybe it's good.
I just feel like every other week I'm seeing something sort of different about whether it's
good or bad for these models.
Yeah, the branding of this is pretty poor.
I would kind of tell people to think about it like cholesterol.
There's good cholesterol, bad cholesterol.
and you can have good amounts of both.
But at this point, it is absolutely without a doubt
that most large models from here and out
will all be trained as some kind of synthetic data,
and that is not a bad thing.
There are ways in which you can do it poorly,
whether it's commercial, you know,
in terms of commercial sourcing
or in terms of the model performance,
but it's without a doubt that good synthetic data
is going to help your model.
And this is just a question,
of where to obtain it and what kinds of synthetic data are valuable.
Even like alpha geometry, you know, was a really good example from like earlier this year.
If you're using the cholesterol analogy, then my egg thing can't be that far off.
Yeah, exactly.
Let's talk about the sort of the state of the art and the GPT4 class landscape and how that's
changed.
Because obviously, you know, sort of the two big things or a couple of the big things that have
happened since we last talked were one, you know,
Gemini first announcing that a model was coming and then finally it arriving, and then very soon
after a sort of a different model arriving from Gemini, and Claude 3. So I guess, you know,
I'm not sure exactly where the right place to start with this conversation is, but, you know,
maybe very broadly speaking, which of these do you think have made a bigger impact?
Probably the one you can use, right? So, Claude.
Well, I'm sure Gemini is going to be great once they let me let me in, but so far I haven't been
able to. I use, so I have this small podcaster thing that I built for our podcast, which does
chapters creation, like named entity recognition, summarization and all of that. Cloud three is
better than GPD4. Cloud 2 was unusable. So I used GPD4 for everything. And then when Opos came out,
I tried them again side by side and I posted it on Twitter as well. Cloud is very good. You know,
it's much better. It seems to me. It's much better. It's much better.
better than GPD4 at doing writing that is more, you know, I don't know, it just got good vibes,
you know, like the GPD4 text, you can tell it's like GPD4, you know, it's like it always uses
certain types of words and phrases and, you know, maybe it's just me because I've now done it
for 50 podcast episodes, so I've read like 75, 80 generations of these things next to each other,
but clutter is really good. I know everybody is freaking out on Twitter about it.
my only experience of this is much better
has been on the podcast use case.
But I know that, you know,
Quran from news research is a very big opus,
pro opus person.
So I think that's also,
it's great to have people that actually care about other models.
You know,
I think so far to a lot of people,
maybe Anthropic has been the sibling in the corner,
you know, it's like cloud releases a new model
and then open AI releases SORA and like,
you know,
there are like all these different things.
But yeah, the new models are good.
It's interesting.
My perception is definitely that just observationally,
Claude 3 is certainly the first thing that I've seen where lots of people,
no one's debating evals or anything like that.
They're talking about the specific use cases that they have,
that they used to use chat GPT for every day, you know, day and day out,
that they've now just switched over.
and that has, I think, shifted a lot of the sort of vibe and sentiment in the space, too.
And I don't necessarily think that it's sort of a full, you know, sort of full knock.
Let's put it this way.
I think it's less bad for open AI than it is good for Anthropic.
I think that because GPT5 isn't there, people are not quite willing to sort of like, you know, get overly critical of open AI,
except insofar as they're wondering where GPT5 is.
But I do think that it makes Anthropic look way more credible as a player, as a credible
sort of player, you know, as opposed to where they were.
Yeah.
And I would say the benchmarks veil is probably getting lifted this year.
I think last year people were like, okay, this is better than this on this benchmark,
blah, blah, blah, because maybe they did not have a lot of use cases that they did frequently.
So it's hard to like compare yourself.
So you deferred to the benchmarks.
I think now as we go into 2024,
a lot of people have started to use these models from,
you know,
from very sophisticated things that they run in production to some utility
that they have on their own.
Now they can just run them side by side.
And it's like,
hey,
I don't care that like the MMLU score of Opus is like slightly lower than GPD4.
It just works for me, you know?
And I think that's the same way that traditional software has been used by people, right?
like you just strive for yourself and like which one does it work that works best for you like nobody
looks at benchmarks outside of like sales white papers you know and I think it's great that we're
going more in that direction we have an episode with ADAP coming out this weekend in some of their
model releases they specifically say we do not care about benchmarks so we didn't put them in you know
because we we don't want to look good on them we just want the product to work and I think more
and more people will go that way.
Yeah.
I would say it does take the win out of the sales for GPT-5,
which I know we're curious about later on.
I think any time you put out a new state-of-the-art model,
you have to break through in some way.
And what Claude and Gemini have done
is effectively take away any advantage to saying
that you have a million token context window.
Now everyone's just going to be like, oh, okay,
now you just match the other two guys.
And so that puts an insane amount of pressure
on what GPT5 is going to be, because it's just going to have, like, the only option it has now,
because all the other models are multimodal, all the other models are long context,
all the other models have perfect recall.
GPT5 has to match everything and do more to not be a flop.
