No Priors: Artificial Intelligence | Technology | Startups - Chips, Neoclouds, and the Quest for AI Dominance with SemiAnalysis Founder and CEO Dylan Patel
Episode Date: August 14, 2025What would it take to challenge Nvidia? SemiAnalysis Founder and CEO Dylan Patel joins Sarah Guo to answer this and other topical questions around the current state of AI infrastructure. Together, the...y explore why Dylan loves Android products, predictions around OpenAI’s open source model, and what the landscape of neoclouds looks like. They also discuss Dylan’s thoughts on bottlenecks for expanding AI infrastructure and exporting American AI technologies. Plus, we find out what question Dylan would ask Mark Zuckerberg. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @dylan522p | @SemiAnalysis_ Chapters: 00:00 – Dylan Patel Introduction 00:31 – Dylan’s Love for Android Products 02:10 – Predictions About OpenAI’s Open Source Model 06:50 – Implications of an American Open Source Model for the Application Ecosystem 10:48 – Evolution of Neoclouds 17:26 – What It Would Take to Challenge Nvidia 27:43 – What Would an Nvidia Challenger Look Like? 28:18 – Understanding Operational and Power Constraints for Data Centers 34:48 – Dylan’s View on the American Stack 43:01 – What Dylan Would Ask Mark Zuckerberg 44:22 – Poker and AI Entrepreneurship 46:51 – Conclusion
Transcript
Discussion (0)
Hi, listeners. Welcome back to No Pryors.
Today I'm here with Dylan Patel, the chief analyst at Semi Analysis, a leading source for anyone
interested in chips and AI infrastructure. We talk about open source models, the bottlenecks
to building a data center the size of Manhattan, geopolitics, and poker as a tell for
entrepreneurship. Welcome, Dylan. Dylan, thank you so much for being here.
Thank you for having me.
I've been really looking forward to this conversation. You're such a
a deep thinker about the space. And then also it's very odd. You clearly have the Samsung watch.
Yeah. I got the foldy phone. I got the blig. And the laptop. The fold. Yeah. Yeah. Tell me more.
So part of the origin story is that I was moderating forms when I was a child. And my dad's first
Android phone was the droid. Right. And for some reason, I was obsessed with like messing with it,
like rooting it, like underclocking it, improving the battery life, all this things.
Because when we run a road trip, there's nothing to do besides like mess around on his phone.
So I posted so much about Android that I became a moderator slash R-slash Android on Reddit
and like many other subreddits related to hardware and NVIDIA and tell them all this stuff.
But because of that, I've just always had Android.
Now, I've had work iPhones before, but I just really love Android and that it's like,
if you're going to like technology, I'm not like someone who pushes it, but like get the best stuff.
So I have like the ultra Samsung watch, which I think looks cool and the foldy phone, right?
It's fun.
It's obviously different and weird.
No iMessage is a travestee.
What does it dominate at?
What is it better at besides the openness of like the hackability?
I don't even hack that much stuff anymore, right?
It's like, what do you use your phone for?
I think the main thing is like you can have like slacken an email up on two different parts of your phone.
I think that's probably the main thing or like you can actually use like a spreadsheet on a folding phone.
You cannot use a spreadsheet on a regular phone.
Okay.
And that's not even an Android thing.
Like Apple's folding phone next year will be able to do that just fine and I'll have no argument then.
But I just like it.
You know, people have their preferences.
people are creatures of habit.
You got to look at the GPU purchasing forecast on a sheet on your phone.
Yes, I do.
I do, no.
It's like someone's telling you numbers.
You're like, wait, this is like slightly different than my number, right?
Okay, so we have a week of big rumored announcements coming up.
Tell me your, like, reaction to the Open A.I.
Open source model.
In theory, it's going to be amazing, right?
Like, I assume this is releasing after it's released or?
Yes.
So that's okay.
The open source model is amazing, guys.
I think the world is going to be really, like, shocked and excited.
It's the first time America's had the best open source model in six months, nine months, a year.
Lama 3.1, 405B was the last time we had the best model.
And then Mishral took over for a little bit, if I recall correctly,
and then the Chinese labs have been dominating for the last, like, six, nine months, right?
So it'll be interesting.
It'll also be funny because, like, the open source model probably won't be the best for just regular chat.
because it is like more reasoning focused and all these things.
But it'll be really good at code and it's how I'm excited for that.
Yeah, like tool use, although that's like going to be confusing.
Like how do you use the tools if you don't have access to opening eyes tool use stuff,
but the model is trained to do so.
That'll be interesting for people to figure out.
I think the last thing is like the way they're rolling it out is really interesting.
They accidentally leaked all the weights,
but no one in the open sources figured out how to actually run inference on it
because there's just some weird stuff in the model with the architecture,
like forbid and like the biases and all this other stuff.
But what's interesting is other companies drop the model weights and say, go,
make your own inference implementation.
But opening eyes like actually like dropping the model weights and like all these custom
kernels for people to implement in inference.
So everyone has a very optimized inference stack day one.
And they work with partners on it too.
Yeah, working with partners on this.
But this is very interesting because like when deep seeks drops, it's like, well,
together and fireworks are like, yeah, we're the best at inference because we have all
these like people who are really good at low level coding, whether it would be like fireworks with
all their like former pie torch meta people or together with like, you know, Tree Dow and all the,
you know, Dan Fu and all these like super cracked like colonel people. They have like higher performance,
right? But in this case, like opening eyes, releasing a lot of this stuff. So it's, it's interesting
for the inference providers too. Like how do they differentiate now? Yeah. I mean, my premise on this is
in the end, a lot of the model optimization performance layer is open source. And,
it's a commodity. And it will end up being like a fight at the infrastructure level, actually.
Interesting.
And so, you know, all of these inference providers, like as you mentioned, you know, fireworks and
together, base 10 and such, they compete on both dimensions. And the question is, what's going to
matter in the long term?
Why would these model level software optimizations all be open? They haven't been open so far and
the advancements are so fast, right? Well, I think they, a bunch of them have been partially
open. And I think opening eye is also pushing for them to be open.
as well, right? And so I think there's a lot of force in the ecosystem to open source from both
like the invidia level up and from the model providers down.
And so I think today these providers all fight on that dimension. Yeah. And they also fight on
the infrastructure dimension. And I think infrastructure is going to end up being a bigger
differentiator. That makes sense. You can't open source your actual infrastructure, right? You just have
to have the network and you have to run it. Yeah, yeah. That makes a lot of sense. Although like I see
today the inference like providers have such a wide variance right like the ones you mentioned are
on the like the leading edge especially like together and fireworks i think are on the leading
edge of like their own custom stacks all the way down to like there's a lot of people who just
take the out-of-box open source software yeah i think there's no market for that but those guys have
just yeah i agree there's no market it's like commoditized yeah they have really really way worse
margins than the people who are very optimized when you see invidia trying to open source all the
stuff around dynamo and and opening i and all these other people are trying to open source stuff but
the level of optimizations is also like really, really large, like cashing between turns
and caching tool use calls and all these other things. And it's not just like a single server
problem. Like it's like the deep seek implementation of inference is like 160 GPUs or something
like that. Like that's over $10 million of hardware. And then that's just one replica. And then
you'll have a lot of replicas and you share the caching servers between them. So like seems like
just that orchestration of that, but also the infrastructure of that. It's a very large amount
of infrastructure. I don't know. That's interesting thought that that will be completely
commoditized optimization layer. Well, I think that there's optimization at the single node level
and then there's like the system software where you can like orchestrate this. And I think
that owning the abstractions for it and having people use your tools and more sophisticated
teams to do that optimization. It's like very ugly distributed systems problem. I think that
will matter. Okay. I could agree with that. I can agree with that. Single node is not necessarily.
Yeah. I agree. Let's move a, you know, out and a layer down like what does
having access to an American open source model mean or just more and more powerful, like,
open source AI models mean for the application ecosystem.
I mean, I know, like, a lot of people and some enterprises are really iffy about, like,
using, like, the best open source model.
They're, like, worried.
It's like, there's nothing wrong with them today.
There's nothing in them today, right?
You know, there's the worry that one day they will.
How do you check?
I mean, you don't, but you can just vibes it out.
Like, they're, like, competing with each other to just release as fast as possible, right?
Like, like Deep Seek and Moonshot and all these other last.
you know, Alibaba, et cetera, like, they're competing to release as fast as they can with each other.
The Alibaba teams in Singapore, like, I don't think that they're, like, putting Trojan horses in
these models, right? And, like, there's some interesting papers that Anthropic did on, like,
you know, trying to embed some stuff in models and ended up, like, being detectable pretty
easily. Again, like, I don't know how to, you know, I'm not, I'm not too much into that space
of interpretability and, like, evals, but I just don't think that they are, right? It's just a
vibes thing. But some people are worried that they could be or they're just like iffy. Like,
oh, I don't want to use a Chinese model. It's like, well, fine, but now you're going to go use
a service that is backed by a Chinese model, which is fine. Like, you know, like, but they, you know,
they're fine with that. They just don't want to directly use the model. I don't know. I think,
I think it's, it's interesting for some enterprises who are still stuck on Mama, but it's mostly
just really interesting because it continues to move the commodity bar up. Now with this tier being
open source and sure, like probably won't be like drastically better than Kimmy.
but Kimi is so big, it's so difficult to run, like people aren't running it, whereas the opening
eye models is like relatively small, so you can run it without being like gigabrain of infrastructure.
You end up with that commoditizing so much more of the closed source API market.
And I think that's just going to be great for an option, right?
Yeah, one of my hopes is for our companies that are doing more with reasoning, it is like they're
still blocked on cost and latency.
So this is something that I've found very interesting is that we've been trying to build a lot of alternative data sources for token usage.
Who's using what tokens, what models, where, et cetera, why?
And it's very clear that people aren't actually using the reasoning models that much in API.
Like Anthropic has eclipsed open AI and API revenue and their API revenue is primarily not thinking.
It's clawed for, but it's not in the thinking mode.
You know, code is code being the biggest use case that's skyrocketing.
And the same applies to like open AI and deep bind from what we see querying big users
and other ways of like scraping alternative data because the latency issues, because the cost
issues especially, right?
The cost is just ridiculous.
Exactly.
So I guess my view is you're not allowed to have a tech podcast without saying the words
Jevin's paradox now.
And I think like I think the behavior is going to be like we see a lot of people use reasoning
because it's so much cheaper to run if you take out a big piece of the margin layer and you make
it smaller.
And so I think, like, we have a lot of companies that are at scale who are using it, but it's so expensive that they restrain themselves.
For a long time, opening I was charging more per token for the reasoning model, right, 01 and 03 than they were for GPD40, even though the architecture is like basically the same.
It's just the weights are different.
And there's like some reason for it to be a little bit more expensive per token because the context length is on average longer.
But in general, like it made no sense for it to be like, was it like 4X the cost per token?
That didn't make any sense.
And then finally they, like, cut it.
But for a long time, not only was it like way more tokens outputed,
it was also a way or higher price per token, even though, and they were just taking that as margin.
Because they could, right?
Because they had the only thing out there.
Yeah.
And then, you know, Deepseek dropped and Anthropic and Google and others started releasing
models and it like, you know, commoditized quite a bit.
But this is going to just like kneecap, like cut everyone off at the hip, right?
And bring margins down again.
So that would be fun.
Who has an API business, you mean?
Yeah, yeah.
For API for models that aren't like, like, super.
leading edge. What do you think evolves in the sort of neocloud layer over time?
It's funny. Every day we still find a new neocloud. We have like 200 now. And still every day
we find new ones, right? Should they all exist? Obviously not, right? So to some extent,
it depends on what the neocloud business is. Like today, there is quite a bit of differentiation
between the neoclouds. It's not just like buy a GPU, put it in a data center. Otherwise,
you wouldn't have some neoclouds with like horrible utilization rate and you wouldn't have
some neoclouds who are like completely sold out on four, five, six year contracts, right?
Like Correweeb, for example.
Who doesn't even quote most startups? Or they just give them a stupid quote because they just
like, I don't want your business. Or like they want a long-term contract, right? Which a lot of
people don't want to sign. And so like there's quite a bit of differentiation in financial
performance of these neoclouds, time to deploy, reliability, the software they're putting
on top, right? Like many of them can't even install slurm for you. It's like, what are you doing?
And you should have some sort of like- So very low-level hardware management.
Yeah, yeah. It's like very, and it's like to some extent,
from the investor side, we see a lot more debt and equity flowing in from the commercial real
estate folks. As commercial real estate has been really poor over the last couple years,
few years, they've been starting to pour money into cloud space. Obviously, the return profile is
quite different because it's like a short-lived asset versus like a longer-lived asset. But at the end of
the day, like these companies, they're okay with a 10, 15% return on equity, right? And over time,
that falling. That is not okay for venture capital, right? And yet a lot of these neoclods are backed by
venture capital. So a lot of these companies will fail either because it no longer makes sense for them
to continue to get venture funding or they end up getting out competed because they just can't get
their utilization up unlike, you know, some other clouds, right? Like the like the, uh, core
weaves and crusos and such of the world, right? So there's sort of like a rock and a hard place for
a hundred of these neocl clouds. And there's many of them who are like, oh no, I purchased these
GPUs. I have a loan. It costs me this much. And because my utilization is here, I'm,
like burning cash, right?
And they should at the very least not be burning cash, right?
And so some of them are like, you know, they're desperate to sell the remaining GPUs.
So they go out to like, you know, companies and give them insanely low deals.
There's some startups who I really commend because they like really figured out how to get
the desperate neoclouds to give them GPUs.
But those neoclouds are going to go bankrupt at some point because their cash flow is worse
than their debt payment.
But at the end of the day, like there's going to be a lot of consolidation.
There is going to be differentiation, right?
There's a lot of software today.
But we have this thing called ClusterMax where we review all the neoc clouds and major clouds.
And it's like, like actually some of these neoclouds are better than Amazon and Google and Microsoft in terms of software.
In terms of uptime and availability or however you measure that.
Yeah, uptime availability, network performance.
There's just a variety of things that they don't have all the old baggage.
But the vast majority are worse.
And we measure across like a bunch of different metrics, including the ones I mentioned and security and so on and so forth.
But our vision of like ClusterMax is that it starts at like a really low stage today, which is like, does the cloud work and how long does it take the user to like get a workload running?
Because you have Slurm installed or you have K's installed and your network performance is good or your reliability is good and it's secure, right?
Like these are like table stakes.
Like what we consider gold or platinum tier today will be just like table stakes in like, you know, six months a year, a couple of years.
There will be a whole layer of like software on top.
And then it's like, do neoclouds build this software, right?
And some of them are, right?
Like together, Nebius are offering inference services on top, right?
So they're saying, hey, we actually want to provide an API endpoint, not just rent GPUs.
And Corrie rumored by the information to be attempting to buy fireworks for the same reason, right?
Like, do you move up or do you just slide down into like, I'm making commercial real estate returns?
Or you have to go crazy, right?
Like Crusoe is like, we're going to build gigawatt data centers, right?
like, okay, there's no competition there. There's like a few companies doing that, right? So it's
very different. So either have to go like really, really big or you need to move into the
software layer or you just make commercial real estate or you go bankrupt, right? Like these are
the paths for all NeoClouds, I think. I really have to believe there's a reason for being
for these companies. And my like simple framework for it is I think the software layer is
really hard for people coming from this operation to to try and build. Right. It's actually
a lot of very specialized software. So I think people will buy or partner into it.
But if you think about other inputs, it could be like, I'm very good at, like, finding and controlling power agreements.
Yeah.
It could be like, I build it a scale.
Other people are incapable of doing so.
Yeah, yeah, which is like sort of what like.
Or like Nvidia wants me to exist, right?
I can't like think of like a lot of arguments beyond that.
And so I would agree with you, like eventually we're going to see consolidation either in this layer or, you know, commoditization by the inference providers.
But in the meantime, there is a lot of lunch to eat from.
Amazon, who continues to charge, you know, really, and Google and Microsoft, who continue to
charge, like, absurd margins for their compute because they're just used to doing that in
CPU world.
Yeah.
Right.
And so, like, their R.OIC is, like, extremely high on CPU and storage.
And to assume that it can, like, translate over to GPUs is a bit of a fallacy, which is why,
which is why a lot of these companies are moving in, right?
And it's like, okay, in standard cloud, there's a lot more software that, like, people
can't just build out of nowhere. Yes, EC2 is a product that is like pretty simple,
but like block storage and all these other things are actually quite difficult to do at scale
well, like that Amazon does. And that's what makes them able to charge this absurd margin
on standard compute. But now, like, it's like, well, the cloud doesn't actually generate,
create any software that the user, end user actually uses, right? It's like, sure, I need
summer communities, but then I'm just using PyTorchrist or open source. And I'm using a bunch of
Nvidia software maybe, or which is open source.
I'm using a bunch of open source models.
I'm using VLum and SGL, which are open source.
It's like you just go down the list.
It's like there's actually no software that the cloud can provide to deserve the margins
that Amazon and Google's clouds do have today.
If you're just infrastructure provider.
I think that there is software that the cloud can provide.
Yes.
But the major clouds have not delivered that the software.
Agree.
Agree.
Okay.
Same page.
Because it's really hard to do this stuff, right?
Like there is no reason that every single startup needs to have like multiple
people dedicated to Infra and like figuring out to run models and like their SLA, their reliability is just
so low, right? Like so many, so many random SaaS providers that are AI, like they have GPUs,
they have open source model. It works great, except sometimes it fails and then it's down for eight
hours and it's like, why? This shouldn't be a problem. It should be something you should just be
able to pay away. I mean, I feel like the multi-trillion dollar question that you have thought about
for perhaps longer than almost anyone else is like, what does it take to actually challenge
Nvidia, you know, asking for a friend, what would it take?
The, like, you know, simple way to put it is like, it's a three-headed dragon, right?
Like, you have, you have, they're actually just really, really good at, you know, engineering
hardware and GPUs, like, that is difficult.
They're really, really good at networking, and then they're really, I would actually say
they're like, okay at software, but everyone else is just terrible.
No one else is even close on software, but, you know, and I guess in that argument,
you can say they're great at software, but, like, actually, like, you know,
installing in video drivers is not, like, not always easy, right?
Well, there's great, and there's also just, like, well, there's like 20 years plus of work in the ecosystem, right?
Yeah.
There's today's capability and, like, usability and there's just, like, mass of, like, libraries.
Yeah, so I think invidia is really hard to take down because of those three reasons.
And it's like, okay, as a hardware provider, can I do the same thing as Nvidia and win?
No, they're an execution machine, and they have these three different pillars, right?
I'm sure they have a lot of margin, but, like, you have to do something different, right?
In the case of the hyperscalers, right, Google with TPUs, Amazon with Traneum, meta with MTIA, they are making a bet of, I can actually do something pretty similar to Invidia.
If you squint your eyes now, like Blackwell and TPU is starting, like the Invidia architecture with TPU architectures are actually converging, like, say, memory hierarchies and similar sizes of systolic arrays, like it's actually not that different anymore. It's still quite different, right?
But hand wave view, it's like pretty similar. And Tranium and TPUs are very similar.
architecturally, the hypers
are not doing anything crazy.
But that's okay because they can just
like do the mass, the margin game.
That's fine.
But for a chip company to try and compete,
they must do something very unique.
Now, if you do something unique,
it's like, okay, all your energy is focused
on that one unique thing,
but on every other vector,
you're going to be worse.
Like, are you going to be there
at the latest process known as fast as in media?
No, okay, that's like 20, 30% right,
on cost slash performance and power, right?
Are you going to be on the latest memory technology
as fast as in video?
No, you'll be like a year behind.
Great.
Same penalty.
Are you going to be the same on networking?
No, okay, you know, you just stack all these penalties up.
It's like, oh, wait, your unique thing can't just be like two to four X faster.
It has to be like way faster.
But then the problem is if you really look at it simplistically, right?
Like a flop is a flop, right?
Again, like this is super simple.
But like there is not 10x you can get out of doing a standard von Neumann architecture
on efficiency of compute.
In which case, do all of these things that in video will engineer better than you
because they have a team of 50 people working on, you know, just memory controllers and
HBM and just like a networking, or actually like thousands of people working on
networking, but like each of these things, do they just cut you by a thousand?
And that's like, oh, actually what would have been 5x faster is now only like 2x faster.
Plus if I like misstep, I'm like six months behind and now the new chip is there, right?
And you're screwed.
So or or supply chain or like intrinsic like challenges with, okay, getting other people to
deploy it now or rack deployments.
There's all these supply chain challenges, right?
Like literally in Amazon's most recent earnings, they said they're like chip architecture is not
aggressive.
Their rack architecture is very simple.
It's not that aggressive.
They're like, yeah, we have rack integration yield issues, which is why we've had,
which they like blamed their miss on AWS for their trading of not coming online fast
enough because of rack integration issues.
And when you look at the architecture, like we have an article on it.
It's like it's not like that crazy.
Like it's like what Google was doing like four or five years ago, right?
It's like, oh, wait, supply chain is hard.
And Amazon couldn't get everything in supply chain to work.
And so therefore they missed their AWS revenue by a few percent, right, which caused the whole stock market to freak out.
But it's like, there are so many things that can go wrong in hardware and the time scales are so long.
And then the last thing is that like model architecture is not stagnant.
If it was, in video, it's optimized for it.
But model architecture and hardware, right, software hardware code design is the thing that matters, right?
And these two things, you can't just like look at one in individual, right?
Like there's a reason why Microsoft's hardware programs suck, right?
Because they don't understand models at all.
Right? Meta, meta, their chips actually work for recommendation systems and they're deployed
for recommendation systems because they can do hardware software code design. Google is awesome
because they do hardware software code design. Why is AMD not catching up despite being
awesome at hardware engineering? Well, yeah, they're bad at networking, but also they suck at software
and they can't do hardware software code design. You know, there's like much deeper reasons why you can
get into this, but you have to understand the hardware and the software and they move in lockstep.
And whatever your optimization is doesn't end up working, right? So one example is all of the
first wave AI company, AI hardware companies, right? Cerebris, GROC, Samba Nova,
yeah, Graph Corps. All of them made a very similar bet. No, they were very different, right?
Some of these are architecturally pretty weird relative. Right, they're architecturally pretty
weird, but they made the same bet on memory versus compute, right? We're going to have more on-chip
memory and lower bandwidth, right, off-chip, right? Because that was the trade-off they decided to
make. So all of them had way more on-chip memory than Nvidia, right? Invidia, their on-chip
memory has not really grown much from A100 H100 Blackwell, right? It's up 30% in like three
generations, whereas these guys had like 10x the on chip memory, right? All the way back in like
when they were competing with A100 or even the generation before. But that ended up being
a problem because they were like, oh yeah, we could just run the model on the chip, right?
You can put the whole weight, all the weights on there. And then, you know, we'll be so much more
efficient. And then the models just got way too big, right? And Cerebrus was like, oh, wait,
but our ship is huge. Oh, wait, but still the model's way too big to fit on it. This is like
very simple, right? You know, the same thing's happening.
in the other direction, right? Like, some companies are like, oh, we're going to make our,
like, systolic array, your compute unit super, super, super, super large because, let's say,
Lama 70B is an 8K hidden dimension and your batch and all that. Like, it's a pretty large map
mole. Oh, great. Okay, we'll make this chip. And then all of a sudden, all the models get
super, super sparse MOEs, right? Like, the hidden dimension of deepseek's models are like really
tiny because they have a lot of experts, right? Instead of one large map mall, it's a bunch of small
ones, you do route, right? And all of a sudden, like, if I made a really, really large hardware
unit, but I have all these small experts, how am I going to run it efficiently? You know, I, I, no one,
they didn't really predict that the hardware would go that way, but it ended up going that way.
And this is like, this is actually the case with at least two of the AI hardware companies today.
I don't want to, I don't want to shut up to talk them just because, you know, it's a, let's be
friendly. But like, this is like, like, clearly like what's happening, right? So it's like,
you can make a decision. It's a hardware bet that will actually be way better on today's
architectures, but then architecture evolves in the generality of like invidia's GPUs or even
like TPUs and Traneum is like more general than like as an architecture, but then it doesn't
beat Nvidia by that much, right?
In which case, they're just going to destroy you with their six months or a year ahead on
every technology because they have more people working on it.
And their supply chain is better, right?
So you, you, it's kind of really tough to make the architecture bet, have the models not just
go in a different direction that no one predicted because no one knows where models are headed, right?
even like you know you can get Greg Brockman and he might like have like a good idea but like
I'm sure he doesn't even know where bottles will look like in two years so there's got to be a
level of generality and it's hard to like hit that intersection properly and so I'm very hopeful
people compete with invidia I think it would be a lot more fun there'd be a lot less margin
eaten up by the infra there'd just be a lot more deployment of AI potentially if someone was able
to compete with invidia effectively but invidia charges a lot of money because they're the best
And, like, if there was something better, people would use it, but there isn't.
And it's just really hard to get, be better than them.
I mean, you had to give the first-gen AI hardware companies some credit because they, like,
made a secular, correct decision about the workload.
But then the architectural decisions, like, ended up being hard to predict correctly, right?
Then you have the cycle of Nvidia innovation, which is really hard to compete with,
both hardware and also, as you said, supply chain issues.
Even just putting together servers is hard.
Yes.
I think the thing that you point out that, like, people oversimplified was with maybe a current generation of AI chip startups.
They're like, we're betting on transformers.
And it's a lot more complicated than that in terms of workload at scale and continued evolution and model architecture.
And it's also not exposed so that if you're not working with the soda labs at, like, from the beginning.
And then you can't make predictions because nobody can make a lot of predictions right now.
It's very hard to, like, say, I'm going to be better at the workload.
two years from now in a very comfortable way with no other changes happening.
Like, I can't make that better right now.
Yeah, and it's like one of the interesting things about open eyes, open source models,
it's like all their training pipelines, but on a quite boring architecture, right?
Like, it's not their crazy, like, cool architecture advantages that they have in their
closed source models, which make it better for long contacts or more efficient KV cash or all
these other things, right?
They're doing it on a standard model architecture that's publicly available.
They like intentionally made the decision to open source a model with a boring architecture that's pretty much open source, right, already.
Like people have already done all these things and kept all the secrets internal that they wanted to keep.
And it's like what's what's in there, right?
Are they even doing standard scale dot product attention?
Probably.
But like there's probably a lot of weird things they're doing, which don't map directly to hardware.
Like you mentioned, right, like transformer chip architecture is like there's a lot more complicated here than just like, oh, it's optimized for transformer.
because like so is an Nvidia chip and a TPU and their next generation is more optimized for it.
Like they take steps towards it.
They don't leap.
But as long as they're like close enough to where you are architecturally optimized for workload,
they'll beat you because of all the other reasons.
And I think your description of like how might a like a chip startup win or any vendor win by specializing.
Like that actually is really hard in this era.
Like generalization may continue to win to a degree.
And it happened with all the edge hardware companies too.
You know, we talk about the first gen AI hardware companies for Data Center.
there were a handful, but for the edge, there were like 40, 50.
And like, none of them are winning because it turns out the edge is just take a Qualcomm chip or an
Intel chip that's made for PC or smartphone and deploy it on the edge, right?
Like, that ended up being way more meaningful.
So it ends up being like the incumbents, they can take steps towards what you're going for.
And if you didn't execute perfectly or if the models didn't change the architecture away from
what you thought it would be, you end up failing.
If you had to make a bet that something becomes competitive, what is the
configuration or company type that does that.
I don't want to show any company that I've invested in or anything like that.
And so therefore I'm not investment advice.
No, no, no.
But like I would just say like I probably think that like AMD GPUs or Amazon's
Traneum will be probably more likely to be a best second choice for people or Google TPU,
of course, but I think Google is just more interested in it for internal workloads.
I just think that those will be much more likely options to succeed than a chip hardware
startup. But I mean, I really hope they do because there's some really cool stuff they're doing.
If we zoom out to the macro and we think about just the scale of hardware and data center deployment
for these workloads, people talk a lot about the operational constraint on building data centers
of this size, the power constraints. I think in particular on the power side, it's very interesting
how that practically shows up. Is it generation at scale, at cost? Is it,
it grid issues? How should, you know, more people in technology understand this?
Yeah. So supply chain is always like fun because like people want to point at one thing is the
issue. But it always ends up being these things are so complicated. Like if one thing was
solved, you could increase production another 20 percent and then something else would be the
issue. You think it's a multi-bottleneck issue. Yeah. Or like, hey, for company A, it's actually
because their supply chain is this, this is the issue. And for company B, it's this is the issue.
But, you know, that's sort of in generalities.
But, like, I think zooming out, right, like, no opinion, like, he had a really fun blog about, like, is this AI hardware buildout going to cause a recession?
I think it's actually funny because you can flip the statement and be like, actually, the U.S. economy would not be growing that much this year.
If it weren't for all the AI buildouts and as a result, data center infrastructure as a result, electricians wages have soared.
As a result, power deployments and other capital investments, which have 1530 year lifespans are being made.
and all of this CAPEX is in turn actually growing the economy and like actually maybe the economy
wouldn't even be growing much or at all if it weren't for all of these investments.
One thing that is perhaps looked over from the White House AI Action Plan was the view of like,
we're going to build these AI data centers in the United States.
We're actually going to need like a lot of general investment beyond the GPUs and the power,
which are everybody's first two items into like labor, for example, right?
So if you just, you know, for simplicity's sake, be like, it's the size.
of Manhattan and we have to run it and it's a new system with changing topology and like
very high degree of relatively novel hardware with failure. Yeah. And like lots of networking that I'm
like, um, like kind of feels like we need to have a bunch of new capacity like from a labor or
robotics. In like 23, it was like very simple. It's like, Nvidia can't make enough chips. Oh,
okay, why can't Nvidia make enough chips? Oh, co-os, right? Chip on Wayfront substrate packaging
technology. And it was like, oh, HBM, right? Like those were like, it was like very
Very simple, 23, 24, like, yeah, all these tools involved in that supply chain.
It was great.
But then it, like, very quickly became much more murky, right?
Then I was like, oh, data centers are the issue.
Oh, okay, we'll just build a lot of data centers.
Oh, wait, substation equipment and transformers are the issue.
Oh, wait, power generation is the issue.
It's not like the other issues went away, right?
Like, actually, you know, COAS is still a bottleneck and HBM is still a bottleneck.
Optical transceivers are still a bottleneck.
But so is power generation and data center physical real.
estate, right? Like, I mentioned like meta is literally building these like temporary like tent
structures to put GPUs in because building the building takes too long. And it takes too much
labor, right? As you mentioned labor, right? That's like one way they were able to remove a part
of a constraint. They're still constrained on power and they had to delay the bring up of some
GPUs in Ohio because the AEP, the grid in Ohio like had some issues, right? The utility, right?
With like bringing on a generator or something, right? Oh, okay, great. Well, we'll buy our own
generators and put them on site. Oh, wait. Now there's an eight-year
backlog or whatever for your backlog for GE's turbines.
Yeah.
Oh, okay, great.
I'm Elon.
I'm going to buy a power plant from overseas that's already existing.
You're going to move it in.
Okay, great.
Now there's like permits and people protesting against me in Memphis.
Like, you know, there's like, there's like a bazillion things that can go wrong.
And labor is a huge one.
I've literally had people in pitches be like, no, no, no.
We've already booked all the contractors.
So no one else is going to be able to build a data center in this entire area of this
magnitude besides us.
Because we took all the people.
We took all the people, they're going to have to fly them in.
But it's like, okay, fine.
Like, you can fly them in, but it's like, there's just, like, not that many electricians in America.
And as a result, we've seen the wages rise a lot for people building data center infra.
There's a group of, like, these Russian guys who used to work for Yandex, Russia's search engine, who, like, wire up data centers who now live in America and they get paid a ton.
Like, and they get paid bonuses for being faster.
And therefore, they do, like, certain drugs to be able to finish the buildouts faster.
Because they get bonuses based on how fast they build it, right?
Like, it's like, there is crazy stuff going on to alleviate bottlenecks, but it's like,
there's bottlenecks everywhere.
And it really just takes a really, really hyper-competent organization tackling each of these
things and creatively thinking about each of these things.
Because if you do it the layman old way, you're going to lose and you're going to, like,
you're going to be too slow, right?
Which is why Open AI and Microsoft, partially like Microsoft is not building Stargate for
Open AI, right?
It's because it would have just been too slow and they're doing it the layman old way.
You have to go crazy.
You have to go.
That's why Microsoft rents from Core we have a ton, right?
because, oh, wait, we need someone who can do things faster than us.
And, oh, look, Corwin is doing it faster.
And now, like, you know, Open AI is, like, going to Oracle and Corrieve and others, right, N-scale
in Finland and all these other companies all around the world, the Middle East, right, G-42,
like anywhere and everywhere they can get compute because you put your eggs in many baskets
and whoever executes the best will win.
And this infrastructure is very, very hard.
Software is, like, fast turnaround times, like, you know, it's still hard.
Software's not easy, but it's like the cycle time is very fast for, like, try something.
fail, right? Try something else. It is not for infra, right? Like, what is XAI actually done to
deserve their prior funding rounds? They haven't released a leading edge model, right? And yet their
evaluations higher than Anthropic today, right? At least, you know, Anthropics raising, but whatever,
right? Like, it's Elon A. And B, they've tackled a problem creatively and done it way faster than
anyone else, which is building Colossus, right? Like, and that's like commendable because that is
part of the equation of being the best of models, right? Yeah, besides the talent. Yeah. And Elon is
like known for me, I want to get talent. So it's like, it's like there's so much complicated on the
infra that, you know, it'd be nice to say there's one thing. But yeah, like the White House
action plan lists a lot of things. But I want like, you know, how do we concretely like solve
the talent issue? It's like there's not enough people in trade school. The pay will go up and that'll
help, but the time skills and that are too slow. Like do we somehow import labor, right? That's how
the Middle East is building all their data centers. They're just importing labor. Or is there
something more intelligent we can do? Robotics, right? I think I just realized today you told me just
now, like a company I seen or I angel invested in, you led the round, right? Like, it's really
cool for data center automation, right? Like, there's all sorts of, like, interesting
problems on the infra layer that could be tackled and tackled creatively.
Speaking of, like, the policy and geopolitics implication here, like, what do you think
about the, you know, White House implication that America needs to, like, export the AI stack
or, like, needs to control important components of it? Like, it's better for us to be exporting
invidia chips than to foster a new industry, it's better for us to have like a globally
leading open source model, et cetera. Like what actually makes sense to you there? I want to tell
a crazy story. I was in Lebanon for a week. This is a good start. Yeah, this is completely
unrelated, but it just popped in my head. I think it'll be entertaining. I was in Lebanon. I was
with a few of my friends. So it was like two Indian people, two Chinese people and then a Lebanese
person, right? And these like 12 year old girls right up to the Chinese woman that was with us,
like my friend. And they were like, oh my God, your skin's so beautiful. Do you like sushi, right?
It's like fine. You're just ignorant. But what was really interesting is like when they asked
where we're from where like San Francisco, they're like, do people get shot in the streets?
Because their entire worldview is built from TikTok. Okay. Of politics. And it's like when you think
about the global propaganda machine that is Hollywood and it's not intentional. It's just American
media is pervasive. It built such a positive image of America. Now like with monoculture broken
and it's more social media based. A lot of the world thinks America is like people are getting shot all
time. It's, like, really bad, and it's, like, bad lives and people are working all the
time. It's unsafe. And, like, you know, like, Europe has a certain view of America.
And, like, I don't think it's accurate. Like, random Lebanese, 12-year-old had a really negative
view of some, like, they liked America. They loved Target for some reason because some
influencers posted TikToks about Target, but, like, they had negative views of America.
It's like, from a sense of, like, what is important is, like, the world should still run on
American technology, right? And they generally do still in terms of the web, although, you know,
Tick-Tock has broken that to a large degree. But in this next age, do you want them to run on Chinese
models, which now have Chinese values, which then spread Chinese values to the world? Or do you want
them to have American models, have American values? Like you talk to Claude and it has a
worldview, right? And it's like, I don't know if you want to call that propaganda or what.
There's a worldview that you're pushing, right? And so I think it makes sense that we need
that worldview espouse. Now, how do you do that, right? The prior administration,
current administration had different viewpoints on this, right? Prior administration said,
yes, we would love for the whole world to use our chips, but it has to be run by American
companies. And so it was like Microsoft, Oracle, we're cool with you building shitloads
of capacity in Malaysia. We don't want random other companies doing it in Malaysia. So the prior
diffusion rule had a lot of technical ways in which like, you know, you could be, you can
have these like licenses and all this. It was very hard for like random small companies to build
large GPU clusters, right? But it was very easy for Microsoft and Oracle to do it in
Malaysia. Of course, the current administration tore that up, and they have their own view on
things. I mean, I think there was a lot of things wrong with the diffusion rules, right?
They were just too complicated. They pissed a lot of people off, et cetera. Now they have a different
view, which is like, what did they do in the Middle East, right? With the deal they signed.
Well, actually, most of those GPUs are being operated by American companies or rented to
American companies, right? Either or, right? Like G42 operating them, but renting them mostly
to like Open AI and such for a large part. Or Amazon and Oracle and others are operating the
GPUs themselves in the Middle East. So it's like, okay, that's effectively the same thing,
but in a very different way. That is still, I think, a view, right? Which is like, we want America
to be as high in the value stack as possible, right? If we can sell tokens or if we can sell
services, we should. Okay, but if we can't sell the service, let's at least sell them tokens.
Okay, we can't sell them tokens, at least sell them like infra, right? Whether it'd be
data centers or renting GPUs or just the GPUs physically. And it's sort of like makes
sense right in the value chain like give them the highest value highest margin thing where we
capture most of the value and like squeeze it down to where like actually for like the
bottom of the stack right like the tools to make chips maybe you shouldn't sell and so like
current export controls and policy dictate that yes you know it's better to sell them services
but sell them both right like give the option let us compete and don't let anyone else win
I think the challenge here is that like how much are you enabling China by selling them
their R-GPUs, like, how much fear-mongering around, like, Huawei's production capacity is
there? Like, how realistic is it versus not because of the bottlenecks of, like, Korea sanctions
that America's made Korea put on China for memory or Taiwan on China for chips or, you know,
U.S. equipment on China, right? Like, there's a lot of different sanctions. Many of these are not
well-enforced slash have holes, but it's sort of like a, it's a very difficult argument on, like,
how much capacity of GPUs should be sold to China.
A lot of people in San Francisco, frankly, don't sell China any GPUs.
But then they cut off rare earth minerals and, you know, like,
ostensibly most people think that like the deal was that you get,
you get GPUs and also EDA software because the administration banned EDA software
for a little bit, just for like a few weeks, basically,
until China was like, okay, we'll ship rare earth minerals.
You can't just ban everything because China can retaliate.
If they banned rare earth minerals and magnets and such,
car factories in America would have shut down
and the entire supply chain there would have had
like hundreds of thousands of people not working
right like you know like there is like
there is a push and pull here yeah there is a push and pull here
so like do I think China should just have
the best Nvidia GPUs? No like that
that would suck but like you know can you give
them no GPUs? No they're going to retaliate
like there is a middle ground and like
Huawei is eventually going to have
a lot of production capacity but there's ways
to slow them down right like properly ban the equipment
because it's not there's a lot of loopholes there
properly banned the subcomponents of like of memory and wafers because Huawei is still getting
you know wafers in Taiwan from TSM through like shell companies right like it's like you know
there's a lot of enforcement challenges because parts of the government are not like funded properly
or not competent enough and has never been competent right so it's like how do you work within this
framework well like okay fine we should sell them some GPUs so that they you know that kind of slows
them down on a Huawei standpoint, although not really, right? But also, like, gets us back
the rare earth minerals, but don't sell them too many, right? Like, how do you find that
massive gray line is what the administration's grappling with, in my view?
Implied in that opinion is your belief of they are going to be able to build
Nvidia equivalent GPUs eventually, if forced. Maybe not equivalent.
Sorry, price performance competitive. There's, like, interesting things here, right? Like,
if China has a chip that consumes 3x the power. But they have 4x the power. But they have 4x
the power then. Yeah, like, who cares, right? Like, you know, obviously there's a lot of supply chain
challenges with building that. And it's like, hey, maybe it's on N-minus two technology. It's on five-year-old
technology or four-year-old technology. Great. And it only consumes three-six-the-power because they
are able to do a lot of software optimization, architecture optimization, et cetera. They end up
with something that maybe cost a little bit more. But like, what do you think about the value
of a GPU today, right? Like, you know, the GPUs dominate the cost of everything. But over time,
services will be built out with your high margin, right? And you can go look at Anthropic or
opening AI fundraising docs and, like, see that their API margins are good. API margins are
nothing compared to what service margins will be for people who use these APIs to build
services. And that's nothing compared to the, like, net good to the economy from how much
automation can happen and how much increased economic activity there is. So this is the argument
of like, okay, even if their chips costs 3X as much, do you help subsidize that rationally?
They can subsidize that rationally because the end goal is like, oh, wait, actually, we can
deploy a lot of Chinese AI and make money and gather data.
Because people are sending us their like prompts and all their databases and all this stuff
to our models controlled by our companies, et cetera, right?
Like plus we're just making money off of it.
And they've done this in other industries, right?
They rationally subsidized like solar and now no one can even compete on solar or EV.
And it's like very close to no one can compete on EVs even, right?
Besides like Tesla really.
And even Tesla is adopting a lot of like Chinese supply chain, right?
It is rational to say you want to have America have more AI prowess around the world.
you know, so that random child in Lebanon doesn't think America is, like, bad, or they're using
American products more than China, Chinese products. But, like, how you get there is very difficult.
And it's a, it's a hard thread to weed. Thread. You got it. I don't croquet, you know.
Oh, my God. Crochet.
Crochet. You clearly don't.
Croquet is the game.
I want to ask you, like, a wild card, a question to finish out. We're trying to get Mark to do
the podcast.
Zuck.
Yes. You can ask him any question.
what would you ask? Mark, you got to do the podcast.
I thought like the, did you read the page they put up?
I thought that was very interesting that they were like, we want AI to be your companion.
So my question to him is not like around his infestuff because I feel like I know most everything.
Like you can figure that stuff out from supply chain and like satellites and all this stuff.
But like the interesting thing I'm curious about is philosophically.
What exactly like does the world look like if everyone is talking to AIs more than other people
or if they're interacting socially with the AIs more than other people?
Do we lose our human element?
do we lose our human connection?
It's not the same thing as, hey, I'm posting on social media and we're interacting
with our social media posts, which that already breaks the brain of a lot of people.
What happens when it's like always on your face, like, meta, you know, his worldviews like
meta reality labs makes these like devices that you wear and they're always, they have all this
AI on them and you're talking to the AI companion all the time.
How does that change the human psyche?
Like this human machine evolution, like, is what are the negative ramifications of it?
what are the positive ramifications? How do we, how are you going to make sure that there's more
positive ramifications from this than like, you know, the sloppification and like complete brain rot
of like our youth, right? Which I like love my brain rot, right? Like it's like, okay. Obviously
the coding wars continue to be like very central. And we were talking about cognition's relevance
and like how to think about the strategy here. But I do think it's really funny what flipped
to your bit on cognition. Can you tell the story?
I thought cognition, NGMI, right?
Like, you know, like Open AI, Anthropic, X-A-I, etc.
They're just going to make better code models.
Like, you know, they just have way more resources.
General models will win.
You know, I hadn't really met too many people.
There was just like a pure vibes-based thing.
And I, you know, I'd used a little bit of Devin, but I was like, whatever, right?
Like, it was like, cloud code seems better.
And we use that internally.
But, like, I went to Koto's East Meets West event.
It's an awesome event where there's people from Asia.
Like, there was like, you know, all these like CFOs and CEOs of like major Chinese
companies.
East Coast of U.S., all these finance bros, also West Coast, like a lot of tech people, right?
So you and I were both there.
There were people from governments and major companies.
And Scott was there.
I spoke with them, like, very briefly.
But then what was interesting is like, it's like, you know, they have a poker night one night.
And everyone gets blasted.
The, like, leader of Kutu is like very good at poker.
These hedge fund guys are just good at poker generally.
And I love it.
Like poker as well.
Yeah, it was a big poker culture in the Bay.
I was playing.
I'm okay.
Right.
But I see, I look over.
at the super high stakes table, Scott's just dominating everyone, right?
I'm like, what is going on?
Like, how are you like, you're like taking chips from like,
CEO of major Chinese company?
I don't want to name people's names because I think there's like some terms around
them like naming who's there.
But like, you know, it's like, you're like winning like a lot of chips
from a lot of big people.
And it's like all of a sudden my vibes were like, I don't know,
maybe like maybe he can.
Maybe he can take from the lion, you know?
So I was like very excited about that.
You know, I thought it was funny.
I still have zero like, I have not done.
much due diligence on their code product.
Like, you know, like, it's like, nor have I on, like,
Claude Code besides the fact that we use it.
But it's like, you know, cool.
Well, I think WinSurf Acquisition Part 2 is like a pretty good hand to play here.
And, you know, as somebody who invests a lot at a, you know, violently competitive application level.
Yeah.
Poker game is live, man.
Everybody, they're, you just invested live players.
Exactly.
And it's, I just loved that, you know, that was how he, uh, he dominated everyone.
And it's like, it's such a stupid reason because I pride myself on being analytical at, like, data-driven.
And it's like, you know, vibes.
Correct for any entrepreneurs listening.
I think, like, you know, Dylan might angel invest or we might back you fully if you win the cognition poker game.
And we'll host it conviction.
Okay, if we got it.
Good.
Awesome.
Yeah, thank you.
Find us on Twitter at No Pryor's Pod.
Subscribe to our YouTube channel if you want to see our faces.
Follow the show on Apple Podcasts, Spotify, or wherever you listen.
That way you get a new episode every week.
And sign up for emails or find transcripts for every episode at no dash priors.com.