Dwarkesh Podcast - @Asianometry & Dylan Patel — How the semiconductor industry actually works
Episode Date: October 2, 2024A bonanza on the semiconductor industry and hardware scaling to AGI by the end of the decade.Dylan Patel runs Semianalysis, the leading publication and research firm on AI hardware. Jon Y runs Asianom...etry, the world’s best YouTube channel on semiconductors and business history.* What Xi would do if he became scaling pilled* $ 1T+ in datacenter buildout by end of decadeWatch on YouTube. Listen on Apple Podcasts, Spotify, or any other podcast platform. Read the full transcript here. Follow me on Twitter for updates on future episodes.Sponsors:* Jane Street is looking to hire their next generation of leaders. Their deep learning team is looking for FPGA programmers, CUDA programmers, and ML researchers. To learn more about their full time roles, internship, tech podcast, and upcoming Kaggle competition, go here.* This episode is brought to you by Stripe, financial infrastructure for the internet. Millions of companies from Anthropic to Amazon use Stripe to accept payments, automate financial processes and grow their revenue.If you’re interested in advertising on the podcast, check out this page.Timestamps00:00:00 – Xi's path to AGI00:04:20 – Liang Mong Song00:08:25 – How semiconductors get better00:11:16 – China can centralize compute00:18:50 – Export controls & sanctions00:32:51 – Huawei's intense culture00:38:51 – Why the semiconductor industry is so stratified00:40:58 – N2 should not exist00:45:53 – Taiwan invasion hypothetical00:49:21 – Mind-boggling complexity of semiconductors00:59:13 – Chip architecture design01:04:36 – Architectures lead to different AI models? China vs. US01:10:12 – Being head of compute at an AI lab01:16:24 – Scaling costs and power demand01:37:05 – Are we financing an AI bubble?01:50:20 – Starting Asianometry and SemiAnalysis02:06:10 – Opportunities in the semiconductor stack Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
Transcript
Discussion (0)
Today I'm chatting with Dylan Patel who runs semi-analysis and John who runs the Asianometry YouTube channel.
Does he have a last name?
No, I do not.
No, I'm just kidding.
John Y.
Why?
That's right.
Is it?
I'm John Y.
Wait, why is it only one letter?
Because why is the best letter?
Why is your face covered?
Why not?
No, seriously, why is it covered?
Because I'm afraid of looking myself get older and fatter over the years.
But so seriously, it's like anonymity, right?
Anonymity.
Okay.
Yeah.
By the way, so you know what Dylan's middle name is?
Actually, no.
I don't know.
He told me.
What's my father's name?
I'm not going to say, but I remember.
You could say it.
It's fine.
Sanjay?
Yes.
What's his middle name?
Sanjay?
That's right.
Wow.
So I'm the Warkash Sanjay Patel.
He's Dylan Sanjay Patel.
It's like literally my white name.
Wow.
It's unfortunate my parents decided between my older brother and me
to give me a white name.
It could have been Dwarkesh Zend...
Like, you know how amazing it would have been if we had the same name?
Like, butterfly effect at all.
That probably would have all wouldn't have turned out the same way.
But, like...
Maybe it would have been even closer.
We would have met each other sooner, you know?
Yeah, yeah, yeah.
Yeah, yeah.
Yeah.
All right.
First question.
If you were Xi Jinping and you're scaling pill, what is it that you do?
Don't answer that question, John.
That's bad for AI safety.
I would basically be contacting every foreigner.
I would be contacting every Chinese national with family back home and saying,
I want information.
I want to know your recipes.
I want to know, I want contacts.
What kind of for like AI laugh foreigners or hardware foreigners?
Honeypotting Open AI.
I would basically like, this is totally off cycle.
But like this is off the reservation.
But like I was doing a video about Yugoslavia's nuclear program.
What?
Nuclear weapons program started absolutely nothing.
One guy from Paris.
And then one guy in Paris, he showed up and he was like, and then he had, who knows what he did?
He knows a little bit about making atomic nuclear weapons.
but like he was like, okay, well, do I need help?
And then the state secret police is like, I would get to everything.
And then like, I shouldn't do that.
I was getting you everything.
And for like a span of four years, they basically, they drew up a list.
What do you need?
What do you want?
What is it going to be for?
And they just state police just got everything.
If I was running a country and I needed catch up on that, that's the sort of thing that I would be doing.
So, okay, let's talk about the espionage.
So what is the most valuable piece of,
if you could have this blueprint,
like this one megabyte of information?
Do you want it from TSM?
Do you want it from NVIDA?
Do you want it from OpenEI?
What is the first thing you would try to steal?
I mean, I guess you have to stack every layer, right?
And I think like the beautiful thing about AI
is because it's growing so freaking fast,
every layer is being stressed to some incredible degree.
Of course, China has been hacking ASML for over five years,
and, you know,
ASMR is kind of like,
oh, it's fine.
The Dutch government's really pissed off,
but it's fine, right?
I think it's,
they already have those files, right?
In my view,
it's just a,
it's a very difficult thing to build, right?
I think,
I think the same applies
for, like,
fab recipes, right?
They can poach Taiwanese nationals
very, like,
not that difficult,
right,
because TSMC employees
do not make
absurd amounts of money.
You can just poach them
and give them a much better life,
and they have, right?
A lot of Smix employees
are TSMC,
you know,
Taiwanese national.
right a lot of the really good ones high up ones especially right and then you go up like the next layers of the stack and it's like i think
i think yeah of course there's tons of model secrets um but then like you know how many of those model secrets do you not already have and you just haven't deployed or implemented you know organized right that's that's the one thing i would say is like china just hasn't they they clearly are still not scale-pilled in my view so these people are i don't know if you could like hire them it would probably worth a lot
to you, right? Because you're building a fab that's worth tens of billions of dollars and this talent
is like, they know a lot of shit. How often do they get poached? Do they get poached by like
foreign adversaries or do they just get poached by other companies within the same industry but
in the same country? And then yeah, well like why doesn't that like sort of drive up their
wages? I think it's because it's very compartmentalized. And I think like back in the 2000s
prior to TSB4, Smyk got big, it was actually much more kind of open, more flag.
I think after that, there was like, after the Among Song and after all the Samsung issues and after all the Smix rise, when there you literally saw...
I think you should tell that story, actually.
The TSM guy that went to Samsung and Smick and all that, I think you should tell that story.
There are two stories.
There's a guy if he ran a semiconductor company in Taiwan called Worldwide Semiconductor, and this guy, Richard Chang, was very religious.
I mean, all the TSM people are pretty religious.
But, like, he in particular, was very fervent, and he wanted to bring religion to China.
So after he sold his company to TSM, huge Cooper TSMC, he worked there for about eight or nine months, and he was like, all right, I'll go to China.
Because back then, the relations between China and Taiwan were much more different.
And so he goes over there, Shanghai says, we'll give you a bunch of money.
And then Richard Chang basically recruits half of, like, a whole bunch.
It's like a Congo line of, like, Taiwanese line.
Just like, they get on the plane and they fly over.
And generally, that's actually a lot of like acceleration points within China.
semiconductor industry, it's from talent flowing from Taiwan.
And then the second thing was like Liang Mung Song.
Leong Song was a, is a nut.
And I've met him, I've not met him.
I've met people who work with him.
And they say he is a nut.
He is probably on the spectrum.
And he does not care about people.
He does not care about business.
He does not care about anything.
He wants to take it to the limit.
The only thing, that's the only thing cares about.
He worked from TSMC, literal genius, 300 patents or whatever, 285, works all the way to
like the top, top tier.
And then one day,
he decides, he loses out on some sort of power game
within TSM and gets demoted.
And he was like head of R&D, right, or something?
He was like one of the top R&D. He was like second or third place.
And it was for the head of R&D position,
basically. More of the head of R&D position, he's like,
I can't deal with this. And he goes to Samsung
and he steals a whole bunch of talent from TSMC.
Literally, again, Konga line goes to just emails,
people say we will pay.
At some point, some of these people,
were getting paid more than the Samsung chairman, which not really comparable.
But, like, you know what I mean?
So they're going...
Isn't the Samsung chairman usually, like, part of the family that own Samsung?
Correct.
Okay, so it's, like, kind of relevant.
So it's a bit like, he goes over there, and he's like, well, I'm like, we will make Samsung
into this monster.
We forget everything.
Forget all of the stuff you've been trying to do it, like incremental.
Toss that out.
We are going to the leading edge, and that is it.
They go to the leading edge.
The guys, like...
They win Apple's business.
They win Apple's business.
They went it back from TSM.
Or did they win it back from TSM?
They had a portion of the...
They had a big portion of it.
And then TSM, Morris Tong, is like, at this time, was running the company.
And he's like, I'm not letting this happen.
Because that guy toxic to work for as well, but also goddamn brilliant.
And also, like, very good of motivating people.
He's like, we will work literally day or night, sets up what is called the Nightingale Army,
where you have...
They split a bunch of people.
And they say, you are working R&D night shift.
There is no rest at the TSM Fab.
You will go in, as you go in, there'll be a day shift going out.
They called it the, it's like you're burning your liver.
Because in Taiwan, they say, like, if you get old, like, as you work, you're sacrificing
your liver.
They call it the liver buster.
So they basically did this nightingale army for like a year or two years.
They finished fin-fat.
they basically just blow away Samsung.
And at the same time, they sue Niel Mung Song directly for stealing trade secrets.
Samsung basically separates from Nel Mung Song, and Nel Monson goes to Smick.
And so Samsung, like, at one point was better than TSMC.
And then, yeah, he goes to Smick, and Smick is now better.
Or not better, but they caught up rapidly as well after.
Very rapid.
That guy's a genius.
That's the guy's a genius.
I mean, I don't even know what to say about him.
He's like 78, and he's like beyond brilliant, does not care about people.
Like, yeah, what is research to make the next process node look like?
Is it just a matter of like 100 researchers go in?
They do like the next N plus one.
Then the next morning, the next 100 researchers go in.
It's experiments.
They have a recipe and what they do.
Every recipe, a TSM recipe is the culmination of a long, long years of like research, right?
it's highly secret, and the idea is that what you're going to do is that you go, you look at one particular part of it and you say, experiment, run an experiment. Is it better? Is it not? Is it better or not? Kind of a thing like that.
You're basically, it's multivariable problem that each, every single tool, sequentially you're processing the whole thing, you turn up knobs up and down on every single tool. You can increase the pressure on this one specific deposition tool.
And what are you trying to measure? Is it like, does it increase the yield? It's not, it's yield, it's performance, it's power.
it's not just a one, it's not just better or worse, right?
It's a multivariable search space.
And what do these people know such that they can do this?
Is they understand the chemistry and physics?
So it's a lot of intuition, but yeah, it's PhDs in chemistry, PhDs in physics, PhDs in
EE, brilliant geniuses people.
And they all just, and they don't even know about like the N chip a lot of times.
It's like, oh, I am an etch engineer and all I focus on is how hydrogen fluoride etches this, right?
And that's all I know.
And like, if I do it at different pressures, if I do it at different temperatures, if I do it with a slightly different recipe of chemicals, it changes everything.
I remember, like, someone told me this when I was speaking, like, how did America lose the ability to do this sort of thing, like etching, hydrofluoric and acid or that?
I told them, like, he told me basically it was like, it's very apprentice, master apprentice.
Like, you know in Star Wars, Sith, there's only one, right?
Master apprentice, master apprentice.
in it used to be that there is a master, there's apprentice, and they pass on this secret knowledge.
This guy knows nothing but etch, nothing but etch.
Over time, the apprentices stopped coming.
And then in the end, the apprentices have moved to Taiwan.
And that's the same way it's still run.
Like you have the NTIU and NTHU, at Qinghua University, National Qinghua University.
There's a bunch of masters.
They teach apprentices and they just pass this secret knowledge down.
Who are the most AGI-I-pilled people in the supply chain?
Is there anybody
I gotta have my phone
called Colette right now
Okay go for it
Sorry sorry
Could we mention
The podcast and Nvidia
Has got to call
To update him on the
earnings call
Well it's not this
Not exactly
But
Go for it
Go for it
Yeah
So Dylan is back from his call
With Jensen Huang
It was not with Jensen
Jesus
What did they tell you
Huh
What did they tell you
What next year's earnings
No
It's just color around
Like a hopper blackwell
And like margins
It's like quite
boring stuff
I'm sure
For most people, I think it's interesting, though.
I guess we could start talking about it in video.
You know what, before we do?
No, I think we should go back to China.
There's like a lot of points there.
All right, we covered the trips themselves.
How do they get like the 10 gigawatt data center off?
What else do they need?
So I think there is a true like question of how decentralized do you go versus centralized, right?
And if you look in the U.S., right, as far as like labs and such, the, you know, open AI, XAI, you know, Anthropic.
and then Microsoft having their own effort, Anthropic, having their own efforts despite having their
partner and then meta.
And you go down the list, it's like there's quite a decentralization.
And then all the startups, like interesting startups that are out there doing stuff,
there's quite a decentralization of efforts.
Today in China, it is still quite decentralized, right?
It's not like Alibaba, Baidu, you are the champions, right?
You have like deep seek, like, who the hell are you?
Does government even support you, like doing amazing stuff?
right if you are
zi-hingping and scale-pilled
you must now centralize
the compute resources right because
you have you have sanctions on how many
Nvidia GPs you can get in now
there's still north of a million a year right
even post-October last year sanctions
they still have more than a million
H-20s and other
hopper GPUs getting in through you know other means
but legally like the H-20s
and then on top of that you have
you have your domestic chips
right but that's less than a million chips
So then when you look at it, it's like, oh, well, we're still talking about a million chips.
The scale of data centers people are training on today slash over the next six months is 100,000 GPUs.
Yeah, right?
Open AI, XAI, right?
These are like quite well documented and others.
But in China, they have no individual system of that scale yet, right?
So then the question is like, how do we get there?
You know, no company has had the centralization push to have a cluster that large and train on it yet, at least publicly.
well known, and the best models seem to be from a company that has got like 10,000 GPUs,
right, or 16,000 GPUs, right? So it's not quite as centralized as the U.S. companies are,
and the U.S. companies are quite decentralized. If you're Zingping and your scale-pilled,
do you just say, X, Y, Z company is now in charge? And every GPU goes to one place. And then you
don't have the same issues of the U.S. Right? In the U.S., we have a big problem with, like,
being able to build big enough data centers, being able to build substations and transfer.
and all this that are large enough in a dense area. China has no issue with that at all because
their supply chain adds like as much power as like half of Europe every year, right? Like,
or some absurd statistics, right? Um, so they're building transformer substations. They're building new
power plants constantly. Um, so they have no problem with like getting power density and you go
look at like Bitcoin mining, right? Um, around the three gorges dam at one point at least,
there was like 10 gigawatts of like Bitcoin mining estimated, right?
which, you know, we're talking about, you know, gigawatt data centers are coming over, you know, 26, 27 in the U.S. in the U.S. or 27, right? You know, sort of this is an absurd scale relatively, right? We don't have gigawatt data centers, you know, ready, but like China could just build it in six months, I think, around the Three Gorge's Dam or many other places, right? Because they have, they have the ability to do the substations. They have the power generation capabilities. Everything can be, like, done like a flip of a switch, but they haven't done it yet. And then they can centralize.
the chips like crazy. Right now, oh, oh, million chips that Nvidia's shipping in Q3 and Q4,
the H20, let's just put them all in this one data center. They just haven't had that centralization
effort. Well, you can argue that like the more you centralize it, the more you start building
this monstrous thing within the industry, you start getting attention to it. And then suddenly,
you know, low and behold, you have a little bit of a little worm in there suddenly, where you're
doing your big training run. Oh, this GPU. Off. Oh, this GPU. Oh, no.
Oh, no.
Oh, no.
I don't know if it's like that easy to hack.
Is that a Chinese accent, by the way?
Just to be clear, John is East Asian.
He's Chinese.
I am of East Asian descent.
Half Taiwanese, half Chinese.
Right, that is right.
But, like, I think, I don't know if that's, like, as simple as that to, like,
because training systems are like fire, like, they're water, is it water gated, firewalled?
What is it called?
Not firewalled.
I don't know.
There's a word for that where they're not like.
Air-gapped.
Air-gapped.
I think they're too.
You're going through like, like, they're going through like,
all the like four elements of the average.
They're not dirt fire.
Like dirt protected.
Water. Fire.
If you're she's like being in your scale pilled.
You kind of like you night the four forces.
Fuck the air vendors.
Fuck the fire benders, you know.
We got the avatar, right?
Like you have to build the avatar.
Okay.
Um, I think, I think that's possible.
Um, the question is like, does that slow down your research?
Do you like crush, like, cracked people like deep seek, uh, who are like clearly like not being,
you know, influenced by the government?
and put some like idiot like, you know, idiot bureaucrat at the top, suddenly he's all thinking
about like, you know, all these politics and he's trying to deal with all these different
things. Suddenly, you have a single point of failure. And that's a, that's bad. But I mean,
on the flip side, right? Like, there is like obviously immense gains from being centralized
because of the scaling loss, right? And then the flip side is compute efficiency is obviously
going to be hurt because you can't do, you can't experiment and like have different.
people lead and try their efforts as much if you're less centralized, more centralized.
So it's like there is a balancing act there.
The fact that they can centralize, I didn't think about this, but that is actually like,
because, you know, even if America as a whole is getting millions of GPUs a year,
the fact that any one company is only getting hundreds of thousands or less means that there's
no one person who can do a single trading run as big in America as if like China as a whole
decides to do one together.
the 10 gigawatts you mentioned near the three words down is it like literally like how how widespread is it like a state is it like one wire like how
I think like between not just the dam itself but like also all of the coal there's some nuclear reactors there I believe as well
between all of and and and like renewables like solar and wind between all of that in that region there is an absurd amount of concentrated power
that could be built I don't think it's like I'm not saying it's like one button but it's like
hey, within X mile radius, right?
Yeah.
It's more of like the correct way to frame it.
And that's how the labs are also framing it, right?
Like, I think in the U.S.
If they started right now, like, how long does it take to build the biggest,
the biggest AI data center that in the world?
You know, actually, I think, I think, um, the other thing is like,
could we notice it?
I don't think so because the amount of like factories that are being spun up,
the amount of other construction, manufacturing, etc.,
that's being built,
a gigawatt is actually like a drop in the bucket, right?
Like a gigawatt is not a lot of power.
10 gigawatts is not an absurd amount of power, right?
It's okay, yes, it's like hundreds of thousands of homes, right?
Yeah, millions of people, but it's like you got 1.4 billion people.
You got like most of the world's like extremely energy intensive, like refining and like, you know,
rare earth refining and all these manufacturing industries are here.
It would be very easy to hide it.
It would be very easy to just like shut down like, I think the largest aluminum mill in the world is there.
it's like north of five gigawatts alone.
It's like, oh, what could we tell if they stopped making aluminum there
and instead started like making AI's there or making AI there?
Like, I don't know if we could tell, right?
Because they could also just easily spawn like 10 other aluminum mills,
make up for the production and be fine, right?
So like there's many ways for them to hide compute as well.
To the extent that you could just take out a 5 gigawatt aluminum refining center
and like build a giant data center there,
then I guess the way to control Chinese AI has to be the chips
because like everything else,
they, so like, how do you, like,
just like walk me through how many trips do they have now,
how many will they have in the future?
What will they, like, how many,
is that in comparison to U.S. and the rest of the world?
Yeah, so in the world, I mean,
the world we live in is they are not restricted at all
in like the physical infrastructure side of things
in terms of power, data centers, et cetera,
because their supply chain is built for that, right?
And it's pretty easy to pivot that.
Whereas the U.S. adds so little power each year
and Europe loses power every year.
The Western sort of,
industry for power is non-existent in comparison, right? But on the flip side is,
quote-unquote Western, including Taiwan, chip manufacturing is way, way, way, way, way larger
than China's, especially on leading edge where China theoretically has, you know, depending on the
way you look at it, either zero or a very small percentage share, right? And so there, you have,
you have equipment, wafer manufacturing, and then you have advanced packaging capacity, right? And where
the U.S. can control China, right? So advanced packaging capacity is kind of a shot because the vast
majority, the largest advanced packaging company in the world was Hong Kong headquartered. They just moved to
Singapore, but like, that's effectively like, you know, in a realm where the U.S. can't sanction it, right?
A majority of these other companies are in similar places, right? So advanced packaging capacity is
very hard, right? Advanced packaging is useful for stacking memory, stacking chips on co-os, right?
Things like that. And then the step down is wafer fabrication. There is a, there is a
immense capability to restrict China there. And despite the U.S. making some sanctions, China in the most
recent quarters was like 48% of ASML's revenue, right? So, you know, and like 45% of like applied
materials and you just go down the list. So it's like, obviously it's not being controlled that
effectively. But it could be on the equipment side of things. The chip side of things is actually
being controlled quite effectively, I think, right? Like, yes, there is like shipping GPUs through
Singapore and Malaysia and other countries in Asia to China. But, you know, the amount you can smuggle is
quite small. And then the sanctions have limited the chip performance to a point where it's like,
you know, this is actually kind of fair. But there is a problem with how everything is restricted,
right? Because you want to be able to restrict China from building their own domestic chip manufacturing
industry that is better than what we ship them. You want to prevent them from having chips that are
better than what we have.
And then you want to prevent them from having AI's better.
The ultimate goal being, you know, and if you read the restrictions, it's like very clear.
It's about AI.
Yeah.
Even in 2022, which is amazing.
Like, at least the Commerce Department was kind of AI pill.
It was like, is, is you want to restrict them from having AI's worse than us, right?
So starting on the right end, it's like, okay, well, if you want to restrict them from
having better AIs than us, you have to restrict chips.
Okay.
If you want restrict them from having chips, you have to let them have at least some level of chip
that the West, also, that is good, better.
than what they can build internally.
But currently, the restrictions are flipped the other way, right?
They can build better chips in China than we restrict them in terms of chips that Nvidia or
AMD or an Intel can sell to China.
And so there's sort of a problem there in terms of the equipment that is shipped can be used
to build chips that are better than what the Western companies can actually ship them.
John, Dylan seems to think the expert controls are kind of a failure.
Do you agree with them?
That is a very interesting question.
because I think it's like...
Why, thank you.
Like, what do you...
Darkish, you're so good.
Yeah, Darkish, you're the best.
I think failure is a tough word to say
because I think it's like,
what are we trying to achieve, right?
Like, they're talking about AI, right?
Yeah.
When you do sanctions like that,
you need such a deep knowledge of the technologies.
You know, just taking lithography, right?
If your goal is to restrict China
from building chips
and you just, like, boil it down,
to like, hey, lithography is 30% of making a chip, so or 25%.
Cool, let's sanction lithography.
Okay, where do we draw the line?
Okay, let me ask, let me ask, let me figure out what, where the line is.
And if I'm a bureaucrat, if I'm a lawyer at the Commerce Department or what have you,
well, obviously I'm going to go talk to ASML.
And ASML is going to tell me this is the line because they know like, hey,
well, you know, this, this is, you know, there's like some blending over.
There's like, they're looking at like what's going to cost us the most money, right?
And then they constantly say, like, if you restrict us, then China will have their own
industry, right? And the way I like to look at it is, like, chip manufacturing is like, like,
like, 3D chess or like, you know, a massive jigsaw puzzle in that if you take away one piece,
China can be like, oh, yeah, that's the piece. Let's put it in. Right. And currently this export
restrictions year by year by year, they keep updating them ever since like 2018 or so 19, right,
when Trump started and now Biden's, you know, accelerated them. They've been like, they haven't just
like take a bat to the table and like break it, right? Like, it's, like, it's.
It's like, let's take one jigsaw puzzle out, walk away.
Oh, shit.
Let's take two more out.
Oh, shit, right?
Like, you know, it's like, instead if they like, they, you either have to go kind of like full bat
to the freaking like table slash wall or or chill out, right?
Like, and like, you know, let them, let them do whatever they want.
Because the alternative is everything is focused on this thing and they make that.
And then now when you take out another two pieces, like, well, I have my domestic industry
for this.
I can also now make a domestic industry for these.
Like, you go deeper into the tech tree or what have you.
It's art, right?
In a sense that there are technologies out there that can compensate.
Like, if you believe, the belief that lithography is a linchpin within the system is, it's not exactly true.
Right.
At some point, if you keep pulling a thread, other things will start developing to kind of close that loop.
And like, I think it's, it's, it's, that's why I say it's an art, right?
I don't think you can stop Chinese semiconductor industry, but the semiconductor industry.
for the semiconductor industry from progressing.
I think that's basically impossible.
So the question is the Chinese government believes in the primacy of semiconductor manufacturing.
They've believed it for a long time, but now they really believe it, right?
To some extent, the sanctions have made China believe in the importance of the semiconductor industry more than anything else.
So from an AI perspective, what's the point of export controls then?
Because even if they're going to be able to get these, like, if you were concerned about AI,
and they're going to be able to build...
Well, they're not centralized, though, right?
So that's the big question is, are they centralized?
And then also, you know, there's the belief...
I don't really...
I'm not sure if I really believe it, but like, you know, prior podcast,
there have been people who talked about nationalization, right?
In which case, okay, now you're talking about...
Why you're referring to ambiguously?
Well, I think there's a couple...
My opponent.
No, I love the opponent.
No, but I think there have been a couple where people have talked about nationalization, right?
But, like, if you have, you know, nationalization,
then all of a sudden you aggregate all the full.
Flops is like, no, there's no fucking way, right?
China can be centralized enough to compete with each individual U.S. lab.
They could have just as many flops in 25 and 26 if they decided they were scale-pilled, right?
Just from foreign chips for individual model.
Like, in 2026, they can train a 1E-27, like they can release a 1-E-27 model by 2026.
Yeah, and then a 28 model, you know, 1-E-28 model in the works, right?
Like, they totally could just with foreign chips apply, right?
Just a question of centralization.
Then the question is, like, do you have as much innovation in?
compute efficiency wins or what have you get developed when you centralize or does like anthropic
and open AI and XAI and Google like all develop things and then like secrets kind of shift a little bit
in between each other and all that like you know you end up with that being a better outcome in the long
term versus like the nationalization of the U.S. right if that's possible and like or you know and what
happens there but China could absolutely have it in 26 27 if they just have the desire to and that's just
from foreign chips right
And then domestic chips are the other question, right?
600,000 of the Ascend 910B, which is roughly like 400 terraflops or so.
You know, so if they put them all in one cluster, they could have a bigger model than any of the labs next year.
Right?
I have no clue where all the Send 910Bs are going, right?
But I mean, well, there's like rumors about like some, they are being divvied up between the like major Alibaba, bite dance, bydo, etc.
And next year, more than a million.
And it's possible that they actually do have, you know, 1E30 before the U.S.
because data center is not as big of an issue.
10 gigawatt data center is going to be, I don't think anyone is even trying to build that
today in the U.S.
Like, even out to 27, 28, really, they're focusing on, like, linking many data centers
together.
So there's a possibility that, like, hey, come 2028, 2029, China can have more flops
delivered to a single model, even ignoring sort of, even once the centralization question
is solved, right?
because that's clearly not happening today for either party.
And I would bet if AI is like as important as, you know, you and I believe that they will
centralize sooner than the West does.
Yeah. So there is a possibility, right?
Yeah.
It seems like a big question then is how much could smic either increase the product, like increase
the amount of wafers, like how many more wafers could they make and how many of those
wafers could be dedicated to the night?
Because I assume there's other things they want to do with these semiconductors.
Yeah.
So there's like two points.
parts there too, right?
Like, so the way the U.S. has sanctioned Smick is really, like, stupid kind of, is that,
in that they've, like, sanctioned a specific spot rather than the entire company.
And so, therefore, right, Smic is still buying a ton of tools that can be used for their
seven nanometer and their, you know, call it 5.5 nanometer process or six nanometer process
for the 910C, which releases later this year, right?
They can build as much of that as long as it's not in Shanghai, right?
And Shanghai has anywhere from 45 to 50 high-end immersion lithography tools is what's, like, believed by intelligence as well as like many other folks.
That roughly gives them as much as 60,000 wafers a month of 7 nanometer, but they also make their 14 nanometer in that fab, right?
And so the belief is that they actually only have about like 25 to 35,000 of 7 nanometer capacity.
Wafers a month, right?
doing the math, right?
Are the chip die size and all these things?
Because Pauaua also uses chiplets and stuff
so they can get away with using less leading edge wafers,
but then their yields are bad.
You can roughly say something like 50 to 80
good chips per wafer
with their bad yield, right?
Why do they have bad yield?
Because it's hard, right?
You know, you're...
Even if it was like, you know, everyone knows the number, right?
It's like a thousand steps,
even if you're 99% for each.
like 98 or 98%
Like in the end
You'll still get a 40%
You know overall
Interesting
I think it's like
Even it's like 99
If I think it's like
I think it's if it's six sigma
Of like
Perfection and you have your 10,000 plus steps
You end up with like yield is still dog shit by the end right
Like yeah
That is a scientific measure
Dog shit percent
Yeah
Yeah as a multiplicative effect right
Yeah
So yields are bad because
They have hands tied behind their back right
Like, A, they are not getting to use EUV, whereas on 7 nanometer Intel never used EUV,
but TSM eventually started using EUV.
Initially, they used DUV, right?
Doesn't that mean the expert control succeeded?
Because they have bad yield because they have to use, like.
Successes, again, they still are determined.
Successes mean they stop.
They're not stopping.
Going back to the yield question, right?
Like, oh, theoretically, 60,000 wafers a month times 50 to 100 dyes per wafer with
yielded,
yielded dies,
holy shit,
that's,
that's millions of GPUs,
right?
Now, what are they doing
with most of their wafers?
They still have not
become skill-pilled,
so they're still throwing them
out, like,
let's make 200 million
Huawei phones,
right?
Like, oh,
okay, cool,
I don't care.
Right?
Like, as the West,
you don't care as much,
even though,
like, Western companies
will get screwed,
like Qualcomm and, like,
you know,
and media tech Taiwanese companies.
So,
so obviously there's that.
And the same applies
to the U.S.,
but when you flip to,
like,
sorry,
I don't fucking know what I was going to say.
Nailed it.
We're keeping this in.
That's fine.
That's fine.
That's fine.
Hey, everybody.
I am super excited to introduce our new sponsors, Jane Street.
They're one of the world's most successful trading firms.
I have a bunch of friends who either work there now or have worked there in the past.
And I have very good things to say about those friends.
And those friends have very good things to say about Jane Street.
Jane Street is currently looking to hire its next generation of Louisville.
As I'm sure you've noticed, recent developments in AI have totally changed what's possible in trading.
They've noticed this too, and they've stacked a scrappy, chaotic new team with tens of millions of dollars of GPUs
to discover signal that nobody else in the world can find.
Most new hires have no background in trading or finance.
Instead, they come from math, CS, physics, and other technical fields.
Of particular relevance to this episode, their deep learning team is hiring,
Kuda programmers, FPGA programmers, and ML researchers.
Go to jane street.com slash dwarcash to learn more.
And now back to Dylan and John.
2026, if they're centralized, they can have as big training runs as any one U.S. company.
Oh, the reason why I was bringing up Shanghai, they're building seven nanometer capacity
in Beijing.
They're building five nanometer capacity in Beijing, but the U.S. government doesn't care.
and they're importing dozens of tools into Beijing.
And they're saying to the U.S. government in ASML,
this is for 28 nanometer, obviously.
This is not bad.
And then obviously, you know, like, in the background.
Yeah, we're making five nanometer here.
Are they doing it because they believe in AI
or because they want to make Huawei phones?
You know, Huawei was the largest TSM customer
for like a few quarters, actually, before they got sanctioned.
Huawei makes most of the telecom equipment in the world, right?
You know, phones, of course, modems,
but of course accelerators, networking equipment.
You know, you go down the whole, like, video surveillance chips, right?
Like, you kind of, like, go through the whole gambit.
Yeah.
A lot of that could use seven and five nanometer.
Do you think the dominance of Huawei is actually a bad thing for the rest of the Chinese tech industry?
I think Huawei is so fucking cracked that, like, it's hard to say that, right?
Like, Huawei out competes Western firms regularly with two hands tied behind their back.
Like, you know, like, what the hell is Nokia and, like, Sony Erickson?
like trash, right?
Like compared to Huawei and Huawei is not allowed to ship sell to like European companies
or American companies and they don't have TSM and yet they still destroy them.
Right.
And same applies to like the new phone, right?
It's like, oh, it's like as good as like a year old Qualcomm phone on a process node that's
equivalent to like four years old, right, or three years old.
So it's like, wait, so they actually out engineered us with a worst process node.
You know, so it's like, oh, wow, okay.
Like, you know, Huawei is like crazy cracked.
do you think that culture comes from the military because it's the PLA it is that we we it is
generally seen as an arm of the PLA but like how do you square that with the fact that sometimes
the PLA seems to mess stuff up oh like filling water and rockets I don't know if that was true
I'm the nine there is there is like that like like like crazy conspirator not care conspiracy it's
like yeah you can you don't know what the hell to believe in China especially as a not
Chinese person but like nobody know even Chinese people don't know what's going on in China
there's like, you know, like all sorts of stuff like, oh, they're filling water in their rockets.
Clearly they're like incompetent. It's like, look, if I'm the Chinese military, I want the
Western world to like believe I'm completely incompetent because one day I can just like destroy
the fuck out of everything, right, with all these hypersonic missiles and all this shit, right? Like drones and
like, no, no, no, we're filling water in our missiles. These are all fake. We don't actually have
100,000 missiles that we manufacture in a facility that's like super hyper advanced and Raytheon
is stupid as shit because they can't make, you know, missiles nearly as fast.
right like i think like that's also like a flip side it's like how much false propaganda is there right
because there's a lot of like no smic could never smic could never they don't have the best tools
blah blah blah and then it's like motherfucker they just shipped 60 million phones last year with this chip
that performs only one year worse than like qualcom has it's like proof is in the pudding right like
you know there's there's a lot of like cope if you will i just wonder where it comes from i do really
do just wonder where that culture comes from like there's something crazy about them where
they're kind of like everything they touch they seem to succeed in. And like I kind of wonder why.
You're making cars. I wonder if it's going on there. I think I like if like supposedly like if we
kind of imagine like historically like do you think they're getting something from somewhere.
What do you mean? Espionage you mean? Yeah. Obviously. Like East Germany in the Soviet industry was basically
it was just it was like a conveyor belt of like secrets coming in and they're just used that to run everything.
But the Soviets were never good at it. They could never mass produce it. How would espionage explain how they can
make things with different processes.
I don't think it's just espionage.
I think they're just literally cracked.
They have the espionage without a doubt, right?
Like, ASML has been known to been hacked a dozen times.
Right, right? Or at least a few times, right?
And they've been known to have people sued
who made it to China with a bunch of documents, right?
Not just ASML, but every fucking company in supply chain.
Cisco code was literally in like early Huawei, like routers and stuff, right?
Like you go down the list, it's like everything is.
But then it's like, no, architecturally the Ascend 910B looks nothing like a GPU.
It looks nothing like a TPU.
It is like its own independent thing.
Sure, they probably learn some things from some places, but like, it is just like they're good at engineering.
It's 996.
Like wherever that culture comes from, they do good.
Yeah.
They do very good.
Another thing I'm curious about is like, yeah, where their culture comes from is about like, how does it stay there?
Because with American firms or any other firm, you can have a company that's very good, but over time it gets worse, right?
Like Intel or many others.
I guess Huawei just isn't that old to a company.
but like it's hard to like be a big company and like stay good.
That is true.
I think it's like what I think a lot, a word that I hear a lot in with regards to
Huawei's a struggle, right?
And China has a culture of like the communist parties.
It's like really big on struggle.
I think like Huawei in the sense they sort of brought that culture into some into their,
in the way they do it like you said before, right?
They, they go crazy because they think that in five years that they're going to fight the
United States.
and literally everything they do, every second is like their country depends on it, right?
It's like, it's the Andy Grovean mindset, right?
Like, shout out to like the based intel, but like only the paranoid survive, right?
Like paranoid Western companies do well.
Why did Google like really screw the pooch on a lot of stuff?
And then why are they like resurging kind of now?
Because they got paranoid as hell, right?
But they weren't paranoid for a while.
If Huawei is just constantly paranoid about like the external world and like, oh, fuck, we're going to die.
Oh, fuck, like, you know, they're going to beat us.
Our country depends on it.
We're going to get the best people from the entire country that are like, you know, the best at whatever they do.
And tell them, you will, if you do not succeed, you will die.
Not you will die.
Your family will die.
Your family will be enslaved and everything.
Like, it'll be terrible.
By the evil Western pigs, right?
Even Western, like, like capital, or not capitalists.
They don't believe in Canada.
They don't say that anymore.
But it's like, like, you know, everyone is against China.
China is being, it's being defiled, right?
And like they're saying like if you, that is all on you, bro.
Like, if you can't do that, then like you, if you can't get that fucking radio to be slightly less noisy and like transmit like five percent more data.
It's like the great palace fire all over again.
The British are coming and they will steal all the all the trinkets and everything.
Like that's on you.
Uh-huh.
Why isn't there more vertical integration in this interconnect industry?
Well, like, why are there like this subcomponent requires this other component from this other company, which requires a subcomponent from another company?
like why is more of it not done in-house?
The way to look at it today is it's super, super stratified,
and every industry has anywhere from one to three competitors.
And pretty much the most competitive it gets is like 70% share,
25% share, 5% share in any layer of like manufacturing chips, anything, anything,
chemicals, different types of chips.
But it used to be vertically integrated.
Or the very beginning it was integrated, right?
Where did that stop?
What happened was, you know, the funniest thing was said, like, you know,
you had companies that used to do it all in the one.
And then suddenly, sometimes a guy would be like, I hate this.
I think I know how to do better.
Spins off, does his own thing, starts this company.
Goes back to his old company says, I can sell you a product that's better, right?
And that's the beginning of what we called the semiconductor manufacturing equipment industry.
Like basically it was a 70s, right?
Like everyone made their own equipment.
60s and 70s.
Like they spin off all these people.
And then what happened was that the companies that accepted, you know, these outside products and equipment got better stuff.
they did better.
Like you can talk about a whole bunch.
Like there are companies that were totally vertically integrated in semiconductor manufacturing
for decades.
And they are still good, but they're nowhere near competitive.
One thing I'm confused about is like the actual foundries themselves,
there's like fewer and fewer of them every year, right?
So there's like maybe more companies overall, but like the final people like who make the,
make the way for there's less and less.
And then it's interesting in a way it's similar to like the AI foundation models where
you need to use the revenues from like a previous model in order or like your market share
to like fund the next round of ever more expensive development.
When TSM launched the foundry industry, right?
And when they started, there was a whole wave of like Asian companies that funded
semiconductor foundries of their own.
You had Malaysia with Siltera.
You have Singapore with chartered.
You had, there was a wide semiconductor where I talked about earlier.
There's one from Hong Kong.
Bunch in Japan.
Bunch in Japan.
Like, they all sort of did this thing, right?
And I think the thing was that when you're going to leading edge,
when the thing is that like it got harder and harder,
which means that you had to aggregate more demand
from all the customers to fund the next node, right?
So technically in the sense that what is kind of do
is aggregating all this money, all this profit,
to kind of fund this next node to the point where now,
like, there's no room in the market for an N2 or N3.
Like, technically you could argue that economically,
you can make an argument that like n2 is a monstrosity that doesn't make sense economically it
should not exist in some ways without the immense single concentrated spend of like five players in the
market i'm sorry to like completely derail you but like there's this video where it's like uh there's
an unholy concoction of meat slurry yes what sorry there's like a video that's like ham is
disgusting it's an unholy concoction of like meat with no bones or collier
And like, I don't know, like, he was like, the way he was describing two nanometers kind of like that, right?
It's like the guy who pumps his right arm so much and he's like super muscular.
The human body was not meant to be so muscular.
What's the point?
Like, why is two nanometer not justify?
I'm not saying N2 is like N2 specifically, but I say N2 as a concept.
The next node should technically, like right now, there is a point, there will come a point where economically the next node will not be possible, like at all, right?
Unless more technology spawned, like AI now makes, you know, one nanometer or whatever.
There was a long period of time.
Yeah, yeah.
Viable, right?
So, like, right before AI spawn.
As in, like, it makes it worth it?
Money.
So every two years, you get a shrink, right?
Yeah.
Like clockwork, Moore's Law.
And then five nanometer happened.
It took three years.
Holy shit.
And then three nanometer happened.
It took three years.
Or no, sorry, is it three nanometer five?
It took three years.
Holy shit.
Like, is Moore's Law dead?
Right?
like because TSM didn't and then what did Apple do?
Even on the third year of three of, uh, or sorry, when three nanometer finally launched,
they still only, Apple only moved half of the iPhone volume to three nanometer.
So this is like, now they did a fourth year of five nanometer for a big chunk of iPhones,
right?
And it's like, oh, is the mobile industry peering out?
Then you look at two nanometer and it's like going to be a similar like very difficult
thing for the, for the industry to pay for this, right?
Apple, of course they have, you know, because they get to make the phone, they have so much
profit that they can funnel into like more and more expensive chips but finally like that was that was
running out right it was to how economically viable is two nanometer just for one player tsmc you know
ignore intel ignore Samsung just in because in Samsung is paying for it with memory not with their actual
profit and then intel is paying it from it from their former CPU monopoly um private equity money
and now and now private equity money and debt and uh subsidies people's salaries yeah but like anyways like
you know, there's a strong argument
that funding the next node
would not be economically viable anymore
if it weren't for AI taking off, right?
And then generating all this humongous demand
for the most leading edge chip.
So how much, how big is the difference
between 7 to 5 to 3 nanometer?
Like is it, like, is it a huge deal
in terms of like who can build the biggest cluster?
So there's the, there's a simplistic argument
that like, oh, moving a process node
only saves me X percent in power, right?
And that has been petering out, right?
you know, when you move from like 90 nanometer to 80 something, right, or 70 something, right?
It was like, you got 2x, right?
Dernard scaling was still intact, right?
But now when you move from 5 nanometer or 3 nanometer, first of all, you don't double density.
Sgram doesn't scale at all.
Logic does scale, but it's like 30%.
So all in all, you only save like 20% in power per transistor.
But because of like data locality and movement of data, you actually get a much larger improvement
in power efficiency by moving to the next node than you.
just the individual transistors power efficiency benefit because, you know, for example,
you're multiplying a matrix that's like, you know, 8,000 by 8,000 by 8,000, and then, like,
you can't fit that all on one chip, but if you could fit more and more, you have to move off chip less,
you have to go to memory less, et cetera, right? So the data locality helps a lot too. But, you know,
the AI really, really, really wants new processed nodes because of A, you know, power used is a lot
less now.
Higher density,
higher performance, of course.
But the big deal is like,
well, if I have a gigawatt data center,
I can now, how much more flops can I get?
If I have two gigawatt data center,
how much more flops can I get?
If I have a 10 gigawatt data center,
how much more flops can I get, right?
And like, you look at the scaling,
it's like, well, no, everyone needs to go
to the most recent process node as soon as possible.
I want to ask the normie question for, like,
everybody's, I want to phrase it that way.
Okay, I want to ask a question that's like,
Normie.
Not for you nerds.
I think,
I think John and I could communicate to the point where you wouldn't even know what the fuck we're talking.
Okay.
Suppose Taiwan is invaded or Taiwan has an earthquake.
Nothing is shipped out of Taiwan from now on.
What happens next?
The rest of the world, how would it feel its impact a day in, a weekend, a month in, a urine?
I mean, it's a terrible thing.
It's a terrible thing to talk about.
I think it's like, can you just say it's all terrible?
Everything's terrible?
Because it's not just like leading edge.
People will focus on leading edge, but there's a lot of trailing edge stuff that, like, people depend on every day.
I mean, we all worry about AI.
The reality is you're not going to get your fridge.
You're not going to get your cars.
You're not going to get everything.
It's terrible.
And then there's the human part of it, right?
It's all terrible.
Can we, like, it's depressing.
I think.
And I live there.
I think day one, market crashes a lot, right?
You're going to think about, like, I think the big, like, six biggest companies, magnificent seven, whatever that gets called, or like 60, 75% of the S&P 5% of the 75% of the S&P,
500 and their entire business relies on chips, right?
Google, Microsoft, Apple, Nvidia, you know, you go down the list, right?
They all meta, right?
They all entirely rely on AI.
And you would have a tech reset, like extremely insane tech reset, by the way, right?
So market would crash a week a day in, a couple weeks in, right?
Like, people are preparing now.
People are like, oh, shit, like, let's start building fabs.
Fuck all the environmental stuff.
Like, war is probably happening.
Yeah, yeah.
But the supply chain is trying to figure out what the hell to do to refix it.
But six months in, disapply of chips for making new cars, gone or sequestered to make military shit, right?
You can no longer make cars.
And we don't even know how to make non-semit conductor-induced cars, right?
Like this unholy concoction with all these like chips, right?
Cars like 40% chips now.
Like it's just chips in the tires.
There's like 2,000 plus chips.
Every Tesla door handle has like four chips.
It's like, what the fuck?
Like, why?
Like, but like, it's like, it's like shitty, like microcontrollers and stuff.
But like there's like 2,000 plus chips even in an, in an ice vehicle, like internal combustion engine vehicle, right?
And every engine has dozens of dozens of chips, right?
Anyways, this all shuts down because not all of the production.
There's some in Europe.
There's some in the U.S.
There's some in Japan.
Yeah, they're going to bring in a guy to work on Saturday until four.
Yeah, yeah.
I mean, yeah.
So you have like TSM always builds new fabs.
that old fab, they tweak production up a little bit more and more,
and new designs move to the next, next, next node,
and old stuff fills in the old notes, right?
So, you know, ever since TSMC has been the most important player,
and not just TSMC, there's UMC there, there's a number of other companies there,
Taiwan's share of, like, total manufacturing has grown every single process node.
So in, like, 1.30 nanometer, there's a lot,
including, like, many chips from, like, Texas instruments or analog devices or, like,
NXP, like all these companies, 100% of is manufactured in Taiwan, right, by, you know, either
TSMC or UMC or whatever, but then you like step forward and forward and forward, right?
Like, 28 nanometer, like, 80% of the world's production of 28 nanometers in Taiwan.
Oh, fuck, right?
Like, you know, and everything in 28 nanometers, like, what's made on 28 nanometer today?
Tons of microcontrollers and stuff, but also, like, every display driver I see.
Like, cool.
Like, even if I can make my Mac chip, I can't make the chip that drives the display.
Like, you know, you just go down the list, like everything.
No fridges, no automobiles, no weedwhackers, because that shit has...
My toothbrush has fucking Bluetooth in it, right?
Like, why?
I don't know, but, like, you know, there's, like, so many things that, like, just, like, poof.
We're tech reset.
We were supposed to do this interview, like, many months ago, and then I kept, like, delaying
because I'm like, ah, I don't understand any of the shit.
But, like, it is, like, a very difficult thing to understand it, where I feel, like, with AI,
it's like...
It's not that...
No, you've just spent time.
You've spent the time to...
Sure, but, like, I also feel like it's, like, less complicated...
It feels like it's a kind of thing.
where, like, in an amateur kind of way,
you can, like, you know, pick up what's going on
in the field.
And this field, like, the thing I'm curious about
is, like, how does one learn
the layers of the stack?
Because the layers of the stack are, like,
there's not just the papers online.
You can't just, like, look up the tutorial
on how the transformer works or whatever.
It's like, it's like many layers of really difficult shit.
There are, like, 18-year-olds
who are just cracked at AI, right?
Already, right?
And, like, there's high school dropouts
that get, like, jobs at open AI.
This existed in the past, right?
Pat Galsinger, current CEO of Intel,
went straight to work.
He grew up in the Amish area of Pennsylvania
and he went straight to work at Intel, right?
Because he's just cracked, right?
That is not possible in semiconductors today.
You can't even get like a job at like a tool company
without like a at least like a freaking master's in chemistry, right?
And probably a PhD, right?
Like like of the like 75,000 TSMC workers,
it's like 50,000 have a PhD or something insane.
Right?
It's like, okay, this is like, there's like some,
there's like a next level amount of like how specialized everything's got.
Whereas today, like, you can take like, you know, Sholto, you know, he, when did he start working on AI not that long ago?
Not to say anything bad about Sholto.
No, no, no, no, but he's cracked.
He's like, Omega cracked at, like, what he does.
What he does, you could pick him up and drop them into another part of the AI stack.
First of all, he understands it already.
And then second of all, he could probably become cracked at that too, right?
Whereas that is not the case in semiconductors, right?
You, one, you, like, specialize, like, crazy.
two, you can't just pick it up.
You know, like,
Shulte, I think, what did he say?
He, like, just started, like...
He was a consultant in McKinsey,
and at, like, night,
he would, like, read papers about robotics.
Right.
And, like, run experiments and whatever.
Yeah, and then, like, he, like, was, like,
people noticed who's like,
who's like, who the hell is this guy
and why is he posting this?
Like, I thought everyone who knew about this
was at Google already, right?
It's like, come to Google.
Right?
That can't happen in semiconductors, right?
Like, it's just not, like,
conducively, like, it's not possible, right?
One, archive.
is like a free thing. The paper publishing industry is like abhorrent everywhere else.
And you just like cannot download I-Tripley papers or like SPIE papers or like other organizations.
And then two, at least up until like late 2022 or really early 2023 in the case of Google, right?
I think what the palm inference paper up until the palm inference paper before that all the good
best stuff was just posted on the internet. After that, you know, it's kind of a little bit clamping down
by the labs. But there's also still all these other companies making innovation.
in the public.
And like, what is state of the art is public?
That is not the case in semiconductors.
Semiconductors have been shut down since 1960s, 1970s, basically.
I mean, like, it's kind of crazy how little information has been formally transmitted
from one country to another.
Like, the last time you could really think of this was like 19, maybe the Samsung era, right?
So then how do you guys keep up with it?
Well, we don't know it.
I don't think I know it.
I don't, I mean, I...
If you don't know it, what are you?
It's crazy because, like, there's a guy.
There's like, I spoke to one guy, he's like a PhD in Etch or something.
The world, one of the top people in Edge and he's like, man, you really know like lithography, right?
I'm just like, I don't feel like I know lithography.
But then you've talked to the people who know lithography.
But then you've done pretty good work in packaging, right?
Nobody knows anything.
They all have gelman amnesia.
They're all in this like single well, right?
They're digging deep.
They're digging deep for what they're getting at.
But they, but, you know, they don't know the other stuff well enough.
And in some ways, I mean, nobody knows the whole stack.
Nobody knows the whole stack.
The stratification of just like manufacturing is absurd.
Like the tool people don't even know exactly what Intel and TSM do in production.
And vice versa.
They don't know exactly how the tools optimized like this.
And it's like how many different types of tools there are?
Dozens.
And each of those has like an entire tree of like all the things that we've built, all the things
we've invented, all the things that we continue to iterate upon.
And then like here's the breakthrough innovation that happens every few years in it too.
So if that's the case of like nobody knows the whole stack,
then how does the industry coordinate to be like, you know, in two years,
we want them to go to the next process, which has gate all around.
And for that, we need X tools and X technologies developed by whatever.
That's really fascinating.
It's a fascinating social kind of phenomena, right?
You can feel it.
I went to Europe earlier this year.
Dylan was like had allergies.
But like I was like talking to those other people.
Europe is just not going to make it.
It's like gossip.
It's gossip. You start feeling the, you start feeling people are.
coalescing around like a something, right?
Early on we used to have like Semitech where people,
all these American companies came together and talked and they came and they'd hammered out,
right?
But Semitech in reality was dominated by a single company, right?
And but then, you know, nowadays is a little more dispersed, right?
You feel, you feel like it's like, it's like, it's like a, it's a blue moon arising kind of thing.
Like they are going towards something.
They know it.
And then suddenly the whole industry is like, this is it.
Let's do it.
I think it's like God came and proclaimed it.
We will shrink density 2x every two years.
Gordon Moore said, he made an observation and then like it didn't go nowhere.
It went way further than he ever expected because it was like, oh, there's a line of sight
to get to here and here.
And he predicted like seven, eight years out, like multiple orders of magnitude of increases
in transistors.
And it came true.
But then by then the entire industry was like, this is obviously true.
This is the word of God.
And every engineer in the entire industry, tens of millions of people,
like literally, this is what they were driven to do.
No, no, every single engineer didn't believe it.
But like, people were like, yes, to hit the next shrink, we must do this, this, this, right?
And this is the optimizations we make.
And then you have this stratification, every single layer, and abstraction layers, every single layer,
through the entire stack to where people, it's an unholy concoction.
I mean, you keep saying this word.
But like, no one knows what's going on because there's an abstraction layer between every single layer.
And on this layer, the people below you and the people above you know what's going on.
and then like beyond that, it's like, okay, I can like,
try to understand, but like not really like...
But I guess it doesn't answer the question of like,
when IRDS or whatever, I don't know, was it 10, 20 years ago,
I watched your video about it where they're like,
we are EUV is like, we're going to do EUV instead of the other thing
and this is the path forward.
How do they do that if they don't have the whole sort of picture
of like different constraints, different tradeoffs,
different blah, blah, blah.
They kind of, they argue it out.
They get together and they get together and they're,
talk and they argue. And basically at some point, a guy somewhere says, I think we can move forward
with this. Semiconductors are so siloed. And the data and knowledge within each layer is, A,
not documented online at all. Right. Documentation. Because it's all siloed within companies.
B, it is, there's a lot of human element to it because a lot of the knowledge, like as John
was saying, is like apprentice master, apprentice master type of knowledge, or I've been doing this for 30
years and there's an amazing amount of intuition on what to do just when you see something
to where like AI can't just learn semiconductors like that.
But at the same time, there's a massive amount of talent shortage and ability to move forward
on things.
Right?
So like the technology used on like like most of the like equipment and semiconductor tool fabs runs
on like Windows XP, right?
Like each tool has like a Windows XP server on it.
Or like, you know, like, all the chip design tools, like, have, like, Sentos,
Sentos, like, version 6, right?
And, like, that's old as hell, right?
So, like, there's, like, so many, like, areas where, like, why is this so far behind?
At the same time, it's, like, so, like, hyper optimized.
That's, like, the tech stack is so broken in that sense.
They're afraid to touch it.
They're afraid to touch it.
Yeah, because it's an unholy amalgamation.
It's an unholy.
It should not be work.
It should not work.
This thing should not work.
It's literally a miracle.
So you have all the abstraction layers, but then it's, like, one is there's a lot of
breakthrough innovation that can happen now stretching across abstraction layers. But two is, because
there's so much inherent knowledge in each individual one, what if I can just experiment and test
at a thousand X velocity or a hundred thousand X velocity? And so some examples of where this is
already like shown true is some of Nvidia's AI layout tools, right? And Google as well,
like laying out the circuits within a small blob of the chip with AI. Some of these like
RL design things, some of the, there's a lot of like various like simulation things.
But is that design or is that manufacturing?
It's all design, right?
Most of it's design.
Manufacturing has not really seen much of this yet, although there is starting to come in.
Inverse lithography, maybe.
Yeah, I'll t and Sam, maybe.
I don't know if that's AI.
That's not AI.
Yeah.
Anyways, like, there's like tremendous opportunity to bring breakthrough innovation simply because
there is so many like layers where things are unoptimized, right?
So you see like all these like, oh, single digit, mid, you know, low double digit, like,
advantages just from like RL techniques from like AlphaGo type stuff, like, or like not
from AlphaGo, but like five, six, seven, eight year old RL techniques being brought in.
But like, gender of AI being brought in could like really revolutionize the industry,
you know, although there's a massive data problem.
So can you give those, can you give the possibilities here in numbers in terms of maybe like
a flaw per dollar or whatever?
the relevant thing here is like, how much do you expect in the future to come from process
and order improvements? How much from just like how the hardware is designed because of AI.
If you like how to dis- we're talking specifically for like GPUs.
Yeah.
Like if you had to disaggregate future improvements.
I think, I think, you know, it's first, it's important to state that semiconductor manufacturing
and design is the largest search base of any problem that humans do because it is the most
complicated industry that anything that humans do. And so, you know, when you think about it,
right, there's there's a 1E10, 1E11, right, 100 billion transistors. Yeah. On, on leading
edge chips, right? Blackwell has 220 billion transistors or something like that. So what is,
and those are just on-off switches. And then think about every permutation of putting those together,
contact ground, et cetera, drain source, blah, blah, blah, with wires, right? There's 15 metal layers,
right, connecting every single transistor
in every possible arrangement.
This is a search space that is literally
almost infinite, right?
You could like, the search space
is much larger than any other search base
that human is known.
And what is the nature of the search?
Like, what are you trying to optimize over?
Well, useful compute, right?
What is, you know, if the goal is
optimize intelligence per piccajouil, right?
And intelligence is some nebulous
nature of what the model architecture is.
Yeah, yeah.
And then Picole is like a unit of energy, right?
How do you optimize?
that. So there's humongous innovations possible in architecture, right? Because vast majority of the power
on a H-100 does not go to compute. And there are more efficient like compute, you know,
AOUs or a thermicologic unit like designs, right? But even then the vast majority of the power
doesn't go there, right? The vast majority of the power goes to moving data around, right? And then
when you look at what is the movement of data, it's either networking or memory,
you have a humongous amount of movement relative to compute
and a humongous amount of power consumption relative to compute.
And so how can you minimize that data movement
and then maximize the compute?
There are 100x gains from architecture.
Even if we literally stop shrinking,
I think we could have 100x gains from architectural advancements.
Over what time period?
The question is how much can we advance the architecture, right?
The challenge, the other challenge is like the number of people
designing chips has not necessarily grown in a long time, right? Yeah, like company to company,
it shifts, but within like the semiconductor industry in the U.S. and the U.S. makes, you know,
designs the vast majority of leading edge chips, the number of people designing chips has not
grown much. What has happened is the output per individual has soared because of EDA,
electronic design assistance tooling, right? Now, this is all still like classical tooling. There's
just a little bit of inkling of AI in there yet, right?
happens when we bring this in is the question,
and how you can solve this search base somehow
with humans and AI working together to optimize this
so it's not most of the power is data movement,
and then the compute is actually very small.
To flip side, the compute is,
first of all, compute can get like 100x more efficient
just with design changes,
and then you can minimize that data movement massively, right?
So you can get a humongous gain in efficiency
just from architecture itself.
And then process node helps you innovate that there.
right? And power delivery helps you innovate that. System design, chip-to-chip networking helps
you innovate that, right? Like memory technologies, there's so much innovation there. And there's so
many different vectors of innovation that people are pursuing simultaneously to where like
Nvidia gen to gen to gen will do more than 2x performance per dollar. I think that's very clear.
And then like hyperscalers are probably going to try and shoot above that, but we'll see if they can
execute. There's like two narratives you can tell here of how this happens.
One is that these AI companies who are training the foundation models who understand the tradeoffs of like how much is the marginal increase in compute versus memory work to them and what tradeoffs do they want between different kinds of memory.
They understand this and so therefore the accelerators they build, they can make these sort of tradeoffs in a way that's like most optimal and also design like the architecture of the model itself in a way that reflects like what.
What are the hardware tradeoffs?
Another is Nvidia because it has like, I don't know how this works.
Presumably they have some sort of know-how.
Like they're accumulating all this like knowledge about how to better design this architecture
and like also better search tools or so on.
Who has basically like better motier in terms of will Nvidia keep getting better at design
and getting this 100x improvement or will it be like Open AI and Microsoft and Amazon
and Anthropic who are designing their accelerators?
keep getting better at like designing the accelerator.
I think that there's a few
vectors to go here, right? One is, you mentioned
and I think it's important to note, is that
hardware has a huge influence
on the model architecture that's optimal.
And so it's not a one-way street that better chip
equals, you know, the optimal model for Google
to run on TPUs, given a given amount of dollars,
a given amount of compute, is
different architecturally than what it is
for opening I with invididged stuff, right?
It is like absolutely different.
And then like even down to like network
decisions that different companies do and data center design decisions that people do.
The optimal, like, if you were to say, you know, X amount of compute of TPU versus GPU,
compute optimally, what is the best thing?
You'll diverge in what the architecture is.
And I think that's important to know, right?
Can I ask about that real quick?
So earlier we're talking about how China has the H20s or B20s.
Yeah.
And there, there's like much less compute per memory bandwidth and like the amount of memory, right?
Does that mean that Chinese models will actually have like very different
architecture and characteristics than American models in the future?
So you can take this to like a very like large like leap and it's like all, you know,
neuromorphic computing or whatever is like the optimal path and that looks very different
than like what a transformer does, right?
Or you can take it to like a simple thing, which is like the level of sparsity and like
coarse grains, varsity, like experts and all this sort of stuff.
The arrangement of like what exactly the attention mechanism is because there are a lot of tweaks.
it's not just like pure transformer attention, right?
Or like, hey, demot, like how wide versus tall the model is, right?
That's, like, very important, like, demod versus, you know, number of layers, right?
These are all, like, things that, like, would be different, like, and I, and, like, I know they're different between, like, say, a Google and an Open AI and what is optimal.
But what really, it really starts to get, like, hey, if you were limited on a number of different things, like, like China invest humongously in compute and memory, you know, which is, like, basically,
the memory cell is directly coupled or is the compute cell, right? So these are like things that
like China's investing hugely and you go to conferences like, oh, there's 20 papers from Chinese
companies slash universities about computed memory. Or like, you know, hey, like because the flop
limitation is here, maybe Nvidia pumps up the on-chip memory and like changes the architecture
because they still stand to benefit tens of billions of dollars by selling chips to China.
Right? Today it's just like neutered American chips, right? A newtied chips that go to the U.S.
but like it'll start to diverge more and more architecturally because they'd be stupid not to make chips for China, right?
And Halea obviously, again, like has like their constraints, right?
Like where are they limited on memory?
Oh, they have a lot of networking capabilities and they could move to like certain optical like networking technologies directly onto the chip much sooner than we could, right?
Because that is what's optimal for them within their search base of solutions, right?
Because this whole area is like blocked off.
It's kind of really interesting to see to think about like the development of how China's,
Chinese AI models will differ from American AI models because of these changes or these constraints.
And it applies to use cases. It applies to data, right? Like, American models are very important about, like, let me learn from you, right? Let me be able to use you directly as a random consumer, right? That is not the case for Chinese model, I assume, right? Because there's probably very different use cases for them. China's crushes the West at video and image recognition, right? At ICML, like Albert Gu at, you know, of Cartier.
like state space models like every single Chinese person was like can I take a selfie with you man was harassed in the US like you see Albert and he's like it's awesome he invented state space models but it's not like state space models are like like here but that's because state space models potentially have like a huge advantage and like video and image and audio which is like stuff that China does more of and that is further along and has better capabilities in right so it's like there's all the surveillance cameras there sorry because of all the surveillance cameras there yeah that's the quiet part out loud right but like there's already divergence
and, like, capabilities there, right?
Like, you know, if you look to image recognition,
China, like, destroys American companies, right?
On that, right?
Because the surveillance.
You have, like, this divergence in tech tree,
and, like, people can, like,
start to design different architectures
within the constraints you're given.
Yeah, yeah.
And everyone has constraints,
but the constraints different companies have
are even different, right?
And so, like, Google's constraints
have shown them that they built,
they built a genuinely different architecture.
But now if you look at, like, Blackwell,
and then what's, like, said about TPV,
right they're i'm not going to say they're like converging but they are getting a little bit closer
in terms of like how big is the matmole unit size and like some of the like topology and like
world size of like the scale up versus scale out network like there is some like convergence slightly
like not saying they're similar yet but like already they're starting to but then there's
different architectures that people could go down and paths so you see stuff like from all these
startups that are trying to go down different tech trees because maybe that'll work but there's a
self-fulfilling prophecy here too right
all the research is in transformers
that are very high arithmetic intensity
because the hardware we have is very high
arithmetic intensity and transformers
run really well on GPUs and TPUs
and like you sort of have a self-fulfilling prophecy
if all of a sudden you have an architecture which is
theoretically it's way better
but you can get only like half of the like usable
flops out of your chip
it's worthless because even if it's
30% you know compute efficiency win
it took twice it's half as fast
on the chip right so there's all sorts of
tradeoffs and like self-fulfilling
prophecies of what path do people go down.
John and Dylan have talked a lot in this episode about how stupefyingly complex the global
semiconductor supply chain is.
The only thing in the world that approaches this level of complexity is the Byzantine
web of global payments.
You're stitching together legacy tech stacks and regulations that differ in every jurisdiction.
In Japan, for example, a lot of people pay for online purchases by taking a code to their
corner store and punching it into a kiosk.
Stripe abstracts all this complexity away from businesses.
You can offer customers whatever payment experience they're most likely to use wherever they
are in the world.
And Stripe is how I invoice advertisers for this very podcast.
I doubt that they're punching in codes at a kiosk in Japan, but if they are, Stripe will handle it.
Anyways, you can head to stripe.com to learn more.
If you were made head of compute of a new AI lab, if like SSI came to you, the I'll
that's going to discover a new lab and they're like, Dylan, we give you $1 billion, you are
our head of compute, like help us get on the map, we can compete with the frontier labs.
What is your first step?
Okay, so the constraints are a U.S. slash Israeli firm because that's what SSI is, right?
And your researchers are on the U.S. in Israel.
You probably can't build data centers in Israel because power is expensive as hell and
it's probably like risky, maybe, I don't know.
So still in the U.S. most likely.
most of the researchers are here, or a lot of them are in the U.S., right, like,
Polo Alto or whatever.
So I guess you need a significant chunk of compute.
Obviously, though, like the whole pitch is you're going to make some research breakthrough.
That's like compute efficiency win, data efficiency win, whatever it is.
You're going to make some breakthrough, but you need compute to get there, right?
Because your GPUs per researcher is your research velocity, right?
Obviously, like, data centers are very tapped out, right?
Not in terms of tapped out, but like every new data center that's coming up,
most of them have been sold, which has led people like Elon to go through this like insane thing in Memphis, right?
I'm just trying to like, I'm just trying to square the circle.
Yeah, yeah.
On that question, I kid you not, in my group house, like group chat, like there have been two separate people who have been like I have a cluster of 800s and I have like a long lease on them.
But I don't like I'm trying to sell them off.
Is it like a buyer's market right now?
Because it does seem like people are trying to get rid of them.
So I think like for the Ilya.
question is like a cluster of like 256 GPUs or even 4K GPUs is kind of it's kind of cope right
it's not enough right um yes you're going to make compute efficiency wins but with a billion dollars
you probably just want the biggest cluster in one individual spot sure um and so like small amounts of
GPUs probably not like you know possible to use right like for them right like and that's what
most of the sales are right like you go and look at like GPU list or like vast or like foundry like
or 100 different GPU resellers,
the cluster sizes are small.
Now, is it a buyer's market?
Yeah, last year you would buy H-100s for like $4 or $3,
like if you, you know, an hour, an hour, right?
For a shorter term or mid-term deals, right?
Now it's like, if you want a six-month deal,
you can get like $2.15 or less, right?
And like the natural cost, if I have a data center, right,
and I'm paying like standard data center pricing
to purchase the GPs and deploy them,
it was like $1.40,
and then you add on the debt,
because I probably took debt to buy the GPUs
or equity, positive capital,
gets up to like $1.70 or something, right?
And so you see deals that are like
the good deals, right?
Like Microsoft renting from Corrieve
are like $1.90 to $2, right?
So people are getting closer and closer to like,
there's still a lot of profit, right?
Because the natural rate, even after debt
and all this is like $1.70.
So like there's still a lot of profit
when people are selling in the low twos,
like GPU companies, people are deploying them.
But it is a buyer's market
in a sense that it's gotten a lot cheaper.
But cost of compute is going to continue.
need a tank, right? Because it's like sort of like, I don't remember the exact name of the law,
but it's effectively Moore's law, right? Every two years, the cost of transistors have,
and yet the industry grew, right? Every six months or three months, the cost of intelligence.
You know, like opening eye on GPD, GPD for, what, February 2020. Right? $120 per million
tokens or something like that was roughly the cost, and now it's like 10, right? It's like the cost. It's like the cost of
intelligence is tanking, partially because of compute, partially because the model's compute efficiency
wins, right? I think that's a trend we'll see, and then that's going to drive adoption as you
scale up and make it cheaper and scale up and make it cheaper.
Right, right.
Anyways, what you're saying, if you're a head of computer of SSI.
Okay, head of computer SSI.
That's very intense.
There's obviously no free data center lunch, right, in terms of, you know, and you can just,
you know, take that based on, like, the data we have shows that there's no free lunch,
per se, like immediately today you need the compute for, for a little.
large cluster size or even six months out, right? There's some, but like not a huge amount because
of what X did, right? XAI is like, oh shit, we're going to go like, we're going to go
buy a Memphis factory, put a bunch of like generators outside, like mobile generators
usually reserved for like natural disasters, a Tesla battery pack, dry as much power as we can
from the grid, tap the natural gas line that's going to the natural gas plant like two miles away.
They could go out natural gas plant, like just like send it and like get a cluster built as fast
as possible. Now you're running 100KGPs, right?
I know. And that cost, that cost about $5 billion, right? Four billion, right? Not not,
not, not one billion. So scale that SSI has is much smaller, by the way, right? So,
so their size of cluster will be, you know, maybe one third or one fourth of the size, right? So now
you're talking about 25 to 32K cluster, right? There, you still don't have that, right? No one is
willing to rent you a 32K cluster today, no matter how much money you have, right? Even if you had more
than a billion dollars. So you now, it makes the most sense to build your own cluster one,
instead of renting it, or get a very close relationship like a Open AI Microsoft with Correve,
or Open AI Microsoft with Oracle slash Crusoe. The next step is Bitcoin, right? So OpenAI has a
data center in Texas, right? Or it's going to be their data center. It's like they've kind of
contract and all that. Corveve, there is a 300 megawatt natural gas plant on site, power
these crypto mining data centers from the company called Core Scientific.
And so they're just converting that.
There's a lot of conversion, but the power's already there.
The power infrastructure is already there.
So it's really about converting it, getting it ready to be water cooled, all that sort of stuff,
and convert it to 100,000 GB200 cluster.
And they have a number of those going up across the country.
But that's also like tapped out to some extent because invidio is doing the same thing in Plano, Texas
for a 32,000 GPU cluster that they're building.
And so did Nvidia's doing that?
Well, they're going through partners, right?
Because this is the other interesting thing
is the big tech companies can't do crazy shit like Elon did.
Why?
ESG.
Oh, interesting.
They can't just do crazy shit like, because this-
Actually, do you expect Microsoft and Google and whoever
to drop their net zero commitments
as the scaling picture intensifies?
Yeah, yeah.
So like this like, what XAI is doing, right,
is like it's not that polluting, you know,
on the scheme of things,
but it's like you have 14 mobile generators
and you're just burning natural gas on site
on these mobile generators that sit on trucks, right?
And then you have like power directly two miles down the road.
There's no unequivocal way to say any of the power
is because two miles down the road
is a natural gas plant as well, right?
There's no way to say this is like green.
You go to the core weave thing
is a natural gas plant is literally on site
from core scientific and all that, right?
And then the data centers around it
are horrendously inefficient, right?
There's this metric called PUE,
which is basically how much power is brought in
versus how much gets delivered to the chips, right?
And like the hypers, because they're so efficient or whatever, right,
their PUE is like 1.1 or lower, right?
I.e., if you get a gigawatt in, 900 megawatts or more gets delivered to chips, right?
Not wasted on cooling and all these other things.
This, like, core scientific one is going to be like 1.5, 1.6, i.e.,
even though I have 300 megawatts of generation on site, I only deliver like 180, 200 megawatts to the chips.
Given how fast solar is getting cheaper and also the fact that like, you know, how the reason solar is difficult elsewhere is like, you know, you're like, you got to like power the homes at night.
Here, I guess it's like theoretically possible to like figure out, you know, only like run the clusters in the way in the day or something.
Absolutely not.
Really?
That's not possible.
Because because it's so expensive to have these GPUs.
Yeah.
So like when you look at the power cost of a large cluster, it's trivial and to some extent, right?
Like, you know, like the meme that like, oh, you know, you can't build a data center in Europe or East Asia because the power is expensive.
That's not really relevant.
What's the real or power is so cheap in China in the U.S.
That's where the only places you can build data centers.
That's not really the real reason.
It's the ability to generate new power for these activities is why it's really difficult.
And the economic regulation around that.
But the real thing is like if you look at the cost of ownership of a GP of an H100, let's just say you gave me, you know, a billion dollars.
and I already have a data center,
I already have all this stuff.
I'm paying regular rates for the data centers.
I'm not paying through the nose or anything.
Paying regular rates for power,
not paying through the nose.
Power is sub 15% of the cost.
And it's sub 10% of the cost, actually, right?
The biggest, like 75% to 80% of the cost
is just the servers, right?
And this is on a multi-year,
including debt financing,
including cost of operation, all that, right?
Like when you do a TCO,
total cost of ownership,
like it's like 80% is the GPUs,
10% is the data center,
10% of the power, rough numbers, right?
So it's like kind of irrelevant, right, whether or not you like, like how expensive the power is, right?
Yeah.
You'd rather do what Taiwan does, right?
When like power, like, what do they do when there was droughts, right?
They like, like, force people to not shower.
They basically reroute the power from when there was a power shortage in Taiwan.
They basically rerouted power from the residential.
And this will happen in a capitalistic society as well, most likely, because like, fuck you.
Like, why are you not going to pay X dollars per kilowatt hour?
Because to me, the marginal cost of power is irrelevant.
it really it's all about the GPU cost and the ability to get the power. I don't want to turn it off
eight hours a day. Maybe let's discuss what would happen if the training regime changes and if it
doesn't change. So like you could imagine that the training regime becomes much more parallelizable
where it's like about like coming up with some sort of like search or synthetic like most of the
compute for training is used to come up with synthetic data or do some kind of search and that can
happen across a wide area. In that world, how fast. How fast.
Could we scale like we just like let's go through the numbers on like year after year and then was suppose it actually has to be
You would know more than me but like suppose it has to be the current
Regime and like just explain what that would mean in terms of like how distributed that would have to be and then how
How plausible it is to get a clusters of certain sizes over the next two years
I think I think it like is not too difficult for Ilya's company to get a cluster of like 32k in like of of blackwell
Uh,
forget about it. Okay, okay, fair enough fair enough um
Like 2025,
2026,
2026, there's,
before I talk about, like,
the U.S.,
I think it's, like,
important to note
that there's, like,
a gigawatt plus of data center
capacity in Malaysia next year
now.
That's, like,
mostly bite dance.
But, like,
there's, like,
you know,
and power-wise,
there's, like,
there's the humongous damming
of the Nile in Ethiopia,
and the country uses,
like, one-third of the power
that that dam generates.
So there's, like,
a ton of power there to,
how much power does that damn generate?
Like,
it's, like,
over a gigawatt.
And the country consumes,
like,
400 megawatts or something trivial.
And is like, are people bidding for that power?
I think people just don't think they can build a data center in fucking Ethiopia.
Why not?
I don't think the dam is filled yet, is it?
I mean, they have to like, the dam could generate that power.
They just don't.
Oh, got it.
Right?
Like, there's a little bit more equipment required, but that's, like, not too hard.
Why don't they?
Yeah.
I think there's, like, like, true security risks, right?
If you're China or if you're the U.S. lab, like, to build a fucking data center with all your
IP in
in Ethiopia.
Like,
you want AGI to be
in Ethiopia?
Like, you want it to be
that accessible.
Like, people,
you can't even monitor,
like,
like, being the technicians
in the fucking data center
or whatever, right?
Or, like,
powering the data center,
all these things.
Like, there's so many, like,
you know,
things you could do to,
like, you could just destroy
every GPU in a data center
if you want,
if you just, like,
fuck with the grid, right?
Like, pretty,
like, easily, I think.
People talk a lot about
of the Middle East.
There's 100 K,
GB 200 cluster going up in the Middle East,
right?
And the U.S.,
like, there's, like, clearly, like, stuff the U.S. is doing, right? Like, uh, the, you know,
um, G42 is the UAE data center company, cloud company. Their CEO is a Chinese national,
or not a Chinese, he's Chinese, basically Chinese allegiance, but, uh, open, I, I think
over now I wanted to use the data center from them, but instead, like, the US forced Microsoft to,
like, I feel like this is what happened is forced Microsoft to, like, do a deal with them, um,
so that G42 has a 100K GPU cluster, but Microsoft is like administering and operating for security.
reasons, right? And there's like omnibah in like Kuwait, like the Kuwait, like super rich guy spending like
five plus billion dollars on data centers, right? Like you just go down the list, like all these
countries, Malaysia has, you know, you know, 10 plus billion dollars of like data center, you know,
AI data center buildouts over the next couple years, right? Like, and you know, go to every country.
It's like this stuff is happening. But on the grand scheme of things, the vast majority of the
computer is being built in the U.S. and then China and then like Malaysia, Middle East and like rest of
the world. And if you're in the, you know,
going back to your point, right?
Like you have synthetic data,
you have like the search stuff,
you have like,
you have all these post-training techniques.
You have all this,
you know,
all this ways to soak up flops.
Or you just figure out
how to train across multiple data centers,
which I think they have.
At least Microsoft and Open AI have figured,
opening eyes up.
What makes you think they figured it out?
Their actions.
So Microsoft has signed deals
north of $10 billion
with fiber companies
to connect their data
centers together. There are some permits already filed to show people are digging, you know,
between certain data centers. So we think with fairly high accuracy, we can say, we think that
there's five data centers, massive, not just five data centers, sorry, five like regions that they're
connecting together, which comprises of many data centers, right? What will be the total power
usage of the? Depends on the time, but easily north of a gigawatt, right? Which is like close to a million
GPUs. Well, the, each GPU is getting more power, higher power consumption, too, right? Like, it's like, you know,
the rule of thumb is like GPU, H100 is like 700 watts, but then like total power per GPU
all in is like 1,300, 1,400 watts, 1400 watts, but next generation Nvidia GPUs are,
it's 1,200 watts for the GPU, but then it actually ends up being like 2,000 watts all in, right?
So there's a little bit of scaling of power per GPU, but like you already have 100K cluster, right?
Open AI in Arizona, XAI in Memphis and many others already building 100K clusters of H100s.
You have multiple, at least five, I believe, GB200, 100K clusters being built by Microsoft
slash opening I slash partners for them.
And then potentially even more, 500KGB 200s, right, is a gigawatt, right?
And that's like online next year, right?
And like the year after that, if you aggregate all the data center sites and like how much power
and you only look at net ads since 2022 instead of like the total capacity at each data
center, then you're still like north of multi-gigawatt.
Right? So they're spending 10 plus billion dollars on these fiber deals with a few fiber companies,
Lumens, A-O, like, you know, a couple other companies. And then they've got all these data centers
that they're clearly building 100K clusters on, right? Like old crypto mining site with Corve in Texas
or like this Oracle Crusoe in Texas and then like in Wisconsin and Arizona and, you know, a couple other
places. There's a lot of data centers being built up, you know, and providers, right, QTS and Cooper.
and like, you know, you go down the list,
there's, like, so many different providers,
and self-build, right?
Data centers, I'm building myself.
So, so, uh, uh,
gigawatts, yes.
Let's just, like, give the number on, like,
okay, 2025,
Elon's cluster is going to be the big,
like, it doesn't matter who it is.
So, so then there's a definition game, right?
Like, Elon claims he has the largest cluster
at 100KGPUs because they're all fully connected.
Rather than who it is, like, I just want to know,
like, how many, like,
I don't know if it's better to denominate and 800,000.
100,000 GPUs this year, right?
Right.
For the biggest cluster.
For the biggest cluster.
Next year.
Next year, 300 to 500,000, depending on whether it's one side or many, right?
300 to like 700,000, I think is upper bound of that.
But anyways, you know, it's about like when they tiered on, when they can connect them, when the fibers connected together.
Anyways, 300 to like 700,000, let's say.
But those GPUs are 2 to 3x faster, right, versus the 100K cluster.
So on an H-100 equivalent basis, you're at a million chips next year.
In one cluster?
By the end of the year, yes.
No, no, no, well, so one cluster is like the, but you know what I mean.
The wishy-washy definition, right?
Multi-site, right?
Can you do multi-site?
What's the efficiency loss when you go multi-site?
Is it possible at all?
I truly believe so.
What is it whether it's, what's the efficiency loss is a question, right?
Okay, it would be like 20% loss, 50% loss?
Great question.
This is where like, you know, this is where you need like the secrets, right?
Of like, and Anthropics got similar plans of Amazon and you go down the list, right?
And then the year after that.
The year after that is where,
This is 2026.
2026, there is a single gigawatt site.
And that's just part of the like multiple sites, right?
For Microsoft.
The Microsoft 5 gigawatt thing happens in 20.
One gigawatt one site in, in 2026.
But then you have, you know, a number of others.
You have five different locations, each with multiple, some with multiple sites,
some with single site.
You're easily north of two, three gigawatts.
And then the question is, can you start using the old chips with the new chips?
And like the scaling, I think, is like, you're going to continue to see flop scaling, like, much faster than people expect.
I think, as long as the money pours in, right?
Like, that's the other thing is, like, there's no fucking way you can pay for the scale of clusters that are being planned to be built next year for Open AI.
Unless they raise, like, $50 to $100 billion.
Which I think they will raise that, like, end of this year, early next year.
50 to $100 billion?
Yes.
Are you kidding me?
No.
Oh, my God.
This is like, you know, like, Sam has a superpower, no?
like, it's like, it's like recruiting and like raising money.
That's like what he's like a god at.
Will ships themselves be a bottleneck to the scaling?
Not in the near term.
It's more again back to the concentration versus decentralization point.
Because like the largest cluster is 100,000 GPUs.
Nvidia is manufactured close to 6 million hoppers, right, across last year and this year.
Right?
So like what? That's fucking tiny, right?
So then why is Sam talking about the 7 trillion to build foundries and whatever?
Well, this is this, you know, like,
draw the line, right? Like, log, log lines. Let's fuck. A number goes up, right? You know, if you do,
if you do that, right? Like, you're going from 100K to 300 to 500K, where the equivalent is a million,
you just 10x year on year. Do that again. Do that again. Or more, right? If you increase the
pace in far as, what is do that again? So like 2026, like the number of H100 equivalents?
If you try and, you know, if you increase the globally produced flops by like 30x, you're on year or
10x year and the cluster size grows or the cluster size grows by.
you know, 3 to 5 to 7x,
and then you do your start,
you get multi-site going better and better and better.
You can get to the point where
multimillion chip clusters,
even if they're like regionally not connected
right next to each other,
are right there.
And in terms of flops,
like it would be 1E, what?
I think 1E30 is like very possible,
like 28, 29.
Wow.
Yeah.
And 1E30, you said,
by 28, 29.
Yeah.
And so that is literally six orders of magnitude
that's like 100,000 times more compute than GPD4.
The other thing to say is like the way you count flops on a training run is really stupid.
Like you can't just do like active parameters times tokens times six, right?
Like that's really dumb because like the paradigm as you mentioned, right, is like,
and you've had many great podcasts on this like synthetic data and like RL stuff post-training,
like verifying data and like all these things generating and throwing it away like all sorts of stuff.
Search like inference time compute.
all these things like aren't counted in the training flops.
So you can't like say 1E30 is a really stupid number to say because by then the, you know,
the actual flops of the pre-training may be X, but the data to generate the for the pre-training
may be, you know, way bigger or like the search inference time may be like way, way bigger, right?
Right.
But also the like because you're doing the sort of adversarial synthetic data where like the thing
you're weakest that you can make synthetic data for that, it might be way more sample efficient.
So like even though.
The pre-training flops will be a.
element, right? Like, I actually don't think pre-training flops will be 1E30. I think more reasonably
it would be like the total sumnation of the flops that you deliver to the model. Right.
Across pre-training, post-training, synthetic data for that pre-training data and post-training data as well as like some of the inference time compute
efficiencies like could be like it's more like one E30 right thing. So suppose you really do get to the world where like it's worth investing
Okay, actually if you're doing one E30
How is that like a trillion dollar cluster?
billion dollar cluster?
Like,
I think it'll be like,
multi hundred billion dollars.
And then,
and,
but then,
like,
it'll be,
like,
I,
like,
truly believe people are going to be able to use their
prior generation clusters and alongside their new generation clusters.
Um,
and obviously,
like,
you know,
smaller batch sizes or whatever,
right?
Like,
or use that to generate and verify data,
all these sorts of things.
And then for one and 30,
um,
right now,
I think 5% of,
uh,
TSMC's N5 is in video or like,
whatever percent it is,
by 20208, what percentage will it be?
Again, this is like a question of like how scale pill you are
and how much money will flow into this and how you think progress works.
Like, will models continue to get better or does the line like not,
does the line slope over?
I believe it'll like continue to like skyrocket in terms of capabilities.
In that world.
In that world, why wouldn't like of not a five nanometer, but like of two nanometer,
A16, A14, these are the nodes that will be in that time frame of 2028 used for AI.
I could see like 60, 70, 80% of it.
Like, yeah.
No problem.
Given the fabs that are like currently planned
and are currently being built,
is that enough for the 1E30 or will be good?
So then like the chip code doesn't make any sense.
Sorry.
Like the chip go stuff about like we don't have enough computer.
There doesn't make any sense.
So no, I think like the plans of TSMC on 2 nanometer and such
are like quite aggressive for a reason, right?
Like to be clear,
Apple, which has been TSM's largest,
customer does not need how much two nanometer capacity they're building. They will not need A16.
They will not need A14, right? Like you go down the list, it's like, Apple doesn't need this shit,
right? Although they did just hire one of Google's head of system design for TPU. But, you know,
so they are going to make an accelerator. But, you know, that's besides the point,
an AI accelerator, but that's besides the point, like, Apple doesn't need this for their business,
which they have been 25% or so of TSM's business for a long time. And when you, when you just
zone in on just the leading edge, they've been like more than half of the newest node,
or 100% of the newest node almost constantly.
That paradigm goes away, right?
If you believe in scaling and you believe in, like,
the models get better, the new models will generate,
you know, infinite, not infinite,
but like amazing productivity, gains for the world
and such on, so on and so forth.
And if you believe in that world, then, like,
TSM needs to act accordingly,
and the amount of silicon that gets delivered needs to be there.
So 25, 26, TSM is, like, definitely there.
And then on a longer time scale,
the industry,
industry can be ready for it, but it's going to be a constant game of like, you must convince them
constantly that they must do this. It's not like a simple game of like, oh, you know, if people
work silently, it's not going to happen, right? Like, they have to see the demonstrated growth
over and over and over and over again on across the industry. And, and markets.
Investors or companies or who are the more so like TSM needs to see in video volumes continue
to grow straight straight up, right? And, oh, and Google's volumes continue to grow straight up.
you know, go down the list.
Chips in the near term, right?
Next year, for example,
are less of a constraint than data centers, right?
And likewise for 2026.
The question for 27, 28 is like,
you know, always when you grow super rapidly,
like people want to say,
that's the one bottleneck,
because that's the convenient thing to say.
And in 2023,
there was a convenient bottleneck, co-os, right?
The picture's got much, much cloudier,
not cloudier,
but we can see that like, you know,
HBM is a limiter too.
CoAS is as well,
COASL especially, right?
Data centers, transformers, substations,
like power generation, batteries, like UPSs,
like CRHs, like water cooling stuff.
Like all of this stuff is now limitations
next year and the year after.
Fabs are in 26, 27, right?
Like, you know, things will get like cloudy
because like the moment you unlock one,
oh, like only 10% higher, the next one is the thing.
And only 20% higher the next one is the thing.
So today, like, data centers are like four to five
percent of total US of total US when you think about like as a percentage of US power that's not that
much but when you think US power has been like this and now you're like this but then you also
flip side you're like all this coal's been curtailed all these like oh there's so many like different
things so like power is not that crazy on a like glow on a national basis on a localized basis it is
because it's about the delivery of it same with the substation transformer supply chains right
it's like these companies have operated in an environment where the US power is like this or even
slightly down right and it's like kind of been like you know like that because of a
efficiency gains because of, you know, so anyways, like there have been humongous, like,
um, weakening of the industry. Um, but now all of a sudden, if you tell that industry,
your business will triple next year if you can produce more. Oh, but I can only produce 50% more.
Okay, fine. You're after that. Now we can produce three X as much, right? You do that to the industry.
The U.S. industrial base as well as the Japanese as like, you know, all across the world can
get revitalized much faster than people realize, right? Like, I,
I truly believe that people can innovate when given the like need to.
It's one thing if it's like this is a shitty industry where my margins are low and we're not growing
really and like, you know, blah, blah, blah, blah, to all of a sudden, oh, this is the sexy.
I'm in power and I'm like, this is the sexiest time to be alive and like we're, we're going
to do all these different plans and projects and people have all this demand and they're like
begging me for another percent of efficiency advantage because that gives them another percent
to deliver to the chips.
Like all these things where 10 percent or whatever it is, like you see all these things happen.
and innovation is unlocked.
And, you know, you also bring in, like, AI tools.
You bring in, like, all these things.
Innovation will be unlocked.
Production capacity can grow, not overnight, but it will on six months,
18 months, three-year time scales.
It will grow rapidly.
And you see the revitalization of these industries.
So, but I think, like, getting people to understand that,
getting people to believe because, you know, if we pivot to, like,
I'm telling you that Sam's going to raise 50 to $100 billion
because he's telling people he's going to raise this much, right?
Like literally having discussions with sovereigns and like Saudi Arabia and like the Canadian pension fund and like not these specific people, but like the biggest investors in the world.
Of course Microsoft as well, but like he's literally having these discussions because they're going to drop their next model or they're going to show it off to people and raise that money.
But because this is their plan.
If these sites are already planned and like they've already.
The money's not there, right?
So how do you plan?
How do you like plan a site without?
Today Microsoft is taking on immense credit risk, right?
like they've signed these deals
with all these companies to do this stuff
but Microsoft doesn't have
I mean they could pay for it right
Microsoft could pay for it on the current time scale
right oh what's what's you know
their Cappex going from $50 billion
to $80 billion
direct Cappax and then another 20 billion
across like Oracle Correve
you know and then like another like
10 billion across their data center partners
they can afford that right
to next year right
but then
that doesn't you know like this is because
Microsoft truly believes in open AI they may have
doubt's like, holy shit, we're taking a lot of credit risk.
You know, obviously, they have to message Wall Street, all these things, but they are not like,
that's like affordable for them because they believe they're a great partner to Open AI that
they'll take on all this credit risk.
Now, obviously OpenA has to deliver.
They have to make the next model, right?
That's way better.
And they also have to raise the money.
And I think they will, right?
I truly believe from like how amazing 4-0 is, how small it is relative to 4.
The cost of it is so insanely cheap.
It's much cheaper than the API prices lead you to believe.
and you're like, oh, what if you just make a big one?
It's like very clear what's going to happen to me on the next jump that they can then raise this money
and they can raise this capital from the world.
This is intense.
That's very intense.
John, actually, if he's right or I don't know, not him, but like in general, if like the capabilities are there, the revenue is there.
Revenue doesn't matter.
Revenue matters.
Is there any part of that picture that still seems wrong to you in terms of like displacing so much of TSM production,
and like,
power and so forth,
does any part of that seem wrong to you?
I can only speak to the semiconductor part,
even though I'm not an expert,
but I think the thing is like,
TSM can do it.
Like, they'll do it.
I just wonder,
though he's right in that in a sense
of 24, 25,
that's covered.
Yeah.
But 26, 27,
that's that secret point
where you have to say,
can the semiconductor industry,
and the rest of the industry
be convinced that this is where the money is?
Like, where's money is?
And that means,
is there money?
Is there money?
by 24 or 25?
How much revenue do you think
the AI industry as a whole
needs by 25
in order to keep scaling?
Doesn't matter.
Compared to smartphones.
Compared to smartphones.
I know he says it doesn't matter.
I'll get to a lie.
You keep, I know.
What is smartphones?
It's like Apple's revenue
is like 200 something billion dollars.
So like...
Yeah, it needs to be another
smartphone size opportunity, right?
Like, even the smartphone industry
doesn't drive this sort of growth.
Like, it's kind of crazy, don't you think?
So today is so far, right?
The only thing I can really perceive?
Yeah, a girlfriend.
But like, it's like...
But you know what I mean?
It's not there.
I want a real one, damn it.
So, so like, few things, right?
The return on invested capital for all of the big tech firms is up since 2022.
Yeah.
And therefore, it's clear as day that them investing in AI has been fruitful so far, right?
Wait, wait.
For the big tech firms.
Return on invested capital.
Like financially, you look at the, you look at Metas, you look at Microsofts, you look at Amazon's, you look at Googles.
The return on invested capital is.
up since 2022.
So it's...
On AI in particular?
No, just generally as a company.
Now, obviously, there's other factors here.
Like, what is meta's ad efficiency?
How much of that is AI, right?
Super messy.
That's a super messy thing.
But here's the other thing.
This is Pascal's wager, right?
This is a matrix of like, do you believe in God?
Yes or no?
If you believe in God, yes or no, like hell or heaven, right?
So if you believe in God and God's real and you go to heaven, that's great.
That's fine.
Whatever.
If you don't believe in God and God is real, then you're going to hell.
This is the deep technical analysis.
you'll subscribe to send me an hour.
I think this is just, this is just me ripping.
Can you imagine what happens to the stock if Satya starts talking about Pascal's wager?
No, no, but this is psychologically what's happening, right?
This is a, if I don't, and Satya said it on his earnings call.
The risk of underinvesting is worse than the risk of overinvesting.
He has said this word for word, this is Pascal's wager.
This is, I must believe I am AGI pill because if I'm not and my competitor does it, I'm absolutely
fucked.
Oh, okay, other than Zuck, who seems pretty converse.
Sundar said this on the on the C on the earnings call so Zuck said it Sundar said it
Satcha's actions on credit risk for Microsoft do it he's very good at PR and like
messaging so he hasn't like said it so openly right um Sam believes it
Dario believes that you look across these tech titans they believe it and then you
look at the capital holders the UAE believes it Saudi believes it how do you know the
UAE and sorry believes it like all these major companies and capital holders also
believe it because they're putting their money here but but that's
Like, how can, like, it won't last.
It can't last unless there's money coming in somewhere.
Correct, correct.
But then the question is, the simple truth is, like, GPD4 costs like $500 million to train.
I agree.
And it has generated billions in reoccurring revenue.
But in that meantime, opening I raised $10 billion or $13 billion and is building a, you know, a model that costs that much effectively, right?
Right.
And so then, obviously, they're not making money.
So what happens when they do it again?
they release and show GPD 5
with whatever
capabilities that make everyone in the world like
holy fuck, obviously the revenue takes time
after you release the model to show up.
You still have only a few billion dollars or
$5 billion of revenue run rate.
You just raise $50 to $100 billion
because everyone sees this like, holy fuck,
this is going to generate tens of billions of revenue.
But that tens of billions takes time to flow in,
right? It's not an immediate click.
But the time where Sam can convince,
and not just Sam, but people's decisions
to spend the money are being made,
are then, right?
Like, so therefore, like, you look at the data centers, people are building.
You don't have to spend most of the money to build the data center.
Most of the money's the chips, but you're already committed to, like, oh, I'm just going to have
so much data center capacity by 2027 or 26 that it's, I'm never going to need to build a
data center again for like three, four, five years if AI is not real, right?
That's like basically what all their actions are.
Or I can spend over $100 billion on chips in 26, and I can spend over $100 billion on chips
in 27, right?
So these are the actions people are doing.
and the lag on revenue versus when you spend the money
or raise the money, raise the money, spend the money,
built, you know, there's like a lag on this.
So this is like, you don't necessarily need the revenue in 2025
to support this.
You don't need the revenue in 2026 to support this.
You need the revenue in 25, 26 to support the $10 billion
that OpenAIs spent in 23, or Microsoft spent in 23 slash early 24
to build the cluster, which then they train the model in mid-204,
you know, for early 24, mid-24, which they then released at the end of
24, which then started generating revenue in 25, 26.
I mean, like, the only thing I can say is that you look at a chart with three points on a
graph, GPT 1, 2, 3, and then you're like, and even that graph is like, the investment you
have to make in GPD 4 over GPD 3 is 100 X, the investment you had to make in GPD 5 over GPD 4
is 100.
Like, so revenue, like, currently the ROI could be positive, but like, and this very well could
be true.
I think it will be true.
But like, the revenue has to like increase exponentially, not.
just like, you know, 10%.
I agree with you, but I also, I agree in DILU is that it can be achieved. ROI, like
Semiconduct TSM does this. Invest $16 billion. It expects a ROI does that, right?
That's, I understand that. That's fine. Lag, all that. The thing that I don't expect is that
GPT5 is not here. It's all dependent on GPT5 being good. If GPT5 sucks, if GPT5,
GPD5 looks like
It doesn't blow people's socks off
This is all void
What kind of socks you're wearing, bro?
Show them.
Show them, AWS.
Show them, SWU.S.
GP5 is not here.
It's late.
We don't know.
I don't think it's late.
I think it's late.
I want to zoom out
and like go back
to the end of the decade
picture again.
So if you're,
if this picture you've already,
we've already lost John.
We've already accepted
GPD5 would be good.
Hello?
But yeah,
You got it, you know?
Yeah, you got.
Bro, like, life is so much more fun
when you just, like, are delusionally, like, you know?
We're just ripping bong, are we?
When you feel the AGI, you feel your soul.
This is why I don't live in San Francisco.
I have tremendous belief in, like, GPD-5 area.
Because, like, what we've seen already.
I think the public signs all show that this is, like,
very much the case, right?
What we see with beyond that is more,
questionable and I'm not sure because I don't know what I don't know right like I don't know
we'll see how like how much they progress but if like things continue to improve life continues
to radically get reshaped for you know many people the you know it's also like every time you
increment up the intelligence the amount of like usage of it grows hugely every time you
increment the cost down of that amount of that amount of intelligence the amount of usage
increases massively as you continue to push that curve out that's what really like
matters, right? And it doesn't need to be today, it doesn't need to be a revenue versus like how much
capex in any time in the next few years, it just needs to be, did that last humongous chunk of
capax make sense for open AI or whoever the leader was? Or, and then how does that flow through,
right? Or were they able to convince enough people that they need to, they can raise this much money,
right? Like, you think Elon's tapped out of his network with raising $6 billion? No. XAI is going to
be able to raise 30 plus, right? Easily, right? I think so. You think Sam, you think,
tapped out, you think Anthropics taped out, Anthropics barely even diluted the company relatively,
right? Like, you know, there's a lot of capital to be raised in just from like, call it FOMO if you want,
but like during the dot-com bubble, people were spending, uh, the private industry flew through like
$150 billion a year. We're nowhere close to that yet. Right. We're not even close to the dot-com bubble,
right? Why would this bubble not be bigger, right? And if you go back to the prior bubbles,
PC bubble, semiconductor bubble, mechatronics bubble throughout the U.S., each bubble was smaller.
you know, you call it a bubble or not,
why wouldn't this one be bigger?
How many billions of dollars a year is this bubble right now?
For private capital?
Yeah.
It's like 55, 60 billion so far.
For this year, it can go much higher, right?
And I think it will next year.
Okay, so let me think of it.
You didn't know the bong rip.
You know, at least like finishing up and looping into the next question was like,
you know, prior bubbles also didn't have the most
profitable companies that humanity's ever created investing, and they were debt financed. This is not
debt financed yet, right? So that's the last, like, little point on that one. Whereas the 90s bubble
was like very debt financed. This is like cash flow. Yeah, sure, but it was so many, so much was built,
right? You know, you got to blow a bubble to get real stuff to be built. It is an interesting analogy
where like with, even though the dotcom bubble obviously burst in like a lot of companies when bankrupt,
they in fact did lay out the infrastructure that enabled the web and everything. So you could imagine
in an AI. It's like a lot of
the foundation model companies or whatever,
like a bunch of companies will like go bankrupt
but like they will
enable the singularity.
During the 1990s at the turn of the 1990s, there was
immense amount of money invested in like mems
and like optical optical technologies
because everyone expected the fiber bubble
to continue, right? That all ended at
2003, 2002 or where it went,
right? And that started in 94?
It hasn't been a revitalization since, right?
Like that's, you could risk
the possibility of a...
woman, one of the companies that's doing the fiber build
out for Microsoft, the stock like fucking
4X last month, or this month. And then how's it
done from 2002 to 2002? Oh no, horrible,
horrible, but like, we're going to
rip, babe. You could, rip that bomb, baby.
You could freeze AI for
another two decades. You, sure,
sure, possible. Or people
can see a badass demo from
GPD5, slight release,
raise a fuckload of money. It could even be like
a Devon like demo, right? Where it's like complete
bullshit, but like it's fine, right? Like,
shit, I should.
Edit that out.
No, it's fine. That's fine, dude. I don't really care.
You know, it's, it's, the capital is going to flow in, right?
Now, whether the, whether deflates or not is like an irrelevant concern on the near term,
because you operate in a world where it is happening.
And being, you know, being, you know, what is the Warren Buffett quote, which is like,
you can be, I don't even know it's Warren Buffett.
You don't know who's, you don't know who's, you don't know who's, you don't know who's
be naked until the tide goes out.
No, no, no, the one about like, um, the market is delusional far longer than you can remain
solvent or something like that. That's not Buffett. That's not Buffett. Yeah, yeah. That's
John Maynard Keynes. Oh shit, that's that old? Yeah. Okay. Okay, so Keynes said it, right? It's like,
you can be, yeah, so this is the world you're operating in. Like, it doesn't matter, right? Like,
what exactly happens? There will be ebbs and flows, but like, that's the world you're
operating in. Um, I reckon that if an AI bubble, if the AI bubble pops, each one of these
CEOs lose their jobs. Sure. Or if you don't invest and you lose, it's, uh, Pascalian,
Wager and you're uh, that's much worse. Across decades, the largest company at the end of each
decade, like the largest companies, that list changes a lot. Yeah. And these companies are the most
profitable companies ever. Are they going to let that list? Are they going to let themselves like
lose it or are they going to go for it? They have one shot, one opportunity, you know, to make
themselves into the whole Eminens song, right? I want to hear like the story of how both of you
started your businesses or you're like the thing you're doing now. Um, John, like, I, like,
How did it begin?
What were you doing when you started the podcast?
You're going to tell about your textile company?
Oh, my God.
No way.
Please, please.
Are you joking?
I guess if he doesn't want to, we'll talk about it later.
Okay, sure.
I think, like, I used to, I mean, the story's famous.
I've told it a million times.
It's like, Asianometry started off as a tourist channel.
Yeah.
So I would go around kind of like, I was, I moved to Taiwan for work, and then...
Doing what?
I was working in cameras.
And then, like, I told...
What was the other?
company you started?
It tells too much about me.
Oh, come on.
I worked in cameras, and then basically, I went to Japan with my mom, and mom was like,
hey, you know, what are you doing in Taiwan?
I don't know what you're doing.
I was like, all right, mom, I will go back to Taiwan and I'll make stuff for you.
And I made videos.
I would, like, go to the Chiang Kai Shrek Park and be like, hi, mom, this park was this,
this, eventually at some point you run out of stuff.
But then it's like a pretty smooth transition from that into.
to like, you know, history of Chinese history,
Taiwanese history,
and then people started calling me Chinenometry.
I didn't like that,
so I moved to other parts of Asia.
And now, like, and then...
So what year did you, like, start for...
Like, what year was, like, people started watching your videos,
let's say, like, 1,000 views per video or something?
Oh my gosh, that was not...
I started the channel in 2017,
and it wasn't until, like, 2018, that...
2019 that actually...
I labored on for, like, three years,
first three years with, like, no one watching.
Like I got like 200 views and I'd be like oh this is great
And then were you were the videos basically like the ones you have
By the way so sorry backing up for the audience who might not
I imagine basically everybody knows Asianometry but if you don't
Like the most popular channel about semiconductors
Asian business history business history in general
Even like geopolitics history and so forth
And yeah I mean it's like honestly I've done like research for like
Different AI guests and different like whatever thing I'm trying to be
I'm trying to understand, like, how does harder work?
How does AI work?
It's like, this is like my...
How does the zipper work?
Did you watch that video?
No, I would watch that one.
It was like, I think it was a span of three videos.
It was like, Russian oil industry in the 1980s and how it, like, funded everything.
And then when it collapsed, they were absolutely fuck.
Yeah.
And then it was like, the next video was like, the zipper monopoly in Japan.
Not a monopoly.
Strong, strong holding in a mid, in a mid-tier size.
There's like the luxury zipper makers.
Asianometry is always just kind of like stuff I'm interested in.
And I'm like interested in a whole bunch of different stuff.
And I like, like, and then the channel, for some reasons, people started watching the stuff I do.
And I still have no idea why.
To be honest, I still feel like it's, I still feel like a fraud.
I sit in front of like Dylan and he's, I feel like a fraud, legit fraud, especially when he starts talking about 60,000 wafers and all that.
I'm just like, I should be know.
I should know this.
But like, you know, in the end it's.
But, but that, you know, I just try my best to kind of bring interesting stories out.
How do you make a video every single week?
Because these are like...
Two a week.
You know how long he had a full-time job?
Five years, six years.
Oh, sorry, a textile business.
And a full-time job.
Wait, no.
Full-time job, textile business and Asianometry
until like for a long, long time.
I literally just gave up the tech-sell business this year.
And like, how are you doing research and doing, like, making a video and like twice a week?
I don't know.
I like do these fucking, I'm like fucking talking.
This is all I do.
And I like do these like once every two weeks.
Sorry.
See, the difference is Dwarkesh.
You go to SF Bay Area parties constantly.
And Dwar GESH is, I mean, then John is like locked in.
He's like locked in 24-7.
I believe that SMC work ethic and I've got like the Intel work ethic.
If I don't, I got the Huawei ethic.
If I do not finish this video, my family is, it will be pillaged.
He actually gets really stressed about it, I think, like not doing something like on his schedule.
Yeah.
Is it very much like, I do, I do two videos a per week.
I write them both simultaneously.
And how are you scouting out future topics you want to do research?
You just like, you know, you just pick up random articles, books, whatever, and then you just, if you find it interesting, you make a video about it?
Sometimes what I'll do is that I'll Google a country and I'll Google an industry and I'll Google like what a country is exporting now and what it used to export.
And I compare that and I say, that's my video.
Or I'll be like, but then sometimes also just as simple as like, I should do a video about YKK.
And then it's also just, but then it's also just a simple.
The zipper is nice.
I should do a video about it.
I do.
I do. It literally is. Do you like keep a list of like, here's the next one, here's the one after that?
I have a long list of like ideas. Sometimes it's as vague as like Japanese whiskey.
No idea what Japanese whiskey is about. I heard about it before. I watched that movie. And then so I was just like, okay, I should do a video about that.
And then eventually, you know, you get to a, you get, you move back. How many research topics do you have in the back burner, basically?
Like, you're like, I'm just kind of reading about it constantly. And then like in a month or so I'll make a video about it.
I just finished a video about how IBM lost the PC.
Yeah.
So right now, I'm de, I'm unstressing about that.
But then I'll kind of move right on to, like, the videos do kind of lead into others.
Like right now, this one is about IBM PC.
How IBM lost the PC.
Now it's next is how compact collapsed, how the wave destroyed compact.
So technically, I'll do that.
At the same time, I'm dual lining a video about cubits.
I'm dual lining a video about, uh, uh, uh, uh, directed,
self-assembly for semiconductor manufacturing, which I'll read a lot of Dylan's work for.
But then like, like, a lot of that is kind of like, it's just, it's in the back of my head,
and I'm, like, producing it as I, as I go.
Dylan, how do you work?
How does one go from Reddit shit poster to, like, running a, like, a semiconductor research
and consulting firm?
Yes.
Let's start with the shit posting.
It's a long line, right?
Like, so immigrant parents grew up in rural Georgia.
So when I was eight, I begged for, or seven, I begged for an Xbox.
and when I was eight, I got it, 360, right?
They had a manufacturing defect called the Red Ring of Death.
There are a variety of fixes that tried them, like putting a wet towel around the Xbox,
something called the Penny Trick.
Those all didn't work.
My Xbox still didn't work.
My cousin was coming the next weekend and, like, you know, he's like two years older
than me.
I look up to him.
He's like in between my brother and I, but I'm like, oh, no, no, we're friends.
You know, you don't like my brother as much as you like me.
My brother's more like jockey types.
I didn't matter.
So, like, he didn't really care that I broke, that the Xbox was broken.
He's like, you better fix it, though, right?
Otherwise, parents will be pissed.
So I figure out how to fix it online.
I tried a variety of fixes, ended up shorting the temperature sensor.
And that worked for long enough until Microsoft did the recall, right?
But in that, you know, I learned how to do it out of necessity on the forms.
I was a nerdy kid, so I liked games, but whatever.
But then, like, there was no other outlet once I was like, holy shit, this is Pandora's box.
Like, what just got opened up?
So then I just shit posted on the forms constantly, right?
and, you know, for many, many years.
And then I ended up, like, moderating all sorts of Reddits when I was, like, a tween teenager.
And then, like, you know, as soon as I started making money, you know, you know, grew up in a family business, but didn't get paid for working, right?
Of course, like yourself, right?
But, like, as soon as I started making money at, like, I got my internship and like internships and I was like 1819, right?
I started making money.
I started investing in semiconductors, right?
Like, I was like, of course, this is the shit I like, right?
you know, everything from like, and by the way, like the whole way through, like as technology progressed,
especially mobile, right, it goes from like very shitty chips in phones to like very advanced every
generation they'd add something. And I'd like read every comment. I'd read every technical post about it.
And also all the history around that technology. And then like, you know, who's in the supply chain?
And it just kept building and building and building. We went to college did data sciencey type stuff,
went to work on like hurricane, earthquake, wildfire simulation and stuff for a financial company.
But before that, like, but during college, I was still like, I wasn't posting on the internet as much.
I was still posting some, but I was like following the stocks and all these sorts of things, the supply chain, all the way from like the tool equipment companies.
And the reason I like like those is because like, oh, this technology, oh, it's made by them, you know, you kind of.
Did you have, like, friends in person who were into this shit?
Or was it online?
I made friends on the internet, right?
Oh, that's dangerous.
Not.
I've only ever had, like, literally one bad experience.
And that was just because he's drugged out, right?
like a one that experience online or like meeting someone from the internet in person everyone else
has been genuine like you you have enough filtering before that point you're like you know even
if they're like hyper mega like autistic it's cool right like i am too right you know no i'm just kidding
um but like you know you go through like the um you know the layers and you look at the economic
angle you look at the technical angle um you read a bunch of books just out of like you know you
can just buy engineering textbooks right and read them right like what's what's what's stopping you right
and if you bang your head against the wall, you learn it, right?
And why we were doing this, was there like, did you expect to work on this at some point?
Or was it just, like, pure interest.
No, it was like, it was like obsessive hobby of many years.
And it pivoted all around, right?
Like, at some point, I really like gaming.
And then I moved into, like, I really liked phones and, like, rooting them and, like,
underclocking them and the chips there and, like, screens and cameras.
And then back to, like, gaming and then to, like, data center stuff.
Like, because that was, like, where the most advanced stuff was happening.
So it was like, I liked all sorts of.
of like telecom stuff for a little bit.
Like it was like it like bounced all around.
But generally in like computing hardware, right?
And I did data science, you know, you could, I said I did AI when I interviewed, but like,
you know, but it was like bullshit multi-variable regression, whatever, right?
It was simulations of hurricanes earthquakes, wildfire for like financial reasons, right?
Like anyways, you move, I moved up to like, you know, I was still, you know, I had a job for
three years after college and I was posting and like whatever.
I had a blog, anonymous blog for a long time.
even made like some YouTube videos and stuff.
Most of that stuff is scrubbed off the internet, including internet archive because I asked
them to remove it.
But like, in 2020, I like quite quit my job and like started shit posting more seriously
on their net.
I, I moved out of my apartment and started traveling through the U.S.
And I went to all the national parks like in my truck slash, like tent slash, you know,
also stayed in hotels and motels like three, four days a week.
But I'd like, I started posting more frequently on the internet.
I mean, I'd already had like some small consulting arrangements in the past, but it really started to pick up in mid-2020, like consulting arrangements from the internet from my persona.
Like what kinds of people, investors, hardware companies?
There were like, it was like, it was like people who weren't in hardware that wanted to know about hardware.
It would be like some investors, right, some couple VCs did it, but some public market folks.
You know, there was times where like companies would ask about like three layers up in the stack like me because they saw me write some random post.
like, hey, like, can we, blah, blah, blah.
Right?
There's all sorts of, like, random.
It was really small money.
And then in 2020, like, it really picked up.
And I just, like, I was like, why don't I just arbitrarily make the price way higher?
And it worked.
And then I started posting, I made it a new, I made a newsletter as well.
And I kept posting, um, quality kept getting better, right?
Because people read it.
They're like, this is fucking retarded.
Like, you know, they're supposed to actually right.
Or, you know, like, you know, over, over more than a decade, right?
and then in 2021 towards the end I made a paid post someone didn't pay and like you know for a report or
whatever right ended up that ended up doing like I went to sleep that night it was about it was about
photo resist and like the developments in that industry which is the stuff you put on top of the wafer before
you put in the lithot tool lithography tool um did great right like I woke up the next day and I
had like 40 paid subscription I was like what okay let's keep going right and like let's post more paid
paid sort of like partially free partially paid did like all sorts of stuff on like advanced packaging
and chips and data center stuff and like AI chips like all sorts of stuff right that I like was
interested in that was interesting and like I always bridged economically because I read all the
companies earnings for like you know since I was 18 and 28 now right you know all the way through to like
you know all the technical stuff that I could um 2020 I also started to just go to every conference I
could right um so I go to like 40 conferences
a year. Not like, not like trade show type conferences, but like technical conferences. Like,
like an chip architecture, photo resist, you know, AI NRIPS, right? Like, you know, ICML.
How many conferences do you go to a year? Like 40. So you like live at conferences?
Yes. Yeah. I mean, I've been to digital nomad since 2020 and I've basically stopped and I moved
to SF now, right? But like kind of, kind of, not really. You can't say that. The government,
the California government. I don't live at SF, come on. But I basically do now.
Carrene, Internal Revenue Service.
Oh, do not joke about this, guys.
Like, do not seriously joke about it.
They're going to send you a clip of this podcast, be like 40% please.
I am in San Francisco, like, sub-formance a year contiguously, you know.
Exactly 100 and whatever day.
Exactly 179 days, let's go, right?
Like, you know, over the full course of the year.
But no, like, you know, go to every conference, make connections at all these, like, very technical things.
like international electron device manufacturing. Oh, lithography and advanced patterning. Oh,
like a very large scale integration. Like, you know, all, you know, circuits conference.
You just go every single layer of the stack. It's so siloed. There's tens of millions of people
that work in this industry. But if you go to every single one, you try and understand the
presentations. You do the required reading. You look at the economics of it. You, like, are just
curious and want to learn. You like, you can start to build up like more and more and the content got
better and like, you know, what I followed, we have better. And then, like, started hiring people in
2020, and early 2022 as well. Um, or might have been, yeah, yeah, like mid, mid, 22 started hiring,
got people in different layers of the stack. But now today, like you fast forward now today, right,
like, uh, almost every hyperscaler is a customer, not for the newsletter, but for like data we
sell, right? Um, you know, most many major semiconductor companies, many investors, right? Like, all these
people are like customers of the data and stuff we sell. Um, and the company has people,
the way from like X-Simer, X-A-S-M-L, all the way to like X, like, Microsoft and like an AI
company, right?
Like, you know, like, and then through the stratification, you know, now there's 14 people
here and like the company and like all across the US, Japan, Taiwan, Singapore, France,
US, of course, right?
Like, you know, all over the world and across many ranges of like, and hedge funds as well,
right, ex-hedge funds as well, right?
So you got kind of have like this amalgamation of like, you know, tech and finance
expertise and we just do the best work there, I think.
Are you still talking about a monstrosity?
An unholy concoction of it.
So, so like, and we saw, you know, we have data analysis, consulting, et cetera,
for anyone who, like, really wants to, like, get deeper into this, right?
Like, we can talk about, like, oh, people are building big data centers, but, like,
how many chips is being made in every quarter of what kind for each company, what are
the subcomponents of these?
chips? What are the sub-components of the servers? Right? We try and track all of that.
Follow every server manufacturer, every component manufacturer, every cable manufacturer,
just like all the way down the stack tool manufacturer and like know how much is being
sold where and how and where things are and project out, right, all the way out to like,
hey, where is every single data center? What is the pace that it's being built out?
This is like the sort of data we want to have and sell. And, you know, it's the validation
is that hypers purchase it and they like like it a lot, right? And like,
AI companies do and like semiconductor companies do.
So I think that's the sort of like how it got there to where it is is just like try and do
the best, right?
And try and be the best.
If you were an entrepreneur who's like, I want to get involved in the hardware chain somewhere,
like what is like what is if you could start a business today somewhere in the stack,
what would you pick?
John, tell them about your textile business.
I think I'd work in memory.
Yeah.
Something in memory.
Because I think like if you, if this concept is like there, like you have to hold immense amounts of memory.
Immense amounts of memory.
And I think memory already is tapped like technologically to HBM exists because of limitations in DRAM.
I said it correctly.
I think like it's fundamentally we've forgotten it because it's a commodity, but we shouldn't.
I think it's breaking memory is going to would change, could change the world in that.
I think the context here is that Moore's Law was predicted in 1965. Intel was founded in 68 and released
their first memory chips in 69 and 70. And so Moore's Law was, a lot of it was about memory.
And the memory industry followed Moore's Law up until 2012, where it stopped, right? And it
became very incremental gains since then, whereas logic has continued and people are like,
oh, it's dying, it's slowing down. At least there's still a little bit of like, you know,
for, you know, coming, right? You know, still more than 10% 15% a year, Kager, right? Of
growth and density slash cost improvement. Memory is like literally like been like since 2012,
like really bad. So and when you think about the cost of memory, you know, it's been,
it's been considered a commodity, but memory integration with accelerators, like this is like
something that I don't know if you can be an entrepreneur here though. That's the real challenge
is because you have to manufacture at some really absurdly large scale or design something,
which in an industry that does not allow you to make custom memory devices or use materials that
don't work that way. So there's a lot of like work there that I don't. So I, I,
I don't necessarily agree with you, but I do agree it's like one of the most important things for people to invest in.
You know, I think there's, it's, it's really about where is your, where are you good at and where can you
vibe and where can you like enjoy your work and be productive in society, right? Because
there are a thousand different layers of the abstraction stack. Where can you make it more efficient?
Where can you use, utilize AI to build better and make everything more efficient in the world and
produce more bounty and like iterate feedback loop, right? And there is more opportunity to today.
than any other time in human history in my view, right? And so, like, just go out there and try, right?
Like, what engages you? Because if you're interested in it, you'll work harder, right?
If you're, like, have a passion for copper wires, I promise to God if you make the best copper
wires, you'll make a shitload of money. And if you have a passion for, like, B2B SaaS,
I promise to God you'll make fuckloads of money, right? I don't like B2B SaaS, but whatever, right?
It's like, whatever, you know, whatever you have a passion for, like, just work your ass off.
try and innovate, bring AI into it, and let it, you try and use AI yourself to like make yourself
more efficient and make everything more efficient. And I promise you will like be successful, right?
I think that's really the view is not necessarily that there's one specific spot because
every layer of the supply chain has, you go, you go to the conference there, you go to talk to the
experts there. It's like, dude, this is the stuff that's breaking and we could innovate in this way.
Or like these five extraction layers, we could innovate this way. Yeah, do it. There's so many layers
where this is, we're not at the.
the parietal optimal, right?
Like, there's so much more to go in terms of innovation and inefficiency.
All right.
I think that's a great place to close.
Dylan, John, thank you so much for coming on the podcast.
I'll just give people, the reminder, Dylan Patel, semi-analysis.com.
That's where you can find the technical breakdowns that we've been discussing today.
Asianometry, YouTube channel.
Everybody will already be of Asianometry, but anyways.
Thanks so much for doing this.
It was a lot of fun.
Thank you.
Yeah.
Thank you.
