a16z Podcast - Dylan Patel: GPT-5, NVIDIA, Intel, Meta, Apple

Starting point is 00:00:00 Invidia is going to have better networking than you. They're going to have better HBM. They're going to have better process node. They're going to come to market faster. They're going to be able to ramp faster. They're going to have better negotiations with, whether it's TSM or SK Hynix and the memory in silicon side or all the rack people or like copper cables, everything.

Starting point is 00:00:14 They're going to have better cost efficiency. So you can't just like do the same thing as Nvidia. You have to really leap forward in some other way. You have to be like 5X better. Today we're talking AI, hardware, chips, and the infrastructure powering the next wave of models with three people at the center of it all. Dylan Patel, founder and CEO of Semi-analysis,

Starting point is 00:00:34 one of the sharpest voices on chips, data centers, and economics driving AI's explosive growth. Aaron Price-Ripe, general partner at A16Z, investing in the technologies and infrastructure shaping the future. Guido Appenzeller, partner at A16Z, with decades on the front lines of AI, cloud, and networking. From GPT5's launch to Nvidia's dominance, custom silicon and the global race for compute,

Starting point is 00:00:57 We're covering what's happening behind the scenes. Let's get into it. Dylan, welcome to the podcast. Thank you for having me. We've been trying to get you for a while. You're a busy man, but it worked out. Guido, why want to you introduce why we were so excited to have Dylan on the podcast and what we're excited to discuss.

Starting point is 00:01:13 I think, Dylan, you've done exceptional job in covering what's happening in the AI Harvard space, AI semi-space, and now more and more data center space as well. And just looking at it, currently the most valuable company on the planet is an AI. semi company, right? The I think biggest IPO so far in AI was an AI cloud company. This is currently where it's happening, right? In any gold rush in the early days is the pigs and troubles that make money. And I think this is the stage that we're in. So, super excited to have you here today. Awesome. Thank you. Happy to talk about my favorite topics. Amazing. Well, maybe let's start with GP5. We just had some of the research for Christina and Isabella on here last week. You said it was

Starting point is 00:01:50 disappointing. When you share your reactions or what capabilities you were hoping to see or overall I think it depends on what tier of users. you are right if you're just using gpd5 and before you were $20 or $200 a month subscriber you no longer have access to 4.5 which in my opinion is still a better pre-trained model for certain things or you no longer have access to 03 which would think for 30 seconds on average maybe right whereas gpd5 even when you're using thinking only thinks for like five to 10 seconds on average right which is an interesting sort of phenomenon right but basically like gpd5 is not spending more compute per se. The model did get a little bit better on a vanilla basis, right?

Starting point is 00:02:31 4-0 to 5 is actually quite a bit better. But when you think about, you know, what is this curve of intelligence, right? It's like, the more compute you spend, the better the model gets. And that's whether it's a bigger model, which GPD-5 isn't, right? You can see it's not a bigger model. It's roughly the same size, you know, or you think more, right? But again, like, this is something that opening eyes, first thinking models, you know, the first few generations of 0.1.03 would think for a long time and waste a lot of tokens, if you will. And when you look at, for example, anthropics thinking models, even when you put them in thinking mode, they think a lot less, right, to get to the same results or better results, right, as opening I was. And so

Starting point is 00:03:09 opening, I think, like, optimized a lot of like, well, if I ask, like, I think the silliest one I had asked was like, I asked 03 once, is pork red meat or white meat? And it thought for like 48 seconds, it's like, what are you doing? Like, they should just like tell me the answer. And so, like, The nice thing is that GPD5 will think a lot less, even if you select thinking manually, but more importantly, they have the sort of auto functionality, the router,

Starting point is 00:03:31 which lets them decide whether or not, hey, do I route to the regular model? Do I route to maybe mini if you're out of rate limits or do I route to thinking, right? And how much do I think? But in general, the thinking model will think less. So there's less compute going into a power user's average query than before. But isn't it even more interesting?

Starting point is 00:03:51 opening eye cannot control how much computer wants to allocate to you, right? If we're in a high load situation, maybe tune the router a little bit so it's less, right? Maybe I have no idea what they're doing behind the curtain. But there's this meme out there at the moment that basically all they did, which is a meme, right? It's not true. But all they did is take all three plus a couple of smaller models, put a router in front and offer that at the lower branded price essentially, right?

Starting point is 00:04:14 I think there's a little bit of that, right? Cost suddenly matters and they figured out a way how they can steer that. I think, yeah. and they talked about how they've been able to dramatically increase their infrastructure capacity because I myself was just regularly using 03 or 4.5, right? And now I'm forced to use auto,

Starting point is 00:04:31 which sometimes gives me the 03 equivalent thinking model, but sometimes gives me just the regular basis which sucks. But like, I think for the free user, it's actually quite interesting, right? The free user was not getting thinking models pretty much ever or not using them. Or in many cases, they just

Starting point is 00:04:47 open the website and asked their query, and now sometimes their query gets routed there. So sometimes they get way better model. But now sometimes the opening I can gracefully degrade them if they need to, right? And I think the router points to the future of opening I from a business, right? Like, you can look at sort of the model companies, right? Anthropic is fully focused on B2B, right? API, code, et cetera, right? Or a cloud code, whatever it is, right? Open AI, yes, they have that business, codex and API business, but really their majority of the revenue is consumer, right? And it's consumer subscriptions. But they have no way to upsell, you know, to make money off of all the

Starting point is 00:05:21 free users, right? In any other application consumer app, the free user still pays via ads. But this is not compatible with AI, right? Like, it's a helpful assistant. You can't just make the users result worse by injecting ads. Banner ads don't really work in AI either. So it's like, how do you now monetize them? And I think with the router, they're getting really close to figuring out how to monetize that user, right? With the new CEO of applications, if you saw her product that she launched at Shopify, I think it was Shopify, was an agent for shopping, right? And now this like immediately clicks, like, oh, if the user asks a low value query, hey, why is the sky blue? Just route them to many, right? The model can answer perfectly fine.

Starting point is 00:06:01 And that is a chunk of queries, right? But if they ask, what's the best DUI lawyer near me, right? All of a sudden, this is like, you know, you're in jail. You have one shot. You're like, screw it. Let me ask Chad GPT what the best DUI lawyer is. And now all of a sudden, the model's not capable of it today, but soon enough, it'll be able to contact all the lawyers in the area and figure out what their results are and maybe search their, like, court filings and whatever, right? Book the best lawyer for you or an airplane ticket. Maybe negotiate a cut as part of that. Yeah, of course they're going to take a cut, right?

Starting point is 00:06:29 But this is a much better way of monetizing the free user. It's like, you know, it's like Etsy, 10% of their traffic now comes from chat. And Open A makes nothing off of that. But they really, really will soon, right? And partially that's because Amazon blocks chat. But there's a way to make money from shopping decisions, whether it's booking flights or looking for items and those you now say free user i don't care i'm going to send you to my best model i'm going to send you to agents i'm going to spend ungodly amounts of compute on you because i can

Starting point is 00:06:57 make money off of this but if it's a query that's like help me with my homework i'll send you like a decent model right i don't need to spend money on you and so this is how i think like opening i can finally make money off of the free user and i think that's the biggest like thing about the router right this is super interesting i think this is the first time that we've seen that there's a launch of a new model where, to some degree, cost is the headline item, right? I mean, so far, I was always like, who is the smartest model? Who is the highest MLU score? Now, we have suddenly people who use models for coding for eight hours a day

Starting point is 00:07:26 and surprise that if you take a large context window and the best model, creates thousands of dollars of cost a month. So cost matters. And so to some degree, so where you're on the parade of frontier between cost and performance is the new benchmark for model competitive no longer cost alone. Is that what we're seeing here? I mean, I think definitely, right? like opening eyes said they doubled their rate limits

Starting point is 00:07:47 for big amounts of users. They've dramatically increased the number of tokens are serving from this launch, which effectively says this is an economic release. It also means the tokens aren't all cheaper, right? Yeah, yeah, for sure, for sure. I think the funniest thing is this whole cost thing you mention is like, we've seen this in the code space, right?

Starting point is 00:08:03 Cursor had to pull away the unlimited clod code. Initially, they have this super expensive plan and it had like unlimited rates and then they were only like a weekly rate limit. Now they have like hour-based rate limits. And I saw the crazy. like, thread on Twitter where this guy said he changed his sleep schedule, right? Modeled after like how sailors in the bay, if you're sailing, you can't sleep, right?

Starting point is 00:08:24 Like, solo sailing, they'll take, like, power naps when they get to the right spots so that they can, like, still be safe. In the morning when it's not very windy. Well, but like they can't sleep uninterrupted, right? And so because Anthropic had to put rate limits that are like not just week based, but like a number of hours based, and like, he like basically sleeps multiple times a day, but small chunks just so you can maximize the usage. And there's also a leaderboard on Reddit

Starting point is 00:08:48 where people are like competing to see how many tokens they're using through their subscription. And there's like a dude spending like $30,000 a month. So I'm going to find some developer in India that I can do pair programming with so I can get the day cycle, he can get the night cycle and we both can maximize together the quota for the account. Is that the future then?

Starting point is 00:09:04 I mean, but it's clear like people are taking advantage of the negative gross margin, like sort of subscriptions that are offered. I think Anthropic probably makes a positive gross margin off of my subscription. I don't code enough, but there's plenty of people that are definitely losing money. And so, as you said, it's an economic... It'll push more and more to, I think, just usage-based pricing.

Starting point is 00:09:22 If you have an underlying commodity that you're reselling to some degree that is that large a part of your cost of goods, right, you need to go to user-based pricing. How much do you think the, like, customer capture and stickiness for these code products is? I'm curious what you think on that, right? Once you use an IDE, once you integrate one of the CLI products in,

Starting point is 00:09:40 like, how sticky is it? Or is it people just switch like that? That is a billion dollar question. That's a very conservative estimate. Look, Andrew Parthy has this great slide where he basically says, if you're building an agentic system today, right? But fundamentally what is is of this loop, right? Where half of the loop is the model thinking, right?

Starting point is 00:09:57 And then the other half is then the user verifying what did the agent do? Is it the right thing providing feedback and trying to steer it in the right direction? Because we can't run forever, eventually you need to steer it back. One half of that is the model provider, right? They're trying to build the best models. The other half is really about, I think, designing the best possible UI to enable the user to give feedback. And I think there's value on that.

Starting point is 00:10:15 So I think there's a certain amount of stickiness in there. So what are all the different tools, like in terms of visual, like say, take code editing, right? How can I most easily visualize what the code changes are? How can it most easily visualize, you know, what they impact, which files? You know, how can I, for small changes get very quick feedback versus for complex ones, you know, get complex feedbacks. There's some tools that actually draw diagrams for you of what they do, right?

Starting point is 00:10:34 So I think this will be the battle is I think there's stickiness in that, right? How much exactly? So in that sense, like people should be doing subscriptions to get people locked in, right? instead of moving to usage-based pricing. Well, I think it's the customers that don't want to do usage-based pricing because it's so hard to guarantee, it's so hard for it to get away from them. And you actually want guarantees,

Starting point is 00:10:53 and you're willing to commit to pretty high spend in order to not have usage-based pricing. I think it's the model companies that want usage-based pricing. I think with consumers, it's frankly very hard to not have usage-based pricing, just because the variability is so massive, right? If it's us coding versus somebody who does this as their full-time job, right? you just have a factor of 20 or so difference in usage.

Starting point is 00:11:15 That costs a lot of money, right? I think for enterprises, we could see like more flat-free pricing because it can average it out more. You have a developer that's using it all day. You kind of know in a general sense, like how many hours a day they're programming and what that sort of looks like. The vibe quotas are harder.

Starting point is 00:11:30 Yeah. Before we leave opening, I want to ask a broad question, which is if Sam Altman was sitting here and saying, hey, Dylan, I'll listen to anything you tell me to do, any advice you have as long as it makes open air more valuable, who would you tell them? I would say immediately launch a method for you to input your credit card into chat GPT and agree that for anything it like agentically does for you, it'll take X cut

Starting point is 00:11:51 and then launch that product because where it does shopping, right? Because like everyone knows that like Anthropic and Open AI and all the other labs are buying RL environments of Amazon and of Shopify and of Etsy and of all the different ways to shop on the internet. Oh, of airline websites, right? now just like hey integrate my calendar I want to fly to there on Thursday make sure I don't miss a meeting cool book right

Starting point is 00:12:15 do that integration like super well know my preferences on whether I like aisle or window all this stuff right and just take a take rate I think this will make them so much money the moment they launch it and I think they're working on it already but I'd like to hear how he thinks about it because he shifted his tone massively on like ads over the last six months right

Starting point is 00:12:33 he used to be like no way and now he's like maybe you know there's a way to do it without harming the user. And I think this is how you monetize the free user, right? So I think that's probably what I'd tell him slash ask him about, like a whole line of questions around this. Well, he's coming on the podcast in a few weeks. So we'll ask him.

Starting point is 00:12:50 I want to shift to Nvidia. Invita is having a monster year. They're up almost 70%. What are the possible paths from here? How do you see it playing out? Depends, like, how pill do you are on like the continued growth. But I think you guys have a good vantage point. We have a good vantage point of how fast revenue is growing for a lot of these

Starting point is 00:13:07 companies, especially the code companies, but even many other applications. I think we can clearly see the demand side is accelerating, right? And then if you look at the training side, I think the race is on. That is upping hugely. Google's upping hugely. If you just look at, again, just OpenA and Anthropic, and the compute that they have and are getting this year from Google and Amazon for Anthropic and from Microsoft, Correve, Oracle for Open AI, 30% of the chips are going to them, just those two companies. But that's actually like. like, okay, well, like 70% of the stuff, like, who's making off? Well, one third of it is like ads, right?

Starting point is 00:13:42 Whether it be bike dance or meta or many of the other people who are doing ads. So then it's still like, okay, well, we're the rest of these one-third of the chips coming from? Well, they're like mostly uneconomic providers who I don't think it's like an obvious bet that they're going to, you know, keep raising bigger and bigger rounds. So what happens there? I think with the, you know, we talked about like coding, right? Like earlier, actually the Quinn Coder 3 model is actually super cheap if you're running it on or if you're running it in the cloud with all these inference libraries.

Starting point is 00:14:10 And so, like, there's stuff like that as well. So I think the question is, like, how much does it keep growing? Because clearly, I think the first third is definitely skyrocketing, right, of open-a-anthropic lab spend. The second third of, like, ads is going to grow. It's not going to grow like crazy. But I think there's definitely an inflection point that could be hit with Gen AI ads. I know Med has been experimenting with it a lot, but I could totally be convinced that there's

Starting point is 00:14:31 going to be a huge inflection and take rate there, right, where you start showing me personalized ads. Like, every person that's an ad is like. looks like me and I'll be like, okay, yes, except like slightly better, so I like feel better, right? And I'm like, I want to buy it, yeah. I have no idea how this is going to scale, right? But if you ask the question, how much could it scale, right? How much value are we creating here?

Starting point is 00:14:49 Can we create enough value to actually keep growing for a long time? If you just take AI software development, right? Yeah. We know we can easily get about 15% more productivity out of a developer. I don't think that's right. I think it's way higher. No, no, but the straight, like I talked to a lot of enterprises, Like a classical enterprise, straight-up GitHub co-pilot deployment, that gives you about 15%.

Starting point is 00:15:08 We can do much more than that. But, bro, like, you know how bad GitHub co-pilot is? Like, how did they, how, did they look at their revenue ARR chart? It's so funny. It's so funny if you look at the revenue ARR chart. It's like, Claude Code and three months has surpassed them. A cursor, you know, easily surpassed them. And then, like, even like companies like Replit are like, and windsurf slash cognition

Starting point is 00:15:27 are like going to pass them. Like, it's like, you're pretty sure what's going on. So, look, let's assume we can get this 100%. Yeah. So as we can double the protein. of a developer, right? About $30 million developers worldwide, give or take. Yeah.

Starting point is 00:15:38 Let's say 100K value ad per developer. It might be a little high worldwide. The U.S. is low, but worldwide is high. So it's $3 trillion. Yeah, yeah. Right. So we're probably building technology here, which adds $3 trillion of GDP value.

Starting point is 00:15:51 In theory, we could put that into GPUs because that's the main cost. Just from a coding model. Just from a coding model. Ignoring every other use case. So at least in theory, the value generation is here to keep growing, right? Now, how that translates through the industry is a much more complicated. I think we've already see. seen AI's value creation

Starting point is 00:16:06 exceed, so sort of there's like the whole like, like the famous like, oh, 300 billion problem or 200 billion problem now, it's 600 billion problem, I'm sure. So Quora's going to put out like the $1.20 million problem, right, soon enough. But like, like, there is some like reality in that, of course, but, you know, ignores that like infrastructure spend today is

Starting point is 00:16:24 accounting for five years of revenue, not like one. And the revenue looks like this, not like flat line. But I think, um, I think the main thing is that AI is already generating more value than the spend, it's that the value capture is broken, right? Like, I legitimately believe opening eye is not even capturing 10% of the value they've created in the world already,

Starting point is 00:16:45 just by usage of chat, right? And I think the same applies to, you know, Anthropic and cursor and whoever else you're looking at. I think the value capture is really broken. Even like internally, I think like what we've been able to do with like four devs in terms of like automation, like our spend on like Gemini API is absurdly low and yet we go through every single permit

Starting point is 00:17:08 and regulatory filing around every single data center with AI and it's like and we we take satellite photos of every data center and we like we're able to label our data set and then recognize what generators people are using what like cooling towers and the construction progress and substation all this stuff is like automated and it's only possible because of Gen AI but and we do it with like very few developers and then like the value capture that I'm able to generate by selling

Starting point is 00:17:32 this data by consulting with it is so high, but the company is making it as like, like, they get nothing out of it, right? Like, I think this, like, there was a value capture challenge here, uh, that far out exceeds the sort of creation, right? And, and, and as you get models like GPD5 or open source models, like continuing to drive it down, it's like the value captor is just harder and harder and harder for these companies, because they're making, you know, 50% gross margin on an inference if they're, you know, or last in many cases.

Starting point is 00:18:02 In so many words, you're saying we're getting commoditized and therefore you can't capture the value and thus you should temper your expectations of how much you can spend in GPUs? Well, no, I think you can, I think there's still ways to like inflect hugely on value capture, right? Like I mentioned, the ads are a huge value capture.

Starting point is 00:18:20 But that needs to happen before before we see a massive increase. No, I think the other thing is like there's a lot of capital that's not been spent, right? Like the hyperscalers still can grow CAPEX 20, 30% next year, right, from what they're doing this year. In addition, companies like Corrieve and Oracle because they're tapping capital markets

Starting point is 00:18:40 can raise way more than 20 to 30% capax. And then you go down the list further and it's like, oh, the largest infrastructure funds in the world like Brookfield and Blackstone, well, actually they're turning all of their eyes to investing even more into infrastructure AI Infra. And then you're like the sovereign wealth funds of the world, like the G42s or, you know,

Starting point is 00:18:59 the Norway one or GIC in Singapore like these people have barely started touching AI and so I think there's a whole lot more CAPEX that can come without it being necessarily like economically motivated day one. I'm also saying like economically motivated CAPEX can only grow like so much but there's so much other like where it's not clear from you know

Starting point is 00:19:23 if you have a spreadsheet you know and you're basing it on real business that you should actually spend this much but people will because they believe I believe, I think you believe, like, Infra, you know, people believe that this will be, you'll get profit out of it, but there's no like 100% certain like, you know, way to argue it. Yeah. How strand, if at all is in video by custom silicon?

Starting point is 00:19:47 I think that's the biggest thing, right, is when we look at orders from Google and from Amazon, right, especially, and meta, their custom silicon is, not Microsoft, their custom silicon kind of sucks. But the other three, they're really upping their orders massively over the last year. You know, Amazon is making millions of Traneum. Google's making millions of TPUs. TPUs clearly are like 100% utilized, right?

Starting point is 00:20:15 Trinium's not there, but I think Amazon will figure out how to do that, and Anthropic will. So I think that's the biggest threat to Invidia is that people figure out how to use custom silicon more broadly. And this sort of becomes this sort of like, if AI is concentrated, then custom silicon will do better. And that's not even talking about like Open AI's Silicon team and stuff, right? Like if AI is really concentrated, then they'll do better, custom silicon.

Starting point is 00:20:44 But if it gets dispersed broadly because there's all these open source models from China and there's all these open source software libraries from, you know, NVIDIA and China, and it makes the deployment costs like rock bottom, then potentially... If Google's TPU is able to compete with NVIDIA, in theory, could do it on the open market. Nvidia is worth more than Google these days.

Starting point is 00:21:08 Shouldn't Google start selling the chips to everyone? I mean, in theory, they should be able to achieve a higher market cap. I absolutely think so. I think Google's even discussing it internally. I think it would require a big reorg of culture. and a big reorg of, like, how Google Cloud works and how the TPU team works

Starting point is 00:21:27 and how the Jax software team and XLA software teams work. I totally think they could. It would just take them, like, shaking themselves pretty hard to be able to do it. Yeah, but I totally think Google should sell TPUs externally. Not just renting, but, like, physically. It's kind of funny if a side hobby, in theory, has a higher company value potential as you make product.

Starting point is 00:21:52 entire business, especially as you think about the degradation of search. So I think, yeah, I think, but I think like, if you were to ask like Sergei, right, like, hey, do you think selling chips and racks is more valuable or a cloud or or Gemini, he'd be like, no, no, no, no, no, like Gemini is going to be worth way, way, way more. It's just not yet today, right? And so I think like, like today you say Nvidia is the most, again, it's like a whole concentration thing, right? If the world is super concentrated in terms of customers,

Starting point is 00:22:22 then InVidio will not be the most valuable company in the world, right? But if it gets dispersed more and more, which arguably we're starting to see with a lot of these open source models getting better and better and with ease of deploying them getting better, then you would see, I think you could argue, InVidio will remain the most valuable company in the world for a long period of time. Historically, no pun intended software has eaten the world

Starting point is 00:22:48 in most markets, right? I mean, like, if you look at early networking days, Cisco was the most valuable company on the planet, right? For a while, it's no longer, right? They're the guys that built services on top, like Google or Amazon or meta eventually eclips. Which is why Nvidia is like making all these software libraries, right? Like that's, that's, and they're trying to commoditize inference, right?

Starting point is 00:23:08 Like, you guys don't, I think, even have an inference API provider investment, do you? Well, we have all kinds of model providers. Model providers, but I'm talking about a, Pure API provider investment, I think, right? Is that correct? I think I talked to one of the team members, maybe Rajko or someone about, like, why you guys didn't invest in like a together or like a fireworks. And sort of the argument was like, well, we think just serving models alone without making

Starting point is 00:23:36 them will sort of be commoditized. Yeah. Right? We have some in the stable diffusion ecosystem. Like with like a file. With file. Yeah. It's a little bit different dynamics there, I think.

Starting point is 00:23:46 They tend to make much more component models than the. the L.M folks, I think that's a little different. But, but, like, you guys don't have one of these, like, you know, base 10 or any of these, like, sort of, like, API investments because you think, this is from someone on the Infra team that you guys think it'll get commoditized because the software in video is making, because VLM and HD-Lang, which is, like, open-source software

Starting point is 00:24:05 coming out of Berkeley and now, you know, sort of has their own environments now. And supported by many, like, this being commoditized means that, like, API providers aren't necessarily worth a ton, right? is sort of your argument, maybe. I think that's relevant to this whole thing, which is, you know, why, right? Like, why would you do this?

Starting point is 00:24:25 Shifting gears, what about the silicon startups? What's your take on those? I mean, there's a ton of capital flowing into that. We've seen, I have not numbers, but probably billions being invested in ship startups. Yeah, for sure, for sure. I mean, like, whether you're looking at, like, you know, companies like, I think it's, like, pretty impressive

Starting point is 00:24:44 that a few companies like etched and Rivas, and a number of other companies, you know, Madax and others, like, have gotten the amount of funding they've had without even launching a chip, right? You know, in the past, like, yes, silicon companies would make money or raise money,

Starting point is 00:25:01 but they would at least launch a chip before they get a, you know, a big round. But, like, Etchden Rivas, like, have raised, you know, a lot of money without ever launching a chip publicly, which I think is, I mean, it speaks to, well, like, yes, silicon is super capital-intensive if you're building a chip, an accelerator, which has so many moving pieces.

Starting point is 00:25:22 And there's like, there's like 10 different AI accelerator companies out there, right? Like, that are newish in the last few years. I think there's a lot more. That are like, yeah, yeah, yeah, that's fair. And then there's the old guard, which continues to raise money, right? Like GROC and Cerebris and Samanova. And Samanova and Tens Torren and so on and so forth, right? Or GraphCorp getting bought out by SoftBank and SoftBank dumping money into this effort

Starting point is 00:25:45 as well, right? There's a lot of capital being invested to dispel sort of NVIDIA's top dollar or top position. But it becomes challenging, right? It's like, how do you beat NVIDIA, right? Like the hyperscalers, I think, are like kind of lucky in that they can do mostly the same thing as NVIDIA, right? They have a customer, which is themselves.

Starting point is 00:26:07 So it's a huge asset thing. They can just win on supply chain, right? Like, I'm using cheaper providers. It's a margin compression exercise, essentially. Yeah, yeah. And maybe for certain workloads, like Metafer, recommendation systems, they'll have a better, you know, they can specialize more. But for the most part, it's like, no, we're targeting the same workloads. We can just simplify supply chain or

Starting point is 00:26:25 or in-house a lot of it and compress margin. It'll be fine. But in the case of, you know, these other companies, it's like, well, they don't have a captive customer. So now you have to contend with, well, I'm using the same ecosystem. And either I can use some custom silicon provider who's going to take a margin anyways on top and that's going to compress my like what I can sell for or I can try and in-house everything but then it's like this is really hard right like I'm going to do all the software design I'm going to do all the silicon design I'm going to build all this different IP I'm going to manage the supply chain on chips on racks on everything right ends up being a huge effort in terms of team size um all in the end like hey I make a 75% gross

Starting point is 00:27:09 margin as invidia. AMD sells their GPs for 50% gross margin, and they have a hard time out engineering Nvidia, and they're great at engineer, right? Like, they're, they, but yet they still take more silicon area, more memory to achieve the same performance, and they have to sell for less, so their margin gets compressed.

Starting point is 00:27:27 That makes sense. Look, I think historically, if you look at it, typically new entrants in markets didn't win by marginally improving on something existing. They happen sometimes, but more likely they jumped some kind of disruptive technology leap, right? But it's like, we have a different approach, we have a different technology.

Starting point is 00:27:43 Is that possible here? I mean, to some degree, maybe it's over simplifying a little bit, but I think part of the reason why the transformer model one was because it runs so incredibly great on GPUs, right? Like a recurring neural network is similarly performing, it looks like, but it runs terribly on a GPU. So did we sort of pick the model for an architecture, and now it's hard to come up with an architecture that, you know, really...

Starting point is 00:28:06 Well, it's hard to... Software co-design, right? Like, there's all this hype about neuromorphic computing, right? Like, theoretically, it's amazing and super efficient. It's like, okay, great. Like, there's no ecosystem of hardware. There's no ecosystem of software. It would take, like, you know, tens of thousands of people

Starting point is 00:28:21 who are the best to AI today focusing on that to even prove out if it's worthwhile or not, right? On a hardware side, on a software side, on a model side. And so, like, you look at like GROC, Cerebra, Samanova, they all, like, sort of over-indexed to the models that were leading at the time when they designed their chips. And so they made certain trade-offs, right? They put a lot more memory on chip.

Starting point is 00:28:42 And Invidia was like, well, we're not going to do that. A lot of faster at least, right? Well, more, like if you compare the amount of memory of S-RM on NVIDI's chips, it's much, much lower. Yes, correct. They went S-RM instead of D-Ram. But then they usually have less DRAM, so there's a trade-off there as well. Right, there's less DRAM, there's more S-RAM.

Starting point is 00:28:58 And because there's more S-RM on the chip, you have to have less compute on the chip. And so they ended up losing, right? Because the model sizes got too big and all this, right? And so you have this, like, super weird dynamic where they bet on something that was actually better, right? Like, I have no doubt that Cerebrus would run certain types of models better than NVIDIA or GRO or, hey, Dojo, right? Dojo runs certain, you know, in Tesla's Dojo, would run certain types of models way better than NVIDIA's chips because they're optimized to that. But then it's like, oh, well, actually, even in vision tasks, he used vision transformers now.

Starting point is 00:29:31 So it's like, okay, cool. Because model sizes grew and all these things. ends up being a, you know, catch-22 in that, like, you optimize for something. And so now, like, today, you have this new age of AI accelerator companies. They're like, okay, we're going to optimize for transformers. But the time they started designing, they're like, okay, transformers are dense models that are this big. What's the best, you know, the hidden dimension is 8K, and your back sizes are this big,

Starting point is 00:29:53 and your sequence are this big. So let's just make a super large systolic array. So you can, you know, create the maximum efficiency, and that turns out, oh, look at deep seek or, you know, go look at what the labs are doing. actually their shapes are much smaller. Actually, you need to do a bunch of small matrix multiplies, not massive, massive, massive, you know, singular matrix multiplies per layer.

Starting point is 00:30:13 And then it ends up, you know, oh, well, that chip you're designing for that is actually not super effective for that. And so the software is evolving constantly because of what works best on Nvidia. And you see that with, you know, whether it be what Deepseek's doing or Alibaba's doing or what the labs are doing internally.

Starting point is 00:30:30 And you even see this like for Google, right? Like, their open source Gemma models make different decisions because the shapes of a TPU are different than a GPU. And those, the GPU and the TPU are actually not that far apart, right? Like you would say, yes, they're very different, but like Blackwell and TPUs are very, they're converging on similar designs, actually. Whereas, like, to be Nvidia, you can't just have the supply chain, you know, with, right? You don't have this captive customer.

Starting point is 00:30:57 So now you need to do something, you know, that will give you 5x advantage, right? in hardware efficiency for a certain type of workload, and then pray the workload doesn't shift, right? Because Nvidia is also optimizing their architecture generation. They've added a lot of stuff to make their chips way better for the existing models, but it's like they're taking, you know, large steps every year, every two years, towards something, whereas you have to, like, go way over there and left field and hope that models stay over there, right?

Starting point is 00:31:27 Because you have to win by 5X because Nvidia is going to have supply chain efficiency over you, They're going to have time to market over you in terms of like a new process node or new memory or whatever technology, right? Even AMD, right? They got to 2 nanometer before Nvidia. They had higher density, HBM. They use 3D stacking, all these things on supply chain

Starting point is 00:31:49 that should be better than Nvidia, and yet they still lose. They're still the software angle, right? Nvidia's fantastic. Yeah, and then there's software as well, right? But it's like, Nvidia's going to have better networking than you. They're going to have better HBM. They're going to have better processed node. They're going to come to market faster.

Starting point is 00:32:01 be able to ramp faster, are going to have better negotiations with, whether it's TSM or SK Hynix and the memory in silicon side or all the rack people or, like, copper cables, everything, they're going to have better cost efficiency. So you have to be like 5x better. But to be fair, if somebody had a viable competitor, which would even be marginally cost competitive,

Starting point is 00:32:18 if my guess is many of the big consumers of GPUs would immediately shift some revenue there just to have a number two, right? Just to have a number two. That's AMD today, right? And Microsoft stopped. I mean, like, there is still pretty limited traction, though. Sure, but. meta continues to buy from them

Starting point is 00:32:33 and Microsoft did buy a bunch and then they stopped because it's like well yes they're you know AMD's giving you all these advantages but ends up still not being better on a performance per lot basis and they have a way bigger software team they're somewhat competitive on like all these dynamics

Starting point is 00:32:47 that I mentioned right so you can't just like do the same thing as Nvidia you really and do it better right or try and execute better like AMD like you have to really leap forward in some other way but that's the design cycle takes so long that models will shift right? Because they're like, oh, what's the next generation

Starting point is 00:33:03 and TPU and GPU look like? Okay, let's optimize for that. And the research path is, you know, like, great. Like, yes, neuromorphic computing could be the most optimal thing for us to do, but no one's working on that because you have to advance in the tech tree you've chosen. Right? If you restart the tech tree, you're going to be like, well, this sucks. And so, like, if it branches this way and you're over here, you're screwed. Because you have to be five X better. There's a mode.

Starting point is 00:33:27 Because the supply chain stuff means that 5X actually turns into a two and a half X. And then Nvidia can compress their margin a little bit if you're actually competitive. And then that two and a half X becomes like a 50% better. And then, yeah, so it's like it ends up being way too difficult to, and the software stuff, right? Everything like takes your 5X and makes it like, oh, you're actually only 50% better. And defense supply chain for sure. Yeah, defense supply chain.

Starting point is 00:33:49 And then like they get that, right? Like so it's like, and Lutnik himself said we had to do the this for rare earth minerals. And it's like interesting. China, there's like provinces in China that have. like rules that say the H20 is not efficient enough to be deployed, which is like super bizarre because it's clearly the best AI chip China has, Paul is still a little bit behind. Well, what's interesting is that, you know, efficiency is just not, is so much less of an issue in China than here because they just have the power infrastructure to be able

Starting point is 00:34:21 to support. So even if they're running less powerful chips, you know, you would imagine that it doesn't really matter because China has just such an infinite supply, infinite supply of power that, you know, they'd sort of be okay with it. So it's interesting. Which is, it's a big challenge in America, right? Like, there have been companies that were like, they would, they've like, you know, Jensen keeps saying he couldn't give away H20 in America for free. But I've literally like heard companies like say, like now say like, yeah, no, I mean,

Starting point is 00:34:49 I wouldn't because like I only have this much power. How am I going, you know, in data centers ready to go over the next year? If I bought an H20, I'd literally have less compute capacity. and then I'd lose, right? Even if it was free. Like, it doesn't make sense. Whereas China doesn't care. They can build these things.

Starting point is 00:35:05 They have the muscle. I'm curious how this all shakes out. You know, China's posturing really hard. They even, like, put out something that was like, we're investigating to see if there's backdoors in the age 20. It's like, there's no backdoor on age 20. Like, chill. You know, it's like, you know, GPU is usually like firewled from the public internet anyways.

Starting point is 00:35:25 Like, you step through stuff before you get to the GPU clusters. so like a back door wouldn't even matter. I don't know. I think it'll be interesting to see because China can definitely deploy way, way, way more power to AI the moment they decide to. But there's these like, there's like competing interest, right? Because they want Huawei to be better than NVIDIA.

Starting point is 00:35:52 Yeah. And then this is how NVIDIA argued to the administration. They're like, if we don't do this. Actually, I think it's like a very like, powerful argument that like like for example within triton which is a common ML library anyway like like bike dance uh has open source some stuff that plugs into this that is like super awesome and there's like all these other libraries it's not just models that china open sources it's like software for invidia that chinese company is open source um in a sense like by

Starting point is 00:36:19 invidia selling GPUs is invidia's argument again like was like uh they were able to you know stop hallway from building up a software ecosystem and the western eco-sum system is better. But then in the flip side, it's like, again, if you believe the models deliver more economic value to society than the hardware, which I actually think they do, it's just there's a value capture problem today, then you're giving China way more by giving them H-20s. And soon a version of Blackwell that's cut down, like Trump said, right, versus, versus, you know, selling them the chips, right? The economic value derived from selling them the chips is not as large as, you know, being able to somehow sell them AI services.

Starting point is 00:36:56 So is China gatekeeping power for AI? I don't think so. I think, again, like, there's a lot of, like, what we see is that, like, even with H20 being sold to China into China and future versions of the chip, H20E and other chips, we still see, like, Chinese companies like Alibaba

Starting point is 00:37:20 renting GPUs outside of China because the GPUs they can get outside of China are just so much better on a dollar spend per performance basis, renting them or even going through sort of like a Singaporean company that is effectively a Chinese company and building data centers and putting chips in them. So it's like I don't think China's limiting the power per se. It's that it's, you know, you can only, if you can spend like your Chinese companies

Starting point is 00:37:43 are growing their CAPEX way more than U.S. companies on a percentage basis next year. The absolute dollar number is, you know, obviously the U.S. companies are spending more still on AI. The percentage basis Chinese companies are growing more next year. And you still have the problem of like, well, dollars spend to AI output in tokens or in whatever is going to be lower because these chips are worse. So power is not the gating factor. It's always capital, right? At least today, right?

Starting point is 00:38:13 Now, China can spend a lot more capital if they wanted to. They're subsidizing the semiconductor industry to the tune of like $150, $200 billion a year through SOEs, through CAPEX that's not generating revenue, et cetera. So it's not like they couldn't do this to the AI ecosystem, right, given, you know, meta's CAPEX is like $60 billion, right? And Google's CAPX is like $80 billion, right? Like they could totally spend way more than that on a single effort. They just haven't decided to.

Starting point is 00:38:38 And I just think for the U.S., our buildouts are constrained by power, right? Like Google has a ton of TPUs sitting waiting for data centers to be powered and ready, as does meta with GPUs, right? We posted about how meta is now building these like effectively tents. Isn't this to some degree also coupled to their unwillingness to sell them to a broader ecosystem? I mean, if they want to be confined in their own data centers and there, you know, didn't ramp data center build out for their own hyper, for their own, some hypercala cases quickly enough, right? Then, yes, that constrains them, right? If they were on the open market, will we still be constrained.

Starting point is 00:39:13 Yeah, yeah, for sure. Because, like, companies like CoreWeave, you know, why is CoreWeave valuable is really because they build infrastructure really fast, right? And their software is nice, I think. but like a lot of their customers are bare muddle, right? Just replace the GPUs whenever they're broken and networked properly. They grew more aggressively. And I think they'll go anywhere. Jensen likes them as well.

Starting point is 00:39:32 Yeah, they'll go. Yeah, yeah, that's very important as well. But they'll like go because it's it, it, it, it, it unconstrates the ecosystem, which is better for Nvidia, right? Have we worked at Intel? I know exactly what's going through his mind. Yeah, so I think what's really important is that like, CoreWeave doesn't care, right? They're like, oh, crypto data center, I will convert it to AI data center, right?

Starting point is 00:39:52 They bought a company from, like $10 billion that's doing crypto mining, which is worth like $2 billion, like a couple years ago. And it's not because they're Bitcoin mining business is growing. It's because they have powered data centers, right? Like anywhere and everywhere, people are trying to build power data centers. And companies like Corweave and Oracle are moving to the... Actually, today, Google just didn't bought 8% of a crypto mining company called Terilwuf, right?

Starting point is 00:40:20 Not because they're getting into crypto mining. No, because they need the data centers, right? They need the power, right? And it's like, all the hyperscalers have, like, said, screw off to my sustainability pledges because they need power as fast as possible, right? They're, you know, they're doing things that are not, that they take a little bit longer to move the ship,

Starting point is 00:40:39 but like, even if you didn't do it in your own self-built data centers, there's still a lot of challenges in the open market. There's a deficit, right? And that's, that's contraining American ship buildouts heavily. Yes, others could maybe do it a little bit faster like Corrieve or others, right? Oracle's got an open mind as well. But it's still contraining U.S. buildouts heavily.

Starting point is 00:41:07 Even though the capital has been spent, right, the chips are, you know, 60 to 80 percent of the cost of the cluster, depending on what chips you're getting. So it's like they've already bought the chips. They just can't put them anywhere because the data centers aren't ready. Supplies to Google, applies to Microsoft, applies to meta, applies to a lot of folks. I mean, it's really hard to build power infrastructure in the U.S. Power, great interconnections, transmission,

Starting point is 00:41:29 substations, all of this stuff. Like electrical contractors, electricians in Texas, if you're willing to, like, be a travel electrician, it's like oil pay, right? Like, it used to be that, like, if you're physically adept, you could go make, you know, 100 grand in West Texas, but like, who the fuck wants to do that? Now it's like, well, you could go like 200 miles away from Dallas,

Starting point is 00:41:51 and what's still a reasonable town and build a data center and work on the wiring within the data center and all this other stuff, the transmission stuff, and your pay is up like 2X now versus what it was just a few years ago. This labor problem is a challenge too, and it's, yeah, I think

Starting point is 00:42:07 in China they don't have any of these problems, but they just haven't spent the capital yet, but capital is an issue as well. Because of the scale of what's being spent, right? Like, like, Invidia's revenue this year is going to be like over $200 million, and next year it expects are over 300 billion plus Google's going to spend like 50 billion dollars on TPU data centers right and it's like and Amazon's going to spend tons and tons on Traneum data

Starting point is 00:42:31 centers it's like the scale of dollars is is quickly growing to nation state level stuff and what's more important is being able to decide to spend the dollars and what's cost effective and so to some extent China is still constrained by that but they can smuggle chips in they can build data centers outside of China. They can rent data centers outside of China and have the most cost effective you know, blackwell chips or whatever, right? Bite Dance is, you know,

Starting point is 00:42:59 either the biggest or the second biggest customer of Google Cloud for a reason, right? And they're getting you know, over, you know, they're getting many, many blackwell from them, right? And the same with Oracle and the same with Microsoft and all these other companies are renting tons of chips to China anyways because it's more cost effective to do that than build it

Starting point is 00:43:15 yourself. So it's not like China has this mentality where we only have to, well, the government does, but the infrastructure companies don't. Like Alibaba, 10 cent, bite dance, et cetera. So what's the end game for data centers? I mean, like, we need more power, we need more cooling. Will the end be every, all data systems would be next to a nuclear reactor, lots of solar, you know, next to a deep level, like deep seawater that we use for cooling

Starting point is 00:43:39 or something like that? Or what's, I think that like cooling is, like the physical cooling of a data center are like, you know, there's this whole narrative about like, oh, yeah, I use this so much And it's like not really, you know, farming alfalfa uses like 100x the water of, of AI data centers. Even by the end of the decade, it'll be the same. And it's like, alfalfa is like worth very little. So it's like, there's like, it's like cooling is like not that, you know, people have like experimented with like, you know, undersea data centers to reduce the cooling cost. That doesn't make sense.

Starting point is 00:44:10 It's like five, 10% savings, but then like if you want to get the water out of the ocean then, then with the data center into the ocean. It's like if you want to service it like, you're screwed, right? So like the same with power. It's like, we talk a lot about, like, the power is not actually that expensive. It's just hard to build, right? How to get to the right place? And hard to get to the right space and convert it down to the voltages and all the stuff that chips need. So it's less the magnitude of power and more where it is and how it moves.

Starting point is 00:44:35 Well, the magnitude, too, right? Like, it's going to be. In terms of total world-wild energy consumption. A idea doesn't it is still a 40%. It's a fraction of a percent. Yeah, yeah. Even by the end of the decade, you know, the U.S. will be like 10% of our power will be A data centers,

Starting point is 00:44:51 which is still like... Of electricity. Of our electricity. In terms of energy, that's even a smaller fraction, right? Oh, yeah, yeah, because you think about... But shifting to electric vehicles, also, you can probably make a bigger swing than, you know, with all the A data centers we can build in the...

Starting point is 00:45:03 But outside, like, it's like in Europe, like, that number's not moving up that fast and, like, all these other countries. I think we need to build a lot more power, but it's not like some crazy, like, amount. It's just, like, doing it properly is the hard. And again, like, the cost of power, like, you go look at like these deals people are signing, they're still signing, like, even though the prices skyrocketed from like a few cents a kilowatt hour for these massive, massive purchases to like 10, it's still, you know, when you think about

Starting point is 00:45:34 the full TCL, the cluster, you know, the GPU cost of networking, all of this stuff far outstrips the power. Yeah. And same with cooling. But what percentage is power? Like if you do a four-year amortized GPU data center, what percentage will be power? 80% of the cost of a GPU data center, if you're building Blackwell, is capital, right?

Starting point is 00:45:53 It's the GPU purchases, it's the networking, it's the physical data center conversion, power conversion equipment. All of this stuff is like 80% of the cost. And then 20% is going to be your land and your power and your cooling and your cooling towers and your backup power and your generators and all this stuff.

Starting point is 00:46:11 It's like nothing, which is why it doesn't matter if you spend 10% or 50% more on that. because at the end of the day, the expensive thing, right? Like, this is why what Elon did would seem silly, right? They spent a lot more money on, you know, generators outside the data center and these mobile chillers to cool the water down for their liquid cooling instead of like the more cost-effective option because it got the data center up three months faster.

Starting point is 00:46:36 And so, like, that three months of additional training time is worth way, way, way more on a TCO basis, right? The performance you got out of the chips and the time to market and all this is way, way, way fast. and therefore it was the right decision, even though this part of the data center bloomed at cost, everything else is still there and you're still paying for the chips.

Starting point is 00:46:55 And if they were sitting idle, it's not worth it. Just by like bypassing the grid, bypassing anything to do with interconnect, anything to do with public utilities. Exactly, exactly. What's your take on Intel? Where is Intel going? I think the world, well, the US needs Intel.

Starting point is 00:47:09 I think the world needs Intel. I think the world needs Intel because like Samsung is doing worse than Intel on leading edge process development, in my opinion, based on, even on various customers in the industry, having done test chips at, like, Intel versus Samsung, they think, I think industry generally agrees that Intel is further along, you know,

Starting point is 00:47:28 the sort of the two-nometer class process technology than Samsung is, but both are way behind TSMC. And TSMC is a monopoly in some extent. The number one question always people ask is, like, why is TSM not making more money? Why are they only raising prices, you know, next year, you know, three to 10% depending on what it is.

Starting point is 00:47:48 It's like, TSM's a monopoly. Like, they could raise a lot more, but they're good Taiwanese people rather than like dirty American capitalists. If TSM was owned or was managed by Americans, I think most ownership is actually American in terms of the stock, it's on a New York Stock Exchange and all this, like, you know,

Starting point is 00:48:05 they would have raised prices a lot more. And so, like, there is this, like, difficult, difficult thing to be done that like, hey, there's one island that controls all leading edge semiconductors and not just all leading edge like the majority of trailing edge production as well something needs to be done

Starting point is 00:48:23 Intel is behind but not like absurdly so right like if something were to happen to Taiwan Intel would have the most advanced technology in the world right? It's just it's not economic can you keep Intel as one company if you want them to be competitive? I think the process of splitting it would take so much

Starting point is 00:48:39 executive time and so much executive effort that you would have been bankrupt by then right And that's the big challenge. Like, I think Intel should be separate, right? But to properly split the company and for all the management time that's needed is, like, absurd. And instead, like, what you need is like, you need Lip Bhutan, who's the CEO of Intel. You know, there's a lot of drama going around about him because he's, he's one of the greatest semiconductor investors ever, right?

Starting point is 00:49:05 He's invested in so many different companies first. You know, he was on the board of like SMIC, which is China's TSM, effectively, which is like a big, like, drama or like, some of the biggest tool companies in Chinese, the first investor in them, because, you know, it was a multipolar world there and he's making good investments. But, like, you know, now, like, people are getting mad about that, but it's like, no, he recognizes the companies,

Starting point is 00:49:27 like, he understands the supply chain. He needs to not spend his time on splitting the company because then he never actually fixes the company, right? Intel's problem is that, like, it takes them five to six years to go from design to shipping the product, in some cases more. And when they tape out a chip, right? Like, you know, you send the design to the fab.

Starting point is 00:49:46 The fab brings back the chip. They go through 14 revisions in some cases where it was like the rest of the industry goes through like one to three, right, revisions if they're good of like send the design in, get the chip back, test it, send the design in, right, for a public launch. And they'll launch a chip in three years. So, but if you look at it until today, right, they still don't have a competitive entry on the, on the AI side. And they won't.

Starting point is 00:50:11 Right. Can you? So what does they mean for their offering? I mean, they're still doing great on CPUs. They don't have a good AI AI chip product. Is it long-term sustainable positioning, right? I mean, as a standalone chip company? I mean, IBM still makes more money every launch off of mainframes.

Starting point is 00:50:28 So it's not like X-86 is dead. It's like you don't get the growth rates, but like you could totally run this as a very profitable enterprise. And I think the same with PCs, right? There's some turmoil, there's some arm entry, there's some AMD competition. They're very well, also. Well, like, I think it's a very, it can be a very profitable business if it had, like, one third the people or half the people working on it.

Starting point is 00:50:49 And so, like, Liputan, to fix Intel needs to go into both the design company and lay off a shitload of people, but, like, keep all the good people and make sure that they're designing fast and they're launching from design conception to launches two to three years, not five to six. And that's on the design side and make that profitable. And then on the fabs, you have to do the same thing. There's all these people, like, one of the heads of Fab Automation at Intel, I explicitly told Lip Butan because, you know, we have a couple X Intel people who are actually good in the company that worked on the fab side, and we're like, they were like, who's the worst people and friends that's like, oh, this guy sucks.

Starting point is 00:51:26 I explicitly told Lip Butan, he'd never talk to the guy because it was like four layers down. The company has like absurd amounts of hierarchy. It's like four layers down. He goes and talks to the guy and he's out, right? it's like, like, he figures out, like, who's bad, right? And who's good? And he has to go in and he's to like, hey, the vast majority of the team at Intel is the one who led the world in production and process technology for 20 years, right?

Starting point is 00:51:48 But there's a lot of like built up crap. So he has to go figure this out, right? He can't waste his time on like, oh, all this like structuring to split. Like, I think it would be better if the company split. I just don't think he can spend the time to do that. And if the design side of the company is, you know, you're not really going to get into AI. You're not really going to...

Starting point is 00:52:08 You have to make some money there. But the fabs, I think, could truly become a competitor. But they're going to go bankrupt by the time anything, you know, can happen. So they have to figure out how to get capital. So he has to figure out how to get capital. He has to figure out how to clean up all the crap. Make the, you know, yields go up, right? Make the product ship way faster.

Starting point is 00:52:26 Like, all of these things are basic problems. I think the goals are completely correct. I mean, I think the big challenge, just refecting back in my time there, right? I think the big challenge is that right now, if you look at Intel, right, they have essentially software, the chip design, and then there's, you know, the core manufacturing part, right? And they have three very different cultures. And it's very hard to get everything under one umbrella, right?

Starting point is 00:52:46 And so I think that is the big challenge. I think you should even run the company separately, right? But like, you can't physically separate them entity-wise because it's going to take so long to sever all these things. Because he doesn't have time, right? Like Intel is literally going to go bankrupt if they don't have a big cash infusion or they lay off like half the company. right, which some could argue

Starting point is 00:53:06 you need to lay off like 30% of the company anyways but there's a lot of bad things that happen if that happens, right? And they need to spend a lot more on building the next generation fad even if they fix the fab and they don't have money for that, right? So there's like, there's like a lot more

Starting point is 00:53:21 more important problems than like physically separating the company even though I think long term, yes, the fab has to be separate from the chips design software, right? Or chip design part of the company. Just like that's going to make each company much more accountable, be able to service their customers better, et cetera. It's just, that's going to take too long and they're going to go bankrupt by then. Awesome.

Starting point is 00:53:41 But I think, I hope, I pray someone does something, right? Like you get a big capital infusion. I don't know. The big hyperscalers are like muscled into like, oh, okay, wait, if TSM eventually grows their margin to 75% because they're the monopoly, plus they intake all this stuff like co-package optics and power delivery and all this, like all of a sudden the cost is going to spike. So we should actually just throw $5 billion at Intel each, right? Screw it.

Starting point is 00:54:05 And that could actually give Intel enough of a lifeline to potentially get to something and maybe be competitive. That's the hope. Can we finish by finishing this game that we started when we gave Sam Altman advice? If Jensen was here, what advice would you have from? If Jensen was here, you know, I think he has a massive, massive balance sheet.

Starting point is 00:54:31 right, Jensen does, he's cash, free cash flow is like, you know, ridiculous. The tax cut, the new, the new Trump, you know, tax bill institutes something really incredible, which is that you can depreciate all of the GPU cluster cost in year one, which we put out like a note about how like the tax implications to like meta are like $10 billion a year. And across each of the major hyper-scalers, it's like massive. It's like, well, Nvidia's going to spend tons and tons of cash or they're going to spend like, you know, tens of billions of dollars of taxes, why don't you get into the infrastructure game somehow?

Starting point is 00:55:06 Now, this is obviously going to be, like, crazy because, like, now they're buying their own GPs and putting them in data centers and doing stuff and they're competing with their own customers, but they're already doing that anyways because their customers are trying to make chips. But they should, like, accelerate the data center ecosystem with investments, right?

Starting point is 00:55:23 Because really, we think we can, like, have very high degree of, like, accuracy on what they're going to do next year, in terms of revenue because it's just the number of data center watts that are being built, right? Like, this is a harder thing to shift up and down, right? Now, there's a little bit of share difference between how much is TPU versus GPU, but like,

Starting point is 00:55:41 it's like you have to accelerate the infrastructure and you need to spend all of this capital that you're building, right? Like, okay, do you want to go the route of like doing buybacks and dividends? Like, great, like, you're a loser if you do that, right? Like, you can make more money by reinvesting and building a bigger company that's not just chips into the ecosystem or servers into the ecosystem, system, but actually, like, controlling the infrastructure end-to-end somehow.

Starting point is 00:56:05 So I think there's something he could do there with this massive war chest. And there's a reason, like, Nvidia's done some buybacks and they've done some dividends increasing, but the cash on their balance sheet keeps growing. And they're going to have north of $100 billion of cash on their balance sheet by the end of this year, I think. So it's like, what are you going to do with that? I think there's something moving into the infrastructure layer much more that they they could do if he really wants to be the king of the world, right?

Starting point is 00:56:33 Which I think he does. Sergey and Sindar? Who? I think they should open up the kimono on TPUs, right? Like start selling them, open up the software, open source a lot more of the XLA software because there's open XLA and there's XLA, but the vast majority is closed source. Really, really open up the kimono on that and be a lot more aggressive, right? They're still pretty not aggressive on data centers.

Starting point is 00:57:01 Um, they're pretty not aggressive on a lot of elements of the company. Um, the TPU team's next gender designs are pretty not aggressive, partially because a lot of the TPU team, uh, has left to go to open AI, the best people that I knew, uh, it was actually really annoying. I knew like four people, uh, or five people and they all went to open AI and it's like, fuck. Like, now I don't get as much. It's, I met some other people, right? But it's like, you know, um, I think they could be a lot more aggressive in many ways across the company. They don't have to be, right? But they could. Um, because, Because AI, you know, like this Chad GBT, Take Gray, the shift of search queries, the monetizable ones, especially from two purchasing agents, is going to really screw Google long term if they don't, you know, get their act together.

Starting point is 00:57:43 I think they've gotten their act together on DeepMind. There's still some inefficiencies, but Sergey works on works within DeepMind a lot and they're driving hard. They're still a little bit behind, but like, I think like physical infrastructure, TPU and how much money they could make and how much they could take the wind out of everyone else's sales. if they start selling TPUs externally and reorg around like building data centers much faster so that they do have the most compute in the world because they did but now there's certain companies that are going to surpass them

Starting point is 00:58:11 potentially over the next few years if they don't really get their act together so I think that's what I would say for them yeah and also like learn how to ship product for Zuck I think I think Zuck

Starting point is 00:58:26 you know it remains to be seen what goes on with super intelligence but like they're trying to move super fast with the data centers you know like screw it we'll build tents instead of like physical data centers because we only need these for five years anyways you know the super intelligence

Starting point is 00:58:41 moves you could you could say whatever you want but like you know trying to buy like thinking for like 30 billion or SSI for 30 billion didn't work out so then they spent you know not even that much on hiring not 30 billion on hiring all these people so I think that he recognizes the urgency

Starting point is 00:58:56 with the models with the infrastructure sure. So I really think he needs to like, you know, if you read his website post about like AI, like I think, you know, he sees the vision, right? There's the wearables, there's integrating AI into that. There's being your AI assistant to do all this purchasing and stuff. I think he sees the vision, but I think he also needs to focus on like actually like releasing that faster, but also like the products that they do outside of their core IP

Starting point is 00:59:24 every time they launch something is kind of mid, right? You know, Metal Reality Labs is doing well, but I think they should, like, go more explicit. Like, have a Chad GPT competitor, have a Claude competitor, like, just start releasing way more products because they're really just focused on their individual gardens rather than, like, branching outside of it.

Starting point is 00:59:46 Do you think Apple should have that same sense of urgency or if Tim Cook was here, what would you tell them? The funny thing is, like, some of their best AI people are now, like, at Super Intelligence. They're building an AI accelerator, They're going to, they're, they have AI models, but they're just, like, way slower. They did mention on the last earnings call, they're going to allocate more capital with this, but it's like, guys, Apple, like, you guys are going to lose the boat if you do not spend, like,

Starting point is 01:00:08 $50, $100 billion on infrastructure. You don't think the concierie will cut it. I think, like, more and more you'll see people, like, you know, great, Apple has this walled garden, but, like, they can only do so much to protect it, right? IDFA, like, they shut down ads to, or data sharing to mess. but meta made better models and now they have way more data and way more power over the user than they ever did before kind of it was good that meta kicked the crutch off of them or Apple did but the same applies to like AI like yes they have access to the text and they have access to this but like I think other people are going to be able to integrate user data and agents will be able to integrate all this user data and they'll start to lose control of what the user experiences as more and more gets disinrediated by AI being the interface rather than touch rather than, you know, touch pad and keyboard. And I don't think they've truly realized what happens when the interface to computing is, is AI. Like they market it, but like that's

Starting point is 01:01:08 going to shift computing really heavily. They have great hardware. And their hardware teams are working on awesome stuff and form factors. But like, I just don't know if they like get what is actually going to happen to the world in the next five years, truly well enough. And they're not building fast enough for it. What about Microsoft to that end? Microsoft has the same problem. I think they were super aggressive in 23 and 24. And then they pulled back heavily, right? Now, like, opening eyes slipping through their grasps.

Starting point is 01:01:38 There's that whole thing there. They cut back on data center investments heavily. They were going to be the largest infrastructure company in the world by, like, a factor of 2X, which would have been, you know, you could argue maybe that was like too much and maybe it wouldn't have been economical. But, like, they're losing grasp on open AI. Their internal model efforts are failing. spectacularly, like, they're on LLM arena right now, and they're pretty decent there,

Starting point is 01:02:01 but it's like, that's just like a sycophantic model. Like, it's a code name, but like, whatever. Like, MAI is, like, failing. Azure is, like, losing a lot of share to Oracle and CoreWeave and Google and so on and so forth, right? Their internal chip effort is by far the worst of any hyperscaler. Like, they're just, like, mis-execute. Like, GitHub, how is GitHub not the highest ARR software,

Starting point is 01:02:26 code model. They only had the best IDE, the best source code repository, the best enterprise, Salesforce, the best model company as a relationship, and they were the first to market, right? It's like, they're never really going for them.

Starting point is 01:02:39 And like, there's just nothing, right? It's like, like, GitHub co-pilot is failing. Microsoft copilot is like still crap, right? Like, yeah, it's unusable. It's like, what is it going on? Like, you need to shake the crap out of the company. Like, I think they win a lot because they have the best business-to-business

Starting point is 01:02:56 relationship with so many enterprises. That stands for us on the planet. Yeah, but like they end up like not having the actual product to sell them, which is like really scary. So they need to really work on product. Satcha has done great on sales and stuff, but like, yeah. If Elon was here, what advice would you give him? A lot of people at XAI are mad about the porn models, like porn stuff.

Starting point is 01:03:18 It's fine. Like you're going to make a ton of money off of this. This is how you accelerate the revenue of that company. But like he's losing a lot of talent and axing a lot of good projects. but Elon is a magnet to amazing talent and building stuff, so I won't bet against him, but it seems like since he left the administration and focused on stuff again.

Starting point is 01:03:35 But I think, I don't know, I think he's focused on a lot of things, and I think, like, Robotaxy starting to look good, actually, again. Like, I haven't ridden one yet, but I have some friends who've ridden one. It's, like, looks pretty decent. He could, like, not make these snap decisions,

Starting point is 01:03:48 which often are the reason why he's amazing, but, like, some of these snap decisions are hurting him. I'm not sure if I can give Elon that much great advice because I think Maybe it's just like focus on like the products again, right? More. But he is working on that stuff a lot. Yeah.

Starting point is 01:04:02 I think there might be a good place to to wrap. Awesome. It was a great discussion. Dylan, thanks so much for joining us. Thank you for having me. Thanks for listening to the A16Z podcast. If you enjoyed the episode, let us know by leaving a review at rate thispodcast.com slash A16Z.

Starting point is 01:04:19 We've got more great conversations coming your way. See you next time. As a reminder, the content here is for informational purposes. only should not be taken as legal business tax or investment advice or be used to evaluate any investment or security and is not directed at any investors or potential investors in any A16Z fund. Please note that A16Z and its affiliates may also maintain investments in the companies discussed in this podcast. For more details, including a link to our investments, please see A16Z.com forward slash disclosures.

Starting point is 01:04:56 Thank you.

a16z Podcast - Dylan Patel: GPT-5, NVIDIA, Intel, Meta, Apple

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.