The a16z Show - Dylan Patel: GPT-5, NVIDIA, Intel, Meta, Apple

Starting point is 00:00:00 Invidia is going to have better networking than you. They're going to have better HBM. They're going to have better process node. They're going to come to market faster. They're going to be able to ramp faster. They're going to have better negotiations with, whether it's TSM or SK Hynix and the memory in silicon side or all the rack people or like copper cables, everything, they're going to have better cost efficiency.

Starting point is 00:00:15 So you can't just like do the same thing as Nvidia. You have to really leap forward in some other way. You have to be like 5X better. Today we're talking AI, hardware, chips, and the infrastructure powering the next wave of models with three people at the center. center of it all. Dylan Patel, founder and CEO of Semi-analysis, one of the sharpest voices on chips, data centers, and economics driving AI's explosive growth. Aaron Price-Ripe, general partner at A16Z, investing in the technologies and infrastructure shaping the future. Guido Appenzeller,

Starting point is 00:00:47 partner at A16Z with decades on the front lines of AI, cloud, and networking. From GPT5's launch to Nvidia's dominance, Custom Silicon, and the global race for compute, we're covering what's happening behind the scenes. Let's get into it. Dylan, welcome to the podcast. Thank you for having me. We've been trying to get you for a while. You're busy man, but it worked out. Gwido, what want to you introduce why we were so excited to have Dylan on the podcast and what we're excited to discuss? I think, Dylan, you've done exceptional job in covering what's happening in the AI harbor space, AI semi-space, and now more more data center space as well. And just looking at it, currently the most valuable company on the planet is an AI semi-company, right? The I think biggest

Starting point is 00:01:29 IPO so far an AI was an AI cloud company. This is currently where it's happening, right? In any gold rush in the early days is the pigs and troubles that make money. And I think this is the stage that we're in. So I'm super excited to have you yet today. Awesome. Thank you. Happy to talk about my favorite topics. Amazing. Well, maybe let's start with GPD5. We just had some of the researchers for Christina and Isabella on here last week. You said it was disappointing. When you share your reactions or what capabilities you were hoping to see or overall. I think it depends on what tier of user you are. Right. If you're just using GPD5 and before you were $20 or $200 a month subscriber,

Starting point is 00:02:03 you no longer have access to 4.5, which in my opinion is still a better pre-trained model for certain things. Or you no longer have access to 03, which would think for 30 seconds on average maybe, right? Whereas GPD5, even when you're using thinking, only thinks for like five to 10 seconds on average, right? Which is an interesting sort of phenomenon, right?

Starting point is 00:02:24 But basically like GPD5 is not spending more compute per se. the model did get a little bit better on a vanilla basis, right? 4-0 to 5 is actually quite a bit better. But when you think about, you know, what is this curve of intelligence, right? It's like the more compute you spend, the better the model gets. And that's whether it's a bigger model, which TPD5 isn't, right? You can see it's not a bigger model. It's roughly the same size, you know, or you think more, right?

Starting point is 00:02:49 But again, like, this is something that opening eyes, first thinking models, you know, the first few generations of 01,03, would think for a long time and waste a lot of tokens. if you will, and when you look at, for example, anthropics thinking models, even when you put them in thinking mode, they think a lot less, right, to get to the same results or better results, right, as opening I was.

Starting point is 00:03:08 And so opening, I think, like, optimized a lot of, like, well, if I ask, like, I think the silliest one I had asked was like, I asked 03 once, is pork red meat or white meat? And it thought for like 48 seconds. It's like, what are you doing? Like, they should just like, tell me the answer. And so, like, a nice thing is that GPD5 will think a lot less,

Starting point is 00:03:25 even if you select thinking manually, but more importantly, they have the sort of auto functionality, the router, which lets them decide whether or not, hey, do I route to the regular model? Do I route to maybe mini if you're out of rate limits or do I route to thinking, right? And how much do I think? But in general, the thinking model will think less. So there's less compute going into a power user's average query than before.

Starting point is 00:03:50 But isn't it even more interesting? Opening I cannot control how much computer wants to allocate to you, right? If we're in a high load situation, maybe tune the route, a little bit so it's less, right? Maybe Skype. I have no idea what they're doing behind the curtain, but there's this meme out there at the moment that basically all they did, which is a meme, right?

Starting point is 00:04:06 It's not true, but all they did is take O3 plus a couple of smaller models, put a router in front and offer that at the lower branded price essentially, right? I think there's a little bit of that, right? Cost suddenly matters and they figured out a way how they can steer that. I think, yeah, I mean, and they talked about how they've been able to dramatically increase

Starting point is 00:04:23 their infrastructure capacity because I myself was just regularly using O3 or 4.5. Right? And now I'm forced to use auto, which sometimes gives me the 03 equivalent thinking model, but sometimes gives me just the regular base muscle, which sucks. But like, I think for the free user, it's actually quite interesting, right? The free user was not getting thinking models pretty much ever or not using them, or in many cases, they just open the website and asked their query. And now sometimes their query gets routed there. So sometimes they get a way better model. But now sometimes the opening I can gracefully degrade them if they need to, right? And I think the router points to the

Starting point is 00:04:58 future of opening I from a business, right? Like you can look at sort of the model companies, right? Anthropic is fully focused on B2B, right? API, code, et cetera, right? Or a cloud code, whatever it is, right?

Starting point is 00:05:09 Open AI, yes, they have that business, Kodak's and API business, but really the majority of the revenue is consumer, right? And it's consumer subscriptions. But they have no way to upsell, you know, have to make money off of all the free users, right? In any other application consumer app, the free user still pays via ads.

Starting point is 00:05:26 But this is not compatible with AI, right? Like, it's a helpful assistant. You can't just make the users of result worse by injecting ads. Banner ads don't really work in AI either. So it's like, how do you now monetize them? And I think with the router, they're getting really close to figuring out how to who monetize that user, right? With the new CEO of applications, if you saw her product that she launched at Shopify, I think it was Shopify, was an agent for shopping, right? And now this like immediately clicks, like, oh, if the user asks a low value query, hey, why is the sky blue? just route them to mini, right?

Starting point is 00:05:59 The model can answer perfectly fine, and that is a chunk of queries, right? But if they ask, what's the best DUI lawyer near me, right? All of a sudden, this is like, you know, you're in jail, you have one shot, you're like, screw it, let me ask Chad GPT what the best DUI lawyer is. And now all of a sudden, the model's not capable of it today, but soon enough it'll be able to contact all the lawyers in the area

Starting point is 00:06:18 and figure out what their results are and maybe search their, like, court filings and whatever, right? Book the best lawyer for you or an airplane ticket. Maybe negotiate a cut as part of that. Yeah, of course they're going to take a cut, right? But this is a much better way of monetizing the free user. It's like, you know, it's like Etsy. 10% of their traffic now comes from chat.

Starting point is 00:06:36 And OpenA makes nothing off of that. But they really, really will soon, right? And partially that's because Amazon blocks chat. But there's a way to make money from shopping decisions, whether it's booking flights or looking for items. And those you now say, free user, I don't care. I'm going to send you to my best model. I'm going to send you to agents.

Starting point is 00:06:54 I'm going to spend ungodly amounts of compute on you because I can make money off of this. but if it's a query that's like, help me with my homework, I'll send you like a decent model, right? I don't need to spend money on you. And so this is how I think like opening, I can finally make money off of the free user. And I think that's the biggest like thing about the router, right?

Starting point is 00:07:11 This is super interesting. I think this is the first time that we've seen that there's a launch of a new model where to some degree cost is the headline item, right? I mean, so far I was always like, who is the smartest model? Who is the highest MLU score? Now we have suddenly people who use models for coding for eight hours a day and surprise that if you take a large context window

Starting point is 00:07:29 and the best model creates thousands of dollars of cost a month. So cost matters. And so to some degree, so where you're on the parade of frontier between cost and performance is the new benchmark for model competitive no longer cost alone. Is that what we're seeing here? I mean, I think definitely, right? Like opening eyes said they doubled their rate limits

Starting point is 00:07:47 for big amounts of users. They've dramatically increased the number of tokens are serving from this launch, which effectively says this is an economic release. Probably also means the tokens are now cheaper, right? Yeah, yeah, for sure, for sure. I think the funniest thing is this whole cost thing you mentioned is like, we've seen this in the code space, right?

Starting point is 00:08:03 Cursor had to pull away the unlimited clod code. Initially, they had this super expensive plan and it had like unlimited rates and then they were only like a weekly rate limit. Now they have like hour-based rate limits. And I saw the craziest like thread on Twitter where this guy said he changed his sleep schedule, right? Modeled after like how sailors in the bay, if you're sailing, you can't sleep, right? Like solo sailing.

Starting point is 00:08:25 They'll take power naps when they get to the right spots so that they can still be safe. In the morning when it's not very windy. Well, but they can't sleep uninterrupted, right? And so because Anthropic had to put rate limits that are like not just week-based, but a number of hours based, and he like basically sleeps multiple times a day, but small chunks just so he can maximize the usage.

Starting point is 00:08:47 And there's also a leaderboard on Reddit where people are like competing to see how many tokens they're using through their subscription. And there's like a dude spending like $30,000 dollars a month. So I'm going to find some developer in India that I can do pair programming with so I can get the day cycle, he can get the night cycle, and we both can maximize together the quota for the account. Is that the future then? I mean, but it's clear like people are taking advantage of the negative gross margin, like sort of subscriptions that are offered.

Starting point is 00:09:11 I think Anthropic probably makes a positive gross margin off of my subscription. I don't code enough, but there's plenty of people that are definitely losing money. And so as you said, it's an economic it puts more and more to, I think, just usage-based pricing, right? I think if you have an underlying commodity that you're reselling to some degree that has that is that larger part of your cost of goods, right? You need to go to user-based pricing. How much do you think the like customer capture and stickiness for these code products is? I'm curious what you think on that, right? Once you use an ID, once you integrate one of the CLI products in, like, how sticky is it? Or is it? That's a billion dollar question. There's a very conservative estimate. Look, Andrew Parthy has this

Starting point is 00:09:47 great slide where he basically says, if you're building an agenetic system today, right? But fundamentally what it is of this loop, right? Where half of the loop is the model thinking, right? And then try to do this. And the other half is then the user verifying what did the agent do? Is it the right thing providing feedback and trying to steer it in the right direction? Because we can't run forever eventually need to steer it back. One half of that is the model provider, right? They're trying to build the best model. The other half is really about, I think, designing the best possible UI to enable a user to give feedback. And I think there's value on that. So I think there's a certain amount of stickiness in there. So what are all the different tools like in terms

Starting point is 00:10:18 of visual, like say, take code editing, right? How can I most easily visualize what the code changes are. How can it mostly visualize, you know, what they impact, which files? You know, how can I, for small changes, get very quick feedback versus for complex ones, you know, get complex feedbacks. And there's some tools that actually draw diagrams for you what they do, right? So I think this will be the battle. I think there's stickiness in that, right? How much exactly? So in that sense, like, people should be doing subscriptions to get people locked in, right? Instead of moving to usage-based pricing. Well, I think it's the customers that don't want to do usage-based pricing because it's so hard to guarantee, it's so hard for it to get away from

Starting point is 00:10:52 And you actually want guarantees and you're willing to commit to pretty high spend in order to not have usage-based pricing. I think it's the model companies that want usage-based pricing. I think with consumers, it's frankly very hard to not have usage-based pricing just because the variability is so massive. If it's us coding versus somebody who does this as their full-time job, right? You just have a factor of 20 or so difference in usage. That costs a lot of money, right? I think for enterprises, we could see more flat-free pricing because you can average it out more. you have a developer that's using it all day, you kind of know in a general sense,

Starting point is 00:11:25 like how many hours a day they're programming and what that sort of looks like. The vibe quotas are. Yeah. Before we leave opening, I want to ask a broad question, which is if someone was sitting here and saying, hey, Dylan, I'll listen to anything you tell me to do, any advice you have, as long as it makes open ad more valuable, would you tell them? I would say immediately launch a method for you to input your credit card into chat GPT and agree that for anything it like agentically does for you, it'll take,

Starting point is 00:11:51 X-cut and then launch that product because where it does shopping, right? Because like everyone knows that like Anthropic and Open AI and all the other labs are buying RL environments of Amazon and of Shopify and of Etsy and of all the different ways to shop on the internet. Oh, of airline websites, right? Now just like, hey, integrate my calendar. I want to fly to there on Thursday. Make sure I don't miss a meeting.

Starting point is 00:12:14 Cool book. Right. Do that integration like super well. Know my preferences on whether I like aisle or window, all this stuff, right? and just take a take rate. I think this will make them so much money the moment they launch it. And I think they're working on it already,

Starting point is 00:12:26 but I'd like to hear how he thinks about it because he's shifted his tone massively on like ads over the last six months, right? He used to be like, no way. And now he's like, uh, maybe, you know? There's a way to do it without harming the user. And I think this is how you monetize the free user, right? So I think that's probably what I'd tell him

Starting point is 00:12:44 slash ask him about, like a whole line of questions around this. Well, he's coming on the podcast in a few weeks. Oh, nice. So we'll ask him. I want to shift to Nvidia. It's having a monster year. They're up almost 70%. What are the possible paths from here?

Starting point is 00:12:56 How do you see it playing out? Depends, like, how Pildew are on like the continued growth. But I think you guys have a good vantage point. We have a good vantage point of how fast revenue is growing for a lot of these companies, especially the code companies, but even many other applications. I think we can clearly see the demand side is accelerating, right? And then if you look at the training side, I think the race is on. That is upping hugely.

Starting point is 00:13:19 Google's upping hugely. hugely, if you just look at, again, just OpenA and Anthropic and the compute that they have and are getting this year from Google and Amazon for Anthropic and from Microsoft, Corrieve, Oracle for Open AI, 30% of the chips are going to them, just those two companies. But that's actually like, okay, well, like 70% of the stuff, like who's making off of it. Well, one third of it is like ads, right? Whether it be bike dance or meta or many of the other people who are doing ads. So then it's still like, okay, well, we're the rest of these one third of the chips coming.

Starting point is 00:13:49 from. Well, they're like mostly un-economic providers who I don't think it's like an obvious bet that they're going to, you know, keep raising bigger and bigger rounds. So what happens there? I think with the, you know, we talked about like coding, right? Like earlier, actually the Quinn Coder 3 model is actually super

Starting point is 00:14:06 cheap if you're running it on-prem or if you're running in the cloud with all these inference libraries. And so like there's stuff like that as well. So I think the question is like how much does it keep growing? Because clearly I think the first third is definitely skyrocketing right, of open-A-anthropic lab spend. The second third of like ads is going to grow. It's not going to grow like crazy,

Starting point is 00:14:23 but I think there's definitely an inflection point that could be hit with Gen AI ads. I know Met has been experimenting with it a lot, but I could totally be convinced that there's going to be a huge inflection and take rate there, right, where you start showing me personalized ads. Like every person that's an ad is like, looks like me and I'll be like, okay, yes. Except like slightly better, so I like feel better, right? And I'm like, I want to buy it. Yeah. I have no idea how this is going to scale, right? But if you ask the question, how much could it scale, right? How much value are we creating here? Is it can we create enough value to actually keep growing for a long time?

Starting point is 00:14:52 If you just take AI software development, right? Yeah. We know we can easily get about 15% more productivity out of a developer. I don't think that's right. I think it's way higher. No, no. But the straight, like I talked to a lot of enterprises, like a classical enterprise straight up GitHub co-pilot deployment,

Starting point is 00:15:07 that gives you about 15%. We can do much more than that. But bro, like, you know how bad GitHub copilot is? Like, how did they, how, look at the revenue ARR chart. It's so funny. It's so funny if you look at the revenue ARR chart. It's like Claude and three months has surpassed them

Starting point is 00:15:21 and cursor, you know, easily surpassed them. And then like, even like companies like RepLIT or like at WindSurf slash cognition are like going to pass them. Like it's like, you're preaching to the quiet. So look, let's assume we can get this 100%. Yeah. So we can double the productivity of a developer, right?

Starting point is 00:15:36 About $30 million developers worldwide, give or take. Yeah. Right. Let's say 100K value ad per developer. It might be a little high worldwide. The US is low, but worldwide's high. So it's $3 trillion. Yeah.

Starting point is 00:15:46 Right. So we're probably building technology. here, which adds $3 trillion of GDP value. In theory, we could put that into GPUs because that's the main cost. Just from a coding model. Just from a coding model. Ignoring every other use case. So at least in theory, the value generation is here to keep growing, right?

Starting point is 00:16:01 Now, how that translates to the industry is a much more complicated. I think we've already seen AI's value creation. So sort of there's like the whole like the famous like, oh, 300 billion problem or 200 billion problem now. It's 600 billion problem, I'm sure. So Kui is going to put out like the $1.1 trillion problem, right, soon enough. But like, there is some, like, reality in that, of course, but, you know, ignores that, like, infrastructure spend today is accounting for five years of revenue, not, like, one.

Starting point is 00:16:27 And the revenue looks like this, not, like, flat line. But I think, I think the main thing is that AI is already generating more value than the spend. It's that the value capture is broken, right? Like, I legitimately believe Open AI is not even capturing 10% of the value they've created in the world already, just by usage of chat. Yeah. Right?

Starting point is 00:16:47 And I think the, yeah. The same applies to, you know, Anthropic and cursor and whoever else you're looking at. I think the value capture is really broken. Even like internally, I think like what we've been able to do with like four devs in terms of like automation, like our spend on like Gemini API is absurdly low. And yet we go through every single permit and regulatory filing around every single data center with AI. And it's like, and we take satellite photos of every data center and we like we're able to label

Starting point is 00:17:16 our data set and then recognize what generators people are using, what like cooling towers and the construction progress and substations, all this stuff is like automated and it's only possible because of Gen AI, but we do it with like very few developers and then like the value capture that I'm able to generate by selling this data by consulting with it is so high, but the company is making it as like they get nothing out of it, right? Like I think this like there's a value capture challenge here that far out exceeds the sort of creation, right? And, and And as you get models like GPD-5 or open-source models, like continuing to drive it down,

Starting point is 00:17:52 it's like the value capture is just harder and harder and harder for these companies because they're making, you know, 50% gross margin on inference if they're, you know, or last in many cases. In so many words, you're saying we're getting commoditized and therefore you can't capture the value and thus you should temper your expectations of how much, how much you can spend in GPUs?

Starting point is 00:18:11 Well, no, I think you can, I think there's still ways to like inflect hugely on value capture, right? Like I mentioned, the ads are a huge value capture. But that needs to happen before we see massive income. No, I think the other thing is like there's a lot of capital that's not been spent, right? Like the hyperscalers still can grow CAPX 20, 30% next year, right, from what they're doing this year. In addition, companies like Corrieve and Oracle, because they're tapping capital markets can raise way more than 20 to 30% capx. And then you go down the list further and it's like, oh, the largest infrastructure fund.

Starting point is 00:18:46 in the world like Brookfield and Blackstone. Well, actually, they're turning all of their eyes to investing even more into infrastructure, AI Infra. And then you're like the sovereign wealth funds of the world, like the G42s or, you know, the Norway one or GIC in Singapore. Like these people have barely started touching AI. And so I think there's a whole lot more CAPEX that can come without it being necessarily like economically motivated day one.

Starting point is 00:19:14 I'm also saying like economically motivated CAPEF. can only grow, like, so much. But there's so much other, like, like, where it's not clear from, you know, if you have a spreadsheet, you know, and you're basing it on real business that you should actually spend this much. But people will because they believe, I believe, I think you believe, like, infra, you know, people believe that this will be, you'll get profit out of it. But there's no, like, 100% certain, like, you know, way to argue it.

Starting point is 00:19:41 Yeah. How threatened, if at all is in video by my custom silicon? I think that's the biggest thing, right, is when we look at orders from Google and from Amazon, right, especially, and meta, their custom silicon is, not Microsoft, their custom silicon kind of sucks. But the other three, they're really upping their orders massively over the last year. You know, Amazon is making millions of Traneum. Google's making millions of TPUs. TPUs clearly are like 100% utilized, right? Yeah. Trinium's not there.

Starting point is 00:20:16 I think Amazon will figure out how to do that, and Anthropic will. So I think that's the biggest threat to Nvidia is that people figure out how to use custom silicon more broadly. And this sort of becomes the sort of like, if AI is concentrated, then custom silicon will do better. And that's not even talking about opening eyes silicon team and stuff, right? Like if AI is really concentrated,

Starting point is 00:20:41 then they'll do better, custom silicon. but if it gets dispersed broadly because there's all these open source models from China and there's all these open source software libraries from, you know, Nvidia and China and it makes the deployment costs like rock bottom than potentially. Hear me out here if Google's TPU is able to compete with Nvidia, in theory could do it on the open market. Nvidia is worth more than Google these days. Shouldn't Google start selling the chips to everyone?

Starting point is 00:21:10 I mean, in theory, they should be able to achieve a higher market cap. I absolutely think so. I think Google's even discussing it internally. I think it would require a big reorg of culture and a big reorg of like how Google Cloud works and how the TPU team works and how the Jax software team and XLA software teams work. I totally think they could.

Starting point is 00:21:33 It would just take them like shaking themselves pretty hard to be able to do it. Yeah, but I totally think Google should sell TPUs externally, not just renting, but like physically. It's kind of funny if a side hobby in theory has a higher company value potential as you than your entire business, especially as you think about the degradation of search as a core business. I mean, yeah, I think, but I think like if you were to ask like Sergey, right, like, hey,

Starting point is 00:22:00 do you think selling chips and racks is more valuable or a cloud or or Gemini, he'd be like, no, no, no, no, no, like Gemini is going to be worth way, way, way more. It's just not yet today. Right. And so I think like today you say Nvidia is the most, again, it's like a whole concentration thing, right? If the world is super concentrated in terms of customers, then Nvidia will not be the most valuable company in the world, right?

Starting point is 00:22:26 But if it gets dispersed more and more, which arguably we're starting to see with a lot of these open source models getting better and better and better and with ease of deploying them getting better, then you would see, I think you could argue in video will remain the most valuable company in the world for a long period of time. Historically,

Starting point is 00:22:47 no pun intended software has eaten the world in most markets, right? I mean, like, if you look at early networking days, Cisco was the most valuable company on the planet, right, for a while, it's no longer, right? They're the guys that built services on top, like Google or Amazon or meta eventually eclips. Which is why NVIDIA is like making all these software libraries, right? Like that's, that's, and they're trying to commoditize inference, right?

Starting point is 00:23:08 Like, you guys don't, I think, even have an inference API provider investment, do you? Well, we have all kinds of model providers. Model providers, but I'm talking about a pure API provider investment, I think, right? Is that correct? I think I talked to one of the team members, maybe

Starting point is 00:23:25 Rajko or someone about like why you guys didn't invest in like a together or like a fireworks. And sort of the argument was like, well, we think just serving models alone without making them will sort of be commoditized. Yeah. We have some in the stable diffusion ecosystem. Like with like a without, yeah, yeah.

Starting point is 00:23:42 Yeah, it's a little bit different dynamics there, I think. They tend to make much more compound models than the LM folks. Yeah, yeah. Yeah. But like you guys don't have one of these like, you know, base 10 or any of these like sort of like API investments because you think this is from someone on the Infra team that you guys think it'll get commoditized because the software in video is making, because VLM and HDLang, which is like open source software coming out of Berkeley and now, you

Starting point is 00:24:07 sort of has their own environments now and supported by many. Like, this being commoditized means that, like, API providers aren't necessarily worth a ton, right? Is sort of your argument maybe. I think that's relevant to this whole thing, which is, you know, why, right? Like, why would you do this? Shifting gears, what about the Silicon startups? What's your take on those? I mean, there's a ton of capital flowing into that.

Starting point is 00:24:32 I've seen, I have not numbers, but probably billions being invested in and ship startups. Yeah, for sure, for sure. I mean, like, whether you're looking at, like, you know, companies, like, I think it's, like, pretty impressive that a few companies like, etched and Rivas and a number of other companies, you know, Madax and others, like, have gotten the amount of funding they've had without even launching a chip, right? You know, in the past, like, yeah, silicon companies would make money or raise money, but they would at least launch a chip before they get a, you know, a big round. But, like, etched and Rivas, like, have raised, you know, a lot of money without ever,

Starting point is 00:25:09 launching a chip publicly, which I think is... I mean, it speaks to... Well, like, yes, silicon is super capital-intensive if you're building a chip, especially an accelerator, which has so many moving pieces. And there's like 10 different AI accelerator companies out there, right? Like, that are newish in the last few years. I think there's a lot more.

Starting point is 00:25:28 That are like, yeah, yeah, yeah, that's fair. And then there's the old guard, which continues to raise money, right? Like GROC and Surrebus and Samova. and Samanova and Tens Torin and so on and so forth, right? Like, or GraphCore getting bought out by SoftBank and SoftBank dumping money into this effort as well, right? There's a lot of capital being invested to dispel sort of Nvidia's top dollar or top position. But it becomes challenging, right? It's like, how do you beat Nvidia, right?

Starting point is 00:25:59 Like the hyperscalers, I think, are like kind of lucky in that they can do mostly the same thing as Nvidia. They have been kept a customer, which is themselves. So they can just win on supply chain, right? Like I'm using cheaper providers. It's a margin compression exercise essentially. Yeah, yeah. And maybe for certain workloads, like metaphor recommendation systems, they'll have a better, you know, they can specialize more.

Starting point is 00:26:20 But for the most part, it's like, no, we're targeting the same workloads. We can just simplify supply chain or in-house a lot of it and compress margin. And it'll be fine. But in the case of, you know, these other companies, it's like, well, they don't have a captive customer. So now you have to contend with. well, I'm using the same ecosystem and either I can use some custom silicon provider

Starting point is 00:26:43 who's going to take a margin anyways on top and that's going to compress my what I can sell for or I can try and in-house everything but then it's like this is really hard right like I'm going to do all the software design I'm going to do all the silicon design I'm going to build all this different IP

Starting point is 00:26:57 I'm going to manage the supply chain on chips on racks on everything right ends up being a huge effort in terms of team size all in the end like, hey, I make a 75% gross margin as NVIDIA. AMD sells their GPUs for 50% gross margin, and they have a hard time out engineering Nvidia,

Starting point is 00:27:16 and they're great at engineer, right? Like, they're, they, but yet they still take more silicon area, more memory to achieve the same performance, and they have to sell for less, so their margin gets compressed. That makes sense. Look, I think historically, if you look at it, typically if new entrants in markets, didn't win by marginally improving on something existing.

Starting point is 00:27:35 They happen sometimes, but more likely they, they jumped on some kind of disruptive technology leap, right? But it's like, we have a different approach, we have different technology. Is that possible here? I mean, to some degree, maybe there's over simplifying a little bit,

Starting point is 00:27:48 but I think part of the reason why the transformer model one was because it runs so incredibly great on GPUs, right? Like a recurring neural network is similarly performing, it looks like, but it runs terribly on now on a GPU. So did we sort of pick the model for an architecture? And now it's hard to come up with an architecture that, you know, really. Well, it's, it's, it's, It's hardware software code design, right?

Starting point is 00:28:08 Like there's all this hype about neuromorphic computing, right? Like theoretically, it's amazing and super efficient. It's like, okay, great. Like there's no ecosystem of hardware. There's no ecosystem of software. It would take like, you know, tens of thousands of people who are the best to AI today focusing on that to even prove out if it's worthwhile or not, right? On a hardware side, on a software side, on a model side.

Starting point is 00:28:29 And so like you look at like GROC, cerebris, Samanova, they all like sort of overindexed to the models. that were leading at the time when they designed their chips. And so they made certain tradeoffs, right? They put a lot more memory on chip. And Invidia was like, well, we're not going to do that. A lot faster at least, right? Well, more, like if you compare the amount of memory of S-RAM on NVIDIA's chips,

Starting point is 00:28:50 it's much, much lower. Yes, correct. They went S-ROM instead of DRAM. Right. Then they usually have less DRAM, so there's a trade-off there as well. Right, there's less DRAM, there's more S-RAM. And because there's more S-RAM on the chip, you have to have less compute on the chip.

Starting point is 00:29:01 And so they ended up losing, right? Because the model sizes got too big and all this, right? And so you have this like super weird dynamic where they bet on something that was actually better, right? Like I have no doubt that Cerebrus would run certain types of models better than Invidia or GRO or, hey, Dojo, right? Dojo runs certain, you know, in Tesla's Dojo would run certain types of models way better than Nvidia's chips because they're optimized to that. But then it's like, oh, well, actually, even in Vision Task, he's used vision transformers now. So it's like, okay, cool. gives model sizes grew and all these things.

Starting point is 00:29:35 So it ends up being a catch-22 in that like you optimize for something. And so now like today you have this new age of AI accelerator companies are like, okay, we're going to optimize for transformers. But the time they started designing, they're like, okay, transformers are dense models that are this big. What's the best, you know, the hidden dimension is 8K and your back sizes are this big and your sequence are this big.

Starting point is 00:29:55 So let's just make a super large systolic array. So you can, you know, create the maximum efficiency and that turns out, oh, at deep seek or go look at what the labs are doing. Actually, their shapes are much smaller. Actually, you need to do a bunch of small matrix multiplies, not massive, massive, massive, you know, singular matrix multiplies per layer. And then it ends up, you know, oh, well, that chip you're designing for that

Starting point is 00:30:15 is actually not super effective for that. And so the software is evolving constantly because of what works best on Nvidia. And you see that with, you know, whether it be what deepseek's doing or Alibaba's doing or what the labs are doing internally. And you even see this like for, Google, right? Like, their open source Gemma models make different decisions

Starting point is 00:30:36 because the shapes of a TPU are different than a GPU. And those, the GPU and the TPU are actually not that far apart, right? Like you would say, yes, they're very different, but Blackwell and TPUs are very, very, they're converging on similar designs, actually. Whereas, like, to be Nvidia, you can't just have this supply chain, you know,

Starting point is 00:30:55 with, right? You don't have this captive customer. So now you need to do something you know, that will give you five-x advantage, right? In, you hardware efficiency for a certain type of workload, and then pray the workload doesn't shift, right? Because Nvidia is also optimizing their architecture generation. They've added a lot of stuff to make their tips way better for the existing models, but it's like they're taking, you know, large steps every year, every two years towards something,

Starting point is 00:31:21 whereas you have to like go way over there and left field and hope that models stay over there, right? Because you have to win by 5X because Nvidia is going to have supply chain efficiency over you. they're going to have time to market over you in terms of like a new process node or new memory or whatever technology, right? Even AMD, right? They got to 2 nanometer before Nvidia. They had higher density, HBM.

Starting point is 00:31:46 They use 3D stacking, all these things on supply chain that should be better than Nvidia, and yet they still lose. They're still the software angle, right? Nvidia's fantastic. Yeah, and then there's software as well, right? But it's like, Nvidia's going to have better networking than you. They're going to have better HBM. They're going to have better process node.

Starting point is 00:32:00 They're going to come to market faster. They're going to be able to ramp faster, are going to have better negotiations with, whether it's TSM or SK-Hinix and the memory in silicon side or all the rack people or, like, copper cables, everything, they're going to have better cost efficiency. So you have to be like 5x better. But to be fair, if somebody had a viable competitor,

Starting point is 00:32:16 which would even be marginally cost competitive, if my guess is many of the big consumers of TPUs would immediately shift some revenue there just to have a number two, right? That's AMD today, right? And Microsoft stopped. I mean, like, there's still pretty limited traction, though. Right. Sure, but.

Starting point is 00:32:31 Meta continues to buy from them and Microsoft did buy a bunch and then they stopped because it's like well yes, they're you know, AMD's giving you all these advantages but ends up still not being better on a performance per lot basis and they have a way bigger software team

Starting point is 00:32:44 they're somewhat competitive on like all these dynamics that I mentioned right? So you can't just like do the same thing as Nvidia. You really and do it better right or try and execute better like AMD like you have to really leap forward in some other way but that's the design cycle takes so long that models will shift

Starting point is 00:33:01 right because they're like oh what's the next generation TPP and GPU look like okay let's optimize for that and the research path is you know like great like yes neuromorphic computing could be the most optimal thing for us to do but no one's working on that because you have to advance in the tech tree you've chosen right if you restart the tech tree you're going to be like well this sucks and so like if it branches this way and you're over here you're screwed

Starting point is 00:33:24 because you have to be five X better there's a mode because the supply chain stuff means that five X actually turns into a two and a half X and then Nvidia can compress their margin a little bit if you're actually competitive and then that two and a half X becomes like a 50% better and then yeah so it's like it ends up being way too difficult to

Starting point is 00:33:40 and the software stuff right everything like takes your 5x and makes it like oh you're actually only 50% better and defense supply chain for sure yeah defense supply chain and then like they get they get that right like so it's like and Lutnik himself said we had to do this for rare earth minerals

Starting point is 00:33:56 that's like interesting China there's like provinces in China that have, like, rules that say the H20 is not efficient enough to be deployed, which is, like, super bizarre because it's clearly the best AI chip China has. Huawei's still a little bit behind. Well, what's interesting is that, you know,

Starting point is 00:34:15 efficiency is just not, is so much less of an issue in China than here because they just have the power infrastructure to be able to support. So even if they're running less powerful chips, you know, you would imagine that it doesn't really matter because China has just such an infinite supply, infinite supply of power that, you know, they'd sort of be okay with it.

Starting point is 00:34:32 So it's interesting. Which is, it's a big challenge in America, right? Like, there have been companies that were like, they would, they've like, you know, Jensen keeps saying he couldn't give away H20 in America for free. But I've literally like heard companies like say, like now say like, yeah, no, I mean, I wouldn't because like I only have this much power. How am I going, you know, in data centers ready to go over the next year? if I bought an H20, I'd literally have less compute capacity,

Starting point is 00:34:59 and then I'd lose, right? Even if it was free. Like, it doesn't make sense. Whereas China doesn't care. They can build these things. They have the muscle. I'm curious how this all shakes out. You know, China's posturing really hard.

Starting point is 00:35:12 They even, like, put out something else like, we're investigating to see if there's backdoors in the H20. It's like, there's no backdoor in the age 20. Like, chill. You know, it's like, you know, GPU is usually like firewled from the public internet anyways. like you step through stuff before you get to the GPU clusters. So like a back door wouldn't even matter.

Starting point is 00:35:33 I don't know. I think it'll be interesting to see because China can definitely deploy way, way, way more power to AI the moment they decide to. But there's these like there's like competing interests, right? Because they want Huawei to be better than NVIDIA. Yeah. And then this is how NVIDIA argued to the administration. They're like, if we don't do this.

Starting point is 00:35:56 Actually, I think it's like a very like powerful argument that like, like, for example, within Triton, which is a common ML library. Anyway, like, like, like, Bight Dance has open source some stuff that plugs into this that is like super awesome. And there's like all these other libraries. It's not just models that China open sources. It's like software for Nvidia that Chinese companies open source. In a sense, like by Nvidia selling GPUs, is Nvidia's argument again, like was like they were able to, you know, stop Huawei from building up a software ecosystem and the Western ecosystem is better.

Starting point is 00:36:29 But then in the flip side, it's like, again, if you believe the models deliver more economic value to society than the hardware, which I actually think they do, it's just there's a value capture problem today, then you're giving China way more by giving them H-20s and soon a version of Blackwell that's cut down, like Trump said, right?

Starting point is 00:36:48 Versus, you know, selling them the chips, right? The economic value derived from selling them the chips is not as large as, you know, being able to somehow sell them AI services. So is China gatekeeping power for AI? I don't think so. I think, again, like, there's a lot of, like, what we see is that, like, even with H20 being sold to China into China and future versions of the chip, H20E and other chips, we still see, like, Chinese companies like Alibaba,

Starting point is 00:37:21 renting GPUs outside of China because the GPUs they can get outside of China are just so much better on a dollar spend per performance basis, renting them or even like going through sort of like a Singaporean company that is effectively a Chinese company and building data centers and putting chips in them. So it's like, I don't think China's limiting the power per se. It's that it's, you know, you can only, if you can spend like your Chinese companies are growing their CAPEX way more than US companies on a percentage basis next year. The absolute dollar number is, you know, obviously the US companies are spending more still on AI. The percentage basis Chinese companies are

Starting point is 00:37:54 growing more next year. And you still have the problem of like, well, dollars spend to AI output in tokens or in whatever is going to be lower because these chips are worse. So power is not the gating factor. It's always capital, right? At least today, right? Now, China can spend a lot more capital if they wanted to. They're subsidizing the semiconductor industry to the tune of like $150, $200 billion a year through SOEs, through CAPEX that's not generating revenue, et cetera. So it's not like they couldn't do this to the AI ecosystem, right, given, you know, meta's CAPEX is like 60 billion, right? And Google's CAPEX is like 80 billion, right? Like they could totally spend way more than that on a single effort they just haven't decided to.

Starting point is 00:38:38 And I just think for the US, our buildouts are constrained by power, right? Like Google has a ton of TPUs sitting waiting for data centers to be powered and ready, as does meta with GPUs. Right. We posted about how meta is now building these like effectively tense. Is this to some degree also coupled to their unwillingness to sell them to a broader ecosystem? I mean, if they want to be confined in their own data centers and they're, you know, didn't ramp data center build out for their own hyper, for their own, some hypersigilist cases, quickly enough, right?

Starting point is 00:39:08 Then yes, that constrains them, right? If they were on the open market, will we still be constrained? Yeah, yeah, for sure. Because like companies like CoreWeave, you know, why is CoreWeave valuable is really because they build infrastructure really fast, right? And their software is nice, I think, but like, but a lot of their customers are bare muddle, right? Just replace the GPUs whenever they're broken and networked properly.

Starting point is 00:39:29 They grew more aggressively. And they'll go anywhere. Jensen likes them as well. Yeah, they'll go. Yeah, yeah, that's very important as well. But they'll like go, because it's it, it, it, it, it, it, it, it, it, it, it, it, it, it, it, it, it, it, it, it, it, it, has been, right? Yeah, so I think what's really important is that, like, CoreWeave doesn't care, right?

Starting point is 00:39:48 They're like, oh, crypto data center. I will convert it to AI data center, right? They bought a company from, you. like $10 billion that's doing crypto mining, which is worth like $2 billion a couple years ago, and it's not because they're Bitcoin mining business is growing. It's because they have powered data centers, right? Like anywhere and everywhere,

Starting point is 00:40:05 people are trying to build power data centers, and companies like Corweave and Oracle are moving to the... Actually, today, Google just didn't bought 8% of a crypto mining company called Terilwuf, right? Not because they're getting into crypto mining. No, because they need the data. They want the power. They need the power, right? And it's like, all the hypers

Starting point is 00:40:27 have like said, screw off to my sustainability pledges because they need power as fast as possible. Yeah. Right? They're, you know, they're doing things that are not, that they take a little bit longer to move the ship, but like,

Starting point is 00:40:40 even if you didn't do it in your own self-built data centers, there's still a lot of challenges in the open market. There's a deficit, right? And that's, that's constraining American ship belt out. heavily. Yes, others could maybe do it a little bit faster like Corrieve or others, right? Oracle's got an open mind as well. But it's still contraining U.S. buildouts heavily. Even though the capital has been spent, right, the chips are, you know, 60 to 80% of the cost

Starting point is 00:41:12 of the cluster, depending on what chips you're getting. So it's like they've already bought the chips. They just can't put them anywhere because the data centers aren't ready. It applies to Google, applies to Microsoft, applies to meta, applies to a lot of folks. I mean, it's really hard to build power infrastructure in the U.S. Power, great interconnections, transmission, substations, all of this stuff. Like electrical contractors, electricians

Starting point is 00:41:34 in Texas, if you're willing to be a travel electrician, it's like oil pay, right? Like it used to be that like if you're physically adept, you could go make 100 grand in West Texas, but like who the fuck wants to do that? Now it's like, well, you could go like 200 miles away from Dallas, and what's still a reasonable town and build a data center

Starting point is 00:41:55 and work on the wiring within the data center and all this other stuff, the transmission stuff, and your pay is up like 2X now versus what it was just a few years ago. This labor problem is a challenge too, and it's, yeah, I think in China they don't have any of these problems, but they just haven't spent the capital yet,

Starting point is 00:42:12 but capital is an issue as well. Because of the scale of what's being spent, right? Like, Invidia's revenue this year is going to be like over $200 billion, and next year expects are over 300 billion plus Google's going to spend like 50 billion dollars on TPU data centers, right? And it's like, and Amazon's going to spend tons and tons on Traneum data centers. It's like the scale of dollars is quickly growing to nation-state level stuff. And what's more important is being able to decide to spend the dollars and what's

Starting point is 00:42:43 cost effective. And so to some extent, China's still constrained by that, but they can smuggle chips in. They can build data centers outside of China. they can rent data centers outside of China and have the most cost-effective, you know, Blackwell chips or whatever, right? ByteDance is, you know, either the biggest or the second-bigest customer of Google Cloud for a reason, right?

Starting point is 00:43:03 And they're getting, you know, over, you know, they're getting many, many Blackwell from them, right? And the same with Oracle and the same with Microsoft and all these other companies are renting tons of chips to China anyways because it's more cost-effective to do that than build it yourself. So it's not like China has this mentality where we only have to,

Starting point is 00:43:19 well, the government does, but the infrastructure companies don't. Like Alibaba, 10 cent, bite dance, et cetera. So what's the end game for data centers? I mean, like, we need more power, we need more cooling. Will the end be every, all data systems will be next to a nuclear reactor, lots of solar, you know, next to a deep level, like deep sea water

Starting point is 00:43:38 that we use for cooling or something like that? Or what's... I think that like cooling is... Like the physical cooling of a data center are like, you know, there's this whole narrative about like, oh, yeah, I use this so much power And it's like not really, you know, farming alfalfa uses like 100x the water of AI data centers. Even by the end of the decade, it'll be the same.

Starting point is 00:44:00 And it's like, alfalfa is like worth very little. So it's like there's like, it's like cooling is like not that. You know, people have like experimented with like, you know, undersea data centers to reduce the cooling cost. That doesn't make sense. It's like five, 10% savings. But then like if you want to get the water out of the ocean then, then put the data center into the ocean. It's like if you want to service it like, yeah, you're screwed. Right.

Starting point is 00:44:17 So like the same with power. It's like, we talk. a lot about like the power is not actually that expensive. It's just hard to build, right? How to get to the right place? And hard to get to the right space and convert it down to the voltages and all the stuff that chips need. So it's less the magnitude of power and more where it is

Starting point is 00:44:34 and how it moves. Well, the magnitude too, right? Like, it's going to be in terms of total world wild energy consumption. A idea doesn't it is still a footprint. It's like nothing. Yeah. It's a fraction of a percent. Yeah, yeah. Even by the end of the decade, you know, the U.S. will be like 10 percent. of our power will be A data centers, which is still like... Of electricity.

Starting point is 00:44:53 Of our electricity. In terms of energy, that's even a smaller fraction, right? Oh, yeah, yeah, because you think about... But shifting to electric vehicles, also, you can probably make a bigger swing than, you know, with all the AIA data centers, we can build... But outside, like, it's like in Europe, like, that number's not moving up that fast and, like, all these other countries. I think we need to build a lot more power, but it's not like some crazy, crazy, like,

Starting point is 00:45:15 amount. It's just, like, doing it properly is the hard thing. And again, like the cost of power, like, you go look at like these deals people are signing. They're still signing, like, even though the prices skyrocketed from like a few cents a kilowatt hour for these massive, massive purchases to like 10. It's still, you know, when you think about the full TCL, the cluster, you know, the GPU cost of networking, all of this stuff far outstrips the power. Yeah. And same with cooling. But what percentage is power from, like if you do a four-year amortized GPU data center, what percentage will be power?

Starting point is 00:45:49 80% of the cost of a GPU data center if you're building Blackwell is capital. Yeah. Right? It's the GPU purchases. It's the networking. It's the physical data center conversion power conversion equipment.

Starting point is 00:46:01 All of this stuff is like 80% of the cost. And then 20% is going to be your land and your power and your cooling and your cooling towers and your backup power and your generators and all this stuff. It's like nothing, which is why it doesn't matter if you spend, you know, 10% or 50% more on that. because at the end of the day, the expensive thing, right? Like, this is why what Elon did would seem silly, right? They spent a lot more money on, you know, generators outside the data center and these mobile chillers to cool the water down for their liquid cooling

Starting point is 00:46:32 instead of like the more cost-effective option because it got the data center up three months faster. And so like that three months of additional training time is worth way, way, way more on a TCO basis, right? The performance you got out of the chips and the time to market and all this is way, way, way fast. and therefore it was the right decision, even though this part of the data center bloomed at cost, everything else is still there and you're still paying for the chips. And if they were sitting idle, it's not worth it.

Starting point is 00:46:57 Just by like bypassing the grid, bypassing anything to do with interconnect, anything to do with public utilities. Exactly, exactly. What's your take on Intel? Where is Intel going? I think the world, well, the US needs Intel. I think the world needs Intel.

Starting point is 00:47:11 I think the world needs Intel because like Samsung is doing worse than Intel on leading edge process development, in my opinion, based on, even on various customers in the industry, having done test chips at like Intel versus Samsung, they think, I think the industry generally agrees that Intel is further along, you know, the sort of the two nanometer class process technology

Starting point is 00:47:30 than Samsung is, but both are way behind TSMC. And TSMC is a monopoly in some extent. The number one question always people ask is like, why is TSM not making more money? Why are they only raising prices? you know, next year, you know, three to 10% depending on what it is. It's like TSM's a monopoly.

Starting point is 00:47:50 Like they could raise a lot more, but they're good Taiwanese people rather than like dirty American capitalists. If TSM was owned or was managed by Americans, I think most ownership is actually American in terms of the stock, it's on the New York Stock Exchange and all this like, you know, they would have raised prices a lot more. And so like there is this like difficult, difficult thing to be done.

Starting point is 00:48:14 that like, hey, there's one island that controls all leading edge semiconductors and not just all leading edge, like the majority of trailing edge production as well. Something needs to be done. Intel is behind, but not like, not like absurdly so, right? Like if something were to happen to Taiwan, Intel would have the most advanced technology in the world, right?

Starting point is 00:48:31 It's just, it's not economic. Can you keep Intel as one company if you want them to be competitive? I think the process of splitting it would take so much executive time and so much executive effort that you would have been bankrupt by then. Right. And that's the big challenge.

Starting point is 00:48:45 Like, I think Intel should be separate, right? But to properly split the company and for all the management time that's needed is like absurd. And instead, like, what you need is like, you need Lip Bhutan, who's the CEO of Intel. You know, there's a lot of drama going around about him because he's, he's one of the greatest semiconductor investors ever, right? He's invested in so many different companies first. You know, he was on the board of like SMIC, which is China's TSM, effectively, which is like a big like drama. or some of the biggest tool companies in Chinese, the first investor in them,

Starting point is 00:49:17 because, you know, it was a multipolar world there and he's making good investments. But, like, you know, now, like, people are getting mad about that, but it's like, no, he recognizes the companies, like, he understands the supply chain. He needs to not spend his time on splitting the company because then he never actually fixes the company, right? Intel's problem is that, like,

Starting point is 00:49:36 it takes them five to six years to go from design to shipping the product, in some cases more. And when they tape out a chip, right, Like, you know, you send the design to the fab. The fat brings back the chip. They go through 14 revisions in some cases, whereas, like, the rest of the industry goes through, like, one to three, right? Revisions, if they're good.

Starting point is 00:49:57 Of, like, send the design in, get the chip back, test it, send the design in, right, for a public launch. And they'll launch a chip in three years. So, but if you look at Intel today, right, they still don't have a competitive entry on the AI side. And they won't. Right. Can you? So what does they mean for their offering? I mean, they're still doing great on CPUs.

Starting point is 00:50:17 They don't have a good AI AI chip product. Is it long-term sustainable positioning, right? I mean, as a standalone chip company? I mean, IBM still makes more money every launch off of mainframes. So it's not like X86 is dead. It's like you don't get the growth rates, but like you could totally run this as a very profitable enterprise. And I think the same with PCs, right?

Starting point is 00:50:37 There's some turmoil, there's some arm entry, there's some AMD competition. They're very well. Well, like, I think it's a very, it can be a very profitable business if it had, like, one third the people or half the people working on it. And so, like, Liputan to fix Intel needs to go into both the design company and lay off a shitload of people, but, like, keep all the good people and make sure that they're designing fast and they're launching from design conception to launches two to three years, not five to six. And that's on the design side and make that profitable. And then on the Fabzi has to do the same thing. There's all these people, like, one of the heads of Fab Automation at Intel,

Starting point is 00:51:14 I explicitly told Lip Butan because, you know, we have a couple X Intel people who are actually good in the company that worked on the fab side, and we're like, they were like, who's the worst people and friends? It's like, oh, this guy sucks. I explicitly told Liputan, he'd never talk to the guy because it was like four layers down. The company has like absurd amounts of hierarchy. It's like four layers down.

Starting point is 00:51:33 He goes to talk to the guy and he's out, right? It's like, he figures out like who's bad, right? and who's good. And he has to go in and he's to like, hey, the vast majority of the team at Intel is the one who led the world in production and process technology for 20 years. Right?

Starting point is 00:51:48 But there's a lot of like built up crap. So he has to go figure this out. Right? He can't waste this time on like, oh, all this like structuring to split. Like I think it would be better if the company split. I just don't think he can spend the time to do that. And if the design side of the company is,

Starting point is 00:52:06 you know, you're not really going to get into AI. You're not really going to... You have to make some money there. But the fabs, I think, could truly become a competitor. But they're going to go bankrupt by the time anything can happen. So they have to figure out how to get capital. So he has to figure out how to get capital. He has to figure out how to clean up all the crap.

Starting point is 00:52:21 Make the yields go up, right? Make the product ship way faster. Like, all of these things are based on. I think the goals are completely correct. I mean, I think the big challenge, just refecting back in my time there, right? I think the big challenge is that right now, if you look at Intel, right, they have essentially software, the chip design,

Starting point is 00:52:39 and then there's the core manufacturing part, right? And they have three very different cultures. And it's very hard to get everything under one umbrella, right? And so I think that is the big challenge. I think you should even run the company separately, right? But like, you can't physically separate them entity-wise because it's going to take so long to sever all these things. Because he doesn't have time, right?

Starting point is 00:52:58 Like Intel is literally going to go bankrupt if they don't have a big cash infusion or they lay off like half the company, right? which some could argue you need to lay off like 30% of company anyways, but there's a lot of bad things that happen if that happens, right? And they need to spend a lot more on building the next generation FAD, even if they fix the FAB, and they don't have money for that, right? So there's like, there's like a lot more, more important problems than like physically separating the company,

Starting point is 00:53:24 even though I think long term, yes, the FAB has to be separate from the chips, design software, right? Or chip design part of the company. Just like, that's going to make each company much more accountable, be able to service their customers better, et cetera. It's just, that's going to take too long and they're going to go bankrupt by then. Awesome.

Starting point is 00:53:41 But I think, I hope, I pray, someone does something, right? Like you get a big capital infusion, I don't know, the big hyperscalers are like muscled into like, oh, okay, wait, if TSM eventually grows their margin to 75% because of the monopoly, plus they intake all this stuff, like copackage optics and power delivery and all this, like all of a sudden the cost is going to spike. So we should actually just throw $5 billion at in the money.

Starting point is 00:54:03 Intel each, right? Screw it. And that could actually give Intel enough of a lifeline to potentially get to something and maybe be competitive. That's the hope. Can we finish by finishing this game that we started when we gave Sam Altman advice? If Jensen was here, what advice would you have from? If Jensen was here, you know, I think he has a massive, massive balance sheet, right? Jensen does. He's cash free cash flow. It's like, you know, ridiculous. The tax cut, the new Trump, you know, tax bill

Starting point is 00:54:40 institutes something really incredible, which is that you can depreciate all of the GPU cluster cost in year one, which we put out like a note about how like the tax implications to like meta are like $10 billion a year. And across each of the major hyperscalers, it's like massive. It's like, well,

Starting point is 00:54:56 Nvidia's going to spend tons and tons of cash or they're going to spend like, you know, tens of billions of dollars of taxes. why don't you get into the infrastructure game somehow? Now, this is obviously going to be like crazy because now they're buying their own GPUs and putting them in data centers and doing stuff and they're competing with their own customers,

Starting point is 00:55:15 but they're already doing that anyways because their customers are trying to make chips. But they should like accelerate the data center ecosystem with investments, right? Because really we think we can like have very high degree of like accuracy on what they're going to do next year in terms of revenue because it's just, the number of data center watts that are being built, right?

Starting point is 00:55:35 Like, this is harder thing to shift up and down, right? Now, there's a little bit of share difference between how much is TPU versus GPU, but it's like, you have to accelerate the infrastructure and you need to spend all of this capital that you're building, right? Like, okay, do you want to go the route of like doing buybacks and dividends? Like, great, like, you're a loser if you do that, right? Like, you can make more money by reinvesting and building a bigger company that's not just chips into the ecosystem or servers into the ecosystem, but actually like control

Starting point is 00:56:03 the infrastructure end to end somehow. So I think there's something he could do there with this massive war chest. And there's a reason like, Nvidia's done some buybacks and they've done some dividends and increasing, but the cash on their balance sheet keeps growing. And they're going to have north of $100 billion

Starting point is 00:56:19 of cash on their balance sheet by the end of this year, I think. So it's like, what are you going to do with that? I think there's something moving into the infrastructure layer much more that they could do if you really wants to be the king of the oral. right, which I think he does. Sergey and Sindar. Cool.

Starting point is 00:56:39 I think they should open up the kimono on TPUs, right? Like start selling them, open up the software, open source a lot more of the XLA software because there's open XLA and there's XLA, but the vast majority is closed source. Really, really open up the kimono on that and be a lot more aggressive, right? They're still pretty not aggressive on data centers. they're pretty not aggressive on a lot of elements of the company. The TPU team's next-gen designs are pretty not aggressive, partially because a lot of the TPU team has left to go to Open AI,

Starting point is 00:57:12 the best people that I knew. It was actually really annoying. I knew like four people or five people, and they all went to Open AI, and it's like, fuck. Like, now I don't get as much. I met some other people, right? But it's like, you know, I think they could be a lot more aggressive in many ways across the company.

Starting point is 00:57:27 They don't have to be, right? But they could. Because AI, you know, like this Chad GBT, Take Gray, the shift of search queries, the monetizable ones, especially from two purchasing agents, is going to really screw Google long term if they don't, you know, get their act together. I think they've gotten their act together on DeepMind. There's still some inefficiencies, but Sergey works on, works within DeepMind a lot and they're driving hard.

Starting point is 00:57:50 They're still a little bit behind, but like, I think like physical infrastructure, TPU and how much money they could make and how much they could take the wind out of everyone else's sales if they start selling TPUs externally and reorg around like building data centers much faster so that they do have the most compute in the world because they did. But now there's certain companies that are going to surpass them potentially over the next few years if they don't really get their act together. So I think that's what I would say for them. Yeah. And also like learn how to ship product. Yeah. Zuck. I think I think Zuck, you know, it remains to be seen what goes on with super intelligence,

Starting point is 00:58:31 but like they're trying to move super fast with the data centers. You know, like screw it, we'll build tents instead of like physical data centers because we only need these for five years anyways. You know, the super intelligence moves. You could say whatever you want, but like, you know, trying to buy like thinking

Starting point is 00:58:45 for like $30 billion or SSI for $30 billion didn't work out. So then they spent, you know, not even that much on hiring, not $30 billion on hiring all these people. So I think that he recognizes the urgency with the models, with the infrastructure, So I really think he needs to like, you know, if you read his website post about like AI, like I think, you know, he sees the vision, right?

Starting point is 00:59:08 There's the wearables, there's integrating AI into that. There's being your AI assistant to do all this purchasing and stuff. I think he sees the vision, but I think he also needs to focus on like actually like releasing that faster. But also like the products that they do outside of their core IP every time they launch something is kind of mid, right? you know, metal reality labs is doing well, but I think they should like go more explicit, like have a Chad GPT competitor, have a Claude competitor,

Starting point is 00:59:37 like just start releasing way more products because they're really just focused on their individual gardens rather than like branching outside of it. Do you think Apple should have that same sense of urgency or if Tim Cook was here, what would you tell them? The funny thing is like some of their best AI people are now like at Super Intelligence. They're building an AI accelerator,

Starting point is 00:59:56 They're going to, they're, they have AI models, but they're just like way slower. They did mention on the last earnings call, they're going to allocate more capital with this, but it's like, guys, Apple, like, you guys are going to lose the boat if you do not spend, like, $50, $100 billion on infrastructure. You don't think the concierge will cut it. I think, I think, like, more and more you'll see people, like, you know, great. Apple has this world garden, but, like, they can only do so much to protect it, right? IDFA, like, they shut down ads to, or data sharing to mess.

Starting point is 01:00:26 But meta made better models, and now they have way more data and way more power over the user than they ever did before. It was good that meta kicked the crutch off of them, or Apple did. But the same applies to like AI. Yes, they have access to the text and they have access to this. But like, I think other people are going to be able to integrate user data. And agents will be able to integrate all this user data. And they'll start to lose control of what the user experience is as more and more gets disinrediated by AI being the interface, rather than touch, rather than, you know, touchpad and keyboard.

Starting point is 01:00:59 And I don't think they've truly realized what happens when the interface to computing is AI. Like, they market it, but like that's going to shift computing really heavily. They have great hardware, and their hardware teams are working on awesome stuff and form factors. But like, I just don't know if they like get what is actually going to happen to the world in the next five years truly well enough, and they're not building fast enough for it. What about Microsoft to that end?

Starting point is 01:01:25 Microsoft has the same problem. I think they were super aggressive in 23 and 24. And then they pulled back heavily, right? Now, like, opening eyes slipping through their grasps. There's that whole thing there. They cut back on data center investments heavily. They were going to be the largest infrastructure company in the world by like a factor of 2X, which would have been, you know,

Starting point is 01:01:48 you could argue maybe that was like too much and maybe it wouldn't have been economical. But like they're losing grasp on OpenAI, their internal model. efforts are failing spectacularly. Like they're on LLM arena right now and they're pretty decent there, but it's like, that's just like a sick authentic model. Like, it's a code name, but like whatever. Like MAI is like failing.

Starting point is 01:02:09 Azure is like losing a lot of share to Oracle and CoreWeave and Google and so on and so forth, right? Their internal chip, chip effort is by far the worst of any hyperscaler. Like they're just like mis-execute. Like GitHub, how is GitHub not the highest ARR software code model. They only had the best IDE, the best source code repository, the best enterprise,

Starting point is 01:02:33 Salesforce, the best model company as a relationship, and they were the first to market, right? It's like, they're never going for them. And like there's just nothing, right? It's like, like GitHub co-pilot is failing. Microsoft co-pilot is like still crap, right? Like, yeah, it's unusable. It's like, what is going on?

Starting point is 01:02:51 Like, you need to shake the crap out of the company. Like, I think they win a lot because. they have the best business to business relationship with so many enterprises. That sounds like on the planet. Yeah, but like they end up like not having the actual product to sell them, which is like really scary. So they need to really work on product. Satcha has done great on sales and stuff, but like, yeah.

Starting point is 01:03:11 If Elon was here, what advice would you give him? A lot of people at XAI are mad about the porn models, like porn stuff. It's fine. Like you're going to make a ton of money off of this. This is how you accelerate the revenue of that company. But like he's losing a lot of talent. and axing a lot of good projects. But Elon is a magnet to amazing talent and building stuff,

Starting point is 01:03:30 so I won't bet against him, but it seems like since he left the administration and focused on stuff again. But I think, I don't know, I think he's focused on a lot of things, and I think, like, Robotaxy starting to look good, actually, again. Like, I haven't ridden one yet, but I have some friends who've ridden one.

Starting point is 01:03:42 It's, like, looks pretty decent. He could, like, not make these snap decisions, which often are the reason why he's amazing, but, like, some of these snap decisions are hurting him. Yeah. I'm not sure if I can give Elon that much great advice. because I think maybe it's just like focus on like the products again, right? More.

Starting point is 01:04:00 But he is working on that stuff a lot. Yeah. I think there might be a good place to to wrap. Awesome. It was a great discussion. Dylan, thanks so much for joining us. Thank you for having me. Thanks for listening to the A16Z podcast.

Starting point is 01:04:13 If you enjoyed the episode, let us know by leaving a review at rate thispodcast.com slash A16Z. We've got more great conversations coming your way. See you next time. As a reminder, the content here is for informational purposes only. should not be taken as legal business, tax, or investment advice, or be used to evaluate any investment or security and is not directed at any investors or potential investors in any A16Z fund. Please note that A16Z and its affiliates may also maintain investments in the companies discussed in this podcast. For more details, including a link to our investments, please see A16Z.com forward slash disclosures.

The a16z Show - Dylan Patel: GPT-5, NVIDIA, Intel, Meta, Apple

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.