Dwarkesh Podcast - Satya Nadella — How Microsoft is preparing for AGI

Starting point is 00:00:00 Today, we are interviewing Satya Nadella, we being me and Dylan Patel, who is founder of Simeanalysis. Satya, welcome. Thank you. It's great. Thanks for coming over at Atlanta. Yeah, thank you for giving us the tour of the new facility. It's been really cool to see. Absolutely. Satya and Scott Guthrie, Microsoft's EVP of Cloud and AI,

Starting point is 00:00:19 give us a tour of their brand new Fairwater 2 data center, the current most powerful in the world. We've tried to 10x the training capacity every 18 to 24 months. And so this would be effectively a 10x increase, 10x from what GPD-5 was trained with. And so to put in perspective, the number of optics, the network optics in this building is almost as much as all of Azure across all our data centers

Starting point is 00:00:43 two and a half years ago. It's kind of what, five million network connections. You've got all this bandwidth between different sites in a region and between the two regions. So is this like a big bet on scaling in the future that you anticipate in the future? There's going to be some huge model that needs to require two whole different regions to train the goal is to be able to kind of aggregate these flops for a large training job and then put these things together across sides right and the reality is you'll use it for uh training

Starting point is 00:01:13 and then you'll use it for data gen you'll use it for inference in all sort of ways it's not like it's going to be used only for one workload forever fairwater four which you're going to see under construction nearby yeah we'll also be on that one pet of pet of bits network so that we can actually link the two at a very high rate and then basically we do the AIWAN connecting to Milwaukee where we have multiple other fair waters being built literally you can see the the model parallelism and the data parallel it's kind of built for essentially the training jobs the pods the super pods across this campus and then with the van you can go to the Wisconsin data center and you literally run a training job with all of them getting aggregated.

Starting point is 00:02:00 And what we're seeing right here is this is a cell with no servers in it yet, no racks. How many racks are in a cell? We think about it. We don't necessarily share that per se, but we, let me... It's the reason I ask. You'll see upstairs, you can start counting. You'll start counting. We'll let you start jogging. How many cells are there in this building? That part also I can't tell you.

Starting point is 00:02:18 Well, division is easy, right? My God, it's kind of loud. Are you looking at this? Like, now I see where my money is going. It's kind of like, I run a software company. Welcome to the software company. How big is the design space once we've decided to use GB200s and VLink? How many other decisions are there to be made?

Starting point is 00:02:41 There is coupling from the model architecture to what is the physical plan that's optimized. optimized. And it's also scary in that sense, which is, hey, there's going to be a new chip that will come out, which obviously, I mean, you take Vera Rubin Ultra, I mean, that's going to have power density that's going to be so different, but with cooling requirements that are going to be so different, right? So you kind of don't want to just build all to one spec.

Starting point is 00:03:10 So that goes back a little bit to I think the dialogue will have, which is you want to be scaling in time as opposed to scale once. And then be stuck with it. When you look at all the past technological transitions, whether it be, you know, railroads or the Internet or, you know, replaceable parts in drustization, the cloud, all of these things, each revolution has gotten much faster in the time it goes from technology discover to ramp and pervasiveness through the economy. Many folks who have been on Dark Escher's podcast believe this is sort of the final technological revolution or transition. And this time is very, very different. And at least so far in the markets, it's sort of, you know, in three years, we've already skyrocketed to, you know, hyperscalers are doing $500 billion of Kappex next year, which is a scale that's unmatched to prior revolutions in terms of speed.

Starting point is 00:04:02 And the end state seems to be quite different. How do you, your framing of this seems quite different than sort of the, I would say, the AI bro, who is quite, you know, AGI is coming and, you know, I'd like to understand that more. I mean, look, I start with the excitement that I also feel for maybe after the industrial revolution, this is the biggest thing. And so, therefore, I start with that premise. But at the same time, I'm a little grounded in the fact that this is still early innings. We've built some very useful things. We're seeing some great properties.

Starting point is 00:04:39 These scaling laws seem to be working. And I'm optimistic that they'll continue to work, right? Some of it is, you know, it does require real science breakthroughs, but it's also a lot of engineering and what have you. But that said, I also sort of take the view that, you know, even what has been happening in the last 70 years of computing has also been a march that has helped us move, you know, with, as I said, you know, I like one of the things that Raj Reddy has as a metaphor for what AI is, right?

Starting point is 00:05:15 He's a Turing Award winner of CMU. And he's always, I think he had this even pre-AGI, but he had this metaphor of AI should either be a guardian angel or a cognitive amplifier. I love that. It's a simple way to think about what this is. Ultimately, what is its human utility?

Starting point is 00:05:35 It is going to be a cognitive amplifier and a guardian angel. And so if I sort of view it that way, I view it as a tool. But then you can also go very mystical about it and say, wow, this is, you know, more than a tool. It does all these things which only humans did so far. But that has been the case with many technologies in the past. Only humans did a lot of things. And then we add tools that did them. I guess we don't have to get

Starting point is 00:05:58 wrapped up in their definition here. But maybe one way to think about it is like maybe it takes five years, 10, years, 20 years. At some point, eventually a machine is producing Satya tokens, right? And the Microsoft board thinks that Satya tokens are worth a lot. How much are you wasting of this, of like, economic value by interviewing Satcha? You could not afford the API cost of Satya tokens. But so, you know, whatever you want to call it, is that, are the Satya tokens a tool or an agent, whatever? Right now, if you have models that cost on the order of dollars or cents per million tokens, there's just an enormous room for expansion, a margin expansion there where a million tokens of Satya are, like, worth a lot.

Starting point is 00:06:37 And where does that margin go? And what level of that margin is Microsoft involved in, is the question I have? So I think in some sense, this goes back up and to essentially what's the economic growth picture going to really look like, what's the firm going to look like, what's productivity going to look like. And that, to me, is where, again, if the industrial revolution created after whatever, 70 years of diffusion is when you started seeing the economic growth, right? It took, that's the other thing to remember, is even if the tech is diffusing fast this time around, for true economic growth to appear, it has to sort of diffuse to a point where the work, the work artifact and the workflow has to change. And so that's kind of one place where I think the change management required for a corporation to truly change, I think, is something

Starting point is 00:07:30 we shouldn't discount. So I think going forward, do humans and the tokens they produce get higher leverage, right? Whether it's the Dorcasch or the Dillon tokens of the future. I mean, Think about the amount of technology. Would you be able to run semi-analysis or this podcast without technology? No chance, right? I mean, the scale that you have able to achieve, no chance. So the question is, what's that scale? Is it going to be 10xed with something that comes through?

Starting point is 00:07:59 Absolutely. And therefore, whether you're ramp to some revenue number or your ramp to some audience number or what have you. And so that, I think, is what's going to happen. I mean, the point is that whatever, what took 70 years, maybe 150 years, 50 years for the Industrial Revolution may happen in 20 years, 25 years. That's a better way to, like, I would love to compress what happened in 200 years of the industrial revolution into 20-year period, if you're lucky.

Starting point is 00:08:28 So Microsoft historically has been perhaps, you know, the greatest software company, the largest software as a service company. You know, you've gone through a transition in the past where you used to sell Windows licenses and disks of Windows or Microsoft, and now you sell, you know, subscriptions to 3, or as we go from sort of, you know, that transition to where your business is today, there's also a transition going after that, right? Software as a service, incredibly low incremental cost per user. There's a lot of R&D.

Starting point is 00:08:59 There's a lot of customer acquisition cost. This is why, not Microsoft, but the SaaS companies have underperformed massively in the markets because the cogs of AI is just so high and that just completely breaks how these business models work. How do you as perhaps the greatest software company, software as a service company, transition Microsoft to this new age where COGS matters a lot and the incremental cost per users is different, right?

Starting point is 00:09:26 Because right now you're charging, hey, it's 20 bucks for a co-pilot. Yeah, so I think that this is a, it's a great question, because in some sense, the business models themselves, I think the levers are going to remain similar, right? Which is, if I look at the, if you look at the menu of models, starting from, like, say, consumer all the way, right? There will be some ad unit. There will be some transaction.

Starting point is 00:09:48 There will be some device gross margin for somebody who builds an AI device. There will be subscriptions, consumer and enterprise. And then there'll be consumption, right? So I still think that that's kind of how those are all the meters. To your point, what is a subscription? Up to now, people like subscriptions because they can budget for them, right? they are essentially entitlements to some consumption rights that come encapsulated in a subscription. So that I think is what, in some sense, it becomes a pricing decision.

Starting point is 00:10:20 So how much consumption is in your entitled to is, if you look at all the coding subscriptions, that's kind of what they are, right? And they kind of have the pro tier, the standard tier and what have you. And so I think that's how the pricing will, you know, and the margin structures will get tiered. The interesting thing is having, at Microsoft, the good news for us is we kind of are in that business across all those meters. In fact, as a portfolio level, we pretty much have consumption, subscriptions to all of the other consumer levers as well.

Starting point is 00:11:00 And then I think time will tell which of these models make sense in what categories. One thing on the SaaS side since you brought up, which I think a lot about. is take Office 365 or Microsoft 365. I mean, man, having a low RPU is great. Because here's an interesting thing, right? During the transition from server to cloud, one of the questions we used to ask ourselves is, oh my God, if all we did was just basically move the same users

Starting point is 00:11:26 who were using, let's call it, our office licenses and our servers at that time, office servers, right, to the cloud. And we had cogs. This is going to basically not only shrink our margins, but we'll be fundamentally a non-profitable, or in less profitable company. Except what happened was that moved to the cloud

Starting point is 00:11:45 expanded the market like crazy. Right, I mean, we sold a few servers in India, didn't sell much, whereas in the cloud, suddenly everybody in India also could afford fractionally buying servers. The IT costs, I mean, in fact, the biggest thing I had not realized, for example, was the amount of money people were spending

Starting point is 00:12:05 buying storage underneath SharePoint. In fact, EMC's biggest segment may have been storage servers for SharePoint. All that sort of dropped in the cloud because nobody had to go buy. In fact, it was working capital. I mean, basically it is cash flow out, right? And so it expanded the market massively. So this AI thing will be that, right? So if you take coding what we built with GitHub,

Starting point is 00:12:35 and Vs code in over whatever decades, suddenly the coding assistant is that big in one year. And so that, I think, is what's going to happen as well, which is the market expands massively. I guess there's a question of the market will expand. Will the parts of the revenue that touch Microsoft expand? So co-pilot is an example where, if you look early this year, I think, I guess, according to Dylan's numbers,

Starting point is 00:13:02 the copilot revenue, GitHub copilot revenue, 500 million or something like that, and then there were like no close competitors. Whereas now you have ClaudeCode, Cursor, and Co-Pilot with around similar revenue around a billion, and then Codex is catching up around $700, 800 million. And so the question is across all the surfaces that Microsoft has access to, what is the advantage that Microsoft's equivalence of Co-Pilot have? Yeah. By the way, I love this chart. You know, I love this chart for so many reasons. One is, we're still on the top. Second is all these companies, that are listed here are all companies that have been born in the last four or five years.

Starting point is 00:13:40 Yeah. That to me is the best sign, right, which is if you have new competitors, new existential problems when you say, man, who's it? Now, Claude's going to kill you. Cursor is going to kill you. It's not Borland, right? So thank God. Like, that means we are in the right direction.

Starting point is 00:13:54 But this is it, right? The fact that we went from nothing to this scale is the market expansion. So this is like the cloud-like stuff. fundamentally, this category of coding and AI is probably going to be one of the biggest categories. It is a software factory category. In fact, it may be bigger than knowledge work. So I kind of want to keep myself open-minded about,

Starting point is 00:14:17 I mean, we're going to have tough competition. I think that's your point, which I think is a great one. But, man, like, I'm glad we have, we parlayed what we had into this. And now we have to compete. And so in the compete side, even in the last, quarter, we just finished, we did our quarterly announcement, I think we grew from 20 to 26 million subs, right? So I feel good about our sub growth and where the direction of travel

Starting point is 00:14:42 on that is. But the more interesting thing that has happened is, guess where all the repos of all these other guys who are generating lots and lots of code go to, they go to GitHub. So GitHub is at an all-time high in terms of repo creation, PRs, everything. So that, in some sense, we want to keep that open, by the way. That means we want to have that, right? Because we don't want to conflate that with our own growth, right? Interestingly enough, we're getting one developer joining GitHub a second or something. That is the stat, I think.

Starting point is 00:15:14 And then 80% of them just fall into some GitHub co-pilot workflow just because there are. And by the way, many of these things will even use some of our code review agents, which are by default on just because you can use it. So we'll have many, many structural shots at this. The thing that we're also going to do is what we did with Git, the primitives of GitHub, starting with Git, to issues, to actions. These are powerful, lovely things because they kind of are all built around your repo. So we want to extend that.

Starting point is 00:15:48 Last week at GitHub universe, that's kind of what we did, right? So we said Agent HQ was the conceptual thing that we said. We're going to build out. This is where, for example, you have a thing called mission control. And you go to Mission Control and now I can fire off. Sometimes I describe it as the cable TV of all these AI agents because I'll have essentially packaged into one subscription. Codex, Claude, you know, cognition stuff, anyone's agents, GROC, all of them will be there. So I get one package and then I can literally go issue a task, steer them.

Starting point is 00:16:25 So they will all be working in their independent branches. I can monitor them. So I literally have, because I think that's going to be one of the biggest places of innovation, right? Because right now, I want to be able to use multiple agents. I want to be able to then digest the output of the multiple agents. I want to be able to then keep a handle on my repo. So if there's some kind of a heads-up display that needs to be built, and then for me to quickly steer and triage what the coding agents have generated.

Starting point is 00:16:53 That, to me, between VS code, GitHub, and all of these new primitives, we will build as mission control, I think with a control plane, observability, I mean, think about every one who is going to deploy all this, will require a whole host of observability of what agent did what, at what time, to what codebase. So I feel that's the opportunity. And at the end of the day, your point is well taken, which is we better be competitive and innovate. And if we don't, yes, we will get toppled. But I like the chart, at least as long as we're on the top, even with competition. The key point here is sort of that GitHub will keep growing,

Starting point is 00:17:29 irregardless of whose coding agent wins. But that market only grows at, you know, call it 10, 15, 20%, which is way above GDP. It's a great compounder. But these AI coding agents have grown from, you know, call it $500 million run rate at the end of last year, which was basically just GitHub co-pilot to now the current run rate across, you know, GitHub co-pilot, Cursor, Cognition, WinSurf,

Starting point is 00:17:52 replet, codex, open-a-codex. That's run rating at $5, $6 billion now for the Q4 of this year. That's a 10x, right? And when you look at, hey, what's the tam of software agents? Is it the $2 trillion of wages you pay people? Or is it something beyond that? Because every company in the world will now be able to develop software more. No question Microsoft takes a slice of that.

Starting point is 00:18:19 But you've gone from near 100%, or certainly way about 50% to, you know, sub-25% market share in just one year, what is the sort of confidence that people can get that Microsoft will be? Again, it goes back a little bit to sort of there's no birthright here that we should have any confidence other than to say, hey, we should go innovate. And knowing the lucky break we have in some sense is that this category is going to be a lot bigger than anything we had high share in. Let's let me say it that way, right? In some sense, you could say, man, we kind of had high share in Viscode. We had high share in the repos with GitHub. And that was a good market, but the point is even having a decent share in what is a much more expansive market, right?

Starting point is 00:19:04 I mean, you could say we had a high share in client server, server computing. We have much lower share than that in hyperscale. But is it a much bigger business by orders of magnitude? So at least there's existence proof that Microsoft being okay, even if our share position has not been as strong. as long as it was, as long as the markets we are competing in are creating more value, and there are multiple winners. So I think that's the stuff. But I take your point that ultimately it all means you have to get competitive. So I watch that every quarter. And so that's why I think I'm very optimistic that what we are going to do with GitHub HQ or Agent HQ turning GitHub

Starting point is 00:19:46 into a place where all these agents come. As I said, we'll have multiple shots on goal on there. It need not be that, hey, some of these guys can succeed along with us. And so it doesn't need to be just one winner and one subscription. I guess the reason to focus on this question is that it's not just about GitHub, but fundamentally about Office and all the other software that Microsoft offers, which is that one vision you could have about how I it proceeds is that, look, the models are going to keep being hobble, then you'll need this direct visible observability all the time.

Starting point is 00:20:22 And another vision is, over time, these models can, now they're doing tasks to take two minutes. In the future, they'll be doing tasks that will be tasks that take 10, 30 minutes. In the future, maybe they're doing days worth of work autonomously. And then the model companies are charging thousands of dollars maybe for access to really a coworker, which could use any UI to communicate with their human and so forth and migrate between platforms. So if we were getting closer to that, why aren't the model companies that are just getting more and more profitable, the ones that are taking all the margin? Why is the place where the scaffolding happens, which becomes less and less relevant to Zia has become more capable,

Starting point is 00:21:01 going to be that important? And that goes to, you know, office as it exists now versus coworkers that are just doing knowledge with autonomously. I think that's a great point. I mean, for example, I mean, this is where, you know, does all the value migrate just to the model and or does the, you know, does it get split between the scaffolding and the model and what have I think that time will tell, but my fundamental point also is the incentive structure gets clear, right, which is if you take, let's take information work. We'll take even coding. Already, in fact, one of the favorite settings I have in GitHub co-pilot is called Auto, right,

Starting point is 00:21:43 which will just optimize. In fact, I buy a subscription. The auto one will start picking and optimizing for what I am asking it to do. And it could even be fully autonomous, and it could sort of arbitrage the tokens available across multiple models to go get a task done. So if that is the, that means that if you take that argument, the commodity there will be models. And especially with open source models, you can pick a checkpoint and you can take a bunch of your data and you're seeing it, right? I think all of us will start, whether it's from cursor or from Microsoft, you'll start seeing some in-house models even. which will, and then you'll offload most of your tasks to it.

Starting point is 00:22:25 So I think that one argument is if you win the scaffolding, which today is dealing with all the hobbling problems or the jaggedness of this intelligence problems, which you kind of have to, if you win that, then you will vertically integrate yourself into the model just because you will have the liquidity of the data and what have you, and there are enough and more checkpoints that are going to be available. That's the other thing. So structurally, I think there will always be an open source model that will be fairly capable in the world that you could then use as long as you have something that you can use that width, which is data and a scaffolding. So I can

Starting point is 00:23:07 make the argument that, oh, my God, if you're a model company, you may have a winner's curse. You may have done all the hard work, done unbelievable innovation, except it's kind of like, like one copy away from that being commoditized. And then the person who has the data for grounding and context engineering and the liquidity of data can then go take that checkpoint and train it. So I think the argument can be made both ways. Unpacking sort of what you said,

Starting point is 00:23:37 there's two views of the world, right? One is that models, there's so many different models out there, open source exists, there will be differences between the models that will drive some level of who wins and who doesn't, but the scaffolding is what enables you to win. The other view is that actually models are the key IP, and yes, everyone's in a tight race, and there's some, hey, I can use Anthropic or Open AI,

Starting point is 00:24:01 and you can see this in the revenue charts, right? Like Open AI's revenue started skyrocketing once they finally had a code model similar capabilities to Anthropic, although in different ways. There's a view that the model companies are actually the ones that garner all the margin, right? because, you know, if you look across this year, at least on Anthropic, their gross margins on inference went from, you know, well below 40% to north of 60, right? By the end of the year, the margins are expanding there despite, hey, more Chinese open source models than ever. Hey, opening eyes is competitive. Hey, Google's competitive. Hey, GROC is now competitive, right? All these companies are now competitive.

Starting point is 00:24:36 And yet, despite this, the margins have expanded at the model layer significantly. How do you think about the... It's a great question. I think the one thing is perhaps a few years ago, people would say, oh, I can just wrap a model and build a successful company. And that, I think, is probably gotten debunked just because the model capabilities and with tools

Starting point is 00:25:00 use in particular. But the interesting thing is there's no, like, when I look at Office 364, let's take even this little thing we built called Excel Agent. It's interesting, right? Excel Agent is not a UI level wrapper. It's actually a model that is, in the middle tier.

Starting point is 00:25:18 In this case, because we have all the IP from the GPT family, we are taking that and putting it into the core middle tier of the office system to both teach it what it means to natively understand Excel, everything in it. So it's not just, hey, I just have a pixel level understanding. I have a full understanding of all the native artifacts of Excel, both when I see it, because if you think about it, if I'm going to give it some reasoning task,

Starting point is 00:25:48 I need to even fix the reasoning mistakes I make. And so that means I need to both not just see the pixels, I need to be able to see, oh, I got that formula wrong, and I need to understand that. And then so to some degree, that's all being done, not at the UI wrapper level with some prompt, but it's being done in the middle tier by teaching it all the tools of Excel, right?

Starting point is 00:26:07 So I'm giving it even essentially a markdown to teach it the skills of what it means to be a sophisticated Excel user. So it's a weird thing that it goes back a little bit to AI brain, right, which is you're building not just Excel. You are now business logic in its traditional sense. You're taking the Excel business logic in the traditional sense and wrapping essentially a cognitive layer to it using this model,

Starting point is 00:26:33 which knows how to use the tool. So in some sense, Excel will come with an analyst bundled in and with all the tools use. That's the type of stuff. stuff that'll get built by everybody. So even for the model companies, they're allowed to compete, right? So if their price stuff high, guess what? If I'm a builder of a tool like this, I'll substitute you.

Starting point is 00:26:54 I may use you for a while. And so as long as there's competition, there's always a winner-take-all thing, right? If there's going to be one model that is better than everybody else with massive distance, yes, that's a winner-take-all. As long as there's going to be competition where there's multiple models, just like hyperscale competition, and there's an open-source check, there is enough room here to go build value on top of models. But at Microsoft, the way I look at it and say is we are going to be in the

Starting point is 00:27:22 hyperscale business, which will support multiple models. We will have access to Open AI models for seven more years, which we will innovate on top of. So, essentially, I think of ourselves as having a frontier class model that we can use and innovate on with full flexibility. and we'll build our own models with MAI. And so we will always have a model level. And then we'll build these, whether it's in security,

Starting point is 00:27:50 whether it's in knowledge work, whether it's in coding or in science, we will build our own application scaffolding, which will be model forward, right? It won't be a wrapper on a model, but the model will be wrapped into the application. I have so many questions about the other things you mentioned, but before we move on to those topics,

Starting point is 00:28:08 I still wonder whether this is like not forward-looking on AI capabilities where you're imagining models like they exist today where, yeah, you have to like, it takes a screenshot of your screen, but it can't like look inside each cell and what the formula is. And I think the better mental model here is like, look, a human, just imagine that these models actually will be able to actually use a computer as well as a human. And a human knowledge worker who is using Excel can look into the formulas, can, you know, use alternative software, can migrate data between Office 365 and another piece of software if the migration is necessary, etc. So what is- That's kind of what I'm saying. But if that's the

Starting point is 00:28:45 case, then the integration with Excel doesn't matter that much. Don't worry about the Excel integration. After all, Excel was built as a tool for analysts. Great. So whoever is this AI that is an analyst should have tools that they can use. Just the way a human can use a computer, That's their tool. The tool is the computer. Right. All right. So all I'm saying is I'm building an analyst as essentially an AI agent, which happens

Starting point is 00:29:14 to come with an a priori knowledge of how to use all of these analytical tools. But is it something, maybe just to make sure we're talking about the same thing, is it a thing that a huge, like me using Excel as a podcaster, I'm not proficient in Excel? Completely autonomous. So just imagine I work. So we should now maybe sort of lay out. how I think the future of the company is, right? The future of a company would be the tools business,

Starting point is 00:29:41 which I have a computer, I use Excel, and in fact, in the future, I'll even have a co-pilot, and that co-pilot will also have agents, right? That's still I am, you know, it's still me steering everything, and everything is coming back. So that's kind of one world. Then the second world is the company just literally provisions a computing resource for an AI agent,

Starting point is 00:30:02 And that is working fully autonomously. That fully autonomous agent will have essentially embodied set of those same tools that are available to it. So this AI tool that comes in also has not just a raw computer because it's going to be more token efficient to use tools to get stuff done. In fact, I kind of look at it and say our business, which today is an end user tools business, will become essentially an infrastructure business and support. of agents doing work. It's another way to think about it. So if one of the things that you'll see us do, in fact, like all the stuff we built underneath M365

Starting point is 00:30:43 still is going to be very relevant. You need some place to store it, some place to do archival, some place to do discovery, some place to manage all of these activities, even if you're an AI agent. So that's, so it's kind of a new infrastructure. So just to make sure I understand,

Starting point is 00:31:01 You're saying, like, look, theoretically, a future AI that has actual computer use, which is all these companies are working on model companies are working right now, could use, even if it's not partnered with Microsoft or under our umbrella, could use Microsoft software. But you're saying we're going to give them, if you're working with our infrastructure, we're going to give you, like, lower level access that makes it more efficient for you to do the same things you could have otherwise done anyways. 100%.

Starting point is 00:31:26 I mean, so the entire thing, in fact, the way, you know, like what happened is, We had servers, then there was virtualization, and they had many more servers. So that's another way to think about this, which is, hey, don't think of the tool as the end thing. What is the entire substrate underneath that tool that humans use? And that entire substrate is the bootstrapped

Starting point is 00:31:49 for the AI agent as well, because the AI agent needs a computer. That's kind of one. So in fact, one of the fascinating things we're seeing a significant amount of growth is all these guys who are doing these office artifacts, and what have you, as autonomous agents and so on, want to provision Windows 365, right? They really want to be able to provision a computer

Starting point is 00:32:09 for these agents. And so, absolutely. And that's why I think we're going to have, essentially, an end user computing infrastructure business, which I think is going to just keep growing, because guess what, it's going to grow faster than the number of users. So in fact, that's kind of one of the other questions

Starting point is 00:32:25 people ask me is, hey, what happens to the per user business? At least the early signs may be the way to think about the per user business is not just per user, it's per agent. And if you sort of say it's per user and per agent, the key is what's the stuff to provision for every agent? A computer, a set of security things around it, an identity around it. And all those things are observability and so on are the management layers. And that's, I think, all going to get baked into that. The way to frame it, at least the way I currently think about it, and I'd like to hear your view, is that these model companies,

Starting point is 00:33:00 are all building environments to train their models to use Excel or Amazon shopping or whatever it is, book flights. But at the same time, they're also training these models to do migration from, because that is probably the most immediate valuable thing, right? Converting mainframe-based systems to standard cloud systems, converting Excel databases into real databases with SQL, right? or converting, you know, what is done in Word and Excel to something that is more programmatic and more efficient in a classical sense that can actually be done by humans as well. It's just not cost effective for the software developer to do that. That seems to be what everyone is going to do with AI for the next, you know, a few years at least to massively drive value. How does Microsoft

Starting point is 00:33:49 fit into that if the models can utilize the tools themselves to migrate to something? And yes, Microsoft has, you know, a leadership position in databases and in storage and in all these other categories, but the use of, say, a office ecosystem is going to be significant less, just like potentially the use of a mainframe ecosystem could be potentially less. Now, mainframes have grown for the last two decades, actually, even though no one talks about them anymore, they've still grown. You got 100%. I agree with that. How does that flow forward? Yeah, I mean, at the end of the day, this is not about sort of, hey, there is going to be a significant amount of time where there's going to be a hybrid world, right? Because people are going to be using the tools that are going

Starting point is 00:34:28 to be working with agents that have to use tools. And by the way, they have to communicate with each other. What's the artifact that generate that then a human needs to see? So like all of these things will be real considerations in any place. So the outputs input. So I don't think it'll just be about, oh, I migrated off, right? But the bottom line is I have to live in this hybrid world. So let's, but that doesn't fully answer your question because there can be a real new efficient frontier where it's just agents working with agents and completely. completely optimizing. And even when agents are working with agents, what are the primitives that are needed? Do you need a storage system? Does that storage system need to have e-discovery?

Starting point is 00:35:06 Does that e-discovery, do you need to have observability? Do you need to have an identity system that is going to use multiple models with all having one identity system? So these are all the core underlying rails we have today for what are office systems or what have you. And that's what I think we will have in the future as well. You talked about databases, right? I mean, take, you know, man, I would love all of Excel to have a database back in, right? In fact, I would love for all that to happen immediately. And that database is a good database. I mean, databases, in fact, will be a big thing that will grow. In fact, if I think about all of the office artifacts being structured better, the ability to do the joins between structured

Starting point is 00:35:47 and unstructured better because of the agenting world, that'll grow the underlying, what is infrastructure business, it happens. The consumption of that is all being driven by agents. You could say all that is just in time generated software by a model company. That could also be true. We will be one such model company too. And so we will build in. So the competition could be that we will build a model plus all the infrastructure and provision it. And then there will be competition between a bunch of those folks who can do that. I guess speaking of model companies, you say, okay, we will also be one of the, not only will have the infrastructure will have the model itself. Right now, Microsoft AI's most recent model that was released two

Starting point is 00:36:27 months ago is 36 in Chatbot Arena. And there's a, I mean, you obviously have the IP rights to open AI. So there's a question of, first, to the extent you agree with that it seems to be behind, why is that the case? Especially given the fact that you could, you theoretically have the right to just like fork opening eyes mono repo or distill on their models. Yeah, especially if it's part of your strategy that we need to have a leading model company? Yeah, I mean, so first of all, we are absolutely going to use the Open AI models to the maximum across all of our products, right? I mean, that's, I think, the core thing that we're going to continue to do all the way for

Starting point is 00:37:04 the next seven years and not just use it, but then add value to it. That's kind of where the analyst and this Excel agent, and these are all things that we will do where we'll do, you know, we'll do RL fine-tuning, we'll do some mid-term. training runs on top of a GPT family, where we have unique data assets and build capability. The MAI model, the way I think we're going to think about it is the good news here, in fact, with the new agreement is even we can be very clear that we're going to build a world-class superintelligence team and go after it with a high ambition. But at the same time, we're also going to use this time to be smart about how to use both

Starting point is 00:37:42 these things. So that means we will on one end be very product focused on the other end. and be very research focused. In other words, because we have access to the GPT family. The last thing I don't want to do is use my flops in a way that is just duplicative and doesn't add much value. So I want to be able to take the flops that we use to generate a GPT family and maximize its value.

Starting point is 00:38:08 While my MAI flops are being used for, let's take the image model that we launched, which I think this launched is number nine in the image arena. you know, we're using it, you know, both for cost optimization. It's on co-pilot. It's in Bing, and we're going to use that. We have an audio model in co-pilot, which it's got personality and what have you. We optimized it for our product.

Starting point is 00:38:30 So we will do those. Even on the LM Arena, we started on the text one, I think it was, it debuted at 19. And by the way, it was done only on whatever, 15,000 H-100s. And so it was a very small model. And so it was, again, to prove out the core capability, the incentive. instruction following and everything else, which we wanted to make sure we can match what was state of the art. And so that shows us, given scaling laws, what we are capable of doing if it gave more

Starting point is 00:38:58 flops to it. So the next thing we will do is an Omni model where we will take sort of the work we have done in audio, what we have done in image, and what we have done in text. That will be the next pit stop on the MAI side. So when I think about the MAI roadmap, we're going to build a first class superintelligence team. We're going to continue to drop and do on in the open some of these models. They will either be in our products being used because they're going to be latency-friendly,

Starting point is 00:39:23 cogs-friendly or what have you, or they'll have some special capability. And we will do real research in order to be ready for some next five, six, seven, eight breakthroughs that are all needed on this march towards superintelligence. So I think that's, and while exploiting the advantage we have of having the GPT family that we can work on top of as well. Say we roll forward seven years. you no longer have access to open AI models, what does one get confidence

Starting point is 00:39:51 or what does Microsoft do to make sure they are leading, have a leading AI lab, right? Today, you know, it's all open AI has developed many of the breakthroughs, whether it be scaling or reasoning or Google's developed all the breakthroughs like Transformers, but it is also a big talent game, right? You know, you've seen Meta spend, you know, north of $20 billion on talent, right?

Starting point is 00:40:12 You've seen Anthropic poached the entire Blue Shift reasoning team from Google last year. You've seen Meta poach a large reasoning and post-training team from Google more recently. These sorts of talent wars are very capital intensive. They're the ones that, you know, arguably, you know, if you're spending $100 billion on infrastructure, you should also spend, you know, X amount of money on the people using the infrastructure so that they're more efficiently making these new breakthroughs. What confidence can one get that, you know, hey, Microsoft will have a team that's world class that can make these breakthroughs. And, you know, once you decide to turn on the money faucet, you know, you're being a bit capital efficient right now,

Starting point is 00:40:49 which is smart, it seems, to not waste money doing duplicative work. But once you decide, you need to, you know, how can one say, oh, yeah, now you can shoot up to where the top five model is. Oh, look. I mean, at the end of the day, we're going to build a world-class team, and we already have a world-class team that's beginning to be sort of assembled, right? With Mustafa coming in, we have Karen, we have Amar Subramanum, who did a lot of the post-training at Gemini, 25, who is at Microsoft, Nando, who did a lot of the multimedia work at Deep Mind is there. And so we're going to build a world-class team. And in fact, I think later this week, even Mustafa will publish some, you know, a little more

Starting point is 00:41:25 clarity on what our lab is going to go do. I think the thing that I want the world to know, perhaps, is we are going to build the infrastructure that will support multiple models. You know, we, because from a hyperscale perspective, we want to. to build the most scaled infrastructure fleet that's capable of supporting all the models the world needs, whether it's from open source or obviously from open AI and others. And so that's kind of one job. Second is in our own model capability, we will absolutely use the open AI model in our products and we will start building our own models. And we may, like in GitHub co-pilot,

Starting point is 00:42:04 Anthropic is used. So we will even have other frontier models that are going to be wrapped into our products as well. So I think that that's kind of how at least each time, at the end of the day, the eval of the product as it meets a particular task or a job is what matters. And we're sort of back from there into the vertical integration needed, knowing that as long as you're serving the market well with the product, you can always cost optimize. There's a question going forward. So right now we have models that have this distinction between training and inference. And one could argue that there's like a smaller and smaller difference between the different models. Going forward, if you're really expecting

Starting point is 00:42:44 something like human-level intelligence, humans learn on the job. You know, if you think about your last 30 years, what makes side their token so valuable? It's the last 30 years of wisdom and experience you've gained in Microsoft. And we will eventually have models if they get to human level, which will have this ability to continuously learn on the job. And that will drive so much value to the model company that is ahead, at least in my view, because you have copies of one model, broadly deplored through the economy, learning how to do every single job. And unlike humans, they can amalgamate their learning to that model. So there's this sort of continuous learning, sort of exponential feedback loop, which almost

Starting point is 00:43:20 looks like a sort of intelligence explosion. If that happens, and Microsoft isn't the leading model company by that time, doesn't then the, you know, you're saying, well, we substitute one model for another, et cetera, matter less because they're just like, this one model knows how to do every single job of the economy, the other long tail don't. Yeah, no, I think your point about if there is one model that is the only model that is most broadly deployed in the world and it sees all the data and it has continues learning, that's game set match and, you know, is such sharp, right? I mean, the reality, at least I see, is the world, even today, for all the dominance of any one model, it's not the case. It's like, take coding.

Starting point is 00:44:07 There's multiple models. In fact, every day, it's less the case where there is not one model that is getting deployed broadly. In fact, there's multiple models that are getting deployed. It's kind of like databases, right? It's always the thing is like, hey, can one database be the one that just is used everywhere, except it's not. There are multiple types of databases that are getting deployed for different use cases. So I think that there is going to be some network effects of continual. learning or data, you know, I'll call liquidity that any one model has.

Starting point is 00:44:38 Is it going to happen in all domains? I don't think so. Is it going to happen in all GOs? I don't think so. Is it going to happen in all segments? I don't think so. It'll happen in all categories at the same time. I don't think so.

Starting point is 00:44:49 So therefore, I feel like the design space is so large that there's plenty of opportunity. But your fundamental point is having a capability, which is at the infrastructure layer, model layer, and at the scaffolding layer. and then to be able to compose these things, not just as a vertical stack, but to be able to compose each thing for what its purpose is. You can't build an infrastructure that's optimized for one model. If you do that, what if you go fall behind?

Starting point is 00:45:16 In fact, all the infrastructure you built will be a waste. You kind of need to build an infrastructure that's capable of supporting multiple sort of families and lineages of models. Otherwise, the capital you put in, which is optimized for one model architecture, that means you're one tweak away from some MOE like breakthrough that happens for somebody else and your entire network topology goes out of the window, then that's a scary thing, right?

Starting point is 00:45:40 So therefore, you kind of want the infrastructure to support whatever may come. In fact, in your own model family and other model families, and you've got to be open. If you're serious about the hyperscale business, you've got to be serious about that, right? If you're a serious about being a model company, you've got to basically say, hey, what are the ways people can actually do things on top of the market? model so that I can have an ISV ecosystem unless I'm thinking I'll own every category. That just can't be. Then you won't have an API business.

Starting point is 00:46:08 And that by definition will mean you'll never be a platform company that's going to be successfully deployed everywhere. So therefore, the industry structure is such that it will really force people to specialize. And that in that specialization, a company like Microsoft should compete in each layer by its merits, but not think that this is all about all a road to game set match where I just compose vertically all these layers. That just doesn't happen. So according to Dylan's numbers, there's going to be half a trillion in AI CAPEX next year alone, and labs are already spending billions of dollars to snag top researcher talent. But none of that matters if there's not enough

Starting point is 00:46:54 high-quality data to train on. Without the right data, even the most advanced infrastructure and world-class talent won't translate into end value for the user. That's where Libelbox comes in. Libelbox produces high-quality data at massive scale, powering any capability that you want your model to have. It doesn't matter whether you need a coding agent that needs detailed feedback on multi-hour trajectories or a robotics model that needs thousands of samples on everyday tasks or a voice agent that can also perform real-world actions for the user, like booking them a flight.

Starting point is 00:47:26 To be clear, this isn't just off-the-shelf data. Labelbox can design and launch a custom production scale data pipeline in 48 hours, and they can get you tens of thousands of targeted examples in weeks. Reach out at labelbox.com slash dwarfish. All right, back to Satya. So last year, Microsoft was on path to be the largest infrastructure provider by far. You were the earliest in 23, so you went out there, you acquired all the resources in terms of leasing data center,

Starting point is 00:47:58 in construction, securing power, everything, you guys were on pace to beat Amazon in 26 or 27. But certainly by 28, you were going to beat them. Since then, you know, in let's call it the second half of last year, Microsoft did this big pause, right? Where they let go of a bunch of leasing sites that they were going to take, which then Google, Meta, Amazon in some cases, Oracle, took these sites. We're sitting in one of the largest data centers in the world, so obviously it's not everything you guys are expanding like crazy. But there are sites that you just stopped working on. Why did you do this, right? Yeah, I mean, the fundamental thing,

Starting point is 00:48:36 this goes back a little bit to what is the hyperscale business all about, right? Which is one of the key decisions we made was that if we're going to build out Azure to be fantastic for all sort of stages of AI, from training to mid-training to data gen to inference, we just need fungium. of the fleet.

Starting point is 00:49:01 And so that entire thing caused us not to basically go build a whole lot of capacity with a particular set of generations. Because the other thing that you've got to realize is having actually up to now 10x every 18 months enough training capacity for the various open AI models, we realize that the key is to stay on that path. But the more important thing is to actually have a balance, to not just train, but to be able to serve these models all around the world. Because at the end of the day, the rate of monetization is what then will allow us to even keep funding.

Starting point is 00:49:41 And then the infrastructure was going to need us to support, as I said, multiple models and what have you. So once we said that that's the case, since then, we just course corrected to the path we're on. If I look at the path we're on is we're doing a lot more starts now. We are also buying up as money capacity as we can, whether it's to build, whether it's to lease, or even GPUs as a service. But we are building it for where we see the demand and the serving needs and our training needs.

Starting point is 00:50:11 And we didn't want to just be a hoster for one company and have just a massive book of business with one customer. That's not a business, right? That is sort of, you know, you should be vertically integrated with that company. Yeah. And so given the thing that OpenEI was going to be a successful independent company, which is fantastic, right? I think it makes sense, right? And even Meta may use third party capacity, but ultimately they're all going to be first party. For anyone who has large scale,

Starting point is 00:50:42 there'll be a hyperscaler on their own. And so to me was to build out a hyperscale fleet and our own research compute. And that's what the adjustment was. And so I feel very, very good. Oh, by the way, the other thing is, I didn't want to get stuck with massive scale of one generation. I mean, we just saw the GB200s. I mean, the GB300s are coming, right? And by the time I get to Weir Rubin, Vera Rubin Ultra, guess what? The data center is going to look very different because the power per rack, power per row is going to be so different. The cooling requirements are going to be so different. And that means I don't want to just go build out like a whole number or gigawatts that are only for a one generation, one family.

Starting point is 00:51:28 And so I think the pacing matters and the fungibility and the location matters, the workload diversity matters, customer diversity matters, and that's what we're building towards. The other thing that we've learned a lot is every AI workload does require not only the AI accelerator, but it requires a whole lot of other things, right? And in fact, a lot of the margin structure for us will be in those other things. And so, therefore, we want to build out Azure as being fantastic for the long tail of the workloads, because that's the hyperscale business, while knowing that we've got to be super competitive, starting with the bare metal for the highest end training. But that can't crowd out the rest of the business, right?

Starting point is 00:52:11 Because we're not in the business of just doing five contracts with five customers being their bare metal service. That's not a Microsoft business. That may be a business for someone else, and that's a good thing. What we have said is we are in the hyperscale business, which is, at the end of the day, a long-tail business for AI workloads. And in order to do that, we will have some leading bare metal-as-a-service capabilities for a set of models, including our own. And that, I think, is the balance you see.

Starting point is 00:52:41 Another sort of question that comes around this whole fungibility topic is, okay, it's not where you want it, right? You would rather have it in a good population center like Atlanta, as we're here, there's also the question of like, well, how much does that matter if as the horizon of AI tasks grows? Well, actually, you know, 30 seconds for a reasoning prompt or, you know, 30 minutes for a deep research or, you know, it's going to be hours for software agents at some point and days and so on and so forth, the time to human interaction. Why does it matter if it's a great question? Location A, B, or C. That's exactly. So in fact, that's one of the other reasons

Starting point is 00:53:18 why we want to think about like, hey, what is an Azure region look like and what is the, in fact, the networking between Azure region. So this is where I think as the model capabilities evolve and I think the usage of these tokens, whether it's synchronously or asynchronously evolves. And in fact, you don't want to be out of position, right? Then on top of that, by the way, what are the data residency laws, right? Where do I? Like, I mean, the entire EU thing for us where we literally had to create an EU data boundary basically

Starting point is 00:53:47 meant that you can't just round-trip a call to wherever, even if it's asynchronous. And so, therefore, you need to have maybe regional things that are high density and the power costs and so on. But you're 100% right in bringing up that the topology as we build out will have to evolve, one for tokens per dollar per watt, what are the economics, overlay that with what is the usage pattern, usage pattern in terms of synchronous asynchronous, but also what is the compute storage because the latencies may matter for certain things. The storage better be there. If I have a Cosmos DB close to this for session data or even for an autonomous thing, then that also has to be

Starting point is 00:54:30 somewhere close to it and so on. So I think that all of those considerations is what will shape the hyperscale business. You know, prior to the pause, you were, you know, versus, you know, what we had forecasted for you by 28, you're going to be like 12, 13 gigawatts, and now we're at, you know, nine and a half or so, right? But, you know, something that's even more relevant, right? And it's, you know, I just want you to, like, more concretely state that this is the business you don't want to be in. But, like, Oracle's going from, like, one-fifth your size to bigger than you by end of

Starting point is 00:55:00 2027. And while it's not a Microsoft level quality of return on invested capital, right, they're still making 35% gross margins, right? Sort of the question is like, is it, isn't it, is it, is it, you know, hey, it's not Microsoft's business to maybe do this. But you've created a hyperscaler now by refusing this business, by giving away the right of first refusal, etc. I'm not, first of all, I don't want to take away any thing from the success Oracle has had in building their business and I wish them well. And so the thing that I think I've answered for you is it didn't make sense for us to go be a hoster for one model company. company with limited time horizon RPO. Let's just put it that way, right? The thing that you have to think through is not what you do in the next five years, but what do you do for the next 50?

Starting point is 00:55:51 Because that's kind of what I, we made our set of decisions. I feel very good about our Open AI partnership and what we're doing. We have a decent book of business. We wish them a lot of success. In fact, we are buyers or even with Oracle capacity. We wish them success. But at this point, I think. The industrial logic for what we are trying to do is pretty clear, which is it's not about

Starting point is 00:56:13 like chasing. First of all, I track, by the way, your things, whether it's the AWS or the Google and ours, which I think is super useful. But doesn't mean I got to chase those. I have to chase them for not just the gross margin that they may represent in a period of time. You know, does Mike, what is this book of business that Microsoft uniquely can go clear, which makes sense for us to clear. And that's what we'll do. I guess I have a question, even

Starting point is 00:56:42 stepping back from this of, okay, I take your point that it's a better business to be in all-L-SQL to have a long tail of customers who can have higher margin from rather than serving bare metal to a few labs. But then there's a question of, okay, which way is the industry evolving? And so if we believe we're on the path to smarter and smarter AIs, then why isn't the shape of the industry that the Open AIs and Anthropics and Deep Mines are the platform which the long tail of enterprises are actually doing business with where they need bare metal, but they are the platform. What is the long tail that is directly using Azure? Because you know, you want to use the general cognitive core. But those ones are going to be available on Azure, right? So any workload

Starting point is 00:57:26 that says, hey, I want to use, you know, some open source model and an open AI model. Like, I mean, if you go to Azure Foundry today, you have all these models that you can provision by PTOs, get a CosmosDDB, get a SQLDB, get some storage, get some compute. That's what a real workload looks like. A real workload is not just a I did an API call to a model. A real workload needs all of these things to go build an app or instantiate an application. In fact, the model companies need that, right? To build anything is just not like I have a token factory. I have to have all of these things. That's the hyperscale business. And it's not any one model, but all these models. And so if you want GROC plus, let's say, open AI plus an open source model, come to Azure Foundry,

Starting point is 00:58:12 provision them, build your application. Here is a database. That's kind of what the business is. There is a separate business called just selling raw bare metal services to model companies. And that's the argument about how much of that business you want to be in and not be in. and what is that. It's a very different segment of the business, which we are in, and we also have limits to how much of it is going to crowd out the rest of it. But that's kind of, at least the way I look at it. So there's sort of two questions here, right? Like, why couldn't you just do both is one? And then the other one is, given our estimates on what your capacity is in 2028 is three and a half gigawatts lower, sure, you could have dedicated that to open AI training and inference

Starting point is 00:58:56 capacity, but you could have also dedicated that to, hey, this 3.5 gigawatts is actually just running Azure, is running Microsoft 365, it's running GitHub copilot. It doesn't actually, I could have built it and not given it to Open AI. Or I may want to build it in a different location. I may want to build it in UAE. I may want to build it in India. I may want to build it in Europe, right? So one of the other things is, as I said, like where we have real capacity constraints right now are given the regulatory needs and the data sovereignty needs. We've got to build all over the world. First of all, state-side capacity is super important and we're going to build everything. But one of the things is when I look out to 2030, I have a sort of a global

Starting point is 00:59:32 view of what is Microsoft shape of business by first party and third party. Third party segmented by the Frontier Clabs and how much they want versus the inference capacity we want to build for multiple models and our own research compute needs. So that's all what's going into my calculus versus saying, hey, I think you're rightfully pointing out the pause. But the pause was not done because we said, oh, my God, we don't want to build that. We realize that, oh, we want to build what we want to build slightly differently by both workload type as well as geotype and timing as well. Like, we'll keep ramping up our gigawatts. And the question is, at what pace and in what location and in what sort of, how do I write even the Moore's Law on it, right?

Starting point is 01:00:23 Which is, do I really want to overbuild three and a half in 27, or do I want to spread that in 27, 28, knowing even one of the biggest learnings we had even with Nvidia is their pace increased in terms of their model, I mean, their migrations. So that was a big factor. I didn't want to go get stuck for four years, five years of depreciation on one generation. And I wanted to just basically buy. In fact, Jensen's advice to me was two things. One is, hey, get on the speed of light execution. That's why I think even the execution in this Atlanta data center, I mean, like, 90 days, right, between when we get it to hand off to a real workload. That's sort of real speed of light execution on their front. And so I wanted to get good at that. And then,

Starting point is 01:01:04 that way, then I'm building this each generation and scaling. And then every five years, then you have a much more balanced, so it becomes really literally like a flow. for a large-scale industrial operation like this where you suddenly are not lopsided where you're built up a lot in one time and then you take a massive hiatus because you're stuck with all this to your point in one location

Starting point is 01:01:28 which may be great for training may not be great for infants because I can't serve even if it's like it's all asynchronous but Europe ain't going to let me run trip to Texas so that's all of the things. How do I rationalize this statement with what you've done over the last few weeks?

Starting point is 01:01:41 You've announced deals with Iris Energy with Nebius and Lambda Labs, and there's a few more coming as well, you're going out there and securing capacity that you're renting from the neoclouds rather than having built it yourself. What was the... I think it's fine for us because we now have...

Starting point is 01:02:00 When you have liner site to demand, which can be served where people are building it, it's great. In fact, we'll even have... I would say, you know, we will take leases, we will take built-a-suite, we'll take even GPUs of service where we don't have capacity,

Starting point is 01:02:15 but we need capacity, and someone else has that. And by the way, I would even sort of welcome every neocloud to just be part of our marketplace. Because again, guess what? If they go bring their capacity into our marketplace, that customer who comes through Azure will use the neocloud, which is a great win for them, and we'll use compute, storage, databases,

Starting point is 01:02:35 all the rest from Azure. So I'm not at all thinking of this as just, you know, hey, I should just gobble up all of that myself. So you mentioned how you're depreciating this asset that's five, six years, and this is the majority of the, you know, 75% of the TCO of a data center. And Jensen is taking a 75% margin on that. So what all the hypers are trying to do is develop their own accelerator so that they can reduce this overwhelming cost for equipment to increase the margins. Yeah, and then like, you know, when you look at where they are, right, Google's way of of everyone else, right? They've been doing it for the longest. They're going to make something

Starting point is 01:03:16 like five to seven million chips, right, of their own TPUs. You look at Amazon, they're trying to make three to five million. But when we look at what, you know, Microsoft is ordering of their own chips, it's way below that number. You've had a program for just as long. What's going on with your internal chips? Yeah, it's a good question. So the couple of things. One is the thing that is the biggest competitor for any new accelerator is kind of even the previous generation of Nvidia, right? I mean, in a fleet, what I'm going to look at is the overall TCO. So the bar I have even for our own, and which, by the way, I was just looking at the data for Maya 200, which looks great, except that one of the things that we learned even on the compute side, right,

Starting point is 01:03:57 which is we had a lot of Intel, then we introduced AMD, and then we introduced cobalt. And so that's kind of how we scaled it. And so we have good sort of existence proof of, at least in core compute, on how to build your own silicon, and then manage a fleet where, that all three are at play in some balance. Because by the way, even Google's buying Nvidia and so is Amazon. It makes sense because Nvidia is innovating and it's the general purpose thing,

Starting point is 01:04:22 all models run on it, and customer demand is there. Because if you build your own vertical thing, you better have your own model, which is either gonna use it for training or inference and you have to generate your own demand for it or subsidize the demand for it. So therefore you wanna make sure you scale it appropriately. So the way we are going to go or do it is have a close loop between our own MAI models and our silicon,

Starting point is 01:04:49 because I feel like that's what gives you the birthright to really do your own silicon, right, where you literally have designed the microarchitecture with what you're doing, and then you keep pace with your own models. In our case, the good news here is OpenAI has a program, which we have access to. And so, therefore, to think that Microsoft is not going to have something that's... What level of access do you have to that? All of it. You just get the IP for all of that.

Starting point is 01:05:18 So the only IP you don't have is a consumer hardware. That's it. Oh, wow. Okay. Yeah. Interesting. Yeah. And by the way, we gave them a bunch of IP as well to bootstrap them, right? So this is one of the reasons why they had a massive...

Starting point is 01:05:34 Because we built all these supercomputers together, or rebuilt it for them, and they benefited from it, right? so. And now as they innovate even at the system level, we get access to all of it. And we first want to instantiate what they build for them, but then we'll extend it. And so to think that we don't have, and so if anything, the way I think about to your question is, Microsoft wants to be a fantastic, I'll call it, speed of light execution partner for NVIDIA, because quite frankly, that fleet is life itself. I'm not worried about, I mean, obviously Jensen's doing super well with his margins, but the TCO has many dimensions to it, and I want to be great at that TCO.

Starting point is 01:06:19 On top of that, I want to be able to sort of really work with the open AI lineage and the MAI lineage and the system design, knowing that we have the IP rights on both ends. Speaking of rights, one thing, you know, you had an interview a couple of days ago where you said that we have rights to the new agreement you've made with Open AI, you have rights the exclusivity to the stateless API calls that Open AI makes. And we were sort of confused about if there's any state whatsoever. I mean, you were just mentioning a second ago that all these complicated workloads that are coming up

Starting point is 01:06:55 are going to require memory and databases and storage and so forth. And is that now not stateless, if chat GPT is storing stuff on sessions? But that's the reason why. So the thing, the business, the strategic decision we made and also accommodating for the flexibility open AI needed in order to be able to procure compute for essentially, think of Open AI having a pass business and a SaaS business. SAS business is chat GPT, their past business is their API. That API is Azure exclusive. The SaaS business, they can run it anywhere. And they can partner with anyone they want to to build SaaS products.

Starting point is 01:07:34 So if they want a partner and that partner wants to use a stateless API, then Azure is the place where they can get the stateless API. It seems like there's a way for them to make, you know, build the product together and it's a stateful thing. No, even that, they'll have to come to Azure. Okay. So if it is any partner. And so fundamentally, you know, so again, this is done in the spirit of what is it that we valued as part of our partnership. And we made sure while at the same time we were good partners to open AI,

Starting point is 01:08:03 given all the flexibility they need. So, for example, Salesforce wants to integrate Open AI. It's not through an API. They actually work together, train a model together, deploy it on, let's say, Amazon now. Is that allowed? Or do they have to use it? No, for any custom agreement like that,

Starting point is 01:08:18 they will have to come run it. There are some few exceptions to U.S. government and so on that we made. But other than that, they'll have to come to Azure. So as Slette explained, as AI agents get more capable, you're going to need more and more observability into what they're doing. You're going to need to catch them

Starting point is 01:08:33 when they're making mistakes, you're going to need high-level summaries of what they're doing, and you're going to need a picture of how everything that they're doing fits together. This is exactly what CodeRabbit provides. You just make a normal pull request, and CodeRabbit automatically reviews the PR. It generates a summary of changes so you can understand exactly what the PR's author was intending, and it uses the context from your full code base to provide line-by-line feedback on how things could be improved. This is helpful whether you're reviewing a PR from a coworker or an agent. In either case, CodeRabbit will write up its thoughts and flag any issues so that your teammate or your agent can go fix them.

Starting point is 01:09:10 I've noticed that when I'm coding with agents, CodeRabbit catches a lot of mistakes that the models make by default. For example, the models have a bad habit of using old versions of libraries. So in one session, I watch CodeRabbit, cache a call to an old model, figure out what the new version was, and then suggest that improvement. Go to coderavit.aI slash thwarcash to learn more. Stepping back, a question I have is, you know, when we're walking back and forth, the factory, one of the things you're talking about is, you know, Microsoft, you can think of it as a software business, but now it's really becoming an industrial business. There's all this CAPEX, there's all this construction.

Starting point is 01:09:50 And if you just look over the last two years, your sort of CAPEX is like tripled, and maybe you extrapolate that forward, it just actually just becomes this huge. industrial explosion. Oh, their hyperscalers are taking loans, right? Meta's done a $20 billion loan at Louisiana. They've done a corporate loan. It seems clear everyone's free cash flow is going to zero, which I'm sure Amy is going to beat you up

Starting point is 01:10:14 if you even try to do that. But like, what's happening? I mean, I think the structural change is what you're referencing, which I think is massive, right? which is, I describe it as we are now a capital-intensive business and a knowledge-intensive business. And in fact, we have to use our knowledge to increase the ROIC on the capital spend, right? Because that's kind of, you know, look, the hardware guys have done a great job of marketing

Starting point is 01:10:42 the Morrill's Law, which I think is unbelievable and it's great. But if you even look, I think some of the stats that even did in my earnings call, which is for a given GPT family, right, the improvement, software improvements of really throughput in terms of tokens per dollar per watt that we're able to get, you know, quarter over quarter, year over year is massive, right? So it's 5x, 10x, maybe 40x in some of these cases, right? Just because how you can optimize. That's sort of knowledge intense intensity coming to bring out capital efficiency.

Starting point is 01:11:17 So that at some level, that's what we have to master. What does it mean? Like somebody people ask me, what was the difference between, you know, a classic old-time hoster and a hyperscaler, it was software. So yes, it is capital-intensive, but as long as you have systems know-how, software capability to optimize by workload, by fleet,

Starting point is 01:11:40 that's why I think when we say fungibility, there's so much software in it. It's just not about the fleet, right? It's kind of the ability to evict a workload, you know, and then schedule another workload. Can I, like, manage the, that algorithm of scheduling, around. That is the type of stuff that we have to be world-class at. And so, yes, so I think we'll still remain a software company. But yes, this is a different business. And we're going to

Starting point is 01:12:06 manage, look, at the end of the day, the cash flow that Microsoft has allows us to have both these arms firing, you know, well. It seems like in the short term, you have more sort of credence on things taking a while, being more jagged. But maybe in the long, term, you think, like, the people who say, talk about AGI and ASI are correct, like Sam, Sam will be right, but eventually. And I have a broader question about what makes sense for a hyperscaler to do, given that you have to invest massively in this thing which depreciates over five years. So if you have 2040 timelines to the kind of thing that somebody like Sam anticipates in three years, you know, what is a reasonable thing for you to do in that world?

Starting point is 01:12:52 There needs to be an allocation to, I'll call it, research compute, that needs to be done like you did R&D. So that's the best way to even account for it, quite frankly. We should think of it as just R&D expense, and you should say, hey, what's the research computer and how do you want to scale it? And let's even say it's an order of magnitude scale in some period, pick your thing. Is it two years, is it 16 months, what have you? So that's sort of one piece, which is kind of, that's kind of table stakes, that's R&D expenses.

Starting point is 01:13:28 And the rest is all demand driven, right? I mean, ultimately, you can, you'll have to build ahead of demand, but you better have a demand plan that doesn't go completely off kilter. Do you buy, so these labs are now projecting revenues of $100 billion in 27, 28, and they're projecting revenue keeps growing at this rate

Starting point is 01:13:48 of like 3x, 2x of year? In the marketplace, right, there's all kinds kinds of incentives right now, and rightfully so, right? I mean, what do you expect an independent lab that is sort of trying to raise money to do? They have to put some numbers out there such that they can actually go raise money so that they can pay their bills for compute and what have you. And it's good thing. I mean, someone's going to take some risk and put it in there, and they've shown traction.

Starting point is 01:14:14 It's not like it's all risk without seeing the fact that they've been performing, whether it's open AI, whether it's anthropic. So I feel great about what they've done. And we have a massive book of business with these jobs. So therefore, that's all good. But overall, ultimately there's two simple things. One is you've got to allocate for R&D. You brought up even talent.

Starting point is 01:14:35 You've got to, like, the talent for AI is at a premium. You've got to spend there. You've got to spend on compute. So in some sense, researcher to GPU ratios have to be high. That is sort of what it takes to be a leading R&D company in this world. And that's something that needs to scale, and you have to have a balance sheet that allows you to scale that long before it's conventional wisdom and so on. So that's kind of one thing. But the other is all about sort of knowing how to forecast.

Starting point is 01:15:07 As we look across the world, right, America has dominated many tech stacks, right? The U.S. owns Windows right through Microsoft, which is deployed even in China, right? That's the main operating system. Of course, there's Linux, which is open source. But, you know, Windows is deployed everywhere in China on personal computers. You look at, you look at Word, it's deployed everywhere. You look at all these various technologies. It's deployed everywhere.

Starting point is 01:15:29 The thing that is quite unique, and Microsoft and other companies have grown elsewhere, right? They're building data centers in Europe and India and all these other, you know, in Southeast Asia and Latam and Africa, right? All of these different places you're building capacity. But this seems quite different, right? You know, today, the political aspect of technology, of compute, you know, you know, the U.S. administration didn't care about the dot-com bubble, right? It seems like the U.S. administration as well as every other administration around the world cares a lot about AI. And the question is, you know, we're in a sort of a bipolar world, at least with U.S. and China, but Europe and India and all these other countries are saying, no, actually, we're going to have sovereign AI as well. how does Microsoft navigate, you know, the difference of the 90s where it's like there's one

Starting point is 01:16:17 country in the world that matters, right? It's America and we do, our companies sell everywhere and therefore Microsoft benefits massively to a world where it is bipolar, where, hey, Microsoft can't just necessarily have the right to win all of Europe or India or, you know, Singapore. There's actually sovereign AI efforts. What is your thought process here? How do you think about this? It's, I think, a super, you know, critical piece, which is, um, I think that the key, key priority for the U.S. tech sector and the U.S. government is to ensure that we not only do leading innovative work, but we also collectively build trust around the world on our tech stack, right? Because I always say the United States is just an unbelievable place. It's just unique in history, right? It's 4% of the world's population, 25% of the GDP, and 50% of the market. And I think you should think about those ratios and really and reflect on it that 50% happens because quite frankly, the trust the world has in the United States, whether it's its capital markets or whether it's its technology and its stewardship of what matters at any given time in terms of leading sector. So if that is broken, then that's not a good day for the United States. And so if we start with that, which I think the

Starting point is 01:17:42 you know, President Trump gets, the White House, David Sachs, everyone really, I think, gets it. And so therefore, I applaud anything that the United States government and the tech sector jointly does to, quite frankly, for example, put our own capital at risk collectively as an industry in every part of the world. So I would like, in fact, the USG to take credit for foreign direct investment by American companies all over. the world, right? It's kind of like least talked about it, but the best marketing that the United States should be doing is it's not just about all the foreign direct investment coming into the United States, but the most leading sector, which is these AI factories, are all being created all over the world by whom? By America and American companies. And so you start there,

Starting point is 01:18:34 and then you even build other agreements around it, which are around their continuity, their legitimate sovereignty concerns around whether it's data residency, whether it's even what happens for them to have real agency and guarantees on privacy and so on. And so, in fact, our European commitments, I think, are worth reading, right? So we made a series of commitments to Europe on how we will really govern our hyperscale investment there, such that really European Union and the European Union and the, you know, the European countries have sovereignty. We're also building sovereign clouds in France and in Germany.

Starting point is 01:19:16 We have something called sovereign services on Azure, which literally give people key management services along with confidential computing, including confidential computing in GPUs, which we have done great innovative work with, NVIDIA. And so I feel very, way good about being able to build both technically and through policy this trust in the American tech stack. And how do you see this shaking out as, you know, you do have this network effect with conditional learning and things on the model level.

Starting point is 01:19:49 Maybe you have equivalent things at the hyperscalor level as well. And do you expect that the countries will say, look, it's clearly one model or a couple models are the best. And so we're going to use them, but we're going to have some laws around while the weights have to be hosted in our country, or do you expect that there will be this push to have, it has to be a model trained in our country?

Starting point is 01:20:08 Maybe an analogy here is like people would, you know, the semiconductors is very important to the economy and people would like to have their sort of sovereign semiconductors, but like TSM is just better. And so semiconductors are so important to the economy that you will just go to Taiwan and buy the semiconductors, you have to. Will it be like that with AI or is there?

Starting point is 01:20:26 Ultimately, I think what matters is the use of AI in their economy to create economic value, right? I mean, that's the diffusion theory, which is ultimately, it's not the leading second, but it's the ability to use the leading technology to create your own comparative advantage, right? So that I think will fundamentally be the core driver. But that said, they will want continuity of that, right?

Starting point is 01:20:51 So in some sense, that's one of the reasons why I believe there is always going to be a check a little bit to sort of some of your points on, hey, can this one model have all the runaway deployment? That's why open source is always going to be there. There will be, by definition, multiple models, That'll be one way. Like, it's kind of, you know, that's one way for people to sort of demand continuity

Starting point is 01:21:14 and not have concentration risk is another way to say it is, right? And so you say, hey, I'll want multiple models and then I want an open source. So I feel as long as that's there, every country will feel like, okay, I don't have to worry about deploying the best model and broadly diffusing because I can always take what is my data and my liquidity and move it to another model, whether it's open source, or from another country or what have you. Concentration risk and sovereignty, right, which is really agency. Those are the two things.

Starting point is 01:21:48 I think that will drive the market structure. The thing about this is that this doesn't exist for semiconductors, right? You know, all refrigerators, cars have chips made in Taiwan. It didn't exist until now. Until now, everybody is now. Even then, right? America, you know, if Taiwan is cut off, there is, there are no more cars or no more refrigerators. TSM, Arizona is not replacing any real fractures.

Starting point is 01:22:08 of the production, like, the sovereignty is a bit of like a scam, if you will, right? I mean, it's worthwhile having it. It's important to have it, but it's not a real, it's not real sovereignty, right? And we're a global economy, we don't, we... I think it's kind of like Dylan saying, hey, at this point, we've not learned anything about sort of what resilience means and what one needs to do, right? So it's kind of, any nation state, including the United States, at this point, will do what it takes to be more self-sufficient on some of these critical supply chains.

Starting point is 01:22:46 So I as a multinational company have to think about that as a first-class requirement. If I don't, then I'm not respecting what is in the sort of policy interests of that country long term, right? And I'm not saying they won't make practical decisions in the short term, right? Absolutely. I mean, the globalization can't just be revound, right? I mean, all these capital investments cannot be made in a way at the pace at which. But at the same time, you have to kind of, like, if I think about it, right, if somebody showed

Starting point is 01:23:17 up in Washington and said, hey, you know what, we're not going to build any semiconductor plans, they're going to be kicked out of the United States. And the same thing is going to be true in every other country, too. And so therefore, I think we have to, as companies respect what the lessons learned are, you know, whether it's, you know, you could say the pandemic woke us up or whatever, but nevertheless, people are saying, look, globalization was fantastic. It helped supply chains be globalized and be super efficient, but there's such a thing called resilience, and we are happy, you know, we want resilience, and so therefore that feature will get built. At what pace, I think,

Starting point is 01:23:57 point you're making. It can't be, like, you can't snap your fingers and say all the TSMC plants now are all in Arizona and with all of the capability. They're not going to be. But is Is there a plan? There will be a plan. And should we respect that? Absolutely. And so I feel that's the world. I want to meet the world where it is and what it wants to do going forward, as opposed to say, hey, we have a point of view that doesn't respect your view. So just to make sure I understand, the idea here is each country will want some kind of data residency, privacy, et cetera. And Microsoft is especially privileged here because you have relationships with these countries. You have expertise in setting up these kinds of sovereign data centers.

Starting point is 01:24:41 And therefore, Microsoft is uniquely fit for a world with more sovereignty requirements. Yeah, I mean, I don't want to sort of describe it as somehow we are uniquely privileged. I would just say, I think of that as a business requirement, that we have been doing all the hard work, all these decades, and we plan to. And so my, answer to Dillon's previous question was, I take these, you know, whether it's in the United States, quite frankly, when, you know, when the White House and the USG says, hey, we want you to allocate more of your, I don't know, wafer starts to fabs in the US, we take that seriously. Or whether it is data center and the EU boundary, we take that seriously. So to me,

Starting point is 01:25:27 respecting what I think are legitimate reasons why countries care about sovereignty and building for it as a software and a physical plant is what I would say we are going to do. And as we go to the bipolar world, right, U.S., China, there is a lot around, you know, American tech does not, you know, it's not just you versus Amazon or you versus, you know, Anthropic or you versus Google. Yeah. There is a whole host of competition. How does America rebuild the trust?

Starting point is 01:26:00 What do you do to rebuild the trust to say, actually, no, American companies will be the main provider for you? And how do you think about competition with up-and-coming Chinese companies, whether it be, you know, bite dance and Alibaba or deep seek and moonshot? And just to add to the question, one concern is, look, we're talking about how AI is becoming this sort of industrial cap-x race, where you're just rapidly having to build quickly across all those supply chain. When you hear that, at least up until now, you just think about China, right? This is like their comparative advantage. And especially if we're not going to moonshot to ASI next year, but it's going to be this decades of buildouts and infrastructure and so forth. How do you deal with Chinese competition now? Are they privileged in that world?

Starting point is 01:26:42 Yeah. So it's a great question. I mean, in fact, you just made the point of why I think trust in American tech is probably the most important feature. It's not even the model capability, maybe. It is like, can I trust you, the company? Can I trust you, your country, and its institutions to be a long-term supplier? Maybe the thing that wins the world. I think it's a good note to end on.

Starting point is 01:27:12 Satya, thank you for doing this. Thank you so much. Thank you. It's such a pleasure. It's awesome. It's like, man, you two guys are like quite the team. everybody. I hope you enjoyed that episode. If you did, the most helpful thing you can do is just share it with other people who you think might enjoy it. It's also helpful if you

Starting point is 01:27:32 leave a rating or a comment on whatever platform you're listening on. If you're interested in sponsoring the podcast, you can reach out at thwarcash.com slash advertise. Otherwise, I'll see you on the next one.

Dwarkesh Podcast - Satya Nadella — How Microsoft is preparing for AGI

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.