Big Technology Podcast - AWS CEO Matt Garman on Amazon's Big AI Chips Bet, Working With OpenAI, and Nuclear Energy

Starting point is 00:00:00 AWS CEO, Matt Garman, is here to talk about AI, the state of the economy and Amazon culture. That's coming up right after this. Welcome to Big Technology Podcast, a show for cool-headed, nuanced conversation of the tech world and beyond. We are here in Las Vegas, Nevada at Amazon's Reinvent Conference with the CEO of AWS, Matt Garman. Matt, so great to see you. Welcome to the show. Yeah, thank you for having me. Let's talk a little bit about infrastructure.

Starting point is 00:00:26 You're the kings of building data centers, right? There's no one that does it better than AWS. But there are headlines in the AI world. Elon Musk took 122 days to build a 100,000 plus GPU data center. Does this show that scaling data centers is now core to competing in AI? Does it validation? What do you think about it? Well, look, we've been building data centers for almost two decades now.

Starting point is 00:00:51 And so this is something that we spend a lot of time on. And it's less that we're out there kind of bragging to the press about. But what we do is we provide... You can brag. You're the size of yours. Well, more is, what we do is provide infinite scale for customers. And so our article is for largely the customers not have to think about these things, right? And so we want them across their compute, across their storage, across their databases,

Starting point is 00:01:12 to be able to scale to any number of size. And so take something like S3 as an example. It's an incredibly complex, very detailed system that keeps your data, keeps it durable, and scales infinitely, right? And customers largely just put data in there and don't have to think about it. And so today, S3 actually stores 400 trillion objects. It's an enormous number that's hard to even get your head around. But it's just something where we just keep scaling and we keep growing for our customers.

Starting point is 00:01:38 As you think about AI now, these are power-hungry, massive data centers for sure. And AWS is adding tons and tons of compute all the time for our customers. Largely what we think of though is less about how fast can you build one particular cluster. The absolute size of AWS is dwarfed by any other particular cluster out there. But we're focused on how do we deliver the compute the customers need to go build their applications? Take somebody like Anthropic as an example. Anthropic has what are widely considered to be the most powerful AI models out there today in their clawed set of models. We're building together with them what we call Project Rainier.

Starting point is 00:02:16 And so it's using our next generation Traneum 2 chips. And this cluster that we're building for them in 2025 will be five times the size the number of Xoflops that they use to train the current generation of models, which are by far the most powerful ones out there. It's going to be five times that size next year, all built on training them to delivering hundreds of thousands of chips in a single cluster for them so they can train the next generation. That's the type of thing where we work with customers,

Starting point is 00:02:42 understand what's interesting for them, and then help them scale to whatever level they need. And that's just one of our customers, of course. We have hundreds and hundreds of other customers as well. Here's my point. You're so good at this. Look at what you just talked about in terms of anthropic, being able to help them scale the way that you are. And that would lead me to believe that Amazon would have its own cutting-edge, state-of-the-art model, one that would lead, you know, and be better than the Open AIs and the Anthropics.

Starting point is 00:03:08 This is your core competency, and this is what makes these models run. So why hasn't that happened? Our core competency is about delivering compute power for all of the people that need it. And, you know, for a long time, we've been very focused on how do we build the capabilities to let our customers build whatever they want. And sometimes there are areas that Amazon also builds, and other times they're not areas that Amazon builds. And so you think about whether it's in the database world or you think about in the storage role or you think about the data analytics role or you think about the ML world, we build this underlying compute platform that everybody can go build upon. And sometimes we build services that compete with others out there in the market. Think about a Redshift competing with a snowflake who's also a very important partner of ours and a big customer of ours and somebody that we do a lot of partnering together on.

Starting point is 00:03:52 And then there's other times where there's applications that people build on top of AWS that Amazon doesn't go and build. And so we operate across that whole swath of area and sometimes we'll build and sometimes they don't. But that's the kind of the beauty of AWS is that our goal is to build that infrastructure so that sometimes we can build those, sometimes we won't build them, but we want this platform that everybody can go build

Starting point is 00:04:14 the broadest set of applications possible out there. But I'm thinking about it for AI specifically. And in the world that you play in, you have Google, they have their own model, they sell cloud services. You have Microsoft, okay, they don't have their own models necessarily, but they have this deal with Open AI.

Starting point is 00:04:27 Pretty sure that Open AI is exclusive on Azure. Now, this is where a lot of the growth is coming from. And so. And I think it's a mistake, actually. So the interesting thing there is, and this is where a lot of people started. I just think it's fundamentally the wrong way of thinking about it, just a lot of times people are thinking

Starting point is 00:04:43 about there's just going to be this one model. And I want to have the one model that's going to be the most powerful and the one model to rule them all. And as you've transitioned, as you've seen over the last year, there isn't one model that's the best at everything. There's models that are really good at reasoning. There are models that are great for, that provide open weights so that people can bring their own data and fine tune them

Starting point is 00:05:01 and distill them and create things that, and create kind of completely new models from that that are completely custom for customers. And on that, you may want to use a Lama model or you may want to use Mestral model. There's customers who really want to build the world's best images and they might use something like a stability or they might use something like a Titan model.

Starting point is 00:05:16 There's customers that need really complex reasoning and they might use an anthropic model. There's a whole ton of these operating out there. And our goal is how do we help customers use the very best. It doesn't have to be one thing. It's not just one. And we don't think that there's one best database.

Starting point is 00:05:30 We don't think there's one best compute platform or processor. We don't think that there's one best model. It's across that whole set. And that's been our strategy. And customers have really embraced that strategy. As you get to them, and they're thinking about how they go build in production applications.

Starting point is 00:05:46 They want the stability, the operational excellence, and the security that they get with AWS. But they also want that choice. It's incredibly important for them. And I think choices. is important for customers, no matter if they're building generative AI applications, no matter if they're picking a database offering,

Starting point is 00:06:01 no matter if they're picking a compute platform, they want that choice. And that is something that AWS, from the very earliest day, has really leaned into. And I think it's an important part of our strategy. And it's maybe not the strategy that others have. Maybe others say it's just this one, and this is the one that we're going to lean into,

Starting point is 00:06:15 but it's not the strategy that we've picked. Our choice is around choice. And it's part of why we have the broadest set of partner ecosystem as well. It's why, as you walk the halls here and reinvent, it's filled with partners who're building their business on top of AWS, we're leaning in and helping our joint customers accelerate their journeys to the cloud, their modernization efforts, their AI efforts.

Starting point is 00:06:33 It's because of that, I think that is a lot of what makes AWS special. Okay, and I'm going to move off this in a moment. But the reason why I'm asking these questions is because you do have at least a bet that big foundational models are going to matter. That's the $4 billion you just invested in Anthropic. And I think that the strategy that AWS has makes a lot of sense, right? This bedrock strategy, there's a lot of different models in there. People have their data in the cloud.

Starting point is 00:06:56 They're going to build with their data they have within AWS using bedrock picking models. But you also are limited in the fact that open AI is not there. I don't think Google's there. So wouldn't it make sense in parallel to the bring your own model strategy to also use this capacity that you have to scale infrastructure to get in the game yourself? Look, what I will say is never say never. It's an interesting idea and then we'll we never close any doors. I think we're always open to, frankly, a whole host of things. we're always open to having open AIA be available in AWS someday

Starting point is 00:07:27 or having Gemini models be available in AWS someday. And maybe someday we will spend more time focused on our own models for sure. I think all of that is open. And part of what I think makes I was I was we're always open to take our announcement earlier this year about partnering deeply with Oracle, about making Oracle databases available in AWS. Lots of people would said, oh, that's never going to happen.

Starting point is 00:07:46 And it's against your strategy. Our strategy is to embrace all technologies, because we want anything that customers can use, we want them to be available in to be able to use it inside of AWS. And look, sometimes it happens today. Sometimes it happens tomorrow. Sometimes it happens weeks from now, months from now,

Starting point is 00:08:00 years from now. But that is our goal is to make all of those technologies available for our customers. OK, I'm going to parse your language a little bit, because you said that you're always, you might be open to having open AI on bedrock within AWS. Are you talking to them? Would you want to ask them to come on?

Starting point is 00:08:17 There's nothing to announce there today. But I'm saying if customers want that, that's something what we would want. And we'd love to make it happen at some point. Yeah. Well, maybe they're listening and they want to make that move. Yeah. But let's speak to the one that I think is the biggest challenger to them, the one that you

Starting point is 00:08:30 have all this money in, which is Anthropic. So what does the $4 billion that you just invested in Anthropic get you? And how does that make you differentiated from other cloud providers? Well, there's a couple things I'd say. One is, you know, we make the investments in Anthropic because we think it's a good bet. They have a very good team. They've made some incredible traction in the market. And we really like what they're, where they're innovating.

Starting point is 00:08:54 Yeah, we're definitely clawed heads on the show. Yeah, it's a fantastic product, right? And Darion and team are very good. And they continue to actually attract some of the best talent out there in the market today. You know, the other thing that we get from that is a deep collaboration on Traynium. And, you know, we're, we've made a big bet on Traynium as an additional option for customers. You know, the vast majority of... We should define it.

Starting point is 00:09:17 Yeah, I was going to get there. Oh, sorry, go ahead. I mean, I'll just say it. The chips that people can... that companies can use to train their own models with at AWS. That's right. And so today, the vast majority of AI processing, whether it's inference or training, is done on Nvidia GPUs.

Starting point is 00:09:31 And we're a huge partner of Nvidia. We will be for a long time. And I think that, and they make fantastic products, by the way. And they continue to do that. And when black hole chips come out, I think people are very excited about that next generation platform. But we also think the customers want choice. And we've seen that time and time again.

Starting point is 00:09:46 We've done that with general purpose processors. We have our own custom general purpose processor called Graviton. And so we actually went and built our own AI chips. The first version was called, or that's called Traneum. We launched Traynium 1 a couple years ago in 2002. And our Just Yang, Traneum 2 here at Reinvent. So that's news that's happening this week. That will be announced in my keynote.

Starting point is 00:10:11 Yes. Remember this is filming. You may have already happened. We're going to release the transcript as your keynote hits and the podcast the day later. Great. But this is brand new news, fresh off the press folks. Fresh off the press. And so we'll have Traneum 2, and Traneum 2 gives really differentiated performance.

Starting point is 00:10:27 We see 30 to 40% price performance games versus our instances that are GPU powered today. So we're very excited about Traneum 2, and customers are really excited about that. And what Anthropic gives us back to your question is a leading frontier model provider that can really work deeply to build the very largest clusters that have ever been built with this new technology where we can learn from them, right? And just learn what's working, what's not, what are the things you? you need accelerated so that training three and training four and training five and training six can all get better

Starting point is 00:10:55 as we as we continue to go. And the software associated with GPUs gets better, or the accelerators gets better as well. I think that's one of the things where people who've tried to build accelerated platforms before have fallen down is the software support has not been as good as NVIDIA software support is fantastic. And so that's a big area where they're helping us as well

Starting point is 00:11:15 as we help iron out the creeks and the kinks and try to figure out how we make sure that developers can start to use these training M2 chips in a very seamless way and high-performance way. So we learn a lot from them as big users that are really leaning in and help us learn. And they get benefits from they get that scale and cost benefit of running on this price performance platform that gives them a huge win. And we think then from that investment we can both benefit as they deliver better and better models over time.

Starting point is 00:11:42 There's an interesting thing that happens when I speak with people who are working in cloud or working to train models, working to build their own chips. There's always a preface. We love working with Nvidia, and we're also building chips that compete with what they do. So how does that relationship work out? They don't get upset that you're trying to build the same. I mean, they have a supply issue, but how does it work with them? No, I have a great relationship with Nvidia and Jensen.

Starting point is 00:12:06 And this is a thing that we've done before. We have a fantastic relationship with Intel and AMD, and we produce our own general purpose processors. And it's a big world out there, and there's a lot of market for, And for lots of different use cases, that's not one is going to be the winner, right? There's going to be use cases where people are going to want to use GPUs, and there's going to be use cases where people are going to find training to be the best case. There are use cases where people find that our Intel instances are the best choice for them. They're ones where they find that the AMD instances are the best choice from.

Starting point is 00:12:37 And there's increasingly a large set where they find Graviton, which is our purpose built, general purpose processor, is the right fit for them. And it doesn't mean that we don't have great relationships with Intel and AmD, and it means we'll continue to have a great relationship with Nvidia, because for them and for us, it's incredibly important for Nvidia processors and a GPU-powered processors to perform great on AWS. And so we are doubling down our investment

Starting point is 00:13:00 to make sure that Nvidia performs outstanding in AWS. We want it to be the best place for people to run GPU-based workloads, and I expect it will continue to be for a really long time. What's the buying process like with Nvidia? Because you want as many chips as you can get. I would imagine you have eating. Elon who buys them by the truckloads.

Starting point is 00:13:19 You have Zuckerberg, who has been buying lots. And I think he wants to power them with a nuclear submarine or something like that. So do you have to jostle with the other companies to get Nvidia chips? Or do you get every exact line you want? InVidia is very fair about how they go about it. And so how do they do that?

Starting point is 00:13:35 I mean, you can ask them about how they internally allocate. That's not really a question for me. It's for them. But they're very fair in dealing with us. And we give long-term forecasts. And they tell us what they can supply. And we don't know that there's been shortages. in the last couple of years, specifically, as demand is really ramped up.

Starting point is 00:13:52 And they've been great about ensuring that we get, you know, enough to support our joint customers as much as possible. What about your inference chips? Inferencia? Yeah. Because last time I heard you speak, you said that the activity within AI right now, Gen AI, is 50% training, 50% inference. Does that ratio still hold?

Starting point is 00:14:13 And how are you going to put the chips out there to allow companies to be able to do cheaper inference because that's the issue with generative AI. It works well, but it's so expensive that companies take proof of concepts and only one fifth actually make them out into production. Yeah, it's absolutely the case. And I think we're still probably seeing about that ratio of 50-50. I think more and more it's more inference than training. And increasingly we'll see more and more of the workload shift that way.

Starting point is 00:14:40 It costs is a super important factor that many of our customers are definitely worried about and thinking about on a daily basis. You know, if you think about where a lot of people were, they went and did a bunch of these Gen AI capabilities, or tests, where they did proof of concepts and they launched hundreds of proof of concepts across the enterprise without really paying attention to like what was the value you're going to be or anything like that.

Starting point is 00:15:01 And now they're looking at them and they say, well, the ROI is not really there. They're not really integrated into my production environment. They're just kind of these POCs that I'm not getting a lot of value out of, and they're expensive, as you mentioned. So two things that people are thinking about is, one, how do I lower the cost so that I make that the cost much lower to run, and that's the point about cost of inference.

Starting point is 00:15:19 And two, how do I actually get more value out of that? So the ROI equation just completely shifts and it makes more sense. And it turns out it's probably not all hundred of those. It's probably two or three or five of those that are really valuable. And so there's a couple things. Number one is on the cost side, as there's a few things that we're doing to help people lower costs.

Starting point is 00:15:36 Number one is, I think Traneum 2 will be a material impact there. And as these models have gotten bigger and bigger, you mentioned inferentia. Originally, we had a small chip called Inferentia that would run, really fast, lightweight inference. Now, as you're running models that have billions, tens of billions, hundreds of billions, trillions of parameters, they're way too big to fit on these small inference chips. And effectively, they're running on the same training chips.

Starting point is 00:15:59 Like, they're the exact same things. And so you run inference today on H-100s, H-200s, or you run inference today on training twos or training ones. And so we may come out over time with other inferentia chips, as you will, but they're really using a lot of that same architecture, and they're still really large servers. And so we actually expect that Traneum 2 is going to be a fantastic inference platform. Our naming is not necessarily always our suit as to what these ships are for. But it's going to be a fantastic inference platform.

Starting point is 00:16:25 We actually think it'll be, as you think about that 30 to 40% price performance benefit that customers are going to get, now if you can run inference at 30% to 40% cheaper compared to the leading GPU-based platforms, that's a pretty big price decrease. And then there's a couple, there's also announced at Reinvent, we're launching automated model distillation inside of bedrock and what that lets you do is you can take one of these really large models that's really good at answering questions you can feed it all your prompts and things for the specific use case you're going to want and it'll automatically tune a smaller model based on those outputs and kind of teach a smaller model to be an expert only in the area that you want with

Starting point is 00:17:03 regards to reasoning and answering so you can get these smaller cheaper models say like a llama 8b model as opposed to a llama 405b model cheaper to run faster to run and you can still get it to be an expert at the narrow use case that you want it to be. And so that combined with a cheaper infrastructure we think is one of the things that is really going to help people lower their costs and be able to do more and more inference in production.

Starting point is 00:17:27 Yeah, those small models seem to be the cost solution. Sounds like you're a believer. So one more question about in video. You've tested the new Blackwell chip. Is it the real deal? We have, you know, they're working on getting the yields up and getting it into production. But we're excited about that.

Starting point is 00:17:43 And also, we're going we're going to announce that the P6, which is the Blackwell-based instance that's coming early next year. And we're excited about that. I think customers, I think we are expecting about two and a half times the compute performance out of a blackwell chip that you get out of an H-100. And so that's a pretty big win for customers. So you're on with Jensen's the more you spend, the more you save. That's right.

Starting point is 00:18:07 That's, you know, that's that they've, that team has executed quite well. And they've, they continue to deliver a huge, improved. in performance and we're happy to make those available for customers. Okay, should we talk about ROI? Sure. All right, two-year anniversary of chat GPT. All these companies have rushed to put generative AI in their products. To this point, there's a couple of things that I've heard that have worked well.

Starting point is 00:18:31 Yeah. AI for coding. AI that is a customer service chat bot with a little more juice. AI that can read unstructured documents and make a little sense of them. Those are the three big ones. three big ones. I haven't heard much more outside of that. Yeah. We're talking about something that's had a trillions of dollars potentially to public company market caps, something that has had the largest VC funding around and then probably the subsequent three after that.

Starting point is 00:18:57 Yeah. Is there, are the three examples that I listed enough to make this worth the money? No, definitely not. But they're super valuable right now and they're just the tip of the iceberg. And that's the thing. You just have to look at the rest of the iceberg to realize how so where are we going? The opportunity is. And on those three, look, I think those are actually massive opportunities by themselves. We have a number of announcements here at RANVent around Q developer and making developers

Starting point is 00:19:19 and their whole life cycle more valuable. You think about the first generation, just using this as an example, the first generation of even developers was just code suggestions, right? It's in like code suggestion. Super valuable, actually. It made developers much more efficient being all the code.

Starting point is 00:19:32 It turns out also developers on average code about one hour a day. The rest of their day is spent doing documentation, it's spent doing writing unit tests, it's spent writing, doing code reviews, it's spent doing going to meetings, it's spent doing upgrades, creating existing applications, doing all that stuff that's not writing code.

Starting point is 00:19:47 Maybe some ping pong in there. And so as part of that, we're actually launching a bunch of new agents that do all of those things for you. You can just type in slash test and it'll actually automatically write unit test for you as you're sitting there coding. You can have a Q developer agent build, write documentation for you as you're writing code. And so you can have really well documented code and you're done and you don't have to go think about it. It'll even do code reviews and look for where you have risky parts of your code, where you maybe have open source or or parts that you want to you should go look at and think about what the licensing rules are around,

Starting point is 00:20:18 how you think about where even from deployment, where you may want to think about how you're deploying stuff, things that you would expect out of somebody doing a code review for you before you go do deployment, Q can now do all that for you. Same on the contact center side, right? We're doing a ton of announcements around Connect, which is our contact center in the Cloud offering,

Starting point is 00:20:36 making it much more efficient so for customers to get a ton out of that contact center, all powered by generative AI. And to your point, that's just So those use cases, I think, get more and more valuable as you add more capabilities. And I think if you think about where things are going, it is a lot more about, if you think

Starting point is 00:20:57 about how I talked about code generation moving to a bunch of the value, it's adding agents in there so they can do a bunch of these things. Right now it's not just giving you code suggestions. It's actually going and doing stuff for you. It's writing documentation for you. It's helping you identify and troubleshoot where you have operations issues.

Starting point is 00:21:13 And it says, ooh, you have an operations issue. And it can look and understand your whole environment. You can interact with it. And you and Q together can go and look and say, ooh, it looks like some permissions over here we're broken. And if you go fix those, maybe this is something that you can automatically, you know,

Starting point is 00:21:27 it'll fix your application. So saving tons of time across that whole development lifecycle. And I think that that's where as AI gets to be more integrated into the core of what a business is, or the core of what you do. And you really have to learn it, that's where you get the value. startup, but there's a number of these doing that, but there's a startup that we work with called evolutionary scale. And they use AI to try to discover new proteins and

Starting point is 00:21:53 molecules that may be more applicable to solving certain diseases. Now you think about not AI is not just generating stuff or it's doing, but it's actually sitting there instead of being able to find tens or hundreds of new molecules a year, you can now find hundreds of thousands of different proteins and test all of these and figure out which are the most likely to be successful and get drugs to market much faster, and that's a huge amount of additional revenue. So if you think about models and capabilities that can do that, whether it's in health and care and life sciences, whether it's in financial services, whether it's in manufacturing and automation, every single

Starting point is 00:22:28 industry, in our view, is going to be completely remade by generative AI at its core. And that's where you think that, that's where you get that, that huge value. I have a question about this. I was speaking with a developer friend who said, yes, AI can code. I can do all these things probably looking at the different things that these agents can do. The problem is, and this applies probably across the board when you trust things to generative AI, something breaks, and then you've lost the skill set to go in and fix that because you've relied so much on the artificial intelligence. What do you think about that?

Starting point is 00:22:58 Isn't that a problem? No, it's three times four. 12? Yeah, you still have Excel, but you still know how to multiply. Like I would say that like maybe, but like, you know, you're able to come to do those things. This is different. This is different, but it's, again, it's different, but I think the The key parts of coding are not the semantics around writing language. The key parts about coding are thinking about how you break down a problem, how you creatively come up with solutions.

Starting point is 00:23:20 And I think that doesn't change. The tools change. You can make you more efficient. But the developer, the core of what the developer actually does is not going to change. You're going to want to think about, there's not a lot of developers today that know a lot about garbage collection. It's just true. They don't, right? Because Java just does it for them and they just don't have to worry about that.

Starting point is 00:23:36 It doesn't mean that all of a sudden, like if it breaks, people don't know how to do garbage collection. and go figure it out and can do it. They just don't do it as part of their daily jobs. And because it's not fun and it's not value at it. And they can focus more on that writing code, right? This is what new languages have done. And so increasingly, I think developers are going to get to do the things that are exciting.

Starting point is 00:23:53 They're going to get to do creative work. They're going to get to figure out how to go solve those interesting problems. And they're going to be able to move much faster, because they don't have to worry about writing documentation. And someday, if it breaks, they probably will know how to write documentation and we'll figure out that is not rocket science.

Starting point is 00:24:08 It's just things they don't necessarily want to do So you're a believer in reasoning. I know that AWS has some news also this week that you're going to have automated reasoning tests where it checks for hallucinations before an answer goes out. Another issue when it comes to ROI is again, how can I trust it? It always comes out with wrong answers. So talk a little bit about your announcement this week and how reasoning can solve some of these issues.

Starting point is 00:24:32 It's a different reasoning than you might be thinking about too. So automated reasoning is a form of artificial intelligence that it's been around for a while. And it's a thing that Amazon has adopted pretty significantly across a number of different places. And we use it. It's actually, what it does is it uses mathematical proofs to prove that something is operating as you intended. That's the historical way.

Starting point is 00:24:54 And an example of that is we actually use it internally to make sure that our permissioning system is actually when you change permissions that it's actually behaving as expected. And so we have a, this AI system has this mathematical proof that can go say, OK, all the places is that permissions are applied across the surface area that's too large for you to actually go check everything. It can prove that they're applied in the way because it knows how the system is supposed to operate.

Starting point is 00:25:17 And it can go, kind of mathematically prove, yes, your iron permissions mean you can access this bucket or you can't access this bucket. We took that and we said, can we apply that to AI to eliminate hallucinations. And so it turns out not universally you can't do it, but for selected use cases where it's important that you get the answer right, you can. And so what we do is say an example, like you're an insurance company, right? And you want to be able to answer questions about people that say, hey, I have this problem, is it covered? And you don't want to say yes when the answer is no or vice versa, right? And so that is the one where it's pretty important to get that right.

Starting point is 00:25:51 What you do is you upload all your policies and all your information into the system, and we'll automatically create these automated reasoning rules. And then there's a process that you go through that's a couple minutes, kind of 10, 15, 20, 30 minutes, where you as the developer answer questions of how it's supposed to interact, right? and you tuned it a little bit. Say, yep, that's how you'd answer that type of question or no, or that's what this means. It'll ask you questions, and you kind of interact with it.

Starting point is 00:26:12 And then it goes, OK, now I have a tuned model. Now, if you go ask it a question, you say, hey, I ran my car through my garage door, is that covered by my insurance policy? It'll go and it will actually produce a response for you. And then it'll tell you that yes, this is provably correct that the answer is yes. And here are the reasons why in the documentation I have and why I feel confident in that.

Starting point is 00:26:34 Or it'll tell you, actually, I don't know the answer. Here's some suggested prompts that I recommend you put back into the engine to see if you can get the answer correct. Because I can't tell you. I came up with yes, but I actually don't know for sure that is the right answer. Change the prompts. And it'll give you kind of tips and hints on how you can re-engineer your prompts or ask additional questions to come back until you get an answer that's a for sure answer that's provably correct by automated reasoning. So by this kind of mechanism, you're like systematically able to actually mathematically prove that you got the right answer coming out of this and completely eliminate.

Starting point is 00:27:05 to completely eliminate hallucinations for that area, right? It doesn't mean that we've eliminated hallucinations all up. Just for that area. Yeah. If you go ask it then, you know, who's the best pitcher on the Mets? It may or may not answer your reasonable question. But let me ask you, what you're talking about also is very similar to what Mark Benioff talked about on the show last week, where he said that because companies have large stores of information within his platform,

Starting point is 00:27:28 agents will be able to go in and pull it out and then present it and sort of help create the linkage to go from step A to to step B. And it was interesting to me because I had always thought agents are going to be something that may be built by Anthropic, where it's my individual agent that goes out into the world and does what I need. And I think both you and Benny often, correct me if I'm wrong, have this idea that the agent is going to be something that I'm going to interact with when I'm speaking with the company or actually is going to perform tasks at work. Maybe that's going to happen before consumers get them. Yeah, I think that that's right. I think that agents are going to be a really powerful tool. Actually, another thing that we're launching the suite is, you know,

Starting point is 00:28:05 One of the things that agents today are quite good at doing relatively simple tasks. And you can have an agent that goes, and what they're very good at actually is tasks that are pretty well defined in a particular narrow slice and go accomplish something. And so what a lot of people are doing is starting to launch a bunch of agents, right? One that's very good at going and doing one particular task, another one that's good at another task. But increasingly, you actually need those agents to interact with each other, right? So we have an example in my keynote where we talk about

Starting point is 00:28:35 if you're thinking about, should I launch a coffee shop? And you actually, you're say you're a global coffee chain. You want to say, I'm going to launch a new location here. You might have an agent that goes out and investigates what the situation is in a particular location. You might have another agent that goes and looks at what are the competitors in that area. You may have another agent that goes and does a financial analysis of that particular area, another one that looks at the demographics of that zone, et cetera. And that's great.

Starting point is 00:29:00 So now you have like half a dozen of the agents that go and do a bunch of these things. saves you some time, but they actually kind of interact with each other, right? Like the demographics may imply, like they may change your financial analysis as an example. And so that's super hard. And then if you want to do it across 100 different locations to see where the best one is, that's also hard to do. Like it's super hard to coordinate because actually those also may be interrelated too. Like, you know, putting a coffee shop here and then another one two blocks down may interact

Starting point is 00:29:28 with each other. They can't be independent. So we launched a multi-agent collaboration capability where you basically have this kind of super agent brain that can actually help collaborate across all of them, break ties between them, help like pass data back and forth between them. And so we think that this is gonna be a really powerful way for people

Starting point is 00:29:45 to really accomplish much more complicated things out there in the world with, again, there's a fundamental model under the covers that's driving a bunch of this reasoning and breaking these jobs into individual parts and then the agents go and actually accomplish a bunch of this work. Okay, I'm just gonna say before we go to break, I appreciate how much news that year,

Starting point is 00:30:03 we think into this? Yes, sure. This is the ultimate number of keynote announcements that have been introduced into a podcast. So thank you for that. All right. Welcome to reinvent. Exactly. All right.

Starting point is 00:30:11 We're going to take a quick break and come back with Matt Garman, the CEO of AWS. Hey, everyone. Let me tell you about The Hustle Daily Show, a podcast filled with business, tech news, and original stories to keep you in the loop on what's trending. More than 2 million professionals read The Hustle's daily email for its irreverent and informative takes on business and tech news. Now, they have a daily podcast. called The Hustle Daily Show, where their team of writers break down the biggest business headlines in 15 minutes or less and explain why you should care about them.

Starting point is 00:30:41 So, search for The Hustle Daily Show and your favorite podcast app, like the one you're using right now. And we're back here on Big Technology podcast with Matt Garman of AWS. Let's talk about some earth-centric topics and starting with nuclear. You have invested 500 million in a company called X Energy to do nuclear. You're also part of, I would say, a wave of companies that are reanimating nuclear energy in the United States. And part of that is because these nuclear plants just didn't, they had excess capacity that they needed to get off their hands. I just want to ask you a broad question about should we really believe in this moment for nuclear? Because on one hand, it's for the moment cleaner than fossil fuels.

Starting point is 00:31:22 On the other hand, we don't really know what happens with nuclear waste. We can't get rid of it. It has to sit in silos. that could be damaging for the planet over time. So is it really, and this is sort of a sensitive one, but is it really an improvement to go to nuclear and how can we be sure because of the long-term effects here? Yeah, look, I think nuclear is a fantastic option

Starting point is 00:31:41 for clean energy. It is a carbon zero energy that has a ton of potential. And as you look about the energy needs over the next couple of years and really the next couple of decades, whether it's from technology or broadly in the world or there's electric cars or just the general electrification of lots of things in our world. We're going to need a lot more energy.

Starting point is 00:31:59 And we at Amazon are one of the biggest investors in renewable energy in the world. In the last five years, we've done over 500 renewable projects where we've added and paid for new energy to the grid, whether they're solar or wind or others. And so we'll continue to continue and invest in those projects. I think they're super valuable. And there's probably not going to be enough of those

Starting point is 00:32:25 soon enough for us to really get to where we want to get from a clean energy perspective. And so I think nuclear is a huge portion of that. You know, look, there's always the fearmongering from like back in the 60s and 70s of what nuclear used to be. Nuclear is an incredibly safe technology today. It's much different today. It turns out technology has changed in the last 50 years. And it's improved a lot.

Starting point is 00:32:44 And so there is a ton of improvements in that space. And we think that it is a both very safe, very eco-friendly energy source that is going to be critical for our Earth if we're going to keep for our world as we keep ramping our energy needs and we think that as part of that portfolio right you're going to have solar you're going to have wind and you're going to have other but nuclear is going to play an important role in that and and we're excited about what that potential looks like you mentioned X energy we do think that you know over the next probably starting somewhere in 2030 and beyond these small modular

Starting point is 00:33:19 reactors which is what X energy builds are going to be a huge component of this And so one of the, and there, they'll be part of that portfolio of offerings. But today, all these nuclear plants that people build are really large implementations. They're multi-billions of billions of dollars to go build these energy plants. And they produce lots of energy, which is great. But there are obviously all that energy is in one location. And then you have to invest in a ton of transmission to get the energy to the actual place you need to go. And they're big projects.

Starting point is 00:33:46 These small modular reactors are much smaller. You can actually produce them almost like you produce gas turbines. in a factory type setting eventually. And you can put them where you need them, right? So you can actually put them next to a data center where transmission is not gonna have to be an important factor. And so we think that that's a great solve

Starting point is 00:34:04 for a portion of the world's energy needs as we continue to evolve for our time. And so it's one of the components of an energy portfolio that we're very excited about. Okay, so we'll be watching that closely. On the state of the economy, AWS had a few quarters of stagnant growth. It was still impressive growth,

Starting point is 00:34:21 but it flatlined. a moment and part of that was because customers not quite flatlined but it was down from where it had been okay but that's I'm just talking about the percentage of yeah part of that was because the economy was in a rough moment everybody was looking for efficiency and so what you did was I think you made some deals with customers to help get their bills down or it helped get them the most out of what they were doing so they could you know effectively live that efficiency motto what does it look like right now is the economy back or

Starting point is 00:34:50 people still in efficiency mode yeah I'd say, and by the way, it wasn't even just deals. We went and proactively jumped in with our customers and helped them figure out how they could reduce their bills. And we looked about where they could consolidate resources, where they could move to cheaper offerings, where they could maybe do more with less. And we were really proactive about helping customers reduce those costs because we thought, from our view, one, as important for them as they thought about how they got their economics in the right place. And it was the right thing to do for them and built that long-term trust. Now, customers, I think, number one, a lot of them have been optimized, right? And there's only so much you can kind of squeeze into an optimized place.

Starting point is 00:35:30 And customers are still looking for optimizations, but a lot of that work has been done. And they're using some of that optimization to help fund some of the new development that they want to do. A lot of that is in the area of AI. Much of that is in the area of migration and modernization where they're moving from on-prem into a cloud world. So some of those optimizations they did are helping them fund some of that work that's moving more of their work to the cloud that's moving and letting them go and build new AI experiences in AWS. So that is where you've seen our growth start to come back up as a percentage basis. Some of that is customers leaning into those new experiences and doing some of

Starting point is 00:36:06 those more of those modernization migrations. Okay, I want to wrap on a culture question. Okay. Andy Jassy recently emailed the company and he said that as a consequence of scale and I'm going to get it exactly as he said it. He says there have been pre-meetings for pre-meetings for the decision meetings a longer line of managers feeling like they need to review a topic before it moves forward owners of initiatives feeling less like they should make recommendations because the decision will be made elsewhere was that going on within AWS and what is the process to change that yeah I think it's across across Amazon so it wasn't

Starting point is 00:36:43 specific to the rest of Amazon it was definitely inside of AWS too and you know I think it look, it's kind of a natural evolution. We have these leadership principles inside of Amazon. Things like being customer obsessed and really understanding the customer. And in order to really understand the customer, you've got to be close to the customer. And so a flatter organization, the more layers you have,

Starting point is 00:37:02 the more removed you are from customers. And so we just kind of fundamentally as we were growing. And then we went through an area of explosive growth of just the number of people and the size of the company and the size of the business. And so throughout that, we just didn't always have the organizational structure exactly right. And so it's, you know, we believe that a flatter organizational structure is better.

Starting point is 00:37:21 The closer you are to the customers, the better decisions you're going to make. The faster decisions are you going to make. And you really want ownership to be pushed down to the people who really are making some of those decisions. And when you have a very kind of hierarchical organization where people don't feel like they have that ownership to make decisions, you go slow. And for us, speed really matters. And so I think Andy was just highlighting some observations we had where, you know, I think he's incredibly thoughtful. on these points, and which I appreciate, we've had a lot of debate here, where it's nothing that's broken, but you could see like really early warning signs or stress around it.

Starting point is 00:37:55 And for us, culture is so important and doing their things in the right way, being that customer obsessed, being ownership, like having the right level of ownership is so important for what makes Amazon so special. And so kind of getting ahead of there being any problems, it's not like there was any burning problem, and we obviously could have just said, done nothing and kind of let things go for a while. But for us, it's not the Amazon way. It's not the Amazon way.

Starting point is 00:38:16 And so we're just being proactive identifying that like, hey, look, this is super important for us. And so let's just be aware of it. Let's be like be upfront about it, think about it, and be very intentional as we think about organizational structures and things about where we can land. And I think all of that has been received really well because it turns out not many of those things customers complain about. They are really focused on ownership. They love being customer obsessed.

Starting point is 00:38:41 And most of that has been quite well received. So you can be a big company, but not have big company culture. That's right. Okay, last one before we go. You've said that less than 20% of all workloads have moved to the cloud so far. What is the max number that that can be 100% in time? I was going to say 100? Is that smart?

Starting point is 00:39:01 No, but what's realistic? Yeah, you know, I think it's a good question. I think if you think about how many workloads are out there, I don't know what the max is. I'm very bad about picking the maximum size of AWS. You have to have some sort of total address. But I do think, I actually think that at a minimum, I think that that percentage could flip and it could be 80-20 versus 20-80 where it is today,

Starting point is 00:39:21 or even less. I think there's a massive number of applications that just haven't moved. And if you think about line of business applications, as you think about workloads that are in telco networks, if you think about workloads that are running inside of hospitals, if you think about, like, it's not even just traditional data center workloads,

Starting point is 00:39:36 but there's a lot of these other workloads that would be much more valuable, they'd be much more connected, they'd be much more able to to take advantage of advancements in AI if they were connected into the cloud world and running there. And so I think that there's a huge opportunity for us to continue to expand what it means to be in the cloud

Starting point is 00:39:52 and to continue to migrate many of these workloads that just haven't moved. And so there's a massive opportunity. I think kind of flipping that percentage over time could be an interesting opportunity for us. And the size of the pie is getting bigger too. I think that's the other exciting thing about generative AI is that the total amounts of compute workloads

Starting point is 00:40:10 are actually significantly accelerating too. timeline to flip? Decades? I still think we're still ways out for the whole thing to flip. There's just a massive amount of workloads out there. But we'll keep working on them and keep going as fast as we can. Mike Arvin, great to meet you. Thanks, nice to coming on the show.

Starting point is 00:40:26 Yeah, thank you. All right, everybody, thank you for listening. We'll be back on Friday breaking down the news and we'll see you next time on Big Technology Podcast. All right, Matt. Cool. Thank you. Yeah, you're here a week.

Starting point is 00:40:37 It's great. Until Wednesday. Okay, cool. Hope you enjoy it.

Big Technology Podcast - AWS CEO Matt Garman on Amazon's Big AI Chips Bet, Working With OpenAI, and Nuclear Energy

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.