Big Technology Podcast - AWS CEO Matt Garman on Amazon's Big AI Chips Bet, Working With OpenAI, and Nuclear Energy
Episode Date: December 4, 2024Matt Garman is the CEO of Amazon Web Services. Garman joins Big Technology Podcast to discuss Amazon's AI strategy and future plans. Tune in to hear Garman's insights on AWS's AI chips, partnerships w...ith leading AI companies including OpenAI, the ROI of generative AI, and how AWS is helping customers leverage AI while managing costs. We also cover AWS's investment in nuclear energy, the current state of the cloud market, and Amazon's culture challenges Hit play for a wide-ranging and in-depth conversation with the executive leading the world's largest cloud computing operation. Transcript: https://www.bigtechnology.com/p/aws-ceo-matt-garman-talks-amazons If you like what we're doing here, please rate the show five stars on Apple Podcast and Spotify. Thanks for listening!
Transcript
Discussion (0)
AWS CEO, Matt Garman, is here to talk about AI, the state of the economy and Amazon culture.
That's coming up right after this.
Welcome to Big Technology Podcast, a show for cool-headed, nuanced conversation of the tech world and beyond.
We are here in Las Vegas, Nevada at Amazon's Reinvent Conference with the CEO of AWS, Matt Garman.
Matt, so great to see you.
Welcome to the show.
Yeah, thank you for having me.
Let's talk a little bit about infrastructure.
You're the kings of building data centers, right?
There's no one that does it better than AWS.
But there are headlines in the AI world.
Elon Musk took 122 days to build a 100,000 plus GPU data center.
Does this show that scaling data centers is now core to competing in AI?
Does it validation?
What do you think about it?
Well, look, we've been building data centers for almost two decades now.
And so this is something that we spend a lot of time on.
And it's less that we're out there kind of bragging to the press about.
But what we do is we provide...
You can brag.
You're the size of yours.
Well, more is, what we do is provide infinite scale for customers.
And so our article is for largely the customers not have to think about these things, right?
And so we want them across their compute, across their storage, across their databases,
to be able to scale to any number of size.
And so take something like S3 as an example.
It's an incredibly complex, very detailed system that keeps your data, keeps it durable, and
scales infinitely, right?
And customers largely just put data in there and don't have to think about it.
And so today, S3 actually stores 400 trillion objects.
It's an enormous number that's hard to even get your head around.
But it's just something where we just keep scaling and we keep growing for our customers.
As you think about AI now, these are power-hungry, massive data centers for sure.
And AWS is adding tons and tons of compute all the time for our customers.
Largely what we think of though is less about how fast can you build one particular cluster.
The absolute size of AWS is dwarfed by any other particular cluster out there.
But we're focused on how do we deliver the compute the customers need to go build their applications?
Take somebody like Anthropic as an example.
Anthropic has what are widely considered to be the most powerful AI models out there today in their clawed set of models.
We're building together with them what we call Project Rainier.
And so it's using our next generation Traneum 2 chips.
And this cluster that we're building for them in 2025 will be five times the size the number of Xoflops
that they use to train the current generation of models,
which are by far the most powerful ones out there.
It's going to be five times that size next year,
all built on training them to delivering hundreds of thousands of chips
in a single cluster for them so they can train the next generation.
That's the type of thing where we work with customers,
understand what's interesting for them,
and then help them scale to whatever level they need.
And that's just one of our customers, of course.
We have hundreds and hundreds of other customers as well.
Here's my point. You're so good at this.
Look at what you just talked about in terms of anthropic,
being able to help them scale the way that you are.
And that would lead me to believe that Amazon would have its own cutting-edge, state-of-the-art model, one that would lead, you know, and be better than the Open AIs and the Anthropics.
This is your core competency, and this is what makes these models run.
So why hasn't that happened?
Our core competency is about delivering compute power for all of the people that need it.
And, you know, for a long time, we've been very focused on how do we build the capabilities to let our customers build whatever they want.
And sometimes there are areas that Amazon also builds, and other times they're not areas that Amazon builds.
And so you think about whether it's in the database world or you think about in the storage role or you think about the data analytics role or you think about the ML world, we build this underlying compute platform that everybody can go build upon.
And sometimes we build services that compete with others out there in the market.
Think about a Redshift competing with a snowflake who's also a very important partner of ours and a big customer of ours and somebody that we do a lot of partnering together on.
And then there's other times where there's applications that people build on top of AWS that Amazon doesn't go and build.
And so we operate across that whole swath of area
and sometimes we'll build and sometimes they don't.
But that's the kind of the beauty of AWS
is that our goal is to build that infrastructure
so that sometimes we can build those,
sometimes we won't build them,
but we want this platform that everybody can go build
the broadest set of applications possible out there.
But I'm thinking about it for AI specifically.
And in the world that you play in,
you have Google, they have their own model,
they sell cloud services.
You have Microsoft, okay, they don't have their own models
necessarily, but they have this deal
with Open AI.
Pretty sure that Open AI is exclusive on Azure.
Now, this is where a lot of the growth is coming from.
And so.
And I think it's a mistake, actually.
So the interesting thing there is, and this
is where a lot of people started.
I just think it's fundamentally the wrong way of thinking
about it, just a lot of times people are thinking
about there's just going to be this one model.
And I want to have the one model that's going to be the most
powerful and the one model to rule them all.
And as you've transitioned, as you've seen over the last year,
there isn't one model that's the best at everything.
There's models that are really good at reasoning.
There are models that are great for, that provide open weights
so that people can bring their own data and fine tune them
and distill them and create things that,
and create kind of completely new models from that
that are completely custom for customers.
And on that, you may want to use a Lama model
or you may want to use Mestral model.
There's customers who really want to build the world's best images
and they might use something like a stability
or they might use something like a Titan model.
There's customers that need really complex reasoning
and they might use an anthropic model.
There's a whole ton of these operating out there.
And our goal is how do we help customers
use the very best.
It doesn't have to be one thing.
It's not just one.
And we don't think that there's one best database.
We don't think there's one best compute platform
or processor.
We don't think that there's one best model.
It's across that whole set.
And that's been our strategy.
And customers have really embraced that strategy.
As you get to them, and they're thinking about how they go
build in production applications.
They want the stability, the operational excellence,
and the security that they get with AWS.
But they also want that choice.
It's incredibly important for them.
And I think choices.
is important for customers,
no matter if they're building generative AI applications,
no matter if they're picking a database offering,
no matter if they're picking a compute platform,
they want that choice.
And that is something that AWS, from the very earliest day,
has really leaned into.
And I think it's an important part of our strategy.
And it's maybe not the strategy that others have.
Maybe others say it's just this one,
and this is the one that we're going to lean into,
but it's not the strategy that we've picked.
Our choice is around choice.
And it's part of why we have the broadest set
of partner ecosystem as well.
It's why, as you walk the halls here and reinvent,
it's filled with partners who're building
their business on top of AWS, we're leaning in and helping our joint customers accelerate their
journeys to the cloud, their modernization efforts, their AI efforts.
It's because of that, I think that is a lot of what makes AWS special.
Okay, and I'm going to move off this in a moment.
But the reason why I'm asking these questions is because you do have at least a bet that
big foundational models are going to matter.
That's the $4 billion you just invested in Anthropic.
And I think that the strategy that AWS has makes a lot of sense, right?
This bedrock strategy, there's a lot of different models in there.
People have their data in the cloud.
They're going to build with their data they have within AWS using bedrock picking models.
But you also are limited in the fact that open AI is not there.
I don't think Google's there.
So wouldn't it make sense in parallel to the bring your own model strategy to also use this capacity that you have to scale infrastructure to get in the game yourself?
Look, what I will say is never say never.
It's an interesting idea and then we'll we never close any doors.
I think we're always open to, frankly, a whole host of things.
we're always open to having open AIA be available in AWS someday
or having Gemini models be available in AWS someday.
And maybe someday we will spend more time focused on our own models for sure.
I think all of that is open.
And part of what I think makes I was I was
we're always open to take our announcement earlier this year
about partnering deeply with Oracle,
about making Oracle databases available in AWS.
Lots of people would said, oh, that's never going to happen.
And it's against your strategy.
Our strategy is to embrace all technologies,
because we want anything that customers can use,
we want them to be available in
to be able to use it inside of AWS.
And look, sometimes it happens today.
Sometimes it happens tomorrow.
Sometimes it happens weeks from now, months from now,
years from now.
But that is our goal is to make all of those technologies
available for our customers.
OK, I'm going to parse your language a little bit,
because you said that you're always,
you might be open to having open AI on bedrock within AWS.
Are you talking to them?
Would you want to ask them to come on?
There's nothing to announce there today.
But I'm saying if customers want that,
that's something what we would want.
And we'd love to make it happen at some point.
Yeah.
Well, maybe they're listening and they want to make that move.
Yeah.
But let's speak to the one that I think is the biggest challenger to them, the one that you
have all this money in, which is Anthropic.
So what does the $4 billion that you just invested in Anthropic get you?
And how does that make you differentiated from other cloud providers?
Well, there's a couple things I'd say.
One is, you know, we make the investments in Anthropic because we think it's a good bet.
They have a very good team.
They've made some incredible traction in the market.
And we really like what they're, where they're innovating.
Yeah, we're definitely clawed heads on the show.
Yeah, it's a fantastic product, right?
And Darion and team are very good.
And they continue to actually attract some of the best talent out there in the market today.
You know, the other thing that we get from that is a deep collaboration on Traynium.
And, you know, we're, we've made a big bet on Traynium as an additional option for customers.
You know, the vast majority of...
We should define it.
Yeah, I was going to get there.
Oh, sorry, go ahead.
I mean, I'll just say it.
The chips that people can...
that companies can use to train their own models with at AWS.
That's right.
And so today, the vast majority of AI processing, whether it's inference or training, is done on
Nvidia GPUs.
And we're a huge partner of Nvidia.
We will be for a long time.
And I think that, and they make fantastic products, by the way.
And they continue to do that.
And when black hole chips come out, I think people are very excited about that next
generation platform.
But we also think the customers want choice.
And we've seen that time and time again.
We've done that with general purpose processors.
We have our own custom general purpose processor called Graviton.
And so we actually went and built our own AI chips.
The first version was called, or that's called Traneum.
We launched Traynium 1 a couple years ago in 2002.
And our Just Yang, Traneum 2 here at Reinvent.
So that's news that's happening this week.
That will be announced in my keynote.
Yes.
Remember this is filming.
You may have already happened.
We're going to release the transcript as your keynote hits and the podcast the day later.
Great.
But this is brand new news, fresh off the press folks.
Fresh off the press.
And so we'll have Traneum 2, and Traneum 2 gives really differentiated performance.
We see 30 to 40% price performance games versus our instances that are GPU powered today.
So we're very excited about Traneum 2, and customers are really excited about that.
And what Anthropic gives us back to your question is a leading frontier model provider
that can really work deeply to build the very largest clusters that have ever been built
with this new technology where we can learn from them, right?
And just learn what's working, what's not, what are the things you?
you need accelerated so that training three and training four
and training five and training six can all get better
as we as we continue to go.
And the software associated with GPUs gets better,
or the accelerators gets better as well.
I think that's one of the things where people who've tried
to build accelerated platforms before have fallen down
is the software support has not been as good
as NVIDIA software support is fantastic.
And so that's a big area where they're helping us as well
as we help iron out the creeks and the kinks
and try to figure out how we make sure that developers can
start to use these training M2 chips in a very seamless way and high-performance way.
So we learn a lot from them as big users that are really leaning in and help us learn.
And they get benefits from they get that scale and cost benefit of running on this
price performance platform that gives them a huge win.
And we think then from that investment we can both benefit as they deliver better and
better models over time.
There's an interesting thing that happens when I speak with people who are working
in cloud or working to train models, working to build their own chips.
There's always a preface.
We love working with Nvidia, and we're also building chips that compete with what they do.
So how does that relationship work out?
They don't get upset that you're trying to build the same.
I mean, they have a supply issue, but how does it work with them?
No, I have a great relationship with Nvidia and Jensen.
And this is a thing that we've done before.
We have a fantastic relationship with Intel and AMD, and we produce our own general purpose processors.
And it's a big world out there, and there's a lot of market for,
And for lots of different use cases, that's not one is going to be the winner, right?
There's going to be use cases where people are going to want to use GPUs,
and there's going to be use cases where people are going to find training to be the best case.
There are use cases where people find that our Intel instances are the best choice for them.
They're ones where they find that the AMD instances are the best choice from.
And there's increasingly a large set where they find Graviton,
which is our purpose built, general purpose processor, is the right fit for them.
And it doesn't mean that we don't have great relationships with Intel and AmD,
and it means we'll continue to have a great relationship with
Nvidia, because for them and for us,
it's incredibly important for Nvidia processors
and a GPU-powered processors to perform great on AWS.
And so we are doubling down our investment
to make sure that Nvidia performs outstanding in AWS.
We want it to be the best place
for people to run GPU-based workloads,
and I expect it will continue to be for a really long time.
What's the buying process like with Nvidia?
Because you want as many chips as you can get.
I would imagine you have eating.
Elon who buys them by the truckloads.
You have Zuckerberg, who has been buying lots.
And I think he wants to power them with a nuclear submarine
or something like that.
So do you have to jostle with the other companies
to get Nvidia chips?
Or do you get every exact line you want?
InVidia is very fair about how they go about it.
And so how do they do that?
I mean, you can ask them about how they internally allocate.
That's not really a question for me.
It's for them.
But they're very fair in dealing with us.
And we give long-term forecasts.
And they tell us what they can supply.
And we don't know that there's been shortages.
in the last couple of years, specifically, as demand is really ramped up.
And they've been great about ensuring that we get, you know,
enough to support our joint customers as much as possible.
What about your inference chips?
Inferencia?
Yeah.
Because last time I heard you speak, you said that the activity within AI right now,
Gen AI, is 50% training, 50% inference.
Does that ratio still hold?
And how are you going to put the chips out there to allow companies to be able to
do cheaper inference because that's the issue with generative AI.
It works well, but it's so expensive that companies take proof of concepts and only one fifth
actually make them out into production.
Yeah, it's absolutely the case.
And I think we're still probably seeing about that ratio of 50-50.
I think more and more it's more inference than training.
And increasingly we'll see more and more of the workload shift that way.
It costs is a super important factor that many of our customers are definitely worried about
and thinking about on a daily basis.
You know, if you think about where a lot of people were,
they went and did a bunch of these Gen AI capabilities,
or tests, where they did proof of concepts
and they launched hundreds of proof of concepts
across the enterprise without really paying attention
to like what was the value you're going to be or anything like that.
And now they're looking at them and they say, well,
the ROI is not really there.
They're not really integrated into my production environment.
They're just kind of these POCs that I'm not
getting a lot of value out of, and they're expensive, as you mentioned.
So two things that people are thinking about is, one,
how do I lower the cost so that I make that the cost much lower
to run, and that's the point about cost of inference.
And two, how do I actually get more value out of that?
So the ROI equation just completely shifts
and it makes more sense.
And it turns out it's probably not all hundred of those.
It's probably two or three or five of those that are really valuable.
And so there's a couple things.
Number one is on the cost side, as there's a few things
that we're doing to help people lower costs.
Number one is, I think Traneum 2 will be a material impact there.
And as these models have gotten bigger and bigger,
you mentioned inferentia.
Originally, we had a small chip called Inferentia that would run,
really fast, lightweight inference.
Now, as you're running models that have billions, tens of billions, hundreds of billions,
trillions of parameters, they're way too big to fit on these small inference chips.
And effectively, they're running on the same training chips.
Like, they're the exact same things.
And so you run inference today on H-100s, H-200s, or you run inference today on training
twos or training ones.
And so we may come out over time with other inferentia chips, as you will, but they're really
using a lot of that same architecture, and they're still really large servers.
And so we actually expect that Traneum 2 is going to be a fantastic inference platform.
Our naming is not necessarily always our suit as to what these ships are for.
But it's going to be a fantastic inference platform.
We actually think it'll be, as you think about that 30 to 40% price performance benefit
that customers are going to get, now if you can run inference at 30% to 40% cheaper compared
to the leading GPU-based platforms, that's a pretty big price decrease.
And then there's a couple, there's also announced at Reinvent, we're launching automated model distillation
inside of bedrock and what that lets you do is you can take one of these really large models
that's really good at answering questions you can feed it all your prompts and things for the
specific use case you're going to want and it'll automatically tune a smaller model based on those
outputs and kind of teach a smaller model to be an expert only in the area that you want with
regards to reasoning and answering so you can get these smaller cheaper models say like a llama
8b model as opposed to a llama 405b model cheaper to run faster to run and you can still get it
to be an expert at the narrow use case
that you want it to be.
And so that combined with a cheaper infrastructure
we think is one of the things that is really
going to help people lower their costs
and be able to do more and more inference in production.
Yeah, those small models seem to be the cost solution.
Sounds like you're a believer.
So one more question about in video.
You've tested the new Blackwell chip.
Is it the real deal?
We have, you know, they're working on getting the yields up
and getting it into production.
But we're excited about that.
And also, we're going
we're going to announce that the P6, which is the Blackwell-based instance that's coming early next year.
And we're excited about that.
I think customers, I think we are expecting about two and a half times the compute performance
out of a blackwell chip that you get out of an H-100.
And so that's a pretty big win for customers.
So you're on with Jensen's the more you spend, the more you save.
That's right.
That's, you know, that's that they've, that team has executed quite well.
And they've, they continue to deliver a huge, improved.
in performance and we're happy to make those available for customers.
Okay, should we talk about ROI?
Sure.
All right, two-year anniversary of chat GPT.
All these companies have rushed to put generative AI in their products.
To this point, there's a couple of things that I've heard that have worked well.
Yeah.
AI for coding.
AI that is a customer service chat bot with a little more juice.
AI that can read unstructured documents and make a little sense of them.
Those are the three big ones.
three big ones. I haven't heard much more outside of that. Yeah. We're talking about something
that's had a trillions of dollars potentially to public company market caps, something that
has had the largest VC funding around and then probably the subsequent three after that.
Yeah.
Is there, are the three examples that I listed enough to make this worth the money?
No, definitely not. But they're super valuable right now and they're just the tip of the iceberg.
And that's the thing. You just have to look at the rest of the iceberg to realize how
so where are we going? The opportunity is. And on those three, look, I think those are actually
massive opportunities by themselves.
We have a number of announcements here at RANVent
around Q developer and making developers
and their whole life cycle more valuable.
You think about the first generation,
just using this as an example,
the first generation of even developers
was just code suggestions, right?
It's in like code suggestion.
Super valuable, actually.
It made developers much more efficient being all the code.
It turns out also developers on average code
about one hour a day.
The rest of their day is spent doing documentation,
it's spent doing writing unit tests,
it's spent writing, doing code reviews,
it's spent doing going to meetings,
it's spent doing upgrades,
creating existing applications, doing all that stuff that's not writing code.
Maybe some ping pong in there.
And so as part of that, we're actually launching a bunch of new agents that do all of those things for you.
You can just type in slash test and it'll actually automatically write unit test for you as you're sitting there coding.
You can have a Q developer agent build, write documentation for you as you're writing code.
And so you can have really well documented code and you're done and you don't have to go think about it.
It'll even do code reviews and look for where you have risky parts of your code, where you maybe have open source or
or parts that you want to you should go look at
and think about what the licensing rules are around,
how you think about where even from deployment,
where you may want to think about how you're deploying stuff,
things that you would expect out of somebody
doing a code review for you before you go do deployment,
Q can now do all that for you.
Same on the contact center side, right?
We're doing a ton of announcements around Connect,
which is our contact center in the Cloud offering,
making it much more efficient so for customers
to get a ton out of that contact center,
all powered by generative AI.
And to your point, that's just
So those use cases, I think, get more and more valuable
as you add more capabilities.
And I think if you think about where things are going,
it is a lot more about, if you think
about how I talked about code generation moving
to a bunch of the value, it's adding agents in there
so they can do a bunch of these things.
Right now it's not just giving you code suggestions.
It's actually going and doing stuff for you.
It's writing documentation for you.
It's helping you identify and troubleshoot
where you have operations issues.
And it says, ooh, you have an operations issue.
And it can look and understand your whole environment.
You can interact with it.
And you and Q together can go and look and say,
ooh, it looks like some permissions over here
we're broken.
And if you go fix those, maybe this is something
that you can automatically, you know,
it'll fix your application.
So saving tons of time across that whole development
lifecycle.
And I think that that's where as AI gets to be more integrated
into the core of what a business is, or the core of what you do.
And you really have to learn it, that's where you get the value.
startup, but there's a number of these doing that, but there's a startup that we work
with called evolutionary scale. And they use AI to try to discover new proteins and
molecules that may be more applicable to solving certain diseases. Now you think
about not AI is not just generating stuff or it's doing, but it's actually sitting
there instead of being able to find tens or hundreds of new molecules a year, you
can now find hundreds of thousands of different proteins and test all of these
and figure out which are the most likely to be successful and get drugs to market much faster,
and that's a huge amount of additional revenue.
So if you think about models and capabilities that can do that, whether it's in health and care and life sciences,
whether it's in financial services, whether it's in manufacturing and automation, every single
industry, in our view, is going to be completely remade by generative AI at its core.
And that's where you think that, that's where you get that, that huge value.
I have a question about this.
I was speaking with a developer friend who said, yes, AI can code.
I can do all these things probably looking at the different things that these agents can do.
The problem is, and this applies probably across the board when you trust things to
generative AI, something breaks, and then you've lost the skill set to go in and fix that
because you've relied so much on the artificial intelligence. What do you think about that?
Isn't that a problem?
No, it's three times four. 12?
Yeah, you still have Excel, but you still know how to multiply. Like I would say that like maybe,
but like, you know, you're able to come to do those things.
This is different. This is different, but it's, again, it's different, but I think the
The key parts of coding are not the semantics around writing language.
The key parts about coding are thinking about how you break down a problem,
how you creatively come up with solutions.
And I think that doesn't change.
The tools change.
You can make you more efficient.
But the developer, the core of what the developer actually does is not going to change.
You're going to want to think about, there's not a lot of developers today that know a lot about garbage collection.
It's just true.
They don't, right?
Because Java just does it for them and they just don't have to worry about that.
It doesn't mean that all of a sudden, like if it breaks, people don't know how to do garbage collection.
and go figure it out and can do it.
They just don't do it as part of their daily jobs.
And because it's not fun and it's not value at it.
And they can focus more on that writing code, right?
This is what new languages have done.
And so increasingly, I think developers are going to get
to do the things that are exciting.
They're going to get to do creative work.
They're going to get to figure out how to go solve
those interesting problems.
And they're going to be able to move much faster,
because they don't have to worry about writing documentation.
And someday, if it breaks, they probably will
know how to write documentation and we'll figure out
that is not rocket science.
It's just things they don't necessarily want to do
So you're a believer in reasoning.
I know that AWS has some news also this week that you're going to have automated reasoning
tests where it checks for hallucinations before an answer goes out.
Another issue when it comes to ROI is again, how can I trust it?
It always comes out with wrong answers.
So talk a little bit about your announcement this week and how reasoning can solve some
of these issues.
It's a different reasoning than you might be thinking about too.
So automated reasoning is a form of artificial intelligence that it's been around for a while.
And it's a thing that Amazon has adopted pretty significantly
across a number of different places.
And we use it.
It's actually, what it does is it uses mathematical proofs
to prove that something is operating as you intended.
That's the historical way.
And an example of that is we actually use it internally
to make sure that our permissioning system
is actually when you change permissions
that it's actually behaving as expected.
And so we have a, this AI system
has this mathematical proof that can go say, OK, all the places
is that permissions are applied across the surface area that's too large for you to actually go check everything.
It can prove that they're applied in the way because it knows how the system is supposed to operate.
And it can go, kind of mathematically prove, yes, your iron permissions mean you can access this bucket or you can't access this bucket.
We took that and we said, can we apply that to AI to eliminate hallucinations.
And so it turns out not universally you can't do it, but for selected use cases where it's important that you get the answer right, you can.
And so what we do is say an example, like you're an insurance company, right?
And you want to be able to answer questions about people that say,
hey, I have this problem, is it covered?
And you don't want to say yes when the answer is no or vice versa, right?
And so that is the one where it's pretty important to get that right.
What you do is you upload all your policies and all your information into the system,
and we'll automatically create these automated reasoning rules.
And then there's a process that you go through that's a couple minutes, kind of 10, 15, 20, 30 minutes,
where you as the developer answer questions of how it's supposed to interact, right?
and you tuned it a little bit.
Say, yep, that's how you'd answer that type of question
or no, or that's what this means.
It'll ask you questions, and you kind of interact with it.
And then it goes, OK, now I have a tuned model.
Now, if you go ask it a question, you say, hey, I ran my car
through my garage door, is that covered by my insurance policy?
It'll go and it will actually produce a response for you.
And then it'll tell you that yes, this is provably correct
that the answer is yes.
And here are the reasons why in the documentation I have
and why I feel confident in that.
Or it'll tell you, actually, I don't know the answer.
Here's some suggested prompts that I recommend you put back into the engine to see if you can get the answer correct.
Because I can't tell you.
I came up with yes, but I actually don't know for sure that is the right answer.
Change the prompts.
And it'll give you kind of tips and hints on how you can re-engineer your prompts or ask additional questions to come back until you get an answer that's a for sure answer
that's provably correct by automated reasoning.
So by this kind of mechanism, you're like systematically able to actually mathematically prove that you got the right answer coming out of this and completely eliminate.
to completely eliminate hallucinations for that area, right?
It doesn't mean that we've eliminated hallucinations all up.
Just for that area.
Yeah.
If you go ask it then, you know, who's the best pitcher on the Mets?
It may or may not answer your reasonable question.
But let me ask you, what you're talking about also is very similar to what Mark Benioff talked about on the show last week,
where he said that because companies have large stores of information within his platform,
agents will be able to go in and pull it out and then present it and sort of help create the linkage to go from step A to
to step B. And it was interesting to me because I had always thought agents are going to be
something that may be built by Anthropic, where it's my individual agent that goes out into the
world and does what I need. And I think both you and Benny often, correct me if I'm wrong,
have this idea that the agent is going to be something that I'm going to interact with when I'm
speaking with the company or actually is going to perform tasks at work. Maybe that's going to happen
before consumers get them. Yeah, I think that that's right. I think that agents are going to be
a really powerful tool. Actually, another thing that we're launching the suite is, you know,
One of the things that agents today are quite good at doing relatively simple tasks.
And you can have an agent that goes, and what they're very good at actually is tasks that
are pretty well defined in a particular narrow slice and go accomplish something.
And so what a lot of people are doing is starting to launch a bunch of agents, right?
One that's very good at going and doing one particular task, another one that's good at another
task.
But increasingly, you actually need those agents to interact with each other, right?
So we have an example in my keynote where we talk about
if you're thinking about, should I launch a coffee shop?
And you actually, you're say you're a global coffee chain.
You want to say, I'm going to launch a new location here.
You might have an agent that goes out and investigates what the situation is in a particular location.
You might have another agent that goes and looks at what are the competitors in that area.
You may have another agent that goes and does a financial analysis of that particular area,
another one that looks at the demographics of that zone, et cetera.
And that's great.
So now you have like half a dozen of the agents that go and do a bunch of these things.
saves you some time, but they actually kind of interact with each other, right?
Like the demographics may imply, like they may change your financial analysis as an example.
And so that's super hard.
And then if you want to do it across 100 different locations to see where the best one is,
that's also hard to do.
Like it's super hard to coordinate because actually those also may be interrelated too.
Like, you know, putting a coffee shop here and then another one two blocks down may interact
with each other.
They can't be independent.
So we launched a multi-agent collaboration capability where you basically have
this kind of super agent brain that can actually help collaborate
across all of them, break ties between them,
help like pass data back and forth between them.
And so we think that this is gonna be
a really powerful way for people
to really accomplish much more complicated things
out there in the world with, again,
there's a fundamental model under the covers
that's driving a bunch of this reasoning
and breaking these jobs into individual parts
and then the agents go and actually accomplish a bunch of this work.
Okay, I'm just gonna say before we go to break,
I appreciate how much news that year,
we think into this?
Yes, sure.
This is the ultimate number of keynote announcements that have been introduced into a podcast.
So thank you for that.
All right.
Welcome to reinvent.
Exactly.
All right.
We're going to take a quick break and come back with Matt Garman, the CEO of AWS.
Hey, everyone.
Let me tell you about The Hustle Daily Show, a podcast filled with business, tech news,
and original stories to keep you in the loop on what's trending.
More than 2 million professionals read The Hustle's daily email for its irreverent and
informative takes on business and tech news.
Now, they have a daily podcast.
called The Hustle Daily Show, where their team of writers break down the biggest business headlines in 15 minutes or less and explain why you should care about them.
So, search for The Hustle Daily Show and your favorite podcast app, like the one you're using right now.
And we're back here on Big Technology podcast with Matt Garman of AWS.
Let's talk about some earth-centric topics and starting with nuclear.
You have invested 500 million in a company called X Energy to do nuclear.
You're also part of, I would say, a wave of companies that are reanimating nuclear energy in the United States.
And part of that is because these nuclear plants just didn't, they had excess capacity that they needed to get off their hands.
I just want to ask you a broad question about should we really believe in this moment for nuclear?
Because on one hand, it's for the moment cleaner than fossil fuels.
On the other hand, we don't really know what happens with nuclear waste.
We can't get rid of it.
It has to sit in silos.
that could be damaging for the planet over time.
So is it really, and this is sort of a sensitive one,
but is it really an improvement to go to nuclear
and how can we be sure because of the long-term effects here?
Yeah, look, I think nuclear is a fantastic option
for clean energy.
It is a carbon zero energy that has a ton of potential.
And as you look about the energy needs over the next couple of years
and really the next couple of decades,
whether it's from technology or broadly in the world
or there's electric cars or just the general electrification
of lots of things in our world.
We're going to need a lot more energy.
And we at Amazon are one of the biggest investors
in renewable energy in the world.
In the last five years, we've done over 500 renewable projects
where we've added and paid for new energy to the grid,
whether they're solar or wind or others.
And so we'll continue to continue and invest in those projects.
I think they're super valuable.
And there's probably not going to be enough of those
soon enough for us to really get to where we want to get from a clean energy perspective.
And so I think nuclear is a huge portion of that.
You know, look, there's always the fearmongering from like back in the 60s and 70s of what
nuclear used to be.
Nuclear is an incredibly safe technology today.
It's much different today.
It turns out technology has changed in the last 50 years.
And it's improved a lot.
And so there is a ton of improvements in that space.
And we think that it is a both very safe, very eco-friendly energy source that is going to be
critical for our Earth if we're going to keep for our world as we keep ramping
our energy needs and we think that as part of that portfolio right you're going to
have solar you're going to have wind and you're going to have other but nuclear
is going to play an important role in that and and we're excited about what that
potential looks like you mentioned X energy we do think that you know over the
next probably starting somewhere in 2030 and beyond these small modular
reactors which is what X energy builds are going to be a huge component of this
And so one of the, and there, they'll be part of that portfolio of offerings.
But today, all these nuclear plants that people build are really large implementations.
They're multi-billions of billions of dollars to go build these energy plants.
And they produce lots of energy, which is great.
But there are obviously all that energy is in one location.
And then you have to invest in a ton of transmission to get the energy to the actual place you need to go.
And they're big projects.
These small modular reactors are much smaller.
You can actually produce them almost like you produce gas turbines.
in a factory type setting eventually.
And you can put them where you need them, right?
So you can actually put them next to a data center
where transmission is not gonna have to be
an important factor.
And so we think that that's a great solve
for a portion of the world's energy needs
as we continue to evolve for our time.
And so it's one of the components
of an energy portfolio that we're very excited about.
Okay, so we'll be watching that closely.
On the state of the economy,
AWS had a few quarters of stagnant growth.
It was still impressive growth,
but it flatlined.
a moment and part of that was because customers not quite flatlined but it was
down from where it had been okay but that's I'm just talking about the
percentage of yeah part of that was because the economy was in a rough moment
everybody was looking for efficiency and so what you did was I think you made
some deals with customers to help get their bills down or it helped get them the
most out of what they were doing so they could you know effectively live that
efficiency motto what does it look like right now is the economy back or
people still in efficiency mode yeah
I'd say, and by the way, it wasn't even just deals.
We went and proactively jumped in with our customers and helped them figure out how they could reduce their bills.
And we looked about where they could consolidate resources, where they could move to cheaper offerings, where they could maybe do more with less.
And we were really proactive about helping customers reduce those costs because we thought, from our view, one, as important for them as they thought about how they got their economics in the right place.
And it was the right thing to do for them and built that long-term trust.
Now, customers, I think, number one, a lot of them have been optimized, right?
And there's only so much you can kind of squeeze into an optimized place.
And customers are still looking for optimizations, but a lot of that work has been done.
And they're using some of that optimization to help fund some of the new development that they want to do.
A lot of that is in the area of AI.
Much of that is in the area of migration and modernization where they're moving from on-prem into a cloud world.
So some of those optimizations they did are helping them fund some of that work that's moving more of their work
to the cloud that's moving and letting them go and build new AI experiences in
AWS. So that is where you've seen our growth start to come back up as a percentage
basis. Some of that is customers leaning into those new experiences and doing some of
those more of those modernization migrations. Okay, I want to wrap on a culture
question. Okay. Andy Jassy recently emailed the company and he said that as a
consequence of scale and I'm going to get it exactly as he said it. He says
there have been pre-meetings for pre-meetings for the decision meetings a longer line of
managers feeling like they need to review a topic before it moves forward owners of
initiatives feeling less like they should make recommendations because the
decision will be made elsewhere was that going on within AWS and what is the
process to change that yeah I think it's across across Amazon so it wasn't
specific to the rest of Amazon it was definitely inside of AWS too and you know
I think it look, it's kind of a natural evolution.
We have these leadership principles inside of Amazon.
Things like being customer obsessed
and really understanding the customer.
And in order to really understand the customer,
you've got to be close to the customer.
And so a flatter organization, the more layers you have,
the more removed you are from customers.
And so we just kind of fundamentally as we were growing.
And then we went through an area of explosive growth
of just the number of people and the size of the company
and the size of the business.
And so throughout that, we just didn't always
have the organizational structure exactly right.
And so it's, you know, we believe that a flatter organizational structure is better.
The closer you are to the customers, the better decisions you're going to make.
The faster decisions are you going to make.
And you really want ownership to be pushed down to the people who really are making some of those decisions.
And when you have a very kind of hierarchical organization where people don't feel like they have that ownership to make decisions, you go slow.
And for us, speed really matters.
And so I think Andy was just highlighting some observations we had where, you know, I think he's incredibly thoughtful.
on these points, and which I appreciate, we've had a lot of debate here, where it's nothing
that's broken, but you could see like really early warning signs or stress around it.
And for us, culture is so important and doing their things in the right way, being that
customer obsessed, being ownership, like having the right level of ownership is so important
for what makes Amazon so special.
And so kind of getting ahead of there being any problems, it's not like there was any
burning problem, and we obviously could have just said, done nothing and kind of let things
go for a while.
But for us, it's not the Amazon way.
It's not the Amazon way.
And so we're just being proactive identifying that like, hey, look, this is super important for us.
And so let's just be aware of it.
Let's be like be upfront about it, think about it, and be very intentional as we think about
organizational structures and things about where we can land.
And I think all of that has been received really well because it turns out not many of those
things customers complain about.
They are really focused on ownership.
They love being customer obsessed.
And most of that has been quite well received.
So you can be a big company, but not have big company culture.
That's right.
Okay, last one before we go.
You've said that less than 20% of all workloads have moved to the cloud so far.
What is the max number that that can be 100% in time?
I was going to say 100?
Is that smart?
No, but what's realistic?
Yeah, you know, I think it's a good question.
I think if you think about how many workloads are out there, I don't know what the max is.
I'm very bad about picking the maximum size of AWS.
You have to have some sort of total address.
But I do think, I actually think that at a minimum,
I think that that percentage could flip
and it could be 80-20 versus 20-80 where it is today,
or even less.
I think there's a massive number of applications
that just haven't moved.
And if you think about line of business applications,
as you think about workloads that are in telco networks,
if you think about workloads that are running inside of hospitals,
if you think about, like, it's not even just
traditional data center workloads,
but there's a lot of these other workloads
that would be much more valuable,
they'd be much more connected,
they'd be much more able to
to take advantage of advancements in AI if they
were connected into the cloud world and running there.
And so I think that there's a huge opportunity
for us to continue to expand what it means to be in the cloud
and to continue to migrate many of these workloads
that just haven't moved.
And so there's a massive opportunity.
I think kind of flipping that percentage over time
could be an interesting opportunity for us.
And the size of the pie is getting bigger too.
I think that's the other exciting thing about generative AI
is that the total amounts of compute workloads
are actually significantly accelerating too.
timeline to flip?
Decades?
I still think we're still ways out for the whole thing to flip.
There's just a massive amount of workloads out there.
But we'll keep working on them and keep going as fast as we can.
Mike Arvin, great to meet you.
Thanks, nice to coming on the show.
Yeah, thank you.
All right, everybody, thank you for listening.
We'll be back on Friday breaking down the news and we'll see you next time on
Big Technology Podcast.
All right, Matt.
Cool.
Thank you.
Yeah, you're here a week.
It's great.
Until Wednesday.
Okay, cool.
Hope you enjoy it.