Big Technology Podcast - Amazon's Longterm AI Vision — With Matt Wood
Episode Date: July 10, 2024Enjoying Big Technology Podcast? Please rate us five stars ⭐⭐⭐⭐⭐ in your podcast app of choice. --- Matt Wood is the VP of AI Products at Amazon Web Services (AWS). Wood joins Big Technology... Podcast to discuss the current state and future potential of AI, according to Amazon. Tune in to hear insights on how customers are adopting AI, the importance of model choice and specialization, and the evolution of AI in the near-term. We also cover AWS' AI platform, Amazon's Alexa assistant, and the cultural shifts needed for organizations to successfully leverage AI. Hit play for an engaging and informative conversation on the cutting edge of AI with one of the industry's leading experts. --- For weekly updates on the show, sign up for the pod newsletter on LinkedIn: https://www.linkedin.com/newsletters/6901970121829801984/ Want a discount for Big Technology on Substack? Here’s 40% off for the first year: https://tinyurl.com/bigtechnology Questions? Feedback? Write to: bigtechnologypodcast@gmail.com
Transcript
Discussion (0)
The VP of AI products at Amazon Web Services joins us to discuss what people are actually
building with the technology and whether it's worth the investment. All that more is coming up
right after this. Welcome to Big Technology Podcast, a show for cool-headed, nuanced conversation
of the tech world and beyond. Well, a year later, we have Matt Wood back with us today.
He's the VP of AI products at Amazon Web Services. Last year, we spoke at the AWS Summit in
New York City, all about Amazon's AI strategy. And we have a great opportunity now to talk a little
bit more about where the AI field is heading. Matt, welcome back to the show. Great to see you.
Good to see you, too. Thanks for having me back. This is awesome. And congrats on the growth of the show.
It's been amazing. I listen every week. So it's a pleasure to be here.
Thanks so much. Oh, that's awesome. So you'll have some context here. So let me ask you,
I think, the most pressing question that I have first, which is when we spoke last year, you said,
I wouldn't be surprised if just the AI part of our cloud computing business was larger than the
rest of AWS combined in a couple years. So I'm actually curious where it is today,
but before we get into that, here's the sort of disconnect I have. So obviously we spoke last year,
there was all this potential with AI. We've been talking about it on the show a lot. And yet I was
just speaking with the colleague of yours who referenced this Gartner study that said only 21%
of AI proof of concepts. So the different programs and product,
within companies actually go into production. That's a one in five rate, which is not great
given how much effort and money it takes to get these things going. So talk a little bit about
like where the potential and the state of AI building today, why there's that disconnect
and where we might be heading. Yeah, I'm happy to talk through it from my perspective. I've been
very fortunate over the past year or two to talk to literally hundreds of customers in every
single industry. And I have honestly not seen this level of energy and enthusiasm for any
technology, probably since the advent of the cloud from customers. Most customers are investing
very diligently. They're making good progress. There is a group which is moving slightly
faster than the average, which is somewhat counterintuitive. And that group is actually
the regulated industries. And so it's folks like in financial services and insurance and
health care and life sciences and manufacturing and they're able to move a little bit faster
in part because all the regulations that have been they've had to comply with over the past
20 years that probably felt at the time like a bit of a headwind have actually driven the right
set of behaviors for that group to be successful with generative AI and so they have you know all
of the governance of their data figured out they understand the quality of their data
They understand which data can be used where, by whom, and for what purpose.
They have very, very large amounts of private text data, exabytes of the stuff in some cases,
which are market reports or clinical trial results or insurance documents, life insurance documents,
those types of things that the models have never seen before,
but are really good at looking at and reading and summarizing and connecting the dots and finding disconnects.
And that just earlier in their kind of digital transformation journey.
And so they've probably looked across and felt like they were kind of sitting to the side
as other areas like retail and transportation and hospitality and media kind of went through
this very aggressive digital transformation over the past 10 years or so, driven by the web
and driven by mobile and a lot of other factors, including the cloud.
And these organizations are looking to not just use generative AI to,
catch up, but to actually leapfrog ahead using the data that they have, which is privately
held. So that's one area that I think is probably a little counterintuitive. I don't think
I would have guessed, you know, even a year ago or two years ago, that a, you know, 160-year-old
life insurance companies would be in the vanguard of really delivering value through
generative AI. But they have these very, very large document stores of, you know, 90-year-old life
insurance documents, which are probably going to pay out in the next decade or so. And they've been
scanned at some point, but no one's ever read them, and they're not sure what level of risk
is associated to those documents inside their business. And so they're able to use generative AI
to be able to piece that risk together and understand it more completely. And so I think...
So I see that you're going to say, like, the structured, the companies with structured data,
who have it very organized and have this partnership already are going to be the ones that are going
benefit the most, which is, that makes sense. But then there's also like some of the more glaring
issues. Change management is difficult. The model still costs too much to run. They're not quite good
enough yet. Like last year, we're going to get to it, but last year we were talking about agents and
all these other, you know, advanced use cases. And they've clearly not hit the way that they
are supposed to. So what do you think about these limitations? Aren't they the main things that are
holding back the field versus just like getting their data in order? Well, I think those, there are limitations
to the technology today, and part of being successful with the technology is understanding
those limitations at a deep level.
And you referenced, I'm not familiar with the details, but you referenced kind of 20%
of prototypes going into production.
Honestly, that sounds pretty good to me.
If you think of just the amount of experimentation that is happening inside organizations around
generative AI, just the number of experiments that are being really.
run on AWS for different companies in the regulated industries and all the other industries
that I mentioned, you know, Bedrock, which is the service that we make available to customers
to build generative AI applications, that's one of our fastest growing services ever.
And all up, AI machine learning at AWS is already a multi-billion dollar business in terms
of ARR. So there is a lot happening. And I think that that 20% is actually pretty good
because the denominator is absolutely massive.
And when technology shifts happen, you really do want customers to be able to innovate,
to be able to experiment, really safely, really quickly with that technology to find out what
works and what doesn't work.
And we're dealing here with the technology, which is just in its very earliest days, it's
much more like a discovery than it is an invention.
We discovered that if you build these very sophisticated mathematical models, that there is
emergent behavior within them that resembles reasoning, that resembles intelligence.
And we're applying that in some places, and some of those applications will turn out to be
successful. And it is no surprise to me at all that some of those experiments turn out not to be
successful, because if you're experimenting in the right way, a lot of those experiments are going to
fail. And so it's why customers in part turn to AWS for running some of these workloads,
the majority of these workloads, because they're able to broadly democratize the way that
these applications are built using generative AI, and they're able to validate the ones that work
really, really quickly, and then when they find that 20% that works, they're able to take it
into production very quickly, and at very, very large scale with the right cost structure
around it as well. And so I think that the 20% is a little misleading if you think the denominator
is small, but that denominator is massive because there's just so much.
experimentation happening. And we see it on AWS and inside Amazon as well. Right. Right. And look,
this is where I always kind of get tripped up because we talk about these, you know, these big,
uh, emergent, big things like emergent behaviors and models being able to do reasoning and how
it's a discovery. And then we talk about, okay, so what practically are they doing? And it's like,
well, they're coming through insurance documents. You know, shout out to all the folks working in
insurance. And I'm sure we have some listening to the show. But I'm like, man, if we
made this discovery, I mean, if people in the tech field made this discovery that models are
intelligent and can think for themselves. And like the thing, that's like one side of this.
And then, but then we ask when it's applied. And it's like, well, insurance adjusters are a little
bit more efficient. And it's like, can that, because we've, the market has valued and the industry
has sort of started building around these like discovery and use cases, the reasoning, the
emergent behaviors. But then we ask practical and it's like the most boring applications
you could possibly imagine. So is that going to change? It's interesting you say boring
because boring workloads are boring because there's so freaking many of them. They're just
everywhere. And so yes, I absolutely believe that there will be large step function changes in a
significant number of industries that are going to drive orders of magnitude improvements for
the organizations that work on them and society at large. One example is just computational biology
and we can talk about that in more detail, but the work that's going on there in terms of
using generative AI are the likes of the Dana Faber Cancer Institute or Genomics England or
Pfizer or the work we've done with a startup called evolutionary scale to be able to use
generative AI to be able to design entirely new molecules to design entirely new antibodies
that are manufacturable, that can go on and find new drug targets.
That is a major opportunity in step function.
It's early, for sure.
The company just went out of stealth.
They just published their paper, which is a great paper.
Recommend everybody read it just for background on what's happening in that field.
But I absolutely believe that there will be many different step functions forward
in multiple different industries of that format.
I also think that there is a huge number.
Some of it's going to be long tail, but just a huge number of, you know, what you call boring workloads that are going to be completely reimagined through the use of generative AI and that's okay. You actually want a lot of that boring work to be automated. You want a lot of that work to be improved. You want to be able to channel the boring work which has maybe inside some organizations is seen as a bit of a just as a cost center and to be able to turn that on its head and channel it into something which drives invention and
drives growth. And this is exactly what we saw with cloud computing in the early days as well.
I literally could have said that sentence. In fact, I think I did, with the advent of cloud
computing, that there is a huge number of workloads inside many, many enterprises that can
take advantage of not just the cost savings in the cloud, but can take advantage of the agility
in the cloud, and take something which is traditionally considered a cost center, building out
data centers, which offer no undifferentiated value, and turn it on its head and drive the right
cost structure and the right agility to be able to use that infrastructure to drive new
product creation, new invention, and reimagination of all of these different products.
And so what we consider boring today is going to be rechanneled, in my opinion, into
much more high leverage growth opportunities for many organizations.
there's such a big change management component to it as well right we talk about the models right
there's a cost there's a capability of the model uh but also you know one thing about trying to
reimagine how boring work is done is there's a lot of people who are sort of used to that
work um what percentage of the workplace do you think is really ready to like let's say this
a i can can revolutionize the way they do work what percentage of the workforce do you think is
ready to take advantage of it. It's a good question. I'm not sure I would peg it as kind of
ready. I suspect that whilst there will be these step function changes over the long period,
I think in the shorter term, in the shorter outlook, it's going to feel a lot more incremental
than we're probably used to. There's an old adage of a story that folks tell that when we finally
discover that there's life on another planet and another galaxy. Yeah, we all
have this idea that this will be a huge, you know, societal shifting event for the planet
that we discover there's life on another planet. But in reality, I suspect there's just
going to be lots and lots and lots of small iterative announcements and that when the NASA
press release comes out that there's life on another planet, it will seem really obvious at that
point. And it'll, from now to when that eventually happens, yeah, that's a really big jump. But
incrementally, we'll get there incrementally, not in one big shift. And I think the same thing
will apply here. There will be over a long term, like big incremental shifts in how we deliver
products, in how we deliver technology, and how we interact with data and information and each other.
But it'll probably appear kind of incrementally. And having patience and having a long-term
view allows you to drive more of that value incrementally and allows you to experiment more.
And it allows you to have big goals and kind of, you know, iterate yourself to iterate your way to greatness.
And having that long-term view, I think, is to go back to your question, one of the most important cultural shifts that organizations will need to make.
You're going to need to have the right teams, for sure.
You're going to need to have the right talent.
You're going to need to have the right technology.
You're going to need to partner with the right organizations to be able to drive that technology.
But having the ability to be able to take a long-term view so that you can allow those creative, inventive builders to be able to use that technology to be able to iterate and improve an experiment and invent.
That requires discipline from a leadership perspective.
It requires you to set up kind of small blast radius experiments.
And it requires the organizations to be very tolerant to that failure.
Because experiments failing, you've learned something.
thing there if you've set it upright. And that learning is disproportionately valuable at this point
in the kind of technology cycle. And so that cultural element that you outline is absolutely
critical. I'd actually say it's more like 50% technical, 50% cultural in terms of the weighting
of the elements of investment that are going to be required to be successful. So I'm not sure
exactly what percentage right now is kind of ready. I would guess if I had to put a number
on it, I would say it's probably 25%, 35% in most large-sized enterprises. But over time,
you know, if you look three years out, five years out, 10 years out, whatever it might be,
with that long-term horizon, my guess is going to be 100%. Yep. Okay, I'm going to ask a follow-up
on that, but first, you believe in aliens? I think you have to believe in aliens if you
understand just how big the universe is. It just seems
incredibly unlikely that we have hit the absolute only magical sweet spot in the whole universe
to encourage carbon to be able to animate and dance around as we do as humans every day. So
the probability of it just being limited to Earth seems very, very unlikely, although I acknowledge
the paradox that if there's life out there, you know, where is it? So that's why I kind of like
that uh or i mean yeah they could also all be be dead and we might be like right now like the only
living i mean i think there probably are some sort of life forms out there that have existed in the
universe you know either before we'll come after but to have them exist concurrently is yeah i agree
that's the question all right let's let's yeah go ahead i just want to get back to the AI stuff
i guess we could do another show on i would love it um goodness all right so you're what you're what
saying about patience, incrementality, you know, 25% of the organizations being ready and
replacing the boring stuff, that all sounds good. But it also makes me wonder if we're going to
end up in a sort of trough of disillusionment with this technology because there's been so much
hype and so much money that have poured into it that are demanding almost a revolution now.
And what you're describing isn't a revolution or isn't a quick moving revolution. It might be a
slow-moving incremental sea change, but not something that happens immediately. It's not something
that, you know, the Wall Street types, for instance, will be, like, thrilled to know that it's
just going to take a while because they think in quarters. So do you think there's a risk here
and, like within the next few years, sort of the public perception of this technology turning
a little bit because of the incremental nature of it? I think it would be a possibility if
and it's a huge if
if the technology wasn't
poised to improve.
So if what we,
if you believe that what we have today
is pretty much what we're going to have to work with
with only incremental small improvements
over the next three, five years,
you know,
then I suspect that, you know,
folks will feel like, you know,
the promise on this occasion
hasn't been delivered on.
But, you know, technology tends to follow
an S curve over time.
And, you know, you,
you get to the top right-hand corner of that S-curve and you end up with the technology,
with the capability, and you get these, you know, just decreasing improvements over time.
You never really know where you're at on the S-curve until you're looking backwards.
And so it's kind of hard to judge where we're at.
I think most people would think we're probably in that kind of middle section high-gradient
piece just because there's so much happening and there's so many improvements.
There's new models and new tech.
techniques and new technologies from academia and the public sector, private sector.
And I have no doubt that by the time we finish this conversation, there'll be another
technique out there that is worthy of our attention. But my guess is that it's probably
more likely that we're at the bottom left-hand corner. I don't think we've hit the kind of
hockey stick inflection point yet of what this technology is capable of. It's still very, very,
very early. So at some point we're going to hit that hockey stick infliction.
point. And it always happens with different technology shifts. It can take more or less time
depending on the shift and the speed of the technology. If you look at, you know, the, and the thing
that triggers the S-curve bend is different in a number of different ways. So if you kind of look
at the maturation of the internet itself, you know, that hockey stick inflection point, it really,
I think, landed with the development of kind of SaaS-style web 2.0 applications, whether it
was, whether it was webmail or whether it's finance systems, whether it's hotel booking
systems, whatever it was, that capability of being able to have access to those types of
services, that made, and the fact that you could integrate those services kind of through
APIs and do interesting things with them, that meant that every new service that was added
to the internet made all of the other services more valuable. And that's kind of what pushes
you up the S-curve in many times. You're still the same thing with kind of the mobile transformation
where we had these remarkable new devices,
we had these applications that more and more people invested in,
more or more organizations invested in,
they became more and more sophisticated.
And over time, the operating systems on which those applications ran
allow those applications to interoperate and interact in interesting ways,
both with the operating system and with each other.
So every net new application added makes all of them more useful,
makes the whole system more useful,
The whole device in your pocket gets better over time
without you having to do anything.
And so that pushes you up the S curve as well.
And I don't think we're at that point
with generative AI yet.
We have a really robust set of really interesting,
really powerful models which are going to mature over time.
But customers will, I'm sure, find interesting ways
to combine those different models.
There isn't one model to kind of rule them all.
Each different model has different sweet spots.
And it's my expectation that most customers will invest
in not building the foundation,
models, but we'll invest in fine tuning and improving those individual models and customizing
them in interesting ways for their own use case.
And those capabilities are interesting in isolation, but part of what will push us up
the S curve that we're seeing with customers at AWS and at Amazon is that combining those
models, leaning into the sweet spot of all these different models, allows you to build systems
that in aggregate have a compounding effect on intelligence.
It's not additive, it's a multiplier.
And so that's going to push us a little bit further up the S-curve.
I think another really interesting area,
and the one that's probably closest to SaaS applications and mobile apps,
is what you mentioned earlier, is agents.
I think agents have a good chance of being the apps
for the generative AI world and the generative AI era.
And that as we add more of those and we find ways to orchestrate multiple agents together,
and there's already customers that are building multi-agent systems on AWS today,
that combine specialties and combine agents that can goal-seek on your behalf and collaborate or
contest with each other in interesting ways, that means that every new agent that's added to the
system drives you up the S-curve. It makes all the other agents more useful at the same time
without you having to do anything. And that's a really big part of it.
But are agents an actual thing in production now? Yeah, I think so. You know, we have
because like I'm just going to say like last year we spoke about like you made you made an announcement
about how agents were agent building technology was on its way and it's just like a full year has
gone by and I haven't seen one example of like a of a realistic agent going out there and taking
action for people well I think I've certainly seen some I use some on a on a day-to-day base
let's hear it yeah that's why we do these discussions I'd recommend you check out a couple things
that may be interesting to you in the audience.
One is a startup company called Ninja Tech.
You can check them out at ninja tech.aI.
They have an assistant, assistive system.
You can interact with natural language, as you may be familiar with.
But they also have, under the hood, a set of specialized agents
that can perform different tasks on your behalf.
So they have a researcher agent, they have a scheduling agent,
They have a web agent, all sorts of different agents.
And just by asking your question, they interpret the question, and then they have an agent which
looks at the response and says, hey, this looks like you're doing some research.
Let me ask my researcher how I can best help you.
And I'll set a problem to my research agent.
And that research agent runs off and does its thing.
And they say, oh, there may be some web data that will be useful here.
I'll set my web agent off to go and collect that data for me and so on and so forth.
and it pulls back all the information together and allows you to interact with your calendar
and with your schedule or your email or whatever it might be in levels which are much more
automated than you could do with just a standard assistive chatbot.
So that's one example.
But if this stuff is so useful, then why hasn't it broken out into the public?
I just think it's very early.
Today, agent systems or agentic systems, as are sometimes called, it's still relatively early.
But I think they are breaking out, to be fair.
I think Ninja Tech is seeing remarkable growth.
They have hundreds of thousands of monthly active users.
We've also built some really powerful and popular agents on AWS.
So we have an assistant for builders that we call Q, Amazon Q.
And Amazon Q allows you to generate code if you're building software.
It will take a question and give you answers and give you guidance on how to build on
AWS and all the things you would expect.
And that's useful, that gets you a bump in productivity.
We've seen some customers get, you know, in terms of just the amount of code that is
automatically generated that they accept.
It's usually between 35 and 50 percent.
It's higher on Q than any other comparable service.
But the thing that drives productivity for developers is what we call the developer agents
inside Q.
And so with the developer agents, you don't just ask a question about what code to write or
or write a comment and get the function back,
you actually set a task to Q.
You say to Q, hey, I want to add this feature to my software.
Here's Q looks at the software across your repository.
It looks at the changes that you've made inside your development environment.
It understands the type of change or the type of feature that you want to make.
And it goes off and it looks at all of that information and it makes a strategy.
It doesn't just generate the code, it makes a strategy for how to add that feature.
So it picks which functions need to be updated,
which modules need to be added,
which tests need to be run,
which documentation needs to be added.
And you get a chance to review that strategy.
And at some point, you can just say, hey, Q, go for it.
And Q will go off and work through diligently
through its to-do list to create a set of software changes
that you can choose to commit, which add that feature
to your code.
And so if you can imagine a developer going from having
to just write or generate that code manually to having tens or dozens or overtime hundreds
of those developer agents running around doing the work on their behalf, you get this
you know, a combinatorial explosion of productivity. We do the same thing for code transformation.
And so if you want to move between different versions of Java, you know, we support that today.
You just say, hey, update this to be compatible with Java 17, whatever you're running.
It will go off and make that same strategy.
It will work diligently through it
and then allow you to review the results
and you can choose to accept those
and commit them back.
And that's a fixed cost effort
that most organizations have to go through.
We need to move Software Project A
from Java X to Java Y.
And it's going to take 10 people
and it's going to take three months.
And we're just going to have to,
it's just a cost of doing business.
We're just going to have to pay that cost.
Pay that cost in people, pay that cost in productivity.
And this is a task that no developer really likes to do.
It's kind of toil work and the very best outcome.
That's definitely boring.
It's boring, exactly.
But it's super impactful because there's so much of it.
And so you move from a world where you have this fixed cost that you just have to pay,
a cost center, just like we were talking about earlier,
and you move it to a point where that is just taken off the table.
It's completed automatically.
And those same developers get back to actually move into doing things which are much more
productive instead of that work. So we have we have Java to Java and you named it you named it
a cue because of Q from Star Trek not Q and on right it's it's it's neither it's neither but it's
what was the inspiration it's based on a quartermaster the idea of a quartermaster where you get your
gadgets okay you guys couldn't have picked a different letter it's a very controversial letter these
days I think it'll work out okay we're here with Matt Wood he's the VP of AI products at
Amazon Web Services.
On the other side of the break,
we're going to talk a little bit
about Amazon's products
and also where the models are going next.
So stay tuned.
We'll be back right after this.
Hey, everyone.
Let me tell you about The Hustle Daily Show,
a podcast filled with business,
tech news, and original stories
to keep you in the loop on what's trending.
More than 2 million professionals
read The Hustle's daily email
for its irreverent and informative takes
on business and tech news.
Now, they have a daily podcast called
The Hustle Daily Show,
where their team of writers
break down the biggest business headlines in 15 minutes or less and explain why you should care
about them. So search for The Hustle Daily Show and your favorite podcast app like the one you're
using right now. And we're back here on Big Technology Podcast with Matt Wood, the VP of AI products
at Amazon Web Services. All right, Matt, so last year we were talking a little bit about
bedrock, which is basically a tool that Amazon Web Services customers can use to develop
AI models. And the idea that you explained to me was basically Amazon's play for generative AI
was that people who want to develop on AI could go in and pick their own models through
Bedrock. It could be Facebook's Lambda or Amazon proprietary models or any host of other
models and then they could build that way. But Bedrock has not integrated OpenAI's GPT models yet
or Google's Gemini models yet.
And I was speaking with someone in the know who was basically like, look, like what they're
offering is not really choice.
It's like one model that works well, which is Anthropics.
And they're leaving out the other state-of-the-art models, which is, you know, open AIs GPT-4-O
and then Gemini.
And ultimately that means that the offering is limited and in some ways behind.
And I'm curious what you think about that argument.
I would obviously disagree that it's behind.
I think, you know, the interesting thing about these models is that, you know, they can be very seductive when you look at a model in isolation.
You know, you can read the benchmarks and, you know, tribes are forming around these models and all those sorts of things.
But what we see time and again with customers, enterprises, startups who are actually building with this in meaningful ways, is that they have a huge number of different workloads.
I work with some customers, and they're very generous, and they send me their roadmap of all the things across the company that they want to be able to apply generative AI to.
And it's a spreadsheet of, you know, five, six hundred rows of all the different things that they want to do with generative AI.
And, you know, it's kind of, it's kind of intuitive if you play that out that there isn't going to be, it seems very unlikely that there's going to be a single model that's going to be the best fit for all of those different workloads.
You know, some of those different workloads have different requirements.
Some have requirements that are, you know, heavy on reasoning or heavy on the ability to be able to do analysis.
Others need to be really good at summarization.
Others need to be really, really fast.
Others need to be very low cost.
And so there's this multiplicity of use cases that have different operational characteristics, whether it is intelligence or latency or cost, whatever it might be.
and customers want to be able to usually map the model to the mission.
They want to be able to find the right model for their use case
because if you have a small number of models or just a single model available to you,
it ends up having to play the role of kind of a Swiss Army knife.
And a Swiss Army knife sounds great.
It's great in a pinch.
But in reality, you almost never want a Swiss Army knife.
What you actually want is a broad tool belt with all of the specialized tools in there
that are a perfect fit for what you're trying to do.
If a contractor turned up at your home to do some renovations
and all they had was a Swiss Army knife,
I think you'd be pretty disappointed with their preparation,
probably pretty disappointed with their work quality as well.
That's right, exactly.
Same thing with AI models.
You want to be able to match the right model
to what it is that you're trying to do
so you can lean into the advantage of that model
in whatever it might be.
Now, some of those models, you really do
want as much intelligence and as much reasoning capability as possible.
And on Bedrock, we make available the anthropic models, the particularly Claude 3 and
the new Claude 3.5 improvements, which drive not just a great experience for high
intelligence requirements, but are the best performing models out there.
You know, Haiku, Claude 3.5 Haiku outperforms all other models on the planet.
And so that's great.
And you also want models which are really, really specialized for a specific task.
And so we're making the evolutionary scale models that I talked about earlier.
They're available on AWS today and we're going to bring them to bedrock later this year.
We have summarization models.
We have models which are specifically tuned to build agentic systems.
We have models that are specifically tuned to work with reasoning.
We have other models that are just really, really, really, really cheap.
We have models that are multi-modal and will handle different modalities.
We have single modality models.
We have large models.
We have small models.
And time and time again, we have seen at AWS, and this is an insight that I think maybe
some other providers have not yet had.
But because of our background in cloud computing, we really recognize the value of optionality
for customers.
Every single time we have ventured into a new domain, customers have time again, and we have
timing again told us that they value the optionality of having purpose-built solutions.
Right. Like being model agnostic is definitely a crucial aspect of development. Basically,
you could swap it, being able to swap in any model. I look at it more. Swapping models, I don't
think is quite the same thing. I look at it more like for each individual use case, you want
to find the right model. Picking the right one. So if you look. And that's working well. It's working well.
Well, Bedrock is one of our fastest growing services ever.
We have tens of thousands of customers that are using it today.
It's growing like crazy.
And it's really based on this observation that we carried over from our cloud computing work.
When we started, when we launched EC2, which is our Elastic Compute Cloud, it's our compute platform on AWS.
We launched with just a single compute type in a single availability zone.
Just one.
That was it.
That's all you could use.
But, you know, the goal was, because we saw it internally at Amazon and customers very quickly
told us that one single choice was not what they needed.
And so today we have over 400 different instance types in...
Right.
But if this choice is working so well, I want to ask you, then there's a question I've been
meaning to ask you for quite some time, which is that maybe it's limitations of the models
on the platform or maybe it's the evolution of the models.
but Amazon worked on something called Bloomberg.
I mean, Bloomberg worked on something called Bloomberg GBT on IWS.
And this is from Ethan Mollick.
He's a professor at Warren who studies this stuff.
He says, remember Bloomberg GPT, which was a specially trained finance LLM,
drawing on all of Bloomberg's data,
made a bunch of firms decide to train their own models
to reap the benefits of their special information and data.
Here's what he says.
You may not have seen that GPT4,
old pre-turbo version with a small context window without specialized finance training or special
tools beat it on almost all finance tasks. So I guess I'm curious from your perspective,
is it the fact that you didn't have the right models or that the models are advancing so fast
that's something that could take that much effort to train through this process that makes a lot of
sense could then eventually be surpassed by the next evolution of model from open AI?
Well, for context, those two models were, what, 12, maybe 18 months apart, something like that.
And today it looks like models have a shelf life of probably about six months if you're training
on kind of open web data.
And it's partly why we like working with our friends at Anthropics so much, they are
committed to continual and consistent improvement of all of their different models.
And, you know, they launched the Claude III set of models.
But if I'm a Bloomberg, though, then why would I develop this, you know, bespoke model if I could be then surpassed by an off-the-shelf model?
Well, again, I suspect that, I don't know for sure, but I suspect that for general world knowledge questions, you actually do want a model which is trained on world knowledge. That's really, really useful.
But that world knowledge is very, very, very broad, but it's not particularly deep.
And most organizations operate at depth.
And so there will be questions for sure that you can pose to multiple different models
and larger, more modern, world models.
I'm sure you can find examples that they will outperform specialized models.
And I am absolutely positive that the inverse is also true.
That you can find older, smaller, specialized models that will offer much better, higher quality,
lower hallucination results on specific tasks at the depth that most organizations need to read.
And so it's an and, not an awe. And so if you follow this train of thought where there is
a single model that is going to quote unquote win, I just think it's self-limiting because
you'll always end up with that being the Swiss Army knife. That presents the denominator on your
capability. And that denominator will, is not guaranteed to grow in the depth that most
organizations need to be able to operate in. And so, world models are great. They're super exciting.
You want them and you want the opportunity to specialize those models and fine-tune them.
You want to be able to build your own models. You want to be able to take existing models
and continue to train them. You want to be able to layer in your existing data using retrieval
augmentation. You want to be able to adjust the alignment and style and tone of these models
in interesting ways. You want to be able to quantize those models if you want to
run them at lower cost or on different environments.
So there's all sorts of value in optionality and all sorts of reasons why you might choose
a different model.
And so that is a really good example of where an end of having different models is a really
good opportunity for customers.
And you must have a good insight into like where the next level of models are going to go.
I mean, being so close with Anthropic, ear to the ground.
There's a lot of expectation that the next set, the GPT5 is,
maybe the anthropic fours, are going to have sort of, I don't know, godlike capabilities.
That's what I like to refer to it on the show.
But that's the anticipation.
What is the realistic expectation for what's coming next on the model front?
I think it's a good question.
I think we'll see a couple of different things.
I think we'll continue to see improved reasoning capabilities, the ability to be able to take
in larger amounts of data, reason across it with very, very high accuracy, to be able
to answer increasingly complex questions,
to be able to apply logic to those questions.
We'll continue to see improvement in that.
I think that improvement will come iteratively,
kind of every six months, and probably much more quickly
because different model providers are on slightly different schedule.
And so I think those will continue to improve.
I also think that there is a undervalued asset
in the fact that these models will continue
to get better for sure.
But you also want to be able to layer in your own data in order to be able to get the model grounded at the right level for your organization.
And so the world as we see things going forwards is that the models will continue to get better, more capable, more reasoning capabilities, and specialization and customization of the systems built with those models will become increasingly important.
and there will be a more sophisticated set of guardrails
which are mediating what the models receive
and what they generate on the outside.
And so you're going to end up in a world, I think,
where you're going to have a set of models
which are going to continue to improve.
Combining those models is going to become
disproportionately advantageous.
You're going to have a set of data inside your organization,
some of which you're going to generate,
which is fresh, to be able to fine-tune those models,
some of it which many organizations already have,
which they're going to use to ground the models
in the reality of their business, and you're going to need a set of capabilities that allow you to
bring those components together, as well as kind of manage the generative AI applications.
And it's those capabilities that we're kind of focused on building it across the board at
AWS.
A lot of stakes have been put into what's going to happen in the next 18 months in this gen AI world.
I mean, basically, from my understanding, there's billions of dollars being put into training
these next set of models. Everything that you said definitely implies. But it's also just like
there are going to be companies that live and die based off of their next iteration of model.
So what do you think is a best case scenario and what is a worst case scenario for generative AI
18 months from now? I think that there's not going to be hundreds of world model providers.
I think that there's likely to be maybe a dozen, two dozen, something of that order of magnitude.
I think that, you know, Anthropic will be one, meta will be one, Amazon will be one. There'll be
there'll be others. But I don't think there'll be hundreds of these providers. I think there'll be
a small number of providers. And I think over time they will offer a broader, you see this
happening already, a broader family of models which offer different opportunities for optimization.
So some of those models will be, hey, the question I am asking is incredibly valuable to my
organization. I want to be able to pad it with as much context from my private repository as
possible and I want the best possible answer at any cost.
That's how valuable that query, that that prompt is to me.
I think there'll be a lot of that.
I also think that you're going to want to run, you know, a set of less capable models
at much, much lower cost and everything in between.
And so my guess is that these models will not, you know, kind of commodify.
My guess is that they will diversify increasingly over time and that the idea that
there's, these models will become commodities defined as, you know, you can hot swap them and
their economics are primarily driven by, you know, supply and demand. Yeah, I don't see that
happening. And you can see the beginnings of that now is, you know, providers like Anthropic
are offering Claude 3, not as a single model, but as a model which has, you know, the,
the sliders on its configuration moved in slightly different positions and offers three different
models within a family. I could see that becoming 10 different models inside a family
with a more fine-tunable set of levers around cost and intelligence and capability and latency,
those sorts of things. And so I think that there'll be a larger number of models in aggregate,
but that the pool of providers probably won't grow much larger than a dozen or two.
Okay, but what is the best case scenario 18 months from now and what is the worst case scenario
Oh, the best case scenario is exactly what I laid out. That is the best case scenario for customers. That offers customers the broadest possible choice. It allows them, you know, by proxy to be able to address the broadest number of use cases inside their organization. And by proxy, derive the scale which will deliver return on investment, which is commensurate with the value that they're investing. So that's the best case. But it doesn't, it doesn't seem like that you're anticipating in the best case scenario models that will really be able to, like, out, like, dramatic.
outperform what we have today? No, I think there will be. I think that if you look at the differences
between, you know, Claude 3 and Claude 3.5, you know, the way that you measure the improvement
is going to become increasingly nuanced. And so today, there is a, in my opinion, misguided
belief that, you know, the king of the hill will basically win. There's going to be a single winner
here. I don't think that's going to be the case because there is so much value in addressing all
of these different use cases.
And so the best performing model today
is also has a really great cost profile
for the intelligence that it provides.
That was part of the invention
between Claude 3 and Claude 3.5.
Now, over time, the intelligence will continue to go up
and there'll be different optionality
within the spectrum
so the customers can find that sweet spot.
That's a very interesting idea.
By the way, has Amazon put all $4 billion into Anthropic now?
I know that there was a promise that that was going to happen or an upper bound.
Yep, we've completed that investment, yeah.
Okay.
So then worst case scenario, what are we, like, let's say everything doesn't live up to expectations.
Like, you must be game planning this out.
Yeah, I mean, what do we end up with in the worst case scenario?
I think the worst case scenario is there's probably two pieces.
One, and this goes back to what we were saying earlier, I think.
The worst case scenario number one is we've just mismatched where we're at the S curve,
and we're actually in the top right-hand corner.
And the capabilities of the core technology, the models, the ability for the models to be
able to work with data at scale, the capabilities to be able to merge those two things
responsibly together, they don't mature and improve at the pace that we expect.
I think that would be a disappointing outcome.
I think it's pretty low probability at this point given the trajectory that we're on, but that could be one.
And the other is, again, going back to something we talked about earlier, is that the readiness of organizations slows down the opportunity to deliver on this technology because they are struggling to manage the change or they're struggling to really drive reinvention through some of their sort of cultural biases.
And so I could imagine that that is playing out.
And I think that's at least as large a challenge for most customers
is the way in which you structure and organize and drive and deliver and measure
how exactly you're going to kind of operationalize from a business perspective
this new technology discovery.
So that's the worst pace.
Not every, yeah, not every company reinvents like Amazon.
This is true.
We are uniquely designed for speed, which is.
which makes it an exciting place to work.
Yeah, okay, so on that note,
and I think we'll come bring it home with this one,
Amazon AI guy, I got to ask about Alexa.
I know it's a different division,
but maybe there is some collaboration going on today.
Everything I've heard about the limitations of Alexa
has been that the intelligence within Alexa
is effectively hard-coded in there,
that there's like hundreds or thousands of different queries
that it's prepared for,
respond based off of like a database that it pulls from and there's been a question of whether
amazon is going to move from that style to a more large language style powered Alexa that will
require effectively a rewrite and so I'm curious if you think that the question is grounded in fact
and what's going to happen inside the Alexa division of Amazon well look uh Alexa is you know
extremely successful personal assistant and has been well received by customers.
We have hundreds of millions of Alexa endpoints out there that customers love to use.
What's really interesting about the future of Alexa is that part of the success of Alexa
has been that the way that Alexa works is that we're very, very accurate matching the intent of the
user to actioning that intent.
So that may be simple things like telling a joke,
or getting the weather, or it could be more serious things
like smart home use cases.
Now, some of those are turning lights on and off,
but some of them are locking and unlocking doors,
setting burglar alarms and those sorts of things.
And so it's really important, a really important capability
of Alexa is the ability to be able to perform that mapping.
That is very, almost entirely complementary to,
the kind of revolution that we're seeing with large language models today,
which allow us to create a much more natural, much more fluid,
much more human sounding, much more intuitive interface to those intents.
And so that's what we're working on. We're working on marrying the capability of this
remarkable ability to be able to pair an intent to an action with the large language
model interfaces that have become very popular and allow us to kind of unlock entirely new ways for alex
to provide assistance for our customers.
And so I think it's a complementary marriage
between the two technologies,
and we're hard at work on that.
Is there an LLM in there today?
Alexa has over a dozen machine learning AI models under the hood,
including large language models.
And is that going to expand the LLM use cases within the device?
Yes. Part of what we're working on
is the ability to be able to take more modern LLMs
that have this very natural, easy, intuitive back and forth.
That's a really important part of building an assistant
and combining that, marrying it with the technical underpinnings,
which allows to do this intent mapping under the hood very, very accurately.
Now, what's funny, the reason is complementary is LLMs today
are not very good of doing that LLM intent mapping.
They make mistakes, you need to be able to check them,
all those sorts of things.
And so, you know, LLMs are good at, you know, providing that natural language, that very intuitive interface in ways that is better than Alexa provides today.
And we want to take advantage of that.
But Alexa today also provides a lot of advantages that LLMs are not good at doing today.
And so, yeah, that's part of what we're doing, part of what we're working on.
Yeah.
And so is it going to require a full rewrite of the stuff under the hood of these assistants?
No, because we want to retain the core capability of alexander.
which is this intent to action mapping.
Okay.
Time frame for that?
Nothing to announce today.
Matt,
Wood, always great to speak with you.
Thanks for coming on the show.
All right, everybody.
Thank you so much for listening.
We'll be back on Friday,
breaking down the news as usual.
Also, Matt is about to hit the stage
at AWS's New York Summit,
so I'm sure you can find the news
that he's going to be making
shortly after this podcast hits.
All right.
Thank you so much for listening.
And we'll see you next time on Big Technology Podcast.