Big Technology Podcast - Amazon's Longterm AI Vision — With Matt Wood

Starting point is 00:00:00 The VP of AI products at Amazon Web Services joins us to discuss what people are actually building with the technology and whether it's worth the investment. All that more is coming up right after this. Welcome to Big Technology Podcast, a show for cool-headed, nuanced conversation of the tech world and beyond. Well, a year later, we have Matt Wood back with us today. He's the VP of AI products at Amazon Web Services. Last year, we spoke at the AWS Summit in New York City, all about Amazon's AI strategy. And we have a great opportunity now to talk a little bit more about where the AI field is heading. Matt, welcome back to the show. Great to see you. Good to see you, too. Thanks for having me back. This is awesome. And congrats on the growth of the show.

Starting point is 00:00:42 It's been amazing. I listen every week. So it's a pleasure to be here. Thanks so much. Oh, that's awesome. So you'll have some context here. So let me ask you, I think, the most pressing question that I have first, which is when we spoke last year, you said, I wouldn't be surprised if just the AI part of our cloud computing business was larger than the rest of AWS combined in a couple years. So I'm actually curious where it is today, but before we get into that, here's the sort of disconnect I have. So obviously we spoke last year, there was all this potential with AI. We've been talking about it on the show a lot. And yet I was just speaking with the colleague of yours who referenced this Gartner study that said only 21%

Starting point is 00:01:21 of AI proof of concepts. So the different programs and product, within companies actually go into production. That's a one in five rate, which is not great given how much effort and money it takes to get these things going. So talk a little bit about like where the potential and the state of AI building today, why there's that disconnect and where we might be heading. Yeah, I'm happy to talk through it from my perspective. I've been very fortunate over the past year or two to talk to literally hundreds of customers in every single industry. And I have honestly not seen this level of energy and enthusiasm for any technology, probably since the advent of the cloud from customers. Most customers are investing

Starting point is 00:02:05 very diligently. They're making good progress. There is a group which is moving slightly faster than the average, which is somewhat counterintuitive. And that group is actually the regulated industries. And so it's folks like in financial services and insurance and health care and life sciences and manufacturing and they're able to move a little bit faster in part because all the regulations that have been they've had to comply with over the past 20 years that probably felt at the time like a bit of a headwind have actually driven the right set of behaviors for that group to be successful with generative AI and so they have you know all of the governance of their data figured out they understand the quality of their data

Starting point is 00:02:47 They understand which data can be used where, by whom, and for what purpose. They have very, very large amounts of private text data, exabytes of the stuff in some cases, which are market reports or clinical trial results or insurance documents, life insurance documents, those types of things that the models have never seen before, but are really good at looking at and reading and summarizing and connecting the dots and finding disconnects. And that just earlier in their kind of digital transformation journey. And so they've probably looked across and felt like they were kind of sitting to the side as other areas like retail and transportation and hospitality and media kind of went through

Starting point is 00:03:31 this very aggressive digital transformation over the past 10 years or so, driven by the web and driven by mobile and a lot of other factors, including the cloud. And these organizations are looking to not just use generative AI to, catch up, but to actually leapfrog ahead using the data that they have, which is privately held. So that's one area that I think is probably a little counterintuitive. I don't think I would have guessed, you know, even a year ago or two years ago, that a, you know, 160-year-old life insurance companies would be in the vanguard of really delivering value through generative AI. But they have these very, very large document stores of, you know, 90-year-old life

Starting point is 00:04:12 insurance documents, which are probably going to pay out in the next decade or so. And they've been scanned at some point, but no one's ever read them, and they're not sure what level of risk is associated to those documents inside their business. And so they're able to use generative AI to be able to piece that risk together and understand it more completely. And so I think... So I see that you're going to say, like, the structured, the companies with structured data, who have it very organized and have this partnership already are going to be the ones that are going benefit the most, which is, that makes sense. But then there's also like some of the more glaring issues. Change management is difficult. The model still costs too much to run. They're not quite good

Starting point is 00:04:50 enough yet. Like last year, we're going to get to it, but last year we were talking about agents and all these other, you know, advanced use cases. And they've clearly not hit the way that they are supposed to. So what do you think about these limitations? Aren't they the main things that are holding back the field versus just like getting their data in order? Well, I think those, there are limitations to the technology today, and part of being successful with the technology is understanding those limitations at a deep level. And you referenced, I'm not familiar with the details, but you referenced kind of 20% of prototypes going into production.

Starting point is 00:05:26 Honestly, that sounds pretty good to me. If you think of just the amount of experimentation that is happening inside organizations around generative AI, just the number of experiments that are being really. run on AWS for different companies in the regulated industries and all the other industries that I mentioned, you know, Bedrock, which is the service that we make available to customers to build generative AI applications, that's one of our fastest growing services ever. And all up, AI machine learning at AWS is already a multi-billion dollar business in terms of ARR. So there is a lot happening. And I think that that 20% is actually pretty good

Starting point is 00:06:05 because the denominator is absolutely massive. And when technology shifts happen, you really do want customers to be able to innovate, to be able to experiment, really safely, really quickly with that technology to find out what works and what doesn't work. And we're dealing here with the technology, which is just in its very earliest days, it's much more like a discovery than it is an invention. We discovered that if you build these very sophisticated mathematical models, that there is emergent behavior within them that resembles reasoning, that resembles intelligence.

Starting point is 00:06:39 And we're applying that in some places, and some of those applications will turn out to be successful. And it is no surprise to me at all that some of those experiments turn out not to be successful, because if you're experimenting in the right way, a lot of those experiments are going to fail. And so it's why customers in part turn to AWS for running some of these workloads, the majority of these workloads, because they're able to broadly democratize the way that these applications are built using generative AI, and they're able to validate the ones that work really, really quickly, and then when they find that 20% that works, they're able to take it into production very quickly, and at very, very large scale with the right cost structure

Starting point is 00:07:19 around it as well. And so I think that the 20% is a little misleading if you think the denominator is small, but that denominator is massive because there's just so much. experimentation happening. And we see it on AWS and inside Amazon as well. Right. Right. And look, this is where I always kind of get tripped up because we talk about these, you know, these big, uh, emergent, big things like emergent behaviors and models being able to do reasoning and how it's a discovery. And then we talk about, okay, so what practically are they doing? And it's like, well, they're coming through insurance documents. You know, shout out to all the folks working in insurance. And I'm sure we have some listening to the show. But I'm like, man, if we

Starting point is 00:08:00 made this discovery, I mean, if people in the tech field made this discovery that models are intelligent and can think for themselves. And like the thing, that's like one side of this. And then, but then we ask when it's applied. And it's like, well, insurance adjusters are a little bit more efficient. And it's like, can that, because we've, the market has valued and the industry has sort of started building around these like discovery and use cases, the reasoning, the emergent behaviors. But then we ask practical and it's like the most boring applications you could possibly imagine. So is that going to change? It's interesting you say boring because boring workloads are boring because there's so freaking many of them. They're just

Starting point is 00:08:44 everywhere. And so yes, I absolutely believe that there will be large step function changes in a significant number of industries that are going to drive orders of magnitude improvements for the organizations that work on them and society at large. One example is just computational biology and we can talk about that in more detail, but the work that's going on there in terms of using generative AI are the likes of the Dana Faber Cancer Institute or Genomics England or Pfizer or the work we've done with a startup called evolutionary scale to be able to use generative AI to be able to design entirely new molecules to design entirely new antibodies that are manufacturable, that can go on and find new drug targets.

Starting point is 00:09:31 That is a major opportunity in step function. It's early, for sure. The company just went out of stealth. They just published their paper, which is a great paper. Recommend everybody read it just for background on what's happening in that field. But I absolutely believe that there will be many different step functions forward in multiple different industries of that format. I also think that there is a huge number.

Starting point is 00:09:54 Some of it's going to be long tail, but just a huge number of, you know, what you call boring workloads that are going to be completely reimagined through the use of generative AI and that's okay. You actually want a lot of that boring work to be automated. You want a lot of that work to be improved. You want to be able to channel the boring work which has maybe inside some organizations is seen as a bit of a just as a cost center and to be able to turn that on its head and channel it into something which drives invention and drives growth. And this is exactly what we saw with cloud computing in the early days as well. I literally could have said that sentence. In fact, I think I did, with the advent of cloud computing, that there is a huge number of workloads inside many, many enterprises that can take advantage of not just the cost savings in the cloud, but can take advantage of the agility in the cloud, and take something which is traditionally considered a cost center, building out data centers, which offer no undifferentiated value, and turn it on its head and drive the right cost structure and the right agility to be able to use that infrastructure to drive new

Starting point is 00:11:01 product creation, new invention, and reimagination of all of these different products. And so what we consider boring today is going to be rechanneled, in my opinion, into much more high leverage growth opportunities for many organizations. there's such a big change management component to it as well right we talk about the models right there's a cost there's a capability of the model uh but also you know one thing about trying to reimagine how boring work is done is there's a lot of people who are sort of used to that work um what percentage of the workplace do you think is really ready to like let's say this a i can can revolutionize the way they do work what percentage of the workforce do you think is

Starting point is 00:11:47 ready to take advantage of it. It's a good question. I'm not sure I would peg it as kind of ready. I suspect that whilst there will be these step function changes over the long period, I think in the shorter term, in the shorter outlook, it's going to feel a lot more incremental than we're probably used to. There's an old adage of a story that folks tell that when we finally discover that there's life on another planet and another galaxy. Yeah, we all have this idea that this will be a huge, you know, societal shifting event for the planet that we discover there's life on another planet. But in reality, I suspect there's just going to be lots and lots and lots of small iterative announcements and that when the NASA

Starting point is 00:12:35 press release comes out that there's life on another planet, it will seem really obvious at that point. And it'll, from now to when that eventually happens, yeah, that's a really big jump. But incrementally, we'll get there incrementally, not in one big shift. And I think the same thing will apply here. There will be over a long term, like big incremental shifts in how we deliver products, in how we deliver technology, and how we interact with data and information and each other. But it'll probably appear kind of incrementally. And having patience and having a long-term view allows you to drive more of that value incrementally and allows you to experiment more. And it allows you to have big goals and kind of, you know, iterate yourself to iterate your way to greatness.

Starting point is 00:13:21 And having that long-term view, I think, is to go back to your question, one of the most important cultural shifts that organizations will need to make. You're going to need to have the right teams, for sure. You're going to need to have the right talent. You're going to need to have the right technology. You're going to need to partner with the right organizations to be able to drive that technology. But having the ability to be able to take a long-term view so that you can allow those creative, inventive builders to be able to use that technology to be able to iterate and improve an experiment and invent. That requires discipline from a leadership perspective. It requires you to set up kind of small blast radius experiments.

Starting point is 00:14:03 And it requires the organizations to be very tolerant to that failure. Because experiments failing, you've learned something. thing there if you've set it upright. And that learning is disproportionately valuable at this point in the kind of technology cycle. And so that cultural element that you outline is absolutely critical. I'd actually say it's more like 50% technical, 50% cultural in terms of the weighting of the elements of investment that are going to be required to be successful. So I'm not sure exactly what percentage right now is kind of ready. I would guess if I had to put a number on it, I would say it's probably 25%, 35% in most large-sized enterprises. But over time,

Starting point is 00:14:43 you know, if you look three years out, five years out, 10 years out, whatever it might be, with that long-term horizon, my guess is going to be 100%. Yep. Okay, I'm going to ask a follow-up on that, but first, you believe in aliens? I think you have to believe in aliens if you understand just how big the universe is. It just seems incredibly unlikely that we have hit the absolute only magical sweet spot in the whole universe to encourage carbon to be able to animate and dance around as we do as humans every day. So the probability of it just being limited to Earth seems very, very unlikely, although I acknowledge the paradox that if there's life out there, you know, where is it? So that's why I kind of like

Starting point is 00:15:35 that uh or i mean yeah they could also all be be dead and we might be like right now like the only living i mean i think there probably are some sort of life forms out there that have existed in the universe you know either before we'll come after but to have them exist concurrently is yeah i agree that's the question all right let's let's yeah go ahead i just want to get back to the AI stuff i guess we could do another show on i would love it um goodness all right so you're what you're what saying about patience, incrementality, you know, 25% of the organizations being ready and replacing the boring stuff, that all sounds good. But it also makes me wonder if we're going to end up in a sort of trough of disillusionment with this technology because there's been so much

Starting point is 00:16:23 hype and so much money that have poured into it that are demanding almost a revolution now. And what you're describing isn't a revolution or isn't a quick moving revolution. It might be a slow-moving incremental sea change, but not something that happens immediately. It's not something that, you know, the Wall Street types, for instance, will be, like, thrilled to know that it's just going to take a while because they think in quarters. So do you think there's a risk here and, like within the next few years, sort of the public perception of this technology turning a little bit because of the incremental nature of it? I think it would be a possibility if and it's a huge if

Starting point is 00:17:03 if the technology wasn't poised to improve. So if what we, if you believe that what we have today is pretty much what we're going to have to work with with only incremental small improvements over the next three, five years, you know,

Starting point is 00:17:17 then I suspect that, you know, folks will feel like, you know, the promise on this occasion hasn't been delivered on. But, you know, technology tends to follow an S curve over time. And, you know, you, you get to the top right-hand corner of that S-curve and you end up with the technology,

Starting point is 00:17:36 with the capability, and you get these, you know, just decreasing improvements over time. You never really know where you're at on the S-curve until you're looking backwards. And so it's kind of hard to judge where we're at. I think most people would think we're probably in that kind of middle section high-gradient piece just because there's so much happening and there's so many improvements. There's new models and new tech. techniques and new technologies from academia and the public sector, private sector. And I have no doubt that by the time we finish this conversation, there'll be another

Starting point is 00:18:10 technique out there that is worthy of our attention. But my guess is that it's probably more likely that we're at the bottom left-hand corner. I don't think we've hit the kind of hockey stick inflection point yet of what this technology is capable of. It's still very, very, very early. So at some point we're going to hit that hockey stick infliction. point. And it always happens with different technology shifts. It can take more or less time depending on the shift and the speed of the technology. If you look at, you know, the, and the thing that triggers the S-curve bend is different in a number of different ways. So if you kind of look at the maturation of the internet itself, you know, that hockey stick inflection point, it really,

Starting point is 00:18:52 I think, landed with the development of kind of SaaS-style web 2.0 applications, whether it was, whether it was webmail or whether it's finance systems, whether it's hotel booking systems, whatever it was, that capability of being able to have access to those types of services, that made, and the fact that you could integrate those services kind of through APIs and do interesting things with them, that meant that every new service that was added to the internet made all of the other services more valuable. And that's kind of what pushes you up the S-curve in many times. You're still the same thing with kind of the mobile transformation where we had these remarkable new devices,

Starting point is 00:19:31 we had these applications that more and more people invested in, more or more organizations invested in, they became more and more sophisticated. And over time, the operating systems on which those applications ran allow those applications to interoperate and interact in interesting ways, both with the operating system and with each other. So every net new application added makes all of them more useful, makes the whole system more useful,

Starting point is 00:19:56 The whole device in your pocket gets better over time without you having to do anything. And so that pushes you up the S curve as well. And I don't think we're at that point with generative AI yet. We have a really robust set of really interesting, really powerful models which are going to mature over time. But customers will, I'm sure, find interesting ways

Starting point is 00:20:14 to combine those different models. There isn't one model to kind of rule them all. Each different model has different sweet spots. And it's my expectation that most customers will invest in not building the foundation, models, but we'll invest in fine tuning and improving those individual models and customizing them in interesting ways for their own use case. And those capabilities are interesting in isolation, but part of what will push us up

Starting point is 00:20:38 the S curve that we're seeing with customers at AWS and at Amazon is that combining those models, leaning into the sweet spot of all these different models, allows you to build systems that in aggregate have a compounding effect on intelligence. It's not additive, it's a multiplier. And so that's going to push us a little bit further up the S-curve. I think another really interesting area, and the one that's probably closest to SaaS applications and mobile apps, is what you mentioned earlier, is agents.

Starting point is 00:21:10 I think agents have a good chance of being the apps for the generative AI world and the generative AI era. And that as we add more of those and we find ways to orchestrate multiple agents together, and there's already customers that are building multi-agent systems on AWS today, that combine specialties and combine agents that can goal-seek on your behalf and collaborate or contest with each other in interesting ways, that means that every new agent that's added to the system drives you up the S-curve. It makes all the other agents more useful at the same time without you having to do anything. And that's a really big part of it.

Starting point is 00:21:45 But are agents an actual thing in production now? Yeah, I think so. You know, we have because like I'm just going to say like last year we spoke about like you made you made an announcement about how agents were agent building technology was on its way and it's just like a full year has gone by and I haven't seen one example of like a of a realistic agent going out there and taking action for people well I think I've certainly seen some I use some on a on a day-to-day base let's hear it yeah that's why we do these discussions I'd recommend you check out a couple things that may be interesting to you in the audience. One is a startup company called Ninja Tech.

Starting point is 00:22:24 You can check them out at ninja tech.aI. They have an assistant, assistive system. You can interact with natural language, as you may be familiar with. But they also have, under the hood, a set of specialized agents that can perform different tasks on your behalf. So they have a researcher agent, they have a scheduling agent, They have a web agent, all sorts of different agents. And just by asking your question, they interpret the question, and then they have an agent which

Starting point is 00:22:55 looks at the response and says, hey, this looks like you're doing some research. Let me ask my researcher how I can best help you. And I'll set a problem to my research agent. And that research agent runs off and does its thing. And they say, oh, there may be some web data that will be useful here. I'll set my web agent off to go and collect that data for me and so on and so forth. and it pulls back all the information together and allows you to interact with your calendar and with your schedule or your email or whatever it might be in levels which are much more

Starting point is 00:23:24 automated than you could do with just a standard assistive chatbot. So that's one example. But if this stuff is so useful, then why hasn't it broken out into the public? I just think it's very early. Today, agent systems or agentic systems, as are sometimes called, it's still relatively early. But I think they are breaking out, to be fair. I think Ninja Tech is seeing remarkable growth. They have hundreds of thousands of monthly active users.

Starting point is 00:23:50 We've also built some really powerful and popular agents on AWS. So we have an assistant for builders that we call Q, Amazon Q. And Amazon Q allows you to generate code if you're building software. It will take a question and give you answers and give you guidance on how to build on AWS and all the things you would expect. And that's useful, that gets you a bump in productivity. We've seen some customers get, you know, in terms of just the amount of code that is automatically generated that they accept.

Starting point is 00:24:23 It's usually between 35 and 50 percent. It's higher on Q than any other comparable service. But the thing that drives productivity for developers is what we call the developer agents inside Q. And so with the developer agents, you don't just ask a question about what code to write or or write a comment and get the function back, you actually set a task to Q. You say to Q, hey, I want to add this feature to my software.

Starting point is 00:24:49 Here's Q looks at the software across your repository. It looks at the changes that you've made inside your development environment. It understands the type of change or the type of feature that you want to make. And it goes off and it looks at all of that information and it makes a strategy. It doesn't just generate the code, it makes a strategy for how to add that feature. So it picks which functions need to be updated, which modules need to be added, which tests need to be run,

Starting point is 00:25:17 which documentation needs to be added. And you get a chance to review that strategy. And at some point, you can just say, hey, Q, go for it. And Q will go off and work through diligently through its to-do list to create a set of software changes that you can choose to commit, which add that feature to your code. And so if you can imagine a developer going from having

Starting point is 00:25:39 to just write or generate that code manually to having tens or dozens or overtime hundreds of those developer agents running around doing the work on their behalf, you get this you know, a combinatorial explosion of productivity. We do the same thing for code transformation. And so if you want to move between different versions of Java, you know, we support that today. You just say, hey, update this to be compatible with Java 17, whatever you're running. It will go off and make that same strategy. It will work diligently through it and then allow you to review the results

Starting point is 00:26:13 and you can choose to accept those and commit them back. And that's a fixed cost effort that most organizations have to go through. We need to move Software Project A from Java X to Java Y. And it's going to take 10 people and it's going to take three months.

Starting point is 00:26:30 And we're just going to have to, it's just a cost of doing business. We're just going to have to pay that cost. Pay that cost in people, pay that cost in productivity. And this is a task that no developer really likes to do. It's kind of toil work and the very best outcome. That's definitely boring. It's boring, exactly.

Starting point is 00:26:46 But it's super impactful because there's so much of it. And so you move from a world where you have this fixed cost that you just have to pay, a cost center, just like we were talking about earlier, and you move it to a point where that is just taken off the table. It's completed automatically. And those same developers get back to actually move into doing things which are much more productive instead of that work. So we have we have Java to Java and you named it you named it a cue because of Q from Star Trek not Q and on right it's it's it's neither it's neither but it's

Starting point is 00:27:18 what was the inspiration it's based on a quartermaster the idea of a quartermaster where you get your gadgets okay you guys couldn't have picked a different letter it's a very controversial letter these days I think it'll work out okay we're here with Matt Wood he's the VP of AI products at Amazon Web Services. On the other side of the break, we're going to talk a little bit about Amazon's products and also where the models are going next.

Starting point is 00:27:44 So stay tuned. We'll be back right after this. Hey, everyone. Let me tell you about The Hustle Daily Show, a podcast filled with business, tech news, and original stories to keep you in the loop on what's trending. More than 2 million professionals

Starting point is 00:27:56 read The Hustle's daily email for its irreverent and informative takes on business and tech news. Now, they have a daily podcast called The Hustle Daily Show, where their team of writers break down the biggest business headlines in 15 minutes or less and explain why you should care about them. So search for The Hustle Daily Show and your favorite podcast app like the one you're

Starting point is 00:28:16 using right now. And we're back here on Big Technology Podcast with Matt Wood, the VP of AI products at Amazon Web Services. All right, Matt, so last year we were talking a little bit about bedrock, which is basically a tool that Amazon Web Services customers can use to develop AI models. And the idea that you explained to me was basically Amazon's play for generative AI was that people who want to develop on AI could go in and pick their own models through Bedrock. It could be Facebook's Lambda or Amazon proprietary models or any host of other models and then they could build that way. But Bedrock has not integrated OpenAI's GPT models yet or Google's Gemini models yet.

Starting point is 00:29:03 And I was speaking with someone in the know who was basically like, look, like what they're offering is not really choice. It's like one model that works well, which is Anthropics. And they're leaving out the other state-of-the-art models, which is, you know, open AIs GPT-4-O and then Gemini. And ultimately that means that the offering is limited and in some ways behind. And I'm curious what you think about that argument. I would obviously disagree that it's behind.

Starting point is 00:29:31 I think, you know, the interesting thing about these models is that, you know, they can be very seductive when you look at a model in isolation. You know, you can read the benchmarks and, you know, tribes are forming around these models and all those sorts of things. But what we see time and again with customers, enterprises, startups who are actually building with this in meaningful ways, is that they have a huge number of different workloads. I work with some customers, and they're very generous, and they send me their roadmap of all the things across the company that they want to be able to apply generative AI to. And it's a spreadsheet of, you know, five, six hundred rows of all the different things that they want to do with generative AI. And, you know, it's kind of, it's kind of intuitive if you play that out that there isn't going to be, it seems very unlikely that there's going to be a single model that's going to be the best fit for all of those different workloads. You know, some of those different workloads have different requirements. Some have requirements that are, you know, heavy on reasoning or heavy on the ability to be able to do analysis.

Starting point is 00:30:38 Others need to be really good at summarization. Others need to be really, really fast. Others need to be very low cost. And so there's this multiplicity of use cases that have different operational characteristics, whether it is intelligence or latency or cost, whatever it might be. and customers want to be able to usually map the model to the mission. They want to be able to find the right model for their use case because if you have a small number of models or just a single model available to you, it ends up having to play the role of kind of a Swiss Army knife.

Starting point is 00:31:12 And a Swiss Army knife sounds great. It's great in a pinch. But in reality, you almost never want a Swiss Army knife. What you actually want is a broad tool belt with all of the specialized tools in there that are a perfect fit for what you're trying to do. If a contractor turned up at your home to do some renovations and all they had was a Swiss Army knife, I think you'd be pretty disappointed with their preparation,

Starting point is 00:31:36 probably pretty disappointed with their work quality as well. That's right, exactly. Same thing with AI models. You want to be able to match the right model to what it is that you're trying to do so you can lean into the advantage of that model in whatever it might be. Now, some of those models, you really do

Starting point is 00:31:55 want as much intelligence and as much reasoning capability as possible. And on Bedrock, we make available the anthropic models, the particularly Claude 3 and the new Claude 3.5 improvements, which drive not just a great experience for high intelligence requirements, but are the best performing models out there. You know, Haiku, Claude 3.5 Haiku outperforms all other models on the planet. And so that's great. And you also want models which are really, really specialized for a specific task. And so we're making the evolutionary scale models that I talked about earlier.

Starting point is 00:32:38 They're available on AWS today and we're going to bring them to bedrock later this year. We have summarization models. We have models which are specifically tuned to build agentic systems. We have models that are specifically tuned to work with reasoning. We have other models that are just really, really, really, really cheap. We have models that are multi-modal and will handle different modalities. We have single modality models. We have large models.

Starting point is 00:33:01 We have small models. And time and time again, we have seen at AWS, and this is an insight that I think maybe some other providers have not yet had. But because of our background in cloud computing, we really recognize the value of optionality for customers. Every single time we have ventured into a new domain, customers have time again, and we have timing again told us that they value the optionality of having purpose-built solutions. Right. Like being model agnostic is definitely a crucial aspect of development. Basically,

Starting point is 00:33:36 you could swap it, being able to swap in any model. I look at it more. Swapping models, I don't think is quite the same thing. I look at it more like for each individual use case, you want to find the right model. Picking the right one. So if you look. And that's working well. It's working well. Well, Bedrock is one of our fastest growing services ever. We have tens of thousands of customers that are using it today. It's growing like crazy. And it's really based on this observation that we carried over from our cloud computing work. When we started, when we launched EC2, which is our Elastic Compute Cloud, it's our compute platform on AWS.

Starting point is 00:34:14 We launched with just a single compute type in a single availability zone. Just one. That was it. That's all you could use. But, you know, the goal was, because we saw it internally at Amazon and customers very quickly told us that one single choice was not what they needed. And so today we have over 400 different instance types in... Right.

Starting point is 00:34:37 But if this choice is working so well, I want to ask you, then there's a question I've been meaning to ask you for quite some time, which is that maybe it's limitations of the models on the platform or maybe it's the evolution of the models. but Amazon worked on something called Bloomberg. I mean, Bloomberg worked on something called Bloomberg GBT on IWS. And this is from Ethan Mollick. He's a professor at Warren who studies this stuff. He says, remember Bloomberg GPT, which was a specially trained finance LLM,

Starting point is 00:35:08 drawing on all of Bloomberg's data, made a bunch of firms decide to train their own models to reap the benefits of their special information and data. Here's what he says. You may not have seen that GPT4, old pre-turbo version with a small context window without specialized finance training or special tools beat it on almost all finance tasks. So I guess I'm curious from your perspective, is it the fact that you didn't have the right models or that the models are advancing so fast

Starting point is 00:35:37 that's something that could take that much effort to train through this process that makes a lot of sense could then eventually be surpassed by the next evolution of model from open AI? Well, for context, those two models were, what, 12, maybe 18 months apart, something like that. And today it looks like models have a shelf life of probably about six months if you're training on kind of open web data. And it's partly why we like working with our friends at Anthropics so much, they are committed to continual and consistent improvement of all of their different models. And, you know, they launched the Claude III set of models.

Starting point is 00:36:14 But if I'm a Bloomberg, though, then why would I develop this, you know, bespoke model if I could be then surpassed by an off-the-shelf model? Well, again, I suspect that, I don't know for sure, but I suspect that for general world knowledge questions, you actually do want a model which is trained on world knowledge. That's really, really useful. But that world knowledge is very, very, very broad, but it's not particularly deep. And most organizations operate at depth. And so there will be questions for sure that you can pose to multiple different models and larger, more modern, world models. I'm sure you can find examples that they will outperform specialized models. And I am absolutely positive that the inverse is also true.

Starting point is 00:37:03 That you can find older, smaller, specialized models that will offer much better, higher quality, lower hallucination results on specific tasks at the depth that most organizations need to read. And so it's an and, not an awe. And so if you follow this train of thought where there is a single model that is going to quote unquote win, I just think it's self-limiting because you'll always end up with that being the Swiss Army knife. That presents the denominator on your capability. And that denominator will, is not guaranteed to grow in the depth that most organizations need to be able to operate in. And so, world models are great. They're super exciting. You want them and you want the opportunity to specialize those models and fine-tune them.

Starting point is 00:37:53 You want to be able to build your own models. You want to be able to take existing models and continue to train them. You want to be able to layer in your existing data using retrieval augmentation. You want to be able to adjust the alignment and style and tone of these models in interesting ways. You want to be able to quantize those models if you want to run them at lower cost or on different environments. So there's all sorts of value in optionality and all sorts of reasons why you might choose a different model. And so that is a really good example of where an end of having different models is a really

Starting point is 00:38:25 good opportunity for customers. And you must have a good insight into like where the next level of models are going to go. I mean, being so close with Anthropic, ear to the ground. There's a lot of expectation that the next set, the GPT5 is, maybe the anthropic fours, are going to have sort of, I don't know, godlike capabilities. That's what I like to refer to it on the show. But that's the anticipation. What is the realistic expectation for what's coming next on the model front?

Starting point is 00:38:56 I think it's a good question. I think we'll see a couple of different things. I think we'll continue to see improved reasoning capabilities, the ability to be able to take in larger amounts of data, reason across it with very, very high accuracy, to be able to answer increasingly complex questions, to be able to apply logic to those questions. We'll continue to see improvement in that. I think that improvement will come iteratively,

Starting point is 00:39:22 kind of every six months, and probably much more quickly because different model providers are on slightly different schedule. And so I think those will continue to improve. I also think that there is a undervalued asset in the fact that these models will continue to get better for sure. But you also want to be able to layer in your own data in order to be able to get the model grounded at the right level for your organization. And so the world as we see things going forwards is that the models will continue to get better, more capable, more reasoning capabilities, and specialization and customization of the systems built with those models will become increasingly important.

Starting point is 00:40:04 and there will be a more sophisticated set of guardrails which are mediating what the models receive and what they generate on the outside. And so you're going to end up in a world, I think, where you're going to have a set of models which are going to continue to improve. Combining those models is going to become disproportionately advantageous.

Starting point is 00:40:23 You're going to have a set of data inside your organization, some of which you're going to generate, which is fresh, to be able to fine-tune those models, some of it which many organizations already have, which they're going to use to ground the models in the reality of their business, and you're going to need a set of capabilities that allow you to bring those components together, as well as kind of manage the generative AI applications. And it's those capabilities that we're kind of focused on building it across the board at

Starting point is 00:40:50 AWS. A lot of stakes have been put into what's going to happen in the next 18 months in this gen AI world. I mean, basically, from my understanding, there's billions of dollars being put into training these next set of models. Everything that you said definitely implies. But it's also just like there are going to be companies that live and die based off of their next iteration of model. So what do you think is a best case scenario and what is a worst case scenario for generative AI 18 months from now? I think that there's not going to be hundreds of world model providers. I think that there's likely to be maybe a dozen, two dozen, something of that order of magnitude.

Starting point is 00:41:26 I think that, you know, Anthropic will be one, meta will be one, Amazon will be one. There'll be there'll be others. But I don't think there'll be hundreds of these providers. I think there'll be a small number of providers. And I think over time they will offer a broader, you see this happening already, a broader family of models which offer different opportunities for optimization. So some of those models will be, hey, the question I am asking is incredibly valuable to my organization. I want to be able to pad it with as much context from my private repository as possible and I want the best possible answer at any cost. That's how valuable that query, that that prompt is to me.

Starting point is 00:42:09 I think there'll be a lot of that. I also think that you're going to want to run, you know, a set of less capable models at much, much lower cost and everything in between. And so my guess is that these models will not, you know, kind of commodify. My guess is that they will diversify increasingly over time and that the idea that there's, these models will become commodities defined as, you know, you can hot swap them and their economics are primarily driven by, you know, supply and demand. Yeah, I don't see that happening. And you can see the beginnings of that now is, you know, providers like Anthropic

Starting point is 00:42:45 are offering Claude 3, not as a single model, but as a model which has, you know, the, the sliders on its configuration moved in slightly different positions and offers three different models within a family. I could see that becoming 10 different models inside a family with a more fine-tunable set of levers around cost and intelligence and capability and latency, those sorts of things. And so I think that there'll be a larger number of models in aggregate, but that the pool of providers probably won't grow much larger than a dozen or two. Okay, but what is the best case scenario 18 months from now and what is the worst case scenario Oh, the best case scenario is exactly what I laid out. That is the best case scenario for customers. That offers customers the broadest possible choice. It allows them, you know, by proxy to be able to address the broadest number of use cases inside their organization. And by proxy, derive the scale which will deliver return on investment, which is commensurate with the value that they're investing. So that's the best case. But it doesn't, it doesn't seem like that you're anticipating in the best case scenario models that will really be able to, like, out, like, dramatic.

Starting point is 00:43:53 outperform what we have today? No, I think there will be. I think that if you look at the differences between, you know, Claude 3 and Claude 3.5, you know, the way that you measure the improvement is going to become increasingly nuanced. And so today, there is a, in my opinion, misguided belief that, you know, the king of the hill will basically win. There's going to be a single winner here. I don't think that's going to be the case because there is so much value in addressing all of these different use cases. And so the best performing model today is also has a really great cost profile

Starting point is 00:44:29 for the intelligence that it provides. That was part of the invention between Claude 3 and Claude 3.5. Now, over time, the intelligence will continue to go up and there'll be different optionality within the spectrum so the customers can find that sweet spot. That's a very interesting idea.

Starting point is 00:44:48 By the way, has Amazon put all $4 billion into Anthropic now? I know that there was a promise that that was going to happen or an upper bound. Yep, we've completed that investment, yeah. Okay. So then worst case scenario, what are we, like, let's say everything doesn't live up to expectations. Like, you must be game planning this out. Yeah, I mean, what do we end up with in the worst case scenario? I think the worst case scenario is there's probably two pieces.

Starting point is 00:45:16 One, and this goes back to what we were saying earlier, I think. The worst case scenario number one is we've just mismatched where we're at the S curve, and we're actually in the top right-hand corner. And the capabilities of the core technology, the models, the ability for the models to be able to work with data at scale, the capabilities to be able to merge those two things responsibly together, they don't mature and improve at the pace that we expect. I think that would be a disappointing outcome. I think it's pretty low probability at this point given the trajectory that we're on, but that could be one.

Starting point is 00:45:52 And the other is, again, going back to something we talked about earlier, is that the readiness of organizations slows down the opportunity to deliver on this technology because they are struggling to manage the change or they're struggling to really drive reinvention through some of their sort of cultural biases. And so I could imagine that that is playing out. And I think that's at least as large a challenge for most customers is the way in which you structure and organize and drive and deliver and measure how exactly you're going to kind of operationalize from a business perspective this new technology discovery. So that's the worst pace. Not every, yeah, not every company reinvents like Amazon.

Starting point is 00:46:41 This is true. We are uniquely designed for speed, which is. which makes it an exciting place to work. Yeah, okay, so on that note, and I think we'll come bring it home with this one, Amazon AI guy, I got to ask about Alexa. I know it's a different division, but maybe there is some collaboration going on today.

Starting point is 00:47:02 Everything I've heard about the limitations of Alexa has been that the intelligence within Alexa is effectively hard-coded in there, that there's like hundreds or thousands of different queries that it's prepared for, respond based off of like a database that it pulls from and there's been a question of whether amazon is going to move from that style to a more large language style powered Alexa that will require effectively a rewrite and so I'm curious if you think that the question is grounded in fact

Starting point is 00:47:38 and what's going to happen inside the Alexa division of Amazon well look uh Alexa is you know extremely successful personal assistant and has been well received by customers. We have hundreds of millions of Alexa endpoints out there that customers love to use. What's really interesting about the future of Alexa is that part of the success of Alexa has been that the way that Alexa works is that we're very, very accurate matching the intent of the user to actioning that intent. So that may be simple things like telling a joke, or getting the weather, or it could be more serious things

Starting point is 00:48:23 like smart home use cases. Now, some of those are turning lights on and off, but some of them are locking and unlocking doors, setting burglar alarms and those sorts of things. And so it's really important, a really important capability of Alexa is the ability to be able to perform that mapping. That is very, almost entirely complementary to, the kind of revolution that we're seeing with large language models today,

Starting point is 00:48:47 which allow us to create a much more natural, much more fluid, much more human sounding, much more intuitive interface to those intents. And so that's what we're working on. We're working on marrying the capability of this remarkable ability to be able to pair an intent to an action with the large language model interfaces that have become very popular and allow us to kind of unlock entirely new ways for alex to provide assistance for our customers. And so I think it's a complementary marriage between the two technologies,

Starting point is 00:49:22 and we're hard at work on that. Is there an LLM in there today? Alexa has over a dozen machine learning AI models under the hood, including large language models. And is that going to expand the LLM use cases within the device? Yes. Part of what we're working on is the ability to be able to take more modern LLMs that have this very natural, easy, intuitive back and forth.

Starting point is 00:49:49 That's a really important part of building an assistant and combining that, marrying it with the technical underpinnings, which allows to do this intent mapping under the hood very, very accurately. Now, what's funny, the reason is complementary is LLMs today are not very good of doing that LLM intent mapping. They make mistakes, you need to be able to check them, all those sorts of things. And so, you know, LLMs are good at, you know, providing that natural language, that very intuitive interface in ways that is better than Alexa provides today.

Starting point is 00:50:20 And we want to take advantage of that. But Alexa today also provides a lot of advantages that LLMs are not good at doing today. And so, yeah, that's part of what we're doing, part of what we're working on. Yeah. And so is it going to require a full rewrite of the stuff under the hood of these assistants? No, because we want to retain the core capability of alexander. which is this intent to action mapping. Okay.

Starting point is 00:50:45 Time frame for that? Nothing to announce today. Matt, Wood, always great to speak with you. Thanks for coming on the show. All right, everybody. Thank you so much for listening. We'll be back on Friday,

Starting point is 00:50:57 breaking down the news as usual. Also, Matt is about to hit the stage at AWS's New York Summit, so I'm sure you can find the news that he's going to be making shortly after this podcast hits. All right. Thank you so much for listening.

Starting point is 00:51:10 And we'll see you next time on Big Technology Podcast.

Big Technology Podcast - Amazon's Longterm AI Vision — With Matt Wood

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.