Big Technology Podcast - Amazon Reveals Its AI Master Plan — With Matt Wood

Starting point is 00:00:00 An Amazon VP working directly on the company's AI initiatives joins us live from Amazon's AWS Summit in New York City, all that and more coming up right after this. LinkedIn Presents. Welcome to Big Technology Podcast, a show for cool-headed, nuanced conversation of the tech world and beyond. And we have a special show for you today. We are live here at Amazon's AWS Summit at the Javits Center. New York City and we are by Matt Wood, he's the VP of Product working on AI products.

Starting point is 00:00:35 Inside AWS, we're going to talk about generative AI. Welcome Matt. Thank you so much for having me. It's pleasure. So just so folks know, we have a live audience in front of us. I want to make sure that you all get on the recording. So you're going to let the people at home hear you and you got to be loud. Let's hear it. That was pretty good. solid crap. The pretty solid craft, yeah. So, so far so good.

Starting point is 00:01:02 So, Matt, I did my research about where Amazon fits in the generative AI space. And I looked at, like, the chat chiputee model, and I looked at chips. Okay, you're dabbling in that area, in those areas. But I don't know if you're stand out there yet, but where you are really trying to compete is in this space where companies, the big companies, bring their models inside AWS, and then anybody that wants to build something with an LLM can build it through your products. So talk to us a little bit about that initiative, and why did Amazon pick that in particular? Sure. Number one, I would say we're doing a lot more than dabbling. I think we've got a very

Starting point is 00:01:47 meaningful focus and investment just across the company on generative AI. Myself, I think the rest of the team, I think like a lot of other people here, probably the light bulb came on when we started, you know, playing with chat GPT when that first came out. And we really got excited and inspired by the capability here. And so what we're trying to do is maybe a little different from what some other folks are doing. What we want to do is take this technology and make it as broadly available as possible. There's a lot of these kind of magical, interesting technologies like cloud computing 20 years ago, like machine learning 10 years ago, and now artificial intelligence, that have traditionally been only available to the very, very largest technology companies, to the biggest

Starting point is 00:02:38 governments and academic agencies. And so my mission and our approach is that we want to make that as broadly distributed as possible. We want every builder and everyone to have access to the same capabilities that were once, you know, very, very limited. And so I think... Okay, wait, let me challenge you on that right off the bat. I mean, there's a huge open source movement in this area. In fact, Facebook just released this Lama 2 model, open source, you can use it, customize it. You don't have to pay them a thing. So there is access. So where is that gap that you're seeing between, you know, the high barrier and what is available on the market?

Starting point is 00:03:18 Absolutely. So Lama 2 is an excellent, very capable model. But there is a long way to go from having the model weights, which are what comprises the neural network, to actually building out an artificial intelligence system. And just having the model weights is super useful. But it's like having the source code to some software. Yes, it will give you some capability. But you still need to be able to deploy that somewhere. You still need to be able to understand it enough to be able to make changes to it. You need tooling that understands how it works. So you can actually take it as an engine and put it inside the car that you're building. And so what we're trying to do at AWS is make it really, really easy to take Lama 2 and other models like it, whether they're open source or proprietary, and use it as an engine inside cars and boats and planes and all sorts of things. So tell me a little bit about the process. So Facebook comes to you or meta comes to you and says, hey, Matt, we have this cool open source model. we'd like to make it available to your clients through AWS. Is that sort of how the process goes?

Starting point is 00:04:21 I mean, pretty much. And what we do then is we take the model and the weights and we put it inside a capability, our machine learning service, that we have called SageMaker. And SageMaker lets any builder, build, train, and deploy using machine learning models. And Lama 2 is one of several dozen large language models that are available on SageMaker today.

Starting point is 00:04:44 So that's exactly almost exactly how it happened. Right. So Sagemaker is your managed service that allows people to build, to shape models. But you have a newer product that's called Bedrock, which allows people to build things like agents, for instance. And they get to, it's very interesting. They get to pick the different model that they want. So talk a little bit about how that works. And I'd really love to hear, you know, about Amazon's progression from SageMaker to Bedrock. Sure. And why that puts you in a better strategic position? Sure. I think SageMaker we launched 2017, I think, and it's been very successful, very happy with the business. Customers love it. Many of our customers have standardized on SageMaker for their machine learning workloads. But one of the super interesting things about generative AI is inherently because you're not training the models yourself, you're taking models from Amazon and meta and a whole host of other stability, AI, and just building on top of them. it makes machine learning far more accessible. And so while SageMaker is great at building and training and deploying those models, we wanted to find a way which gave the maximum leverage of that accessibility. So you could just, instead of having to figure out how the model worked and fine-tune your data and all those sorts of things,

Starting point is 00:06:06 just give a prompt, just tell us what, tell the system what you want, choose the model that you want to run it against. And we have about half a dozen models. plus include from partners and some from Amazon. And then we just give you the result. There's no servers. There's no infrastructure to manage. You don't have to worry about using data or labeling data or worrying about GPUs or capacity

Starting point is 00:06:28 or any of those things. Just give us a prompt, choose the model, and we'll give you that output. So it's pretty cool. Like if you wanted to make a chat bot, for instance, like people think, all right, I want to make a bot. I go to OpenAI and I build it with them. Well, what you could do actually in Amazon's technology is go ahead and build an agent or a bot, and then pick whether you want OpenAI or Claude, right?

Starting point is 00:06:51 It just depends. It runs the gamut. That's the strategic bet for Amazon. That's right. And then our approach is to find areas that are really, really valuable. The customers, real problems the customers are trying to solve, and then add capabilities to bedrock to make those problems smaller. So a good example would be a chatbot.

Starting point is 00:07:12 So chatbots, like you may have played with chat GPT, they're very capable. They can understand what you're talking about. They have context. You can go back and forth. And they give the appearance of intelligence. But they actually are not very good today at completing complex tasks. And so let's say you wanted to create a right retirement plan. You could ask your chatbot, build me a retirement plan. And it would go off and he would build a very kind of reasonable approach to retirement planning. I bet my retirement off. I would vet that very carefully. But it will give you a pretty good strategy, a pretty good starting point. But it doesn't know about your personal finances. It doesn't know about the state of the markets. It doesn't know what products are available.

Starting point is 00:08:04 And so one of the things we're adding to Bedrock is the ability to be able to provide that information to the language model, using your own private data inside the applications that are already run in the Amazon Cloud, and to be able to extend the language model's capability with that data in a couple minutes so that the model can produce not just a strategy, but it can actually help you complete a task,

Starting point is 00:08:27 and that's something that hasn't been possible before. You know, Matt, this all sounds good, and then I go back to the release of Lambda 2 last, well, a couple weeks ago. And, okay, I saw it was definitely on AWS, but Azure is the preferred. partner there. So, yes, you're doing this. It's a strategy that makes sense in my mind, but you're also, your competitors are doing it as well. So where's the distinction there?

Starting point is 00:08:57 I think the distinction is that we have a slightly different approach to some other providers. We want to broadly democratize this technology. We want to do that because we think that there's not going to be a single model to rule them all. And so others are talking about, about, well, our stated goal is that we want an artificially generally intelligent system. That is not our stated goal. Our stated goal is that we want to just be very pragmatic, meet customers where they're at today, and then provide capabilities like agents, provide the option of different models, and then allow customers to deploy those capabilities in a way which is very low cost, high availability, and low latency. And that operational

Starting point is 00:09:40 performance is, I think, something that's going to really set folks apart in the future. Okay, but I have to get back to this Azure example. I mean, they're doing the same thing. So are you in Microsoft? You're not going to have, I think what you're basically saying is you're not going to have this field to yourself. You realize that you're going to come up against competitors doing the same thing. Or maybe I'm putting words in your mouth. Well, I think it's safe to say there's going to be a very competitive space for a very, very long time.

Starting point is 00:10:06 The opportunity is enormous. The capabilities are incredibly early. And so when you match early capabilities with a huge opportunity, you're naturally going to get a ton of different competition and ideas and thoughts. And we have our own and who knows if they'll turn out to be right or not. But our key point of differentiation is to be able to allow builders to be able to build these systems with their own data privately and securely and to be able to leverage the investments that they've already made in that data on AWS, using completely novel capabilities, in addition to existing models and novel models as well. So there's a lot of focus on models today, but those models are going to remain important, but over time there's going to be additional capabilities like agents,

Starting point is 00:10:57 like reinforcement learning, like the ability to be able to understand and vet the responses that come out of these things, our accuracy, that are going to be as important as the models over time. One more Microsoft question, if I may. Sure. They're so deeply invested in OpenAI and those GPT models. Does that make you distinct from them? Like, are there, okay, put me in the seat of a customer who's evaluating these two solutions. Open AI, Microsoft got all the buzz in the beginning.

Starting point is 00:11:31 But Microsoft is totally sort of all in on this model. You have your own models. We're going to talk about them. but Amazon seems to be less, it has less at stake in terms of making yours work. So is there a point of differentiation there if I'm a customer? Like, for instance, is it more neutral? Like, how do I think about that? How do they think about that?

Starting point is 00:11:55 Yeah, I think we're probably a little bit more pragmatic. We're a little bit more neutral. We have our own models, yes, and we think that they're going to be very capable. But we also recognize that different models will have different sweet spots. Some are going to be really good at managing data. Some are going to be really good at translating languages. Some are going to be really good at reasoning. And we expect that most customers are going to want access to not a single model that tries to do it all,

Starting point is 00:12:19 but a range of models that are good at different things. And I think today we're the only place where you can take data. And we have customers with exabytes of data. You'd be surprised how many customers have exabytes of data on AWS. And they can take that data that they've invested in and they can use it with these models. to create a net new asset for their organization that is valuable and unique and private. And you can only do that on AWS. So I have a note here and it's just like, man, you guys did Alexa first.

Starting point is 00:12:51 So why didn't you end up leading this LLM conversation? I mean, the fact that like we thought we were going to be talking to intelligent assistance everywhere is kind of an Amazon idea. I mean, it was an Apple idea, but their execution was bad. yours was better. You guys understood that we'd want to be talking to computers. And yet, you know, we just mentioned that you have your own LLM, your own model that's right there for people to access in Bedrock. Trust me, when I said I was going to be speaking with you, most of the general public was like, what's their strategy? Where's their model? You have one. So just talk a little bit about what happened there. Well, I think, number one, we're very happy with Alexa. Alexa is available to

Starting point is 00:13:34 billions of customers and requests across hundreds of millions of endpoints. So all listeners with one of those devices in your home, I apologize. We apologize. We're triggering the woman, or the man, if you've got it set that way. So I think we're incredibly proud of the progress we've made with Alexa. If you look at the way that it has been used in the real world, the number of endpoints that it's available in, if you'd have told me six, seven years ago that you could put Alexa in a microwave, put Alexa in a car.

Starting point is 00:14:07 Everywhere. The microwave is a great idea. Don't put your echo in the microwave. No, do not do that, just to be clear. Alexa, the service can live inside the microwave. The devices should stay outside of the microwave. So, yeah, I think, yeah, we're incredibly proud of that. I think our vision for Alexa that we've always been striving for

Starting point is 00:14:28 has been to provide a truly personal assistant. Yeah, we talk about this idea of like the inspiration. of coming from Star Trek and talking to the computer and all those sorts of things. That's great. But I think our actual long-term vision is a really personal assistant,

Starting point is 00:14:43 not an all-seeing controlling system. It's not that Alexa should have been this. It's that you had visibility into what this could be and didn't release a chat GPT first. So was that an oversight or... Well, only one person released chat GPT first. Everybody else didn't.

Starting point is 00:15:00 So I think that that... I think we should take inspiration from that. It is a fantastic... probably one of the most remarkable technology demonstrations that I have ever seen. Calling it a demo. It's a great research tool. It's developed beyond that. There are plugins inside ChatGPT right now that allow you to do.

Starting point is 00:15:18 Some people have access to plugins. There's Code Interpreter that does some of the things that your models are working out, which is like speaking to data that it's not trained on and saying, bring me some results. Well, Code Interpreter actually just executes code, which is generated by the model. It doesn't have access to net new information, doesn't have access to your own private information. And on that point, nobody is putting their private information into chat GPT. There are hundreds, maybe thousands of CIOs that are telling that whole organization not to use chat GPT. What's the concern there?

Starting point is 00:15:53 The concern is accidental data exfiltration. When you're using chat GPT, the service that exists on the web that we've all played with, whatever you type into that, is being used to train and improve the models, which makes sense. As a research tool, that makes a ton of sense. But if you're an organization and you start to want to reason and understand or develop your own IP, that IP is not differentiated against. And it goes into the model and it gets exfiltrated. And we've seen actual customers see their own IP come back to them from the model.

Starting point is 00:16:29 And that is terrifying to enterprise customers, where the IP is the crown jewels. And so, it is, I don't mean it in a diminutive way. It is an excellent technology demo. It is an excellent research tool from an exceptional research company. I'm not taking anything away from what they have achieved, and I think they will continue to do wonderful work. However, it is not the way that most organizations, in my opinion, are going to actually build and develop their generative capabilities.

Starting point is 00:17:00 And that's your bet, is to help them build these models. Correct. and do it in a, I imagine, privacy, safe. I'm making the pitch here, but I'm just trying to figure, like, navigate this conversation and figure out what you guys are doing. Yeah. And that's what it is. A company comes to you and says, we want to incorporate LLMs in our business.

Starting point is 00:17:17 You help them shape that. And so they don't end up dropping everything in chat GPT and having that spit out to their competitors. Correct. And the models have a similar level of capability, so you're not really losing anything there. But what we provide is security. what we provide is privacy. So none of the information that is used with our bedrock service is used to improve the underlying models. In fact, none of that information even leaves your network.

Starting point is 00:17:43 You can see exactly what happens to that data on where it goes. And then when you want to improve those models and customize them, specialize them for your own use cases internally, you don't want that specialization to be available to your competitors. You want it to be private and secure. And so we make it really, really easy to specialize those models in a way which gives you leverage against your own data and allows you to do that in a way which is completely private. What's the business opportunity there? Because let me just tell you Amazon, I'm sorry, Andresen Horowitz, I just read they think that cloud on top of AI, the opportunity is 10 to 20% of this genit AI spend. That sounds small.

Starting point is 00:18:24 What do you think? I think that two things. if you look at the broader opportunity, I think that this technology is as transformational as the very earliest internet and the web browsers that allowed us to access it. And it was that early internet and those early web browsers that gave inspiration and motivation and growth to companies like Amazon and Netflix and Airbnb. And I think that there is going to be a wave of similarly Amazon-sized companies that evolve out of the generative AI opportunity generally.

Starting point is 00:19:00 And so I think we're going to see multiple Amazon size organizations develop and grow over the next, who knows? 20 years, 10 years, 10 weeks, anything seems possible. And you're content with 10 to 20% of that next boom? I don't know if that's true. It seems very low to me, but I wouldn't be at all surprised if just the AI part of our cloud computing business was larger than the rest of AWS combined in a couple years.

Starting point is 00:19:29 Okay. No consumer product coming out from you guys. Nothing to announce today, that's for sure. Big smile. Why not do it? Do what? A chat GPT style search. I mean, we don't really operate in the search space.

Starting point is 00:19:47 We don't have a deep investment in web search. I mean, you have voice computing. It doesn't have to be search. Right. I'm just, I think you can. expect for sure to see new invention and innovation coming from Alexa and devices and our retail stores and our ads business. There isn't a team that I've spoken to at Amazon in the past six to ten months that isn't really focused on understanding this technology and where it can

Starting point is 00:20:17 be applied to their own business inside the company. Let me ask you this. Microsoft recently said that they have 11,000 customers who are using their OpenAI software building service. Do you think, how important is it for a company to establish a lead in this moment? And can you sit here and tell me with a straight face that Amazon is ahead of Microsoft? Well, number one, it is incredibly early. And we are three steps into a marathon. race. And I don't think anybody, without a smile on their face, could call a winner three steps into a marathon race. But Amazon had this amazing moment where it got ahead. Now, it wasn't

Starting point is 00:21:04 everybody going at the same time, and you were very early in AWS, so you know this, but AWS kind of ran away with the cloud computing field or cloud services field until other companies started to figure it out. And by establishing itself so early, built that market dominance. But I'm curious if you think we're going to see something like that in this moment or it just doesn't apply? I think that there is going to be multiple very credible options for builders. And I am strongly convicted that AWS, if it is not the leader after today's announcements, which I would say it is, you argue that it would be. But even if you take that as red, I think it's going to be hard to argue that by the end of the year, you know,

Starting point is 00:21:44 you won't see AWS. It will be hard to argue that AWS isn't in the top one or two providers. Right. Okay. Let's talk, you might enjoy the segment a little more. Let's talk a little bit about... I enjoy that, quite a lot, but... Okay, good. Well, actually, I'm going to go back to my page of difficult questions. Let's talk a little bit about building, right? We have, we're in front of a room of developers, at least I hope, or other, are really very serious big technology fans. So, thank you for showing. They look like builders. I mean that with the greatest respect. Yeah, it's great. So, tuck practice. about what people have already done with your technology. I mean, you had an announcement today that's kind of interesting about agents, right?

Starting point is 00:22:31 They can build agents. So I'd like to hear a little bit more about like the practical level of, maybe you can go step by step of like what, and briefly, but like what people would build with the AWS services. Sure. I think the ones that I've seen that are the most compelling, number one, just generative responses. So the sort of blog posts, advertising copy, you know, 3D meshes, those sorts of thing, where you're an expert and you just want to, you just want a starting point. You just want, instead of starting with an empty word document, just give me a first pass and let me iterate on it.

Starting point is 00:23:08 It's way easier, a huge time saver. You do that all day long, very, very popular. The next area, which is less sexy, but in my opinion maybe even be a larger opportunity, is using, in this technology to improve search results, improve ranking, relevance, personalization, those sorts of use cases where you don't even know that you're working with a large language model. It's all in the background. But they are remarkable at boosting the accuracy of those sorts of results. Then you've got knowledge discovery. So that's the sort of chatbot example. And the one that I'm most excited about is collaborative problem solving. So working with, this is a bit more science fiction, but I think we've materially advanced the state of the art this morning with our

Starting point is 00:23:54 agents announcement, where you are able to, as an individual or another artificially intelligent system, interact with an artificially intelligent system to solve complex problems. That is a very interesting area. Talk about what that means. Well, it means that imagine you've got any sort of business problem that you can imagine. Super simple. I've got $1,000. I want to turn it into $2,000.000. I want to turn it into $2,000. $2,000. How do I do that? You may have a set of artificial intelligent capabilities that will help advise you as to how to turn that $1,000 into $2,000. And you can interact with them one by one and build the strategy yourself. Or you can have them operate as a swarm

Starting point is 00:24:41 of agents collaborating with themselves in order to be able to build the best possible strategy and for H, like a to-do list. And for each item on the to-do list, they can recommend the specific tasks that you need to go do in order to be able to complete that. So similar to like the baby AGIs? That's sort of idea, exactly.

Starting point is 00:25:00 Yeah, that auto-GPT approach of using large language models to rationalize with other large language models. That doesn't freak you out a bit? I don't think it freaks me out. No, I think we've seen tremendous opportunity, but here's why it doesn't freak me out. because it works best in highly constrained domains where you put so many constraints around what it is

Starting point is 00:25:22 you're trying to solve that all of the agents, none of them are running a mock, none of them are running off and doing things you didn't tell them to do. You put the constraints on and constraints are probably the single largest force that we have to improve the capabilities of these ALLMs. Give a practical example. Practical example would be in my $1,000 to $2,000 example, you constrain it to a set of markets, you constrain it to a set of stocks, you constrain it to a set of financial products, you can strain it to a set of operations and buy and sell operations that you would do in a particular given of time. And every layer of constraint that you add reduces the chance that the language model will just create a spurious, erroneous output, but also just keeps the

Starting point is 00:26:09 whole thing grounded and keeps the whole thing focused. And when we've looked at these, these approaches inside the company, you know, they end up kind of acting like humans. Like, they argue. And sometimes they get stuck in a loop and you need to go in there and intervene. And other times, you need a tiebreaker because there's two equally good ideas and two equally good options. And you need some, you need another agent or person to come in and tiebreak. And so it's very interesting watching these very tightly constrained, um, uh, uh, ants run around trying to build things on your behalf. And you think this is going to be useful? Absolutely, because it drives levels of automation for solving complex problems, either completely

Starting point is 00:26:50 automatically in cases where you would want that, or in tandem with one or more people where you would want that. Yep. Let's talk a little bit more. You said there's some chat applications. You helped Bloomberg work on Bloomberg GPT, which is their chatbot, that queries, financial data. So talk a little bit about that process. Like, is that the same?

Starting point is 00:27:09 Are they coming in and saying, we're going to pick our more. model, but we're going to use your software to film the blink? Yeah, that's right. So Bloomberg has, like a lot of customers, actually, they have huge amounts of text information. And so they were able to take all of that text data, their natural language data from market reports and analyst reports and everything that they've used, everything they've accumulated over however long Bloomberg have been operating 50 years. I don't know. They're able to take all of that and then loaded into the cloud on AWS and then use a machine learning algorithm to build their own chatbot, which is Bloomberg GPT. And they ran all of that inside our cloud computing

Starting point is 00:27:50 infrastructure. Interesting. So where does Amazon get paid in that loop? We get paid by providing compute capacity to actually do the model training and for providing the access to the large amounts of storage that are needed to store the data and then get that data into the machine learning models. And so we provide that capability on a pay-as-you-go basis, as if it was a utility. And so you only pay for what you use, and we meter it by the second. So for one second, you pay us a certain amount. Yeah, so this sounds pretty expensive to me. I'm curious, I mean, and it does seem like it's the province of companies that have more money. I mean, the fact that we're talking about Bloomberg, I mean, is sort of, okay, that's sort of indicative of

Starting point is 00:28:32 the companies with the resources to build these models. So tell us a little bit about the cost factor and, you know, who can actually afford this stuff? Yeah, I think training net new models is not going to be very common. It's very complicated. It is expensive to your point. You need a lot of compute capacity, a lot of data, a lot of expertise. Some folks that have differentiation in one of those three things will want to invest there. I think it makes sense. But the vast majority will not want to invest there. Now that said, we want to, from the Amazon side, make it as cheap and easy as possible to train those models in the first place. And so we've been investing in custom silicon in order to be able to accelerate that process with chips

Starting point is 00:29:19 that are specifically designed and built for large-scale machine learning training. And then once you've got the model, you want to be able to operate it in as low cost as possible. And whilst a lot of focus is put on training. If you think about it, you may train a model once a month, once a week, let's say. But you're going to be running predictions and inference and chatting with that model hundreds, thousands, tens of thousands of times a day. And so if you're not careful, actually the vast majority of that cost isn't in the training, although it can be expensive. It is in the operationalizing of the model for actually doing the chat. And that's why we have a second chip, which is specifically designed for low cost, low latency inference.

Starting point is 00:30:00 inferential. And I'm paying for the compute on that. What do you think the cheapest is for someone who wants to build their own bot with AWS? Like, what's the entry level price? If you use an existing foundational model, and no, 10 cents? Really? Per just to build it? Ten cents to build it, and then the cost of running it is priced per token. Yeah. I was speaking to Michael Schmulik from Bernstein, he's a financial analyst, he follows Amazon closely. And I was like, Michael, what should I ask? And he said, well, look, this is going to cost a lot of money. Microsoft just said that they have, they're making something like a $3 billion infrastructure investment in this. Is Amazon thinking about making an investment anywhere in that range, or has it already?

Starting point is 00:30:50 Well, I don't think we're going to release the size of the investment, but when you're thinking about the size and scope of possible investments, they don't get much larger. than custom building and fabricating chips. And so that is a huge investment. We've been making at AWS for nearly a decade now. You know, we're on our second generation of our inference chips. We're on our first generation of our training chips. We're going to keep investing in those.

Starting point is 00:31:14 And we're going to see, I don't think we're anywhere near the point of diminishing returns in terms of the capability and price performance improvements that we can provide through those chips. And so I say, I don't know what they, I don't know that the raw number is actually all that interesting. What's more interesting is what's the outcome? And is that outcome truly benefiting this broad democratization that we're seeing? Okay. So you've mentioned chips. We've talked about your own model. I want to take a break quickly and then come back to talk about those two things. And I have another post-it that I'm holding with me. And the headline is fun. So stay tuned.

Starting point is 00:31:48 I'll be back right after this. Hey, everyone. Let me tell you about the Hustle Daily Show, a podcast filled with business, tech news, and original stories to keep you in the loop on one. it's trending. More than 2 million professionals read The Hustle's daily email for its irreverent and informative takes on business and tech news. Now, they have a daily podcast called The Hustle Daily Show, where their team of writers break down the biggest business headlines in 15 minutes or less and explain why you should care about them. So, search for the Hustle Daily show and your favorite podcast app, like the one you're using right now.

Starting point is 00:32:20 And we're back here with Matt Wood. He's a VP of product at AWS focused on AI. So, mention that you have your own models, your own LLMs. And that's actually something that's available if people want to build within Bedrock. They can pick, it's Titan. Or they can pick something from OpenAI or LAMA. Anthropic. or AI21 Labs.

Starting point is 00:32:52 We added co-here this morning. So why build your own? I mean, it seems so good up until the point where you start building your own model. And now all of a sudden, you're running into the same problem that we talked about in the first step that Microsoft has. We're like, if I'm building something, you know, I don't know if Amazon really is neutral. So talk a little bit about why you build your own one. And this is Schmole looks at the same thing, too. He's like, you don't need an 81st model.

Starting point is 00:33:19 So why did Amazon build it? Well, on the 81st model, I think you do need an 81st model right now. It would be completely arbitrary to decide right now, at this point in time, that we need to limit the model or that we've got enough. There is so much opportunity. It is so early. I think there's going to be no end of invention in the foundational models going forward. So that said, that's why we took our approach of making all of them available,

Starting point is 00:33:46 because who knows which one is going to have a breakout capability. who knows which one is going to be the best fit for a particular use case. Our approach has been that we have training, we've trained a set of our own foundational models, we have a language model, and we have a vector embedding model, and each of those models is actually a family of models. So the customers can choose the right model for their use case,

Starting point is 00:34:10 not just for the capability, but also for the latency, and also for the price. And so you may have a use case for very, very, very, very simple, small model that you want to operate with very, very low latency. And so that's an option that you have with Titan. You don't have that option with some of these extraordinarily large models, particularly that are hosted in who knows where, where the latency is just what you get. With Titan, you can choose the right trade-off for capability and latency. You can choose

Starting point is 00:34:42 the right trade-off for capability and price. And for each of those models, you can add your own data to the model to improve it privately. Yeah. Now, I'm going to get to chips, but it's always at this point in the conversation, like we're a little bit more than halfway in, or there's always a thought that pops up in my head, which is we've been talking about generative AI, wanting to talk to computers as a given, as if we actually want to interact with them in natural language. We want to have them as chatbots. We want to talk to them. But even the Alexa example shows that, like, there was all of us, there originally was this whole range of things we wanted to do. And then the test narrowed. And even, I don't know, what, When you're, actually, let me ask you, are you using LLM, like chat GPT and consumer bots as much now as you were at the beginning? I use, we have an internal tool that we actually built for ourselves, primarily so that our engineers and our own builders could get familiarity and practice with prompt engineering. And so it's super simple, internal tool. What's it called? It's called the LLM playground.

Starting point is 00:35:46 And it is literally a playground. You can build little one-page mini-apps where you can provide a prompt and you can chain that prompt into another prompt and you can play with the parameters and the different models and then you can arrange the widgets on the screen

Starting point is 00:36:02 to build out little applications. You can build a chat application that way. You can provide it a URL and it will fetch the website and then use that as the part of the context for the prompt. So you can reason and ask questions about the website.

Starting point is 00:36:16 using this stuff. So, okay, that's great. That's good. It's the most fun I have all week, honestly. Really? Yeah. How many hours do you spend doing it? How many hours? I don't know. I probably spend at least 30 minutes a day. Oh, wow. Okay. Just exploring what the team is identifying as kind of emerging capabilities. Okay, so now that I've effectively sabotaged my own question for like the last minute and a half, I'm going to ask it. Hit it. Which is, is this something that people actually want to do? Like, do they want to talk to computers? I mean, it sounds good. It's really, it's really, freaking cool when you use it. But even now, people are saying chat GPT is getting dumber. Largely people believe that's because the novelty has worn off. So we talk about

Starting point is 00:36:54 this generative AI moment that you're saying it's going to be bigger than the internet. What makes you so convinced that this time is for real? What makes me so convinced is that the level of invention and efficiency and automation that we've seen inside the company and that our customers are experiencing, chat being just one modality. So what else do we have? Well, we have the other ones that I mentioned earlier, like the generative pieces, the search pieces, but the collaborative problem solving pieces, the automation pieces, completing complex tasks.

Starting point is 00:37:30 That, I think, is where the majority of the value is going to be. And I think that chat is a great user interface. It's a great way to explore some knowledge and a domain. It's great for new users that are getting up to speed on a particular product. all of that. So it's a really useful use case, but it is very hard for me to imagine that we nailed the use case first time right out of the gate with chat. I think that there is going to be all manner of model improvements and supporting engine improvements that allow us to deliver customer experiences that we just haven't even imagined yet. And we are doing that imagination

Starting point is 00:38:07 and going through that process inside the company now. I like a lot of our customers at AWS and it is inspiring. Anything cool from inside Amazon you can share? I think some of the early stuff that we're looking at. We announced today we're around generative BI. So being able to business intelligence. Thank you. To be able to ask and interact with your data just using these using a chat interface. One, but also to be able to create dashboards, two, to be able to understand and find insights. Number three. And then number four, when you found those insights, to be able to quickly just summarize your narrative, immediately create, like, a business report that includes all of the summaries and all of the reports and charts that you might need,

Starting point is 00:38:51 and then email that around to your colleagues. Like, if you imagine the level of what you would have to have done before these capabilities were available, to be able to enable that, you would have needed to have teams of business analysts to connect to the data. They would have had to spend time, you know, investigating what to build, and then building the dashboard and setting it up. and then you have to train everybody in order to be able to do the work with the data, and then you have to find the insights, which is very, very difficult.

Starting point is 00:39:19 It just shortens that whole path to discovery through automation in a way which is unprecedented. So who's learning from who here, and please don't say both. Is it AWS learning from the rest of the workflow inside Amazon or people inside Amazon learning from AWS? us? I think it's honestly true that, I mean, Amazon is a very big company. I think we're taking inspiration and we're kind of organizing ourselves internally, deliberately to take inspiration where we find it. And so number one, we're enabling all of our builders, all of our software engineers in which we have a pretty large number, to be able to experiment and try out large

Starting point is 00:40:02 language models through bedrock. So everybody has that capability. And then we're finding ways that they can show their thinking and their invention to each other with something as simple as a demo day. So internally, we have multiple different teams, not a lot of them, but multiple different teams, and they will proactively reach out and we have a schedule of people that are bringing their demos. And sometimes it's just slideware. Sometimes it's an idea, but more often than not, it is running software that we can take a look at and they share their thought process and their implementation. techniques, and that gets everybody else excited. And then the next iteration, we're building on top of that, and round around it goes. And so those kind of idea nucleation points across the company have proven to be very inspiring to our developers. And number two, enables us to share

Starting point is 00:40:52 our knowledge and our discovery and our thinking very, very broadly. And number three, honestly, prevented us from building the same thing. Let's say we've got a thousand development teams. You start them all off from the start line at the same time. come up with the same idea. They're going to come up with the same idea, a lot of them well. So we've avoided that as well. Let's talk briefly about chips. Sure.

Starting point is 00:41:11 It's interesting because I think that there's minimal awareness that Amazon has its own LLM in Titan. And there's even, I think, less awareness in the general public. I'm not talking about the folks sitting here. I'm sure they all know about it. But we hear so much about Nvidia chips. I swear, I feel like every day I hear Nvidia, like it's just a marching, you know, chant in head and video chips and video chips, but you have your own chips. How are you making them and are they serving the same purpose or something slightly different? Well, we acquired a...

Starting point is 00:41:46 This is all for chips for training AI. That's right. Yeah, that's right. We acquired a chip design company called Annapurna probably about 10 years ago now. And since then, we've been on a path to build arm processes for general purpose computing. We have... arm processes specifically for high-performance computing, and to find very specific, not a large number, but very specific use cases that we could accelerate in silicon. And the machine learning work use cases very quickly rose to the top. And so we've been investing there in terms of building out custom silicon that you can deploy on AWS today for building out your own large language models and for running the inference

Starting point is 00:42:30 against the large language. And you design your own chips as well? That's right, yeah. And it's not just Arm that builds it, right? Arm's just a blueprint. It's a starting point. But with Traneum and with Inferenshire, those are totally custom designed. Who's building it?

Starting point is 00:42:45 I think they're constructed in Asia somewhere. I don't exactly know where. Taiwan? Most likely, yeah. Okay. Let's go to the fun post-it because I feel like people are getting restless. Okay. Now I'm nervous.

Starting point is 00:42:59 No, it's all good. I think a fun post-it for you is a nervous-inducing post-it for you. That is how it should be. Oh, ethics. All right. So you have your own LLM Titan. What was it trained on? And how can I be sure as a writer that it wasn't trained on my work? Titan was trained on publicly available data. And that's a very squishy phrase. It's very precise. Okay. And data which, proprietary data that we had licensed specifically for the purpose of training. Okay. So can you say definitively that? there's no chance that this model was trained on, like, for instance, sub-stack articles.

Starting point is 00:43:39 If they're publicly available, there's a good chance that they were part of the web crawl that we would have used. If they were part of or had been licensed to a large proprietary set of natural language, we would have licensed that with the right permissions to be able to use them for training. So my sub-sac stories are publicly available. They're available on the internet. I guess I didn't really opt in for them to be used as part of training. Should I have the ability to decide whether or not they're going to be part of LLM training or not, even though they are live on the web? It seems like, I mean, it's your content, you own it, you can do as you please, but I think it seems like a strange use case to identify and single out.

Starting point is 00:44:21 Who's to say what this data can be used for when it's publicly available? You know, you chose to make it publicly available. If you want to put some permissions around it, you want to take it private, it's totally up to you. You still own the data. But it is public. And as a result, it can be used for things that you may not have thought through initially or there weren't possible early on. Yeah. And I'm not sitting here and saying, you know, how dare you take it? I don't know if you have. But it is interesting. It's a trade of. It's something that people who are producing content are having, have to, we always thought it would just be Google, for instance, that crawled our stuff.

Starting point is 00:44:56 But clearly it's going to be more. And so. Yeah, I mean, there are open source, open web, is available as open data, for example, you can just go and look at what's in there. It's maintained and kept up to date. It's called the Common Crawl. You can check that out. Sergey Brin is back inside Alphabet. He's called Generative AI, something like the most exciting technology or moment of his entire life. Jeff Bezos, do you know what his feeling is about this stuff? Should you? I do. I think he feels. that it is, I wouldn't want to speak on his behalf, of course, but, you know, I think he feels the same though. This is the single largest transformational step in how we interact with data and

Starting point is 00:45:42 information and each other, you know, since the very earliest web browsers. And so I think I probably stole that from him at some point. How do you think you would feel if people inside Amazon started using generative AI for their six-pagers? That ship has sailed, I can tell you. People are doing it? For sure, yeah. Of course. Okay, but hold on. It's a starting point. Wait a second, because the whole point, if I have it right it from Bezos, is that when you write

Starting point is 00:46:09 something, you have to think it through super deeply and make sure every idea connects one-to-one, if you turn that over to AI, you're not really going through the process. Well, I don't think that's true because what you're getting back is just a first draft. And so that's actually a really good way to explore your idea. you can get a gut check as to whether your idea kind of tracks, whether it's got legs, and you can start to poke and prod at the idea and all of our ideas get better through that poking and prodding and the discussions that we have around their ideas. And so to be able to do more of that early on, you actually front load a lot of the product

Starting point is 00:46:49 development work and you can do some of that with your team. You can do some of it on your own, do some of it with an LLM. I think it makes perfect sense. It's a huge efficiency game. When you read one of these six pages written with an LLM, can you tell that it's been involved in the process? I don't know if I've read one that was completely written autonomously with no edits. But even with a little bit of help.

Starting point is 00:47:13 I think that I think for sure, for sure that I have read paragraphs, maybe even pages that were automatically generated with probably some pretty heavy editing that I did not notice. that's good. It's very encouraging. Two more for you. Okay. Amazon's culture.

Starting point is 00:47:35 The whole point, I mean, I wrote a book, the title is called Always Day One. So the point of the book is that the company operates as if it's a startup on its first day. And the culture has been extremely intentionally built that way by Bezos. And there was a story recently about how Amazon has more of a big company feel lately. And you even have Adam from AWS. I want to make sure I get the language right saying, you know, we're going to be insurgents. And you only say this word, we're going to be insurgents when you feel like you need that rallying cry. What's the story?

Starting point is 00:48:17 I see your theory. It's an interesting theory. I personally have not seen any, well, number one, I haven't really know what big company stuff. I've only ever really worked in Amazon, and so I've been here for a long time. I'm pretty well entrenched in the culture, but I haven't seen many elements of big company slowness. I haven't seen many elements of big company politics. I haven't seen many elements of big company infugality or wastage.

Starting point is 00:48:50 And so I think, I'm sure there's many more that you could list off that would be qualities of big companyness that would be negative. What I have seen is a continued focus on working backwards from the customer. And what I have seen is I continued focus on scrappiness. And a continued focus on doing what we need to do in order to be able to solve real problems on behalf of customers. I think you'll see that in our approach to generative AI. You'll see it in our approach to analytics and satellites and all sorts of things. Yeah, because, I mean, you really need that sort of scrappiness if you're going to be able to compete.

Starting point is 00:49:26 I mean, I feel stupid even saying this out loud, someone who's worked at Amazon for as long as you have. But this is going to be a fight, man. It's going to be a very, very interesting time for sure. And, you know, I would also say that the sort of cultural norms that we'd have, they don't exist and they're not maintained without some energy and without some effort. And anyone that has had a conference call with me from my own. office will have seen that behind me on my office wall, I fly a pirate flag, partly an homage to Steve Jobs in the early. You have the pirate? Okay. When I was writing the book, I heard about this pirate mentality. Yes. And I could never nail it down. Talk a little bit about it. Yeah,

Starting point is 00:50:11 this is good. It's part homage to Steve Jobs and the early Mac team. Very early day one, drive it tons of transformation. I'm a huge fan of Apple, huge fan of the Mac. I've used Macs all my life. So part of it is an homage to that. But part of it is, periods of discontinuous change, you just can't operate like a big super tanker. You've got to operate like a small merry band of pirates that are just cruising and adventuring around every cove that you can think about and just staying scrappy and nimble. And so for my part, such as it is, I fly the flag in my office as a reminder to myself and any of the teams that I'm working with that this is a period of discontinuous change. And this is a time in which we need to be scrappy,

Starting point is 00:50:56 and resilient and explorers and missionaries. And the people are listening? So far, so good. People seem to like my flag. All right. This is not the last one, but I have to ask you about this. The news is gearing up for the FTC to bring up a lawsuit to break up Amazon.

Starting point is 00:51:18 Obviously it doesn't happen yet. It's all speculation. But it seems like it will. And it's going to be like potentially the biggest government action against a U.S. company since Microsoft, maybe even Mobile. So do you think about that at all? Is it even something that you pay attention to? That one is above my peg rate.

Starting point is 00:51:37 Okay. Last question for you. You know, you have a very interesting position within Amazon because you're like really working industry by industry and helping them imagine how they're going to transform with the latest technology. But we're in the middle of this really unbelievable moment in technology where we're starting to really get a chance to imagine things we couldn't before. One example, you guys have released a medical note-taking generative AI application, health scribe. So I'm a son of a foot doctor, and my dad spent too big, too big of a chunk of his life writing notes.

Starting point is 00:52:18 And just think about all the hours like he could have had back. I went to med school before I joined Amazon. And it's just, think about how much better care you can provide to patients if you're actually focused on that versus doing these things that generative AI applications can do. So why don't you take us like on a top three interesting things in different industries that you could imagine generative AI having a real impact? And I think that's the medical one is interesting. That's a really good one. What else? Where are some other examples that we're just not looking at yet? Well, number one, I am sure that everything that I'm going to touch on,

Starting point is 00:52:59 that there is a startup or even a large organization. Now, they're already working on it, and they'll probably be getting ready to ship as we speak. There's just so much investment and activity happening on this area. I think that a couple that spring to mind, the first one is cybersecurity. there seems like such an opportunity to employ these language models in the identification of the very subtle signals that have become harder and harder to identify, which indicate some sort of vulnerability or threat. And so being able to identify those threats across multiple different sources with better

Starting point is 00:53:38 precision is going to be better for everybody. I think that's one. It's not really an industry, but I think it's one that's going to be important. That counts. That still counts. Okay, good. I think another one is just going to be developer productivity, like code generation. We didn't really talk about that yet. Such a large accelerant. Whenever I talk about that with CEOs, they're like, yeah, we're doing it, but they haven't

Starting point is 00:53:58 really seen the productivity increase. Now, I know you have seen it internally, but. We for sure have seen it internally, and we've heard from our customers. There's definite. Yeah, it's also early, and it wouldn't be at all surprised if customers are still trialing it out and getting a sense for it. Developers are very used to a particular workflow and changes to that workflow can take time to get right. And they should be thoughtful about it. But I think that's another one that's really going to drive change. And then just more generally, I think any industry that has access to very, very large volumes of text is going to be the first places that we see this sort of change. And what's interesting about that

Starting point is 00:54:36 is there's some areas like healthcare that are steeped in natural language, but they don't usually have, like, the best reputation for being at the vanguard for technology adoption, we're seeing so much interest and so much excitement. Healthcare. Lawyers. Legal, healthcare, life sciences, clinical trials, drug discovery, all these areas where financial services, insurance, like the oldest, stodgiest industries that you can imagine, they've got so much natural language.

Starting point is 00:55:08 It's such a large opportunity. I think that's where we'll see the earliest returns potentially on generative AI. Matt, can you believe this? I mean, what a great crowd. I love these guys. So for listeners at home, what we're looking at is it's a silent disco type of conversation where everybody is wearing headphones. We're not even using any amplified sound at all.

Starting point is 00:55:31 And just been looking at this crowd as we've gone, and they've sat here and hung on every word. So thanks to you guys. Thank you so much. Thank you for being here. And I'm going to thank Matt in a second, but I'd be remiss if I didn't mention that the Big Technology podcast airs every Wednesday and Friday, Wednesday a flagship interview. Like this Friday, we cover the news.

Starting point is 00:55:56 All right, everybody, thank you so much. Thank you to you. Thank you to Matt. Thank you. And I hope you enjoy the rest of your day. Thanks.

Big Technology Podcast - Amazon Reveals Its AI Master Plan — With Matt Wood

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.