Everyday AI Podcast – An AI and ChatGPT Podcast - EP 244: Accelerate your GenAI journey with AWS

Episode Date: April 5, 2024

If you're a fast-growing company needing to leverage all that Generative AI has to offer, the last thing you want is a vendor lock-in. Or, finding out the AI solution you thought you needed doesn...'t actually have the juice you need. One solution? Amazon Web Services (AWS). We sit down and talk all things AI models with Shruti Koparkar, Product Marketing Lead, AI/ML Acceleration at AWS. Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Ask Jordan and Shruti questions on GenAI and AWSRelated Episodes:Ep 238: WWT’s Jim Kavanaugh Gives GenAI Blueprint for BusinessesEp 232: Creating and Capturing Business Value with GenAI – Insights From HPEUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTimestamps:01:20 About Shruti Koparkar and AWS02:50 More on AWS and Generative AI09:54 X-rated computing used to fine-tune models. 12:46 Adobe focused on AI tools, using AWS.17:50 AI evolution and AWS preparedness for influx.21:01 Experienced ARM engineer excited about NVIDIA's potential.24:08 Identify use cases for small-scale applications.Topics Covered in This Episode:1. AWS and its Role in Generative AI2. AWS and Foundation Models2. AWS's Involvement with Companies Implementing Generative AI4. Future Preparations of AWS for Generative AI DevelopmentsKeywords:generative AI, industry insider, Everyday AI, podcast, livestream, newsletter, AWS, Shruti Koparkar, product marketing, x rated computing, NVIDIA GPUs, Amazon Web Services, cloud computing, foundation models, Amazon Bedrock, API, Amazon SageMaker, customizing models, data security, ChatGPT, Perplexity, Amazon Code Whisperer, Adobe, ARM, Leonardo dot ai, apps, NVIDIA GTC conference, accelerated computingSend Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Start Here ▶️Not sure where to start when it comes to AI? Start with our Start Here Series. You can listen to the first drop -- Episode 691 -- or get free access to our Inner Cricle community and all episodes: StartHereSeries.com Also, here's a link to the entire series on a Spotify playlist. 

Transcript
Discussion (0)
Starting point is 00:00:00 This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live and Adobe Firefly, the All In One Creative AI Studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. So many businesses are trying to figure out how they can accelerate their journey with generative AI.
Starting point is 00:00:52 And you can look in all different directions. And sometimes the more you look, the more confusing it might be. It seems like there's so much information, so many new pieces of software and services that we should be looking at. So today, we're going to be talking with an industry insider on the correct. way and one of the best ways that I think that you can accelerate your generative AI journey. Thank you for joining us. My name is Jordan Wilson and I'm the host of Everyday AI, where your daily live stream, podcast, and free daily newsletter helping everyday people like you and me, not just learn generative AI, but how we can all actually leverage it. And if you are joining us on the podcast,
Starting point is 00:01:30 thank you, as always, make sure to check out your show notes for more information. And if you're joining us live, you probably see something different. Yes, we are live here in person at InVidia's GTC. conference where our guest today, AWS, is one of the big, big partner booths. I can see it from here. So without further ado, very excited to introduce our guest for today. Shruti Kuparker is the senior product manager at AWS. Shruti, thank you for joining us. Thank you. Thank you for having me, Jordan. All right. Can you tell us a little bit about what your role at AWS kind of is made up up. Yeah. So my role is leading product marketing for XRated Computing at AWS. And that means basically helping our customers explore, evaluate, and adopt XRated Computing solutions powered by
Starting point is 00:02:18 Nvidia GPUs as an example to help power their AIML applications, their graphics applications, their high-performance computing applications. Yeah. And so that's my role at AWS. And for your listeners, and, you know, hopefully a lot of them know about who AWS is, but AWS is Amazon Web Services. And so in simple terms, we're basically the cloud computing division of Amazon. Yeah. And probably, yeah, probably anyone watching or listening to this, whether you know it or not, AWS is probably involved in this process somewhere, right?
Starting point is 00:02:54 Depending on where you're listening, how you're listening, you're probably getting a lot of this through AWS. You just may not know it. So, Shudie, I'd love to talk a little bit. about how AWS shows up in generative AI, right? Because a lot of people may not fully understand, you know, how big of a footprint AWS actually has in the generative AI landscape. Can you tell us a little bit about how AWS actually shows up in today's version of generative
Starting point is 00:03:23 AI? Absolutely. You know, and this is something that we get asked about by our customers as well, because they are trying to figure out how to adopt generative AI, how to take advantage of this technology and get started quickly. And so I think what I would like to do is talk about AWS and what we are doing, but from the lens of four important considerations that we've identified through our conversations with customers, through our conversations with our internal experts, these are the four considerations that are important when getting started.
Starting point is 00:04:02 with generative AI. And so the first one is that there is no one foundation model that is going to rule the world. That is the right fit for every use case. And, you know, again, for the listeners, foundation models are these big, you know, the large language models, the LLMs. These are all foundation models that are pre-trained on perabytes of data. And there are so many of them out there. There's Lama 2 from meta. There is close. from Anthropic, you know, every week there is a new model. Literally. Yeah.
Starting point is 00:04:37 I mean, literally, right? And so customers really need to identify which of those models are the right fit for their use case. And so for that, we have a service called Amazon Bedrock, which makes a diverse set of foundation models available via a single API. So it's just a simple API call, and you can choose which model you want and test it. out with the application you're building for your specific use cases. So that's sort of, you know, that's the easiest place to start, especially for developers, because all they have to do is,
Starting point is 00:05:13 you know, make a call to this API and they can get going. Now, the second important consideration is that customers need to differentiate with their own data. Because if you think about it, like all these models are available to everyone. So how do customers differentiate and gain competitive advantage. It's with their own data. And so this is where again we have managed services such as Amazon Bedrock or Amazon SageMaker, which allow our customers to customize their models, their applications on their own domain specific data. So think financial data or, you know, legal tech, legal data. In some cases, healthcare and life sciences. That's a completely different modality of data. It's, you know, it's language, but it's the sort of the language of life
Starting point is 00:06:02 through genetics and things like that. So allowing customers to fine-tune and customize their models to their own data is something we do really well in a very private and secure manner. Because security, we often say security is job number one at AWS because customers are trusting us with their applications and their workloads and their data. So we take security really, really seriously. And so that's sort of the second consideration. The third consideration is that customers may just want to use out-of-the-box applications, right?
Starting point is 00:06:41 Like just like every people, right, like I use a lot of generative AI applications. I didn't build those. I'm just using those. There's lots of people using chat GPT. One of my favorite applications is perplexity. So similarly, our customers might also want just out-of-the-box applications. And this is where we have Amazon Code Whisperer, which is basically a coding assistant. It's your coding companion.
Starting point is 00:07:06 It will help coders, developers, write code, be much more effective and focus on the innovation and not the role task of writing the code. And then finally, the fourth consideration, and this is more applicable to people who are building, you know, generative AI pipelines with us. But it's important for, you know, every epochs to know is that ultimately what powers all of this is reliable and scalable infrastructure. And AWS excels at this. We have infrastructure that delivers the highest performance while keeping costs as low as possible and, you know, helping customers achieve their goals in terms of the services or the applications they are trying to build. And then,
Starting point is 00:07:52 Vindvia, of course, is a huge partner for us in this space. And that's actually the X-rated computing portfolio that I focus on. So, yeah, so those are sort of, you know, those are a few ways in which we show up. There are, you know, AWS has 200 plus services. So, and honestly, each of them, I would imagine touch generative AI in one way or the other. So it's really hard to pick my favorites. But these were some of the ones that came to mind. And I think that lens of like the full considerations maybe helps sort of make it a little bit easier to follow all along.
Starting point is 00:08:29 Yeah. And you name some very well-known, you know, large language models there, chat GPT, perplexity as an example, which is actually one of your customers, right? So I'm wondering if you can kind of walk us through, you know, an example of maybe perplexity or something like that of how AWS is actually powering that. because I think a lot of our listeners, myself included, I use perplexity every single day. So I'm even curious, how does AWS accelerate, as an example, perplexity's journey in Gen. Adobe just introduced an entirely new way to create,
Starting point is 00:09:10 bringing the power and precision of its creative suite into one conversational experience. Meet Firefly AI Assistant, now live in the Adobe Firefly app, the All In One Creative AI Studio. Powered by Adobe's Creative Agent, Firefly AI Assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the assistant.
Starting point is 00:09:32 The assistant orchestrates multi-step workflows, drawing on 60-plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom Express, and more to help bring your ideas to life. You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait. retouching and creating social variations. Every step the assistant takes is visible so you can refine, redirect, or take over at any time. You stay in the driver's seat as the creative director.
Starting point is 00:10:07 Adobe Firefly AI assistant now in public beta. See it today at firefly.adop.com. Yes, absolutely. Happy to share it. So Perplexity actually spoke at our flagship event, the AWSV invent event last year. And they shared their journey. So for all of you folks, like, go check out that video. Honestly, that will do so much more justice than I can.
Starting point is 00:10:34 But I'm happy to talk about it. So perplexity, you know, for, again, for some of the listeners who may not know, it's basically like an alternative to a traditional search engine, right? So instead of when you search and when you want to learn about something, you right now have to like go through many links and figure out which one of them has the right information and all of that. Perplexity has simplified it where you can ask the app a question and it'll come back to you with a really well-structured answer with curated sources. And all of this is powered by their large language models. So how we help them is that they've, as an example, they've done many things.
Starting point is 00:11:15 So again, I'm just going one example. They fine-tuned models like Lama 2 or Mistral, which are these open-source models that are available. and they fine-tuned it on our P-4D and P-4-D instances. These names sound really complicated, what they basically mean. These are servers that are powered by AWS technology as well as the Nvidia GPUs. So they use this extraceated computing
Starting point is 00:11:42 to fine-tune those models for their own application, for what they were trying to do. Another service as an example that they used was Amazon SageMaker HyperPort. Because when you are fine-tuning or training these really large models, you have to do that. You can't fit on one server. You have to do it across many, many, many nodes, right? And to be able to do it well in a way that where a particular node goes down, it needs to be brought back up really quickly or be replaced with something else.
Starting point is 00:12:14 SageMaker HyperPort makes that really easy. It makes this like multi-node training very, very easy. resilient because it auto detects any failures, it replaces with a new node, it makes the distribution easy, it optimizes performance. So those are just some ways in which, you know, we basically help our customers do sort of, we take care of the heavy lifting so they can focus on their innovation and their use cases. But again, yeah, perplexity. I just love the app, use it so much. And please go check out. I think their CEO spoke at Reinvest. Yeah, yeah. And even, and even kind of, you know, maybe we'll zoom in and then zoom out on this perplexity example, right?
Starting point is 00:12:57 It's, I think it's one of these startups talk about being accelerated, right? It's gone from, you know, launch to one of the most visited generative AI websites in the world in just about a year, give or take. But I'm sure a lot of that is being able to scale on AWS. Can you talk a little bit about, maybe especially, you know, I know we have a lot of people in startups and who work at, you know, now well-funded startups in their, you know, series B, C, D, etc. Can you talk maybe a little bit about some of those other examples and you talk so many different AWS services, you know, that companies and enterprise can take it, can leverage to really grow. But maybe walk us through, you know, let's say a perplexity or another big company like that,
Starting point is 00:13:43 how important is it to have something like AWS that you can start small and as you grow and as you grow, you can instantly start using all of these other different services and scale. Can you kind of walk us through what that looks like? Yeah, yeah, absolutely. So, you know, when you talked about scaling, the first example that came to my mind is Adobe. Adobe is also a really important AWS customer. And their VP of Genitive AI, Alexandra Koston, also spoke at Reinvent.
Starting point is 00:14:15 This is where I learned. Like, I learn about our own stuff at our event. that it's really great. It's such an educational event. So he spoke there and he talked about their journey on AWS because Alexander's team had invested in machine learning for a long time. And even before, you know, sort of generative AI became a term, he had, they were already using some techniques, right, to create tools for creators in Photoshop where, you know, people could have this generative field or things like that. But then, once, of course, some of these generative AI models came on the market, they off doubled down on it.
Starting point is 00:14:55 And Alexander talked about basically what he called building an AI superhighway, right? Internally within Adobe so that the infrastructure, the services are all in place so that his teams could innovate. Like he wanted to also take the heavy lifting off of his teams. And he used AWS for that, right? He used AWS to build that AI superhighway. And what is that superhighway? It's like I said, it's the servers at the very foundation, right? But then it's all of the other things.
Starting point is 00:15:26 Like it's storage, it's networking. Because where is the data living? It's living somewhere. It's in storage. You need storage solutions that can feed these models as fast as possible because they are processing data as fast as possible. And you don't want, GPUs are expensive. You don't want them to sit idle.
Starting point is 00:15:42 You want them to be working. So you want to make sure that your storage solutions can feed them data as fast as as fast as possible too. You also want really good networking, as I mentioned, because we have to distribute it across multiple workloads. So they use all of this. They use the compute services, the storage services, the networking services, many of our orchestration services,
Starting point is 00:16:03 which is like how do you make all of this, you know, work together, come together, work really well. So Adobe Firefly is a really, really great example. And he also talked about how when they started out, you know, they were, they thought like, oh, this is how much compute capacity. we will need, right? Because this is how much user response we would get.
Starting point is 00:16:24 But it just went viral. People loved the product, right? Creators love the product, being able to just bring their ideas to life so quickly. And this is where the beauty of AWS comes in, is that once they realized that there were a lot more users they needed to serve, they needed to scale quickly. And we worked closely with them. I think 20x their capacity.
Starting point is 00:16:47 and all of it, or rather most of it anyway, on GPUs, and they deployed. The other thing that AWS obviously also does is provide a lot of optionality on all levels, on the type of compute solutions you have. So we have GPU-based solutions, but we also have our own chips, Traneum and Infinshire. And they use, for example, Adobe used Infringia-powered solutions as well.
Starting point is 00:17:16 And then we have a lot of optionality on the storage side. We have a lot of optionality from a managed services perspective. You can use Sage Maker or you can build your own machine learning pipes, so to speak. So bringing it back
Starting point is 00:17:32 a little bit to sort of the everyday application, it's like you said earlier, you may not even know, but AWS is powering it. Like so many creators today are using Photoshop. Another example,
Starting point is 00:17:46 is Deonautau.a.i. It's a startup outside of Australia. And they are doing something very similar, image generation tools, generative tools for creators. And they are, you know, they're a smaller company, but similar to Adobe, they're using a lot of our infrastructure as well. You know, and even when you talk about generative AI, I think it's hard to keep up. Like even for me, right, I talk about this every day. And every day you see a new, you know, not quite a new Leonardo AI every day, but you see, you know, new AI image generators and new language models all the time. You know, I'm curious, you know, what is, you know, AWS doing or to kind of prepare for this influx, right?
Starting point is 00:18:35 Because we talk about compute and we talk about not wanting, you know, all these, you know, GPUs to go to waste and to sit idly by. So, you know, what is AWS working? on or offering to kind of prepare for this influx because it is getting easier and easier, right, to build, you know, generative AI software. So I'm sure you guys are seeing this, this nonstop, you know, influx of new customers. So what is AWS doing or what can it do to really prepare for this influx of generative AI companies wanting to grow and scale? Right, right. Great question. And again, like, I think I will have to bring it back to sort of that for people who are familiar with AWS,
Starting point is 00:19:18 they'll have seen this what we like to call the three-layer cake. And at the very foundation, as I mentioned, is the infrastructure. And so in terms of preparing for that, our partnership with Nvidia is a really important piece of that, right? I mean, we've heard the announcement at GTC this year with the Grace Blackwell 200. That's coming to AWS. You know, that's going to definitely power a lot of generative AI innovation. we also do, like I said, continue to invest in our own silicon to offer sort of more options,
Starting point is 00:19:52 more sort of optionality to optimize cost and performance. And so that's sort of what we are doing on that infrastructure side, in addition to, you know, many innovations in the storage and networking piece. But then it's also going to be about trying to make it really easy to use. And this is, again, where Amazon bedrock comes in. We also have actually this service called Amazon Party Rock. And I highly encourage all of your users to check it out. It is literally like it's basically a service through which anybody can build an app.
Starting point is 00:20:27 You just have to go there and talk to it as if you are as if you're talking to a friend and basically say, hey, this is the app I want to build. And just try it out. Give it a shot. It's fun. It's so much. That's why we call it Party Rock. So I highly encourage you and your listeners to check it out. So Amazon Bedrock, of course, for developers who want to access foundation models to a single API.
Starting point is 00:20:51 But check out Amazon Party Rock and have some fun with it. And so that's sort of what we are doing, right? Party Rock is also a way for us to do that, to educate a larger set of people about this technology, about what it can do. and sort of pull them in the orbit so that they feel empowered, and then they feel ready to go to, say, a bedrock and start building their own applications. Yeah, and then finally, you know, we talked about Amazon Q and Amazon Code Whisperer.
Starting point is 00:21:24 These are some of the applications where you don't need to do anything. They're already generative AI applications. You just need to use them. So you're just a user, but, hey, they increase your productivity. They help you interact. Like Amazon Q, for example, can help you interact. with your business data can, you know, can really help you even navigate AWS services. So, so those are sort of the three ways in which we are, you know,
Starting point is 00:21:50 innovating across those three layers to help prepare for this influx because you're absolutely right. We see a lot of interest coming in. And, you know, you kind of talked earlier about, you know, at the AWS conference, you know, being able to to learn from so many of your customers. and here we are at Nvidia GTC. And if you're listening on the podcast, you probably can't see this. But over here we have the entire exhibit hall, right? And we can see probably hundreds of trillions of dollars in market cap.
Starting point is 00:22:20 Just sitting right there. And the future is really being built just out this window that we kind of are overlooking here. You know, as we talk about the GTC conference, you know, even for you personally, what's kind of catching your eyes, you know, specifically when it comes, you know, you know, to the future of accelerated computing because there's so much going on. But what's kind of catching your eye this week at the Nvidia GTC conference? Yeah, that's a really great question. Well, you know, so my background, just a little bit, you know, about myself,
Starting point is 00:22:53 I worked at Arm for eight years. So for those of who you don't know, Arm is basically a really big company in the semiconductor space. they've pioneered the risk architecture, the arm architecture that is basically, in a way, a competition to the Intel, you know, grown X-86 architecture. Anyway, point being, I come from this really hardware background, and I, you know, I am a big fan of Nvidia in that sense that their roots lie in that sort of chip design and, you know, GPU design space as well. And so what is really has been exciting for me, not just about GT, but in general in this space. And it's not just Nvidia or even us with our own chips. It's the idea that a lot of this software innovation is interlocked now with the hardware innovation. There's so much synergy between what happens at your chip level, at your system level,
Starting point is 00:23:53 and what happens at your software stack and application level and user experience. There's so much codependence and synergy. Maybe it's always existed, but I feel like it's true a lot more. I mean, I was in a talk earlier where Jensen hosted the authors of the paper attention is all you need. These are basically, it was a paper that changed the world, right? It was a paper where the transformer architecture, that is the foundation for gender to AI, was proposed. And he just interviewed everyone. And they joked about how, you know, Jensen was like, we are building the next.
Starting point is 00:24:32 GPU to be the size of your next model. And then the authors joke back, well, we are building the models to be the size of your GPU. So this sort of innovation, which is bringing together the hardware architects with the deep learning engineers, with the machine learning scientists, like all of them coming together, that's just to me, that synergy is very exciting to me. So I don't know if that's one thing. I guess that was a cop out. I'm basically saying it's all of it, but yeah. Yeah, and there's no doubt, you know, there's always so much excitement, you know, at big events like this. But, you know, Shruti, we've talked about a lot, you know, when it comes to being able to accelerate your generative AI journey. You know, as we wrap up, what's maybe the one
Starting point is 00:25:18 important take home that you really want listeners and viewers to take away from our time here when it comes to really helping them leverage generative AI in their journey? Yeah, I think would, my first thing would be learn about it, which they're already doing by listening to this podcast, right? So I would say be curious, learn about it, start in small ways, figure out what you can do today to learn more about it, to maybe try it out, you know, start using perplexity. Maybe go check out party rock that I just mentioned. So start small. And then once you sort of figure out what are the right use cases, now these could be use
Starting point is 00:25:59 cases in your business applications and you know you could suggest that to your colleagues you could brainstorm you could think about how you could advance those within your profession or it could be use cases in your own personal lives you know you could use for example like i use perplexity all the time to learn i mean all of this stuff that i know and is a lot of it not related to a ws but in general related to this field is through that app right so you could you could just identify you use cases within your sort of personal life as well. I've used generative AI services to think about ideas for my child's birthday party. Like it's the the opportunity is endless. So start small and then go from there. I love it. And there's no there's no greater advice because yeah, we we covered so much
Starting point is 00:26:51 here. But I love it. Just start small, go from there. Well, thank you for tuning in. Shruti, thank you so much for joining the everyday AI show. We really appreciate your time. you again for having me, Jordan. All right. And as a reminder, there's going to be a lot more. So make sure to go to your everyday AI.com. We talked about a lot. We're going to be recapping it all in our newsletter as we always do. So thanks for tuning in today. We hope to see you back tomorrow and every day for more everyday AI. Thanks y'all. Meet Firefly AI assistant. Now live in Adobe Firefly, the Allman One Creative AI studio. Just describe what you want to create in your own words and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud
Starting point is 00:27:34 apps, including Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine at any time. See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going.
Starting point is 00:28:06 For a little more AI magic, visit Your Everyday AI. and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.