In The Arena by TechArena - Compute Requirements in the AI Era with Intel’s Lisa Spelman

Episode Date: March 26, 2024

TechArena host Allyson Klein chats with Intel’s Lisa Spelman about how compute requirements are changing for the AI era, where we are with broad enterprise adoption of AI, and how software, tools an...d standards are required to help implement solutions at scale.

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Tech Arena, my name is Allison Klein. It's a really exciting day on the tech arena. We've got Lisa Spellman here, Corporate Vice President from Intel. Hey, Lisa, welcome to the program. Oh, it's good to be here, Allison. Always a pleasure when we get to hang out together. Lisa, you and I have talked to each other many times in the past about technology, but this is the first time you're on tech arena. So why don't you go ahead and introduce yourself and your history and role at Intel? Sure. You know, I'm Lisa Spellman. I have been at Intel for 23 years. Feels like it happened in a blink of an eye, but through my time here, I've had the chance to participate in a lot of different areas of the industry. I've been in our Intel IT infrastructure,
Starting point is 00:01:07 engineering and operations, which is great experience. And I've been in this product group and had the honor and privilege to lead the Xeon team over the last several years. And it's been an exciting ride, not only as Intel has been working on improving products and solving customer problems. But also as this AI explosion has happened, and for those of us that have been in the
Starting point is 00:01:32 industry for a long time, we lived through that kind of HPC, huge supercomputer blow up, then HPC, some conversation about transitioning into AI and where would it go? And now we're part of living this AI going absolutely mainstream. So it's a pretty exciting time, honestly. You know, in all the years that Xeon has driven the world's data centers, we really haven't seen a moment like what started in November of 22 when ChatGPT was released. And since then, everyone is talking about generative AI. And I really want to talk to you about how you're seeing this momentum around large language models trending. Obviously, Intel has been talking about AI for a number of years and developing AI acceleration into your processors.
Starting point is 00:02:28 But how do you look at this latest opportunity for the enterprise? Yeah, it's so interesting. I was reflecting on 2023 thinking we started the year in January launching a Xeon new generation. And at that launch, we were talking about recommendation systems, image recognition, all of the AI use cases that the industry had spent decades getting into production and getting real value out of. At the end of 2023 in December, we launched a next generation Xeon. And just the change in that one year of the pervasiveness of large language models was phenomenal. One of the fastest tech adoptions that I have seen in my career. And you look at what's being done and the ability for so many enterprises and so many, you know, customers in the ecosystem, their ability to
Starting point is 00:03:23 now train their own data, you know, using their own data with all these large language models, their ability to verticalize their solutions and really focus, you know, chat GPT like capabilities for their employees, for their practitioners, specifically in their domain. So not everybody needs to be able to do what OpenAI is doing, which is build massive models that can answer every question in the world. But you look at healthcare and see the phenomenal potential to be able to answer very specific questions around cancer diagnostics and treatment based off of very real data from patients all over the world. So still a big model, but narrowed into a very specific use case.
Starting point is 00:04:12 Same thing for manufacturing companies and for quality control. You see the same thing in the automotive industry. So it's one of those capabilities that is touching every single industry and giving the opportunity for very specific utilization and very good depth of answers to common questions. When you talk to enterprise customers and when you think about enterprise, they tend to be a bit risk averse. So when you talk to them about the opportunity to inject large language models into their applications, what are you seeing them doing in terms of ensuring that these new applications align with their business priorities, align with their security policies, align with their privacy policies, and ensure a great experience for their customers. I think you'll see enterprises move towards more reg-like implementations and models with this
Starting point is 00:05:15 concept of a need for ensuring that they are protecting themselves against things like hallucination. So if you think of, you know, a teenager using ChatGPT-4 to help with writing a paper, and they get a hallucination about a data source, the consequences of that might be a not great grade, or the fact that they didn't double check it, they might learn a lesson there. But there's no life or death consequence. The consequence of hallucination in an enterprise customer facing customer service type of LLM implementation can actually be quite high. You can end up having committed your company to something it wasn't intending to do. You could give an answer that is not acceptable in a healthcare environment. Many of them are just at that early stage about how to actually do this. You can't go anywhere right now without hearing about this insatiable demand for compute capacity
Starting point is 00:06:21 to fuel AI and tons of conversations about AI acceleration. How is Intel looking at this challenge and opportunity across the portfolio? And how do you see Xeon continuing to play a role in this AI era? Yeah, it's a compute explosion out there. And it's really moved mainstream. Even people that are not in the technology industry are well aware of the compute capacity that's required for all of this AI. So it's a real moment for the entirety of the industry. With the Xeon product line, we've been investing for several generations in continuing to add in acceleration for AI workloads. But we're also recognizing that in this large language
Starting point is 00:07:14 model moment, there is such a high demand for the memory bandwidth and that memory capacity and capability, just given the way the workload is structured. So the way we view the world is really for all of that, you know, really large training, you absolutely do need accelerator products, you know, again, whether it's accelerator, a GPU, some sort of dedicated compute capacity and capability in order to actually train the model. Then you move down into fine tuning and you move into inference. And there you have a little bit more choice, depending on the size of that model. So our goal on the Xeon side is to continue to provide really capable hardware that is optimized for AI workloads while also still delivering that absolute general purpose performance and capability so that you have the world's most flexible
Starting point is 00:08:13 infrastructure. And then we know that every enterprise, every cloud service provider, every SaaS company will go through their own model and their own math about when does it really make sense to move to more acceleration. So we have Xeon for a lot of that inference work and that foundational general purpose work. We have Xeon being the head node for the vast majority of AI deployments using accelerators and GPUs. And then we have our Gaudi accelerators that are really specializing in that high performance inference in those fine tuning of models and have the opportunity to help both enterprises and cloud service providers alike. talking to many of the leading players in the AI arena. Intel sponsored an entire day and you brought some friends with you, including a guy that was from an agricultural company talking about a very real world use case of utilizing AI to help them mature their crops and pick their crops at the optimal time.
Starting point is 00:09:27 And one of the things that I thought about is that, you know, the importance of software and enabling real world customers with software solutions that can deliver the core capabilities for them and integrate into their environments. Can you talk a little bit about what Intel's doing in this space? Yes. I just have to say, though, I love the example. An agriculture company that is using AI, large language models, it just starts to show the pervasiveness and the interest in the explosion of use cases where LLMs can help. So I think stuff like that's pretty cool. When we look at software, the number one feedback we get from customers, pretty much regardless of size, is the complexity and where to start.
Starting point is 00:10:21 It's not that it can't be done. It's the requirement it takes on their developer community, on their technical community to be out looking for the right models, looking for the right frameworks, pulling it all together, hunting and searching. And honestly, that's not really work developers like to do. I mean, they love to work on optimizing and improving the performance by 4x on something. But just hunting down the basics off of GitHub or here or there, it's not that exciting. So we've been taking a big effort. We are open source first and really trying to engage with that community and help put together more standardized stacks to make it just
Starting point is 00:11:07 easier to grab what you need. So really focused on frameworks like PyTorch and also OpenVINO, especially for the edge. And then building into that, you know, what we think of as the enterprise journey. So, you know, enterprises have a tremendous amount of infrastructure that are built on top of their VMware or Red Hat or Nutanix, you know, all of these companies. So working with them to build a software stack that's in recognition of the, both the infrastructure and the software layer that they're already working with and helping them put that together in a cleaner way. So this is what I would call definitely work in progress.
Starting point is 00:11:52 And you'll hear more about this from us throughout the year, starting soon and over the next couple of months, because we think this is the ultimate deployability challenge to solve. And we're pretty excited about some of the work that we're doing inside the industry. Now, I've got to ask about the latest acceleration capabilities that you've baked into Xeon with your last generation Sapphire Rapids processor. AMX was a technology that you delivered to the market. Can you tell me exactly why this technology is pretty foundational for many of your customers? So AMX is one of our newest features in the AI space on Xeon, but it's actually built on a foundation of multiple years of integrating AI features into Xeon. And what we're always trying to do is stay, you know, one step ahead of the curve on what's going to be needed because
Starting point is 00:12:53 hardware is a long game. You know, you can't just decide today and get it tomorrow. You have to do a little bit of that planning ahead of time. But AMX stands for Advanced Matrix Extensions, and it's built into the core. And it's an accelerator inside that core that delivers higher performance for both inference and training. Now, of course, there isn't a lot of training done on Xeon, but there will be fine tuning and there can be opportunities where you might train a small model if you have a bunch of excess compute capacity, you know, all of that. And it's really great for things like natural language processing, recommendation systems, image recognition. And we see customers, you know, using it quite quickly, because it doesn't require a bunch of software work to take advantage of that performance. So that's kind of our best day
Starting point is 00:13:46 when we can put a feature in that software doesn't need to do extra work or the software ecosystem doesn't in order to realize, yeah, realize that advantage. So obviously Sapphire Rapids is out in the market. What comes next? As far as looking out into the future, a lot of what we're doing right now is across our entire portfolios. Like I said, Xeon, Gaudi, you know, our GPU product lines is getting into the space where we're looking at what is the constraints at the system level. So not just at the chip level, but at the system level, which takes you out to the rack level, really thinking of this system of systems, what is it that's holding people back from achieving their best performance? So you'll see investments from us in, again, that improved
Starting point is 00:14:39 memory bandwidth and memory capacity, both the speed and the size of the memory and the data sets that you can manage. You'll also see us working on solutions that use Xeon for a lot of the data pre-processing, data preparation, and making sure that we're feeding the accelerator at the, you know, best, fastest, most efficient, cost-effective way because you don't want your most expensive compute resources sitting underutilized. So anyways, that's what you're going to see from us is trying to take this view that's much more at that enterprise class system, scale systems, and give people both the hardware tools at the system level as well as the software in order to just, again, fundamentally make all of this easier across
Starting point is 00:15:35 every industry. Lisa, you sit at a really foundational part of the industry. You're talking every day to ISVs, to cloud service providers, to enterprises, and of course, to those platform companies that are delivering solutions based on your processors. When you look out at 2024, and I don't even know how long to make this question, given how fast the industry is moving, but let's just say 2024, and you think about where we may exit the year in terms of capability from the industry and what customers will be doing with AI. What do you see at that time frame? And what are you most excited about? Yep. It's funny because I will confess, Alison, I chuckle sometimes when I talk to folks in the industry and say, oh, I totally knew this
Starting point is 00:16:34 was coming and it was going to go this fast. I was like, well, you should have said something. It's a lot of people that can predict the future once it's the past. So I'll say, you know, cautiously as I look forward, I do see enterprises moving into real deployment. They have to get actual business value out of these investments. But I alluded to it earlier. I see this reg concept as really getting legs underneath it. There's already been examples in the industry of businesses that have felt harm due to commitments their chatbots have made, for example. And nobody wants to end up in that position, but they cannot afford to sit out this revolution. And so that reg model, that concept for enterprise AI specifically, I think is going to become the standard, the foundation, whether it's, again, for your internal knowledge base that you're using to help ramp and train employees, or whether you're using something for
Starting point is 00:17:40 customer-facing decisions, customer-facing recommendations. So I see a lot more focus on what data I am going to use in order to deliver the best capabilities for my enterprise. And then how am I going to ensure that I'm delivering it in a way that absolutely minimizes any type of hallucination and gives that opportunity. I had the chance to go to the World AI Forum a few weeks ago and speak there and met with so many customers, government agencies, every industry you can imagine. It was a really great event. And this was so top of mind for every chief data officer, for every head of AI,
Starting point is 00:18:27 you know, regardless of size of company industry they were in. And I could tell they are actively shifting resources in this direction to solve this problem for their business. Lisa, it was so much fun having you on the tech arena and learning a little bit more about what Intel is doing in the AI arena. You're welcome back anytime. One final question for you. Where can folks go to learn more about what Intel is delivering on the portfolio and engage your team? Oh, of course. You know, we're out there in all the standard ways on LinkedIn and on X and engaging with the
Starting point is 00:19:08 audience. We have a ton of, you know, content that we're putting out on, whether it's on our intel.com, just trying to give people a chance to, you know, hear from us, see from us, see the work that we're doing in the software space, find ways to easily get to, again, their solutions. And I think you'll see Intel seeking to continue to be a lot more visible in the industry. This AI world is not a one company game. It's a worldwide challenge. And we absolutely, as you would say, want to be in the arena for it and are pursuing all the right conversations to be part of. So we do look forward to that customer engagement and from the ecosystem. So looking forward to continuing the conversation. Thanks so much for being on the program.
Starting point is 00:19:59 It was a lot of fun. Well, it's great to be here with you, too. So we'll have to talk again soon. Thanks for joining the Tech Arena. Subscribe and engage at our website, thetecharena.net. All content is copyright by the Tech Arena.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.