In The Arena by TechArena - Compute Requirements in the AI Era with Intel’s Lisa Spelman
Episode Date: March 26, 2024TechArena host Allyson Klein chats with Intel’s Lisa Spelman about how compute requirements are changing for the AI era, where we are with broad enterprise adoption of AI, and how software, tools an...d standards are required to help implement solutions at scale.
Transcript
Discussion (0)
Welcome to the Tech Arena, my name is Allison Klein. It's a really exciting day on
the tech arena. We've got Lisa Spellman here, Corporate Vice President from Intel. Hey, Lisa,
welcome to the program. Oh, it's good to be here, Allison. Always a pleasure when we get to hang out
together. Lisa, you and I have talked to each other many times in the past about technology,
but this is the first time you're on tech arena. So why don't you go ahead and introduce yourself and your history and role
at Intel? Sure. You know, I'm Lisa Spellman. I have been at Intel for 23 years. Feels like it
happened in a blink of an eye, but through my time here, I've had the chance to participate in a lot
of different areas of the industry. I've been in our Intel IT infrastructure,
engineering and operations, which is great experience.
And I've been in this product group
and had the honor and privilege to lead the Xeon team
over the last several years.
And it's been an exciting ride,
not only as Intel has been working on improving products
and solving customer problems.
But also as this AI explosion has happened, and for those of us that have been in the
industry for a long time, we lived through that kind of HPC, huge supercomputer blow
up, then HPC, some conversation about transitioning into AI and where would it go?
And now we're part of living this AI going absolutely mainstream. So it's a pretty exciting time, honestly.
You know, in all the years that Xeon has driven the world's data centers, we really haven't seen
a moment like what started in November of 22 when ChatGPT was released. And since then, everyone is talking about generative
AI. And I really want to talk to you about how you're seeing this momentum around large language
models trending. Obviously, Intel has been talking about AI for a number of years and developing AI
acceleration into your processors.
But how do you look at this latest opportunity for the enterprise?
Yeah, it's so interesting.
I was reflecting on 2023 thinking we started the year in January launching a Xeon new generation. And at that launch, we were talking about recommendation systems, image recognition, all of the AI use
cases that the industry had spent decades getting into production and getting real value out of.
At the end of 2023 in December, we launched a next generation Xeon. And just the change in that one
year of the pervasiveness of large language models was phenomenal. One of the fastest
tech adoptions that I have seen in my career. And you look at what's being done and the ability for
so many enterprises and so many, you know, customers in the ecosystem, their ability to
now train their own data, you know, using their own
data with all these large language models, their ability to verticalize their solutions and really
focus, you know, chat GPT like capabilities for their employees, for their practitioners,
specifically in their domain. So not everybody needs to be able to do what OpenAI is doing,
which is build massive models that can answer every question in the world. But you look at
healthcare and see the phenomenal potential to be able to answer very specific questions around
cancer diagnostics and treatment based off of very real data from patients all over the world.
So still a big model, but narrowed into a very specific use case.
Same thing for manufacturing companies and for quality control.
You see the same thing in the automotive industry.
So it's one of those capabilities that is touching every single industry and giving the opportunity for very specific utilization and very good depth of answers to common questions.
When you talk to enterprise customers and when you think about enterprise, they tend to be a bit risk averse. So when you talk to them about the opportunity to inject large language models
into their applications, what are you seeing them doing in terms of ensuring that these new
applications align with their business priorities, align with their security policies, align with
their privacy policies, and ensure a great experience for their customers.
I think you'll see enterprises move towards more reg-like implementations and models with this
concept of a need for ensuring that they are protecting themselves against things like hallucination. So if you think of, you know, a teenager using
ChatGPT-4 to help with writing a paper, and they get a hallucination about a data source,
the consequences of that might be a not great grade, or the fact that they didn't double check
it, they might learn a lesson there. But there's no life or death consequence. The consequence of hallucination in an enterprise customer facing
customer service type of LLM implementation can actually be quite high. You can end up having
committed your company to something it wasn't intending to do. You could give an answer that is not acceptable in a healthcare environment.
Many of them are just at that early stage about how to actually do this.
You can't go anywhere right now without hearing about this insatiable demand for compute capacity
to fuel AI and tons of conversations about AI acceleration.
How is Intel looking at this challenge and opportunity across the portfolio?
And how do you see Xeon continuing to play a role in this AI era?
Yeah, it's a compute explosion out there.
And it's really moved mainstream.
Even people that are not in the technology industry are well aware of the compute capacity that's required for all of this AI.
So it's a real moment for the entirety of the industry. With the Xeon product line, we've been investing for several generations in continuing to
add in acceleration for AI workloads. But we're also recognizing that in this large language
model moment, there is such a high demand for the memory bandwidth and that memory capacity
and capability, just given the way the workload is structured. So the way we view the world is really for all of that, you know, really large training,
you absolutely do need accelerator products, you know, again, whether it's accelerator, a GPU,
some sort of dedicated compute capacity and capability in order to actually train the model. Then you move down
into fine tuning and you move into inference. And there you have a little bit more choice,
depending on the size of that model. So our goal on the Xeon side is to continue to provide
really capable hardware that is optimized for AI workloads while also still delivering that absolute
general purpose performance and capability so that you have the world's most flexible
infrastructure. And then we know that every enterprise, every cloud service provider,
every SaaS company will go through their own model and their own math about when does it
really make sense to move to more
acceleration. So we have Xeon for a lot of that inference work and that foundational general
purpose work. We have Xeon being the head node for the vast majority of AI deployments using
accelerators and GPUs. And then we have our Gaudi accelerators that are really specializing in that high performance inference in those fine tuning of models and have the opportunity to help both enterprises and cloud service providers alike. talking to many of the leading players in the AI arena. Intel sponsored an entire day and you
brought some friends with you, including a guy that was from an agricultural company talking
about a very real world use case of utilizing AI to help them mature their crops and pick their crops at the optimal time.
And one of the things that I thought about is that, you know, the importance of software and enabling real world customers with software solutions that can deliver the core capabilities for them and integrate into their environments.
Can you talk a little bit about what Intel's doing in this space?
Yes.
I just have to say, though, I love the example.
An agriculture company that is using AI, large language models, it just starts to show the
pervasiveness and the interest in the explosion of use cases where LLMs can help.
So I think stuff like that's pretty cool.
When we look at software, the number one feedback we get from customers, pretty much regardless of size, is the complexity and where to start.
It's not that it can't be done. It's the requirement it takes on their developer community, on their technical community to
be out looking for the right models, looking for the right frameworks, pulling it all together,
hunting and searching.
And honestly, that's not really work developers like to do.
I mean, they love to work on optimizing and improving the performance by 4x on something.
But just hunting down the basics off of GitHub or here or there, it's not that exciting.
So we've been taking a big effort.
We are open source first and really trying to engage with that community and help put together more standardized stacks to make it just
easier to grab what you need. So really focused on frameworks like PyTorch and also OpenVINO,
especially for the edge. And then building into that, you know, what we think of as the enterprise
journey. So, you know, enterprises have a tremendous amount of
infrastructure that are built on top of their VMware or Red Hat or Nutanix, you know, all of
these companies. So working with them to build a software stack that's in recognition of the,
both the infrastructure and the software layer that they're already working with
and helping them put that together in a cleaner way.
So this is what I would call definitely work in progress.
And you'll hear more about this from us throughout the year,
starting soon and over the next couple of months,
because we think this is the ultimate deployability challenge to solve.
And we're pretty excited about some of the work that we're doing inside the industry.
Now, I've got to ask about the latest acceleration capabilities that you've baked into Xeon with your last generation Sapphire Rapids processor.
AMX was a technology that you delivered to the market. Can you tell me exactly why this technology is pretty foundational for many of your customers?
So AMX is one of our newest features in the AI space on Xeon, but it's actually built on a foundation of multiple years of integrating AI features into Xeon. And what we're always
trying to do is stay, you know, one step ahead of the curve on what's going to be needed because
hardware is a long game. You know, you can't just decide today and get it tomorrow. You have to do
a little bit of that planning ahead of time. But AMX stands for Advanced Matrix Extensions,
and it's built into the core. And it's an accelerator inside that core that delivers
higher performance for both inference and training. Now, of course, there isn't a lot of
training done on Xeon, but there will be fine tuning and there can be opportunities where
you might train a small model if you have a bunch of excess compute capacity, you know, all of that. And it's really great for things like natural language processing,
recommendation systems, image recognition. And we see customers, you know, using it quite quickly,
because it doesn't require a bunch of software work to take advantage of that performance. So that's kind of our best day
when we can put a feature in that software doesn't need to do extra work or the software ecosystem
doesn't in order to realize, yeah, realize that advantage. So obviously Sapphire Rapids is out
in the market. What comes next? As far as looking out into the future, a lot of what
we're doing right now is across our entire portfolios. Like I said, Xeon, Gaudi, you know,
our GPU product lines is getting into the space where we're looking at what is the constraints
at the system level. So not just at the chip level, but at the system level, which takes you out to
the rack level, really thinking of this system of systems, what is it that's holding people back
from achieving their best performance? So you'll see investments from us in, again, that improved
memory bandwidth and memory capacity, both the speed and the size of the memory and
the data sets that you can manage. You'll also see us working on solutions that use Xeon for a lot
of the data pre-processing, data preparation, and making sure that we're feeding the accelerator at
the, you know, best, fastest, most efficient, cost-effective
way because you don't want your most expensive compute resources sitting underutilized.
So anyways, that's what you're going to see from us is trying to take this view that's
much more at that enterprise class system, scale systems, and give people both the hardware tools at the system level
as well as the software in order to just, again, fundamentally make all of this easier across
every industry. Lisa, you sit at a really foundational part of the industry. You're
talking every day to ISVs, to cloud service providers, to enterprises,
and of course, to those platform companies that are delivering solutions based on your processors.
When you look out at 2024, and I don't even know how long to make this question, given
how fast the industry is moving, but let's just say 2024, and you think
about where we may exit the year in terms of capability from the industry and what customers
will be doing with AI. What do you see at that time frame? And what are you most excited about? Yep. It's funny because I will confess,
Alison, I chuckle sometimes when I talk to folks in the industry and say, oh, I totally knew this
was coming and it was going to go this fast. I was like, well, you should have said something.
It's a lot of people that can predict the future once it's the past. So I'll say, you know, cautiously as I look forward, I do see enterprises moving into real deployment. They have to get actual business value out of
these investments. But I alluded to it earlier. I see this reg concept as really getting legs underneath it. There's already been examples
in the industry of businesses that have felt harm due to commitments their chatbots have made,
for example. And nobody wants to end up in that position, but they cannot afford to sit out this
revolution. And so that reg model, that concept for enterprise AI specifically, I think
is going to become the standard, the foundation, whether it's, again, for your internal knowledge
base that you're using to help ramp and train employees, or whether you're using something for
customer-facing decisions, customer-facing recommendations. So I see a lot more focus
on what data I am going to use in order to deliver the best capabilities for my enterprise. And then
how am I going to ensure that I'm delivering it in a way that absolutely minimizes any type of
hallucination and gives that opportunity.
I had the chance to go to the World AI Forum a few weeks ago and speak there and met with
so many customers, government agencies, every industry you can imagine.
It was a really great event.
And this was so top of mind for every chief data officer, for every head of AI,
you know, regardless of size of company industry they were in. And I could tell
they are actively shifting resources in this direction to solve this problem for their business.
Lisa, it was so much fun having you on the tech arena and learning a little bit more about what Intel is doing in the AI arena.
You're welcome back anytime.
One final question for you.
Where can folks go to learn more about what Intel is delivering on the portfolio and engage your team?
Oh, of course.
You know, we're out there in all the standard ways on LinkedIn and on X and engaging with the
audience. We have a ton of, you know, content that we're putting out on, whether it's on our
intel.com, just trying to give people a chance to, you know, hear from us, see from us, see
the work that we're doing in the software space, find ways to easily get to, again,
their solutions. And I think you'll see Intel seeking to continue to be a lot more visible
in the industry. This AI world is not a one company game. It's a worldwide challenge. And we absolutely, as you would say, want to be in the arena for it and are pursuing all the right conversations to be part of.
So we do look forward to that customer engagement and from the ecosystem.
So looking forward to continuing the conversation.
Thanks so much for being on the program.
It was a lot of fun.
Well, it's great to be here with you, too.
So we'll have to talk again soon.
Thanks for joining the Tech Arena. Subscribe and engage at our website,
thetecharena.net. All content is copyright by the Tech Arena.