In The Arena by TechArena - The Quest for Broad Data Center Advancement with Arm’s Eddie Ramirez
Episode Date: November 28, 2023TechArena host Allyson Klein chats with Arm’s Eddie Ramirez at the Open Compute Summit about the architecture’s progress in data center, growing a thriving ecosystem, and the sustainability advant...ages of Arm’s design making it even more attractive to data center operators.
Transcript
Discussion (0)
Welcome to the Tech Arena,
featuring authentic discussions between
tech's leading innovators and our host, Alison Klein.
Now, let's step into the arena.
Welcome to the Tech Arena.
My name is Alison Klein. We're recording at OCP Summit in
San Jose, California. And right now I'm very delighted to be joined by Eddie Ramirez from
ARM. Welcome to the program, Eddie. Glad to be here and glad to be at OCP Summit 2023.
So you are a first-time guest on the Tech Arena. So why don't you just take a moment and
introduce yourself and your role at Arm.
I'm Vice President of Marketing for the Infrastructure line of business at Arm.
So at Arm, we tend to focus on four segments of the market and have lines of business associated with each of those.
Mobile and client would be one, automotive would be another, IoT, and then infrastructure, which is everything data center, 5G infrastructure, networking infrastructure, and even HPC. We had Arm on the podcast earlier this year. A gentleman from your
edge group came on and talked about what you were doing in networks. And I knew as I started looking
at data center sustainability and data center efficiency that I wanted to have you guys back on
around data center. And I'm so glad that you're here. You've made incredible headway in data centers. Can you provide some perspective
on where you are on the journey? Five years ago, people were questioning, would Arm ever make it
into that moment where Arm was becoming a viable alternative in the data center?
And what are you doing with customers in terms of considering ARM
as a viable platform choice for their data center workloads? Yeah, that's a great way to start. So
I think what some people may not realize is that ARM is prevalent in a lot of places in the data
center. I mean, we've been in SSDs and hard drives for many years. And so we've always started with traction and peripherals.
We've moved up into accelerators.
So now you're seeing ARM even in network accelerating devices.
And really what was the breakthrough for ARM is being able to offer solutions in the server space as well.
And so then the server SOC was probably the last area where we needed to be able to have performant ARM solutions,
and we now are there.
Five years ago, I would say the big catalyst for us was getting AWS Graviton launched into the market
and showing that a hyperscaler could now go and vertically integrate from software to hardware
and deliver compelling solutions.
And we're now seeing more proliferation of ARM in the cloud.
Now, the topic of the day at OCP has been the AI era
and what is needed for the AI era.
And I just want to hit that accelerator topic right out of the gates.
All of the keynotes seem to have touched on AI in some fashion.
Exactly.
So you mentioned accelerators and ARM being a real key ingredient to those accelerator solutions.
Tell me what those look like and how you're playing with the rest of the ecosystem in delivering core capabilities.
Well, I'll talk about two partners that are actually in our booth here at OCD, right?
The first being NVIDIA.
So NVIDIA's got a great demo around Grace and Grace Hopper, users who aren't familiar with that solution. Effectively, they married a very high performance server SOC with GPUs, right, and are
now co-designing those two together to be able to deliver the best in class performance for an AI
system. And so Arm now gives tools for companies like NVIDIA to be able to do that co-development.
The second partner we have in our booth is actually a startup called NewReality.
And NewReality is effectively doing, you can think about it as an AI DPU card.
So they're able to manage all sorts of accelerators,
be able to manage the ingestion of models, transforming those models
to make them as easy as possible and efficiently to actually work on the compute.
And I think one of the talks that kind of resonated to me here at OCP was talks, for example,
from Meta, where they came at it and said, hey, there isn't a one-size-fits-all solution for AI.
Some models, we're going to need a lot of memory. In other models, we're going to need more compute.
So the space of a one-size-fits-all AI hardware is very difficult to try to crack.
And so we love that because at ARM, we can give partners the ability of designing customized silicon for whatever use case they want to optimize.
These AI accelerators are a perfect example of a broader trend around heterogeneous computing, whether it's in a chiplet interface, if it's, you know, multiple chips on a board or multiple solutions in a rack.
What do you see in terms of broad market applicability of heterogeneous configurations?
And how is the software ecosystem keeping up with these poor capabilities? Yeah.
So, you know, we're now seeing kind of this explosion of specialized compute
and silicon that's now being designed for workloads
because it's the easiest way to accelerate it
and also in many cases be able to deliver it with an efficient TCO, right?
So how do you make sure that it's energy efficient at the same time?
And so for us at Arm, what we're now trying to do is make it as easy as possible to design
custom silicon. And so we've kind of grown our approach from a company that would provide,
you know, pieces of IP, and then somebody would then go and design it to now how do we make the
building blocks even easier for companies to consume. So yesterday we announced our Arm Total design program, which is we're working with an
ecosystem of partners.
We signed up 13 different partners, everything from the boundaries like IFS, TSMC, to third
party IP vendors, where their IP is complementary to our compute subsystem IP, to even basic
design houses who can then integrate
custom acceleration onto these designs. Very cool. And the firmware companies who now build
firmware. So you're looking at the end-to-end design flow. How do we preemptively work with
these partners to make that design of custom silicon as easy and as fast as possible, and at
the same time, de-risk it.
Because designing custom silicon,
especially on leading edge process nodes,
requires a lot of investment.
How do you make sure that we accelerate that
and de-risk that?
And that was part of the program
that we launched yesterday at OCB.
That's fantastic.
I'm really excited to hear about that.
That's exactly what's needed in this moment.
And just to answer the other part of your question,
because I didn't touch on this,
but the software piece is super important, right?
Because you can have all of these customized silicon,
but they all need to be able to run software
in a very similar manner, right?
Because you're not going to have software developers
that want to have to tune their software
for so many different types of silicon
and system solutions out there. And so
for us, the software piece is just as important. We've been investing in the software ecosystem
for over 10 years, particularly in the data center space. We've got programs like ARM System Ready
are geared to providing standards so that software just runs on our solutions. Obviously, we're at OCP Summit.
Sustainability is a key topic.
This week, OCP announced a core tenet around sustainability last year at the summit.
You guys are known for energy sipping cores.
How should we view ARM within the broader context of sustainability?
And what are you doing with the industry to drive more efficiency?
Well, that's a key part of the value that Arm breaks in terms of solutions in the data center space.
We've taken technology that for years was being applied to mobile devices that had to be very efficient in computing because they ran on batteries.
And we were able to scale that to high levels of performance, yet be able to maintain the energy efficiency in those solutions.
And so many of our partners that are going to market, whether it's in the accelerator space, the AI space, or in the server space, they are all showing the advantages of energy efficiency and how they can actually drive compute up, but yet maintain the power
efficiency that ARM is known for. And so we've got, for example, solutions like AMP here,
which now have 192 cores on a single server SoC, but they're all doing that within a power budget
that people can actually deploy without liquid cooling. They even have some solutions that are
fanless out there.
So I think that's an important piece, right?
Because I think you're now in a time where you're seeing in some parts of the world
that energy is now becoming a huge increase in cost,
and it's becoming the main decision of where you're going to build a data center, right?
Where can I get clean energy, and where can I get enough of it to drive the compute?
Now, I'm glad you brought up Ampere.
One question that I have for those in the virtual room that don't think about TDPs all the time.
Can you give some comparison to some x86 alternatives in terms of what the energy footprint looks like in an ARM world and what the energy footprint looks like in other worlds?
Oh, yeah.
TDP, right, is effectively what's your total power that is going to get consumed by a processor.
It's very important when you design a server because then you need to figure out how you're going to feed that processing power.
So now we were getting to solutions that on average were about 200 to 300 watts TDP.
And that's what most of the market was designing towards, right? And if you wanted to increase and add more cores, like do two socket systems,
well, guess what? Now you're doubling the TDP. You go to four socket systems with four SOCs,
and now you're really dealing with very power hungry systems. One of the advantages with what
Ampere is bringing to market is they've increased the
core count to such a high number, right, that you're able to effectively get servers that are
single socket that have as great a performance as you would in the past with two or four socket
systems. That alone is a fantastic TDP service, right? And they've been able to do that without
blowing up the TDP budget.
So we're really excited to continue down this path
of having more of solutions
that do not explode
the TDP budget in systems.
When you look at AI,
and there's been so much talk about it,
and this is a weird question
to ask at OCP
because we are in a conference
of the haves versus the have-nots.
But we expect massive growth in AI
in the years ahead, both in training and in inference. How do we work together as an industry
to ensure democratization of infrastructure? And how does ARM contribute to that?
Yeah, that's a good question. And I think it's one that you see the excitement, especially that chat GPT unleashed on the world of what AI
could do. And, you know, the first area of focus that I think people put into was on the cost of
training these models, right? As you start getting to trillion parameter models and the amount of
compute cycles to do the training of those models was, I think, the initial area where
people were kind of concerned, right? There's this talk that just to do a single training run on a
large language model consumed as much electricity as like five or six automobiles or that you would
drive around for the whole year. And that's just one training run right and so that definitely is
was the initial problem but i think now what you're seeing is that the models have reached
a level of intelligence that they're being able to be applied and now they're looking at how do
we optimize these models for specific use cases and now you're going to start seeing the shift
towards inference right right and we think that 80% of compute cycles is actually going to come from inference.
Yeah, I agree.
It's a lot from training because the models are getting to having to kind of step up in their level of intelligence.
But now that we're at a certain point, you're really going to see inference be a key part.
And so we definitely think that from the ARM side, we can play a key
role in enabling inference, not just in the cloud, but also at the edge. And then even in the cell
phones and mobile phones as well. Right. So I think that's the next area that you'll see a lot
of discussions, right, around making sure that we can run inference in a manner that can happen in
the cloud, that can happen in an edge server,
it can happen in a networking gateway device, and even on a mobile phone.
That's fantastic.
And I want to go under the covers a little bit about the Arm roadmap and what you have
aimed for the data center.
What's exciting in the portfolio for data center workloads?
And what do you expect for innovation heading into 2024?
Oh, yeah, good question.
For us at Arm, all of the products that we do for infrastructure and data center
are branded as Neoverse.
So we have a Neoverse roadmap.
We have three different kinds of cores.
And along with that, we have a high-speed interconnect
to be able to do 100-plus core designs very efficiently.
And what we've added to the roadmap is what I mentioned,
is we've increased the integration of this IP into what we've added to the roadmap is what I mentioned is we've increased
the integration of this IP into what we call our compute subsystem. So at Hotchips, actually,
in August, we introduced our first compute subsystem publicly. We got quite a lot of
fans. But now we're talking to customers now about these integrated design points,
right? Being able to offer those and eventually being able to offer those
with chiplet type interfaces.
Sure.
Because that really is going to be
the next frontier of chip design
is where people start looking at integrating chiplets
to build a lot of these specialized associates.
And so we're trying to get ahead of that
and be able to establish now a chiplet ecosystem
with ARM solutions and with
Neoverse CSS as part of it. Interesting. I love this conversation, Eddie. Thank you so much for
taking time out from your busy OCP schedule. One final question for you. Where can folks find out
more about the products that you talked about today and engage with the ARM team? Yeah. So again,
for anybody that's here at OCP, come buy our boots and see some of the demos
and meet some of the team. For those folks that want to see online, obviously arm.com is a great
place. We've also got armdeveloper.com. So if you're a software engineer and want to learn
how to program on Arm and how the software ecosystem is enabled, that's another great place to
look at as well.
Thank you so much for being with me today.
Thank you.
Thanks for joining the Tech Arena.
Subscribe and engage at our website, thetecharena.net.
All content is copyright by The Tech Arena.