In The Arena by TechArena - Data Insights: Arm’s Eddie Ramirez on Data Center Innovations

Episode Date: October 16, 2024

Join Allyson Klein and Jeniece Wnorowski as they chat with Eddie Ramirez from Arm about how chiplet innovations and compute efficiency are driving AI and transforming data center architecture....

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Tech Arena, featuring authentic discussions between tech's leading innovators and our host, Alison Klein. Now, let's step into the arena. Welcome in the arena. My name is Alison Klein. And today we're coming to you from Open Compute Project Summit in San Jose, California. And it's another Data Insight podcast. So that means Janice Naroski is with me. Welcome to the program, Janice. Thank you. Thank you, Alison. It's great to be back.
Starting point is 00:00:39 I am so excited for the interviews that we're doing this week at OCP. And how has the show gone for you so far? Oh, my gosh. I've been here, what, three years now in a row? And this one has extensive energy. Just really interesting topics this year. Yeah, I've heard that it's the largest OCP summit in history. Over 7,000 people, which is incredible.
Starting point is 00:00:59 I remember back in the day when it was like 50. And so this is a stark change. Why don't you introduce our guest? I'm excited to introduce Eddie Ramirez, who is the VP of marketing for Arm. Welcome to the show, Eddie. Thank you for having me. It's great to be here again for a second year in a row. I know. Incredible. And I was thinking about that, Eddie, that it's been a year since we first met on this podcast. And you had talked about a launch of a new ecosystem program at the time. It was really cool to see this from Arm. And I just want to say welcome back to the program. And I want to understand the progress
Starting point is 00:01:34 that Arm is making in the data center and what has happened since last year. It feels like so much has happened in a year. Just for Arm within the infrastructure space, we had a lot of partners who launched custom silicon, including Microsoft with the server CPU announcement that came to the market shortly after OCD last year, and then Google announcing their own custom ARM processor. So it basically now makes all the major hyperscalers have ARM-based silicon in their fleets today. They're using them for internal workloads and they're using them for customer instance types as well. So really exciting the progress that we've made.
Starting point is 00:02:10 Some of those engagements have leveraged some of our key technologies like our compute subsystems, which I talked about with you last year. So it's good to now see the proof points of how the compute subsystem products are helping customers deliver ARM-based software. So Eddie, the discussion really here at OCP has been around the emphasis on compute efficiency. Can you tell us
Starting point is 00:02:32 a little bit about how ARM contributes to the overall efficient delivery of systems? Yeah, it's been in our DNA to really provide compute-efficient architectures. We've seen partners like AWS on their Graviton products be able to achieve 50% to 60% performance efficiency from ARM-based servers versus x86 servers. And that's huge because at the end of the day, power is probably the biggest cost of maintaining and running these servers. So in a big cloud environment, like what those major cloud providers do, significant benefits, right, that that provides them in the overall TCO.
Starting point is 00:03:05 So we're really seeing that that's the key, right? To ARM's success. Now, as ARM makes entry into broader data center deployments, if you go into the expo hall, everybody's showing off these massive GPU clusters and you think that the era of CPU-centric computing in the data center could be dead.
Starting point is 00:03:26 How do you view that? Do you think that is true? Do you think that is very much overblown? Where are we with the use of CPUs within data center workloads? And how do we look at that in the era of AI? Yeah, I think accelerated computing has been here for a while, right? And GPUs as accelerators and the data center, it's really just grown with AI. And I think what folks are implementing GPUs has been in the reaction to how we get enough compute for AI training, particularly for these really big models. And that's been the effective way to do it. But going forward, more and more people are actually going to look at, okay, how do I monetize these models? And it's going to be through inferencing. And so inferencing is not going to need the same 200, 250-watt rack systems filled with GPUs that you needed with training.
Starting point is 00:04:13 We're actually going to be able to do a lot of inferencing using ARM-based compute servers as well. So I wouldn't say it's overblown. I just think it's dominated the conversation to where people associate that as the only option. And we don't really feel that's the case. In fact, a lot of what we've published in blogs within our ARM properties has really been about showcasing the performance and power efficiency of ML inferencing on CPU. So as you work with different players in the industry, how do you see the entire infrastructure foundation transforming across networking, compute, and storage?
Starting point is 00:04:49 Yeah, that's an exciting thing. I think at this show in particular, you're now seeing all of these elements as areas within an AI server that needed to scale very quickly to handle the data demands, right, with AI. So obviously the compute side of it, the networking side, they're all having to scale much faster than before. For us, it's exciting because guess what? There are ARM cores in all of those devices. We have ARM cores in the top rack switch. I talked about the server CPU. Networking equipment like DPUs are deploying ARM cores, even BOC chips, right, which are
Starting point is 00:05:23 how you access manageability features have ARM cores. So it's been great for us, right? Because we're now having discussions with folks on how to get more ARM cores. One way to do that is through chiplets. Chiplets is what we really focused on within ARM Total Design of how do we take that ecosystem and then have that be the ecosystem where you know that you can get chiplets that you can integrate into these other designs very quickly. Now, we talked about last year's news, but that's last year's news.
Starting point is 00:05:52 And today you made a big announcement. Can you tell me about that? Yeah. So on the same lines, right, showing the power of chiplets to accelerate design was really what we focused this year's Armtola design announcements around. So we have eight different partners who within Arm Armtola Design, have announced chiplet projects that they've kicked off. Everything from 16 cores to 64 core chiplets that you can use in a variety of products. And more exciting, we announced one particular partnership where we brought together
Starting point is 00:06:20 Samsung Foundry, AD Technology, a Korean ASIC design partner, as well as Rebellions AI, a startup doing TPU AI accelerators. And we're showing how through the Artola Design Partnership, we're able to actually provide an integrated design that provides three times better performance efficiency than GPUs for large-scale AI workloads. And so you're talking about the leading edge products from NVIDIA, how we can actually do better from a performance and power standpoint, all leveraging chiplets-based designs with these partners and showing that we can actually integrate these. And the reason that's a big deal is because today, yeah, there are chiplet designs, but it's all done by one company. And this concept of being able to leverage chiplets between companies, reuse those components,
Starting point is 00:07:07 we're now proving that out in an AI scheme. That's fascinating. I've learned a lot about chiplets now, and hearing about some of that partnership is amazing. Can you tell us a little bit more, though, about how ARM is innovating its processor design to keep up and keep pace with the changing rank state? Yeah. What's exciting is we have our Neoverse platform, which is the products and the processor cores and the interconnect IPs that we are developing specifically for the infrastructure market.
Starting point is 00:07:35 And so a lot of these partners are now getting access to our newest v3 Neoverse platform as well. So it's the completely integrated design that they can take and augment and add IO and different capabilities to it. And that's exciting that for these kind of partnerships, we can bring our latest technology and actually apply it to these AI large-scale designs. Now, one final question for you.
Starting point is 00:07:59 Where can folks engage with you and learn all about the ecosystem programs, the new products, everything that you've talked about today and engage your team? First of all, if you're at OCP, find us in either the expo hall or some of the sessions. I think we've got our people in about 12 different sessions. Everything from how to leverage chiplets, TCO case studies, open firmware, the broad variety of areas where we're at. And also, you can learn more on our website. case studies, open firmware, the broad variety of areas where we're at. And also, you can learn more on our website.
Starting point is 00:08:30 I just did a blog today with some of the announcements today. I encourage people to go on our newsroom blog and learn more as well. And feel free to reach out on LinkedIn. I'd love to talk to more people who are interested. Thank you so much for spending a little bit of your very busy and CP schedule with us. It was a real pleasure. Always a pleasure for me as well. Thank you again.
Starting point is 00:08:46 Thank you so much. Thanks for joining The Tech Arena. Subscribe and engage at our website, thetecharena.net. All content is copyright by The Tech Arena. Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.