In The Arena by TechArena - Optimizing AI Scale-Out: Cornelis Networks’ Vision with Lisa Spelman
Episode Date: September 12, 2024Lisa Spelman, CEO of Cornelis Networks, discusses the future of AI scale-out, Omni-Path architecture, and how their innovative solutions drive performance, scalability, and interoperability in data ce...nters.
Transcript
Discussion (0)
Welcome to the Tech Arena,
featuring authentic discussions between
tech's leading innovators and our host, Alison Klein.
Now, let's step into the arena.
Welcome into the arena.
My name is Alison Klein, and today I've got a really
exciting guest with me. I'm at the AI Hardware Summit in the Bay Area, and Lisa Spellman,
CEO of Cornelis Networks, is here. Welcome, Lisa. Thanks for having me, Alison. It's super
exciting to be here together with you and in the arena with you. What a difference a
few months makes. You were on the program a few months ago with a completely different company. You've made a tremendous career move and are now the CEO
of Cornelis Networks, steering that ship. I have so many questions to ask you about this, but let's
just start with an introduction of Cornelis. Cornelis has never been on the program before.
Yeah, thank you again for having me. So I'm excited to be our first guest. So Cornelis Networks is a company that specializes in data center fabric and interconnects.
And we are really at the intersection of the growth in the massive scale out system, how
to unlock more performance out of AI systems that are being built all over the world and
high performance computing.
So really seeing that convergence come together. And we think we have a really unique position, some great durable
architectural advantage and the opportunity to scale and just solve this next frontier
of system optimization that's ripe for innovation. And when I say that, Dex Frontier, so much has
been done at the compute layer and so many really great investments have been made and so many
people are solving these huge problems of how to handle these massive training models.
And there's opportunity yet still at the network layer to actually improve so much of the performance of those GPUs, of those AI acceleration ASICs,
of the whole system. And that's where we come in. We're ready to solve that challenge for our
customers. You know this market really well. I've watched you engage in this market for a long time.
You know these players intimately. I feel like watching the cloud providers build out AI clusters is like watching a never-ending season of The Amazing Race as they look for AI dominance and their path to AGI.
How much do supply chain constraints impede their progress? And does this open the door
for consideration of new technology entrants like Cornelis?
Yep. So I just want to start by saying, if we're ever on Amazing Race together,
I'm the driver. Okay. So I claim that you can be the navigator. I drive an M2. I'm just saying. I'm
a good driver. You have to eat all the bugs. Okay. But now that we've got that cleared up,
I do think supply chain constraints have had an impact. But when we look at the arc of what we're
doing or the arc of computing and what's happening,
this is all stuff that's going to get solved. And it's not that we're necessarily focused on,
again, addressing a specific supply chain constraint. We're addressing a gap in the
capability of that really large scale out. And so what Cornelis has that's unique to us and what we can offer to our customers
is this ability to not only have competitive bandwidth, which is very important for artificial
intelligence, but also getting into extremely low latency, improving the message rates and
driving that ability to add GPU after GPU while not impacting the performance.
So that scalability differentiation makes a huge impact.
So you can take your idle GPUs, you can take your unused compute capacity and put it to work because you can feed it with the data.
Now let's take a step back.
Cornelis' product portfolio is based on a technology I know really well, OmniPath.
How does this technology differ from other high-speed fabric technologies like InfiniBand?
And why do you see this as a winning solution for AI fabrics?
Yep.
And the OmniPath architecture, we do really believe, offers a durable competitive advantage for our customers.
And it gets into some of what I was saying around, it's not just throwing more
bandwidth at the problem. And as you get to larger scale, you do start to run into some of those,
again, latency, message rate, those scalability issues. And there's really three architectures
that are deployed currently in the market. There's the InfiniBand architecture, there's
the Ethernet architecture, and there's the OmniPath architecture, which up until now has been deployed primarily in high-performance computing.
Yeah, same bring some of those
capabilities that were at first only important in high-performance computing, but actually now are
quite important in AI scale out. And, you know, we've seen this market. Yes, the hyperscalers are
big consumers of technology. They're going to set the pace for the frontier models. And I see that continuing into
the future as we see it for now. But this enterprise AI and kind of public cloud next wave
AI is going to be a huge marker of investment for the industry as well. And there's going to be
customers that are driving tremendous innovation and for a variety of reasons may not be offering all of that training
or that inference from a cloud customer.
So we see the market as really split
across multiple segments,
but having the technology portfolio
to address all of them.
And again, I know from my previous experience
how much a small improvement in utilization can make a complete difference to your total cost of ownership.
I know how much it can save you when you can pull a watt of power out of a system.
So you look at these optimizations that we're offering to these massive scale-out systems, and I think it's a pretty compelling story. The other thing we're doing is that our products
are going to support a very interoperable
and multi-vendor environment.
NVIDIA GPUs have been deployed as a standard.
AMD is making tremendous progress with their GPU products.
Companies like Google investing in their TPU,
Microsoft investing in their own
custom silicon. And we're fully prepared to support any and all of it across the board. So that offers
a really nice environment for customers that aren't going to just be one single vendor for all
of their AI training, inference, and high-performance computing. You know, I think it's something that
we both know
well that customers do not like vendor lock-in and they don't want a single source. So this opens up
an incredible opportunity. Now, one thing that we've been covering on Tech Arena that I need to
ask you about is UltraEthernet. All of the major cloud providers have thrown their hats in to
support UltraEthernet. And it feels like another day of the same story with Ethernet of we are
just going to get more out of this technology and we're going to throw more development at it.
And we've done this many times in the industry.
How does Cornelis embrace Alter and Ethernet?
You guys are part of the initiative.
And how do you see that working hand in hand with OmniPath?
Well, I'm glad you asked, actually, because this is one of the most exciting things about our roadmap.
Ethernet is 45 years old.
That's amazing.
A technology that has done so much for the world of computing.
But it does have, again, some of those what I'll call architectural limitations.
And on the whole, we are absolutely supportive of this move towards ultra-Ethernet and are helping shape what that actually looks like and is defined.
What it offers us is the ability
to pair it with our architecture. So if you think about a training cluster or even an inference
cluster, you're going to be able to, within that cluster, get the absolute peak performance,
price performance, power savings, GPU utilization, again, latency, message rates, all of that at the absolute max of what the OmniPAP architecture can deliver.
And then we're going to have this ultra-Ethernet compatibility that allows that cluster and that system to appear to the rest of your data center as like-for-like.
And so it'll make interoperability so much easier. So you get the best of the performance where you absolutely require it and need it and the price performance and then to the rest of your data center.
So as you connect to your CRM or your ERP or whatever it is where the data is coming in and out of, you have that opportunity to look like just another system within the data center. So I think that interoperability that
we're pursuing with both Ethernet and UltraEthernet is going to be a game changer for the way that the
OmniPath architecture can fit into a large data center. And one of the things that you spoke about
on that answer that I think is so important, all of these organizations, whether they're
hyperscalers or enterprises, are still running legacy applications and need heterogeneous solutions to drive that.
So this makes a lot of sense, gives a lot of extensibility in terms of the technology.
That leads me to my last question.
What can we expect from Cornelis now that you've taken the reins on the leadership?
That's one that we can unpack over a glass of wine at some point.
But what does 2025 look like for the company?
Yeah, we're super excited about the future.
It's just such an exciting place to be.
And it's really cool to be part of this build out and all the innovation that is being
unlocked right now.
So for 2025, it's a big year for us.
We're bringing out our next generation of products.
We're building a world-class engineering and execution team
that's excited to take on these challenges
of this ultra ethernet definition
and integration into the products.
And we're really ramping up our go-to-market
and sales efforts
because we're getting out in front of customers
and making sure everyone knows who we are
and what we have coming next.
And I think just as a company, we're feeling so much energy and momentum about this place
where we sit right now, where we have the opportunity to really help solve our customers'
biggest problems.
I mean, I said it earlier, and I can't think of a better way to say it.
We really feel like we're sitting on the edge of the next optimization frontier in the AI scale out system. And it is fully our intention to be the absolute best option for our customers. long time and a leader that I believed in for a long time coming together. I'm still back on your
comment that I have to eat all the bugs, but thank you so much for being here with us today. We'd
love to have you back on the show. Yeah. Thank you, Alison. It's great to be here. Great to be
connected. I appreciate your kind words and we can negotiate over maybe there's a certain bug I'll
eat. Sounds good. Thanks so much. And thanks for this episode of the Tech
Arena. Thanks for joining the Tech Arena. Subscribe and engage at our website,
thetecharena.net. All content is copyright by the Tech Arena.