In The Arena by TechArena - Cornelis and OCP Examine Why AI Infrastructure Must Evolve
Episode Date: January 16, 2026From the OCP Global Summit, hear why 50% GPU utilization is a “civilization-level” problem, and why open standards are key to unlocking underutilized compute capacity....
Transcript
Discussion (0)
Welcome to Tech Arena, featuring authentic discussions between tech's leading innovators and our host, Alison Klein.
Now, let's step into the arena.
Welcome in the arena. My name's Allison Klein. We are coming to you from the OCP Summit in San Jose,
and I am so delighted because we are with Lisa Spellman, CEO of Cornellus Networks, and Zane Ball, CTO of OCP,
for a fun and delightful conversation about everything around AI computing.
Welcome to the program, guys.
Thank you.
Yeah, thanks for having us.
OCP Summit, avoiding the future of AI.
When you think about leading versus following in AI infrastructure,
what does this distinction mean to you?
Maybe I'll go first on this one.
But in the world of AI, I feel like, gosh, I hate to say that there almost is no follow.
You have to be leading, you have to be foot forward.
and you have to be solving not just today's pressing problem,
but tomorrow's pressing problem.
The hardest part, I think, that we're all facing
is seeing that future in a world that moves so fast.
Anyone can think of two years ago,
what we were talking about a year ago, six months ago,
and that pace just keeps accelerating.
So for those of us working on long-term projects,
Silicon Innovation to support AI,
it really is a big thinking challenge
to stay on that front foot and be leading.
I think the follow is hard.
For my respect, clue, that feels like up until now,
and the infrastructure, well, AI has been kind of inflicted on everyone.
Yeah, let's listen.
I was talking about the keynote yesterday that, you know, we've run out of words
because for years, people come to these conferences and talk about unprecedented, exponential.
And then this is like a hundred times bigger than that was saying.
So, yeah, you've been saying unprecedented for years.
But actually, no, it's really unprecedented.
And it's like civilizational infrastructure building out.
And it's the degree to which it's really hard to put a supercomputer on the ground every week at some of these companies.
And leading on AI means getting on top of this for a while.
It's like, you know, we've really had, especially at the facility data center level, we've got to change our minds.
No, we've got to think very differently about how this is done.
And instead of running around with our hair on fire, it's time to get really proactive and lead on these things.
and do it, collaboration.
Yeah, I like what Zane said,
but I think there's so much pressure
when you're an operator in these roles.
So this separating out the hair on fire,
which has led to amazing advancements,
even like I said, over the last six months
in what we're able to do
and finding the time and the brain power
to step back from that
and think about it more systemically.
It is a real challenge
because the hair on fire doesn't stop.
And so, yeah,
you're trying to create space
for that type of thinking inside,
of an incredibly constrained talent world as well.
So it's a big civilization challenge.
I wanted to ask you this question because I know that you worked very closely at a previous company,
building foundational technology with the industry and delivering this broad proliferation
of data center capability.
When you look at this moment, what are we doing right and what do we still need to work on in terms of
showing up for this moment in civilization, if you will.
The way I can think about it, like when we go through big technological changes,
they're almost always built on top of the last one.
Like, you know, we didn't get automobiles.
It would have never been possible without railroads.
But the ecosystem had to belong very significantly to do that,
but you wouldn't have had the ability to build roads or finance things,
had you not gone through one, and you couldn't have done mobile had you not done
PCs. And I don't think you could do AI if you hadn't built out cloud infrastructure before.
But it's not to say that it's at all the same. Cloud infrastructure is really quite different
than infrastructure. And so while we build on all the smart things we did in the past,
we have to solve a bunch of new problems. But we wouldn't even be where we are today if we
weren't using the supply chain and the ecosystem that has already been built and the problems
that have already been solved. We're doing things wrong.
today that we did right before because we're still doing a lot of those same things.
We're just a bunch of new problems that we've got to solve on top of it.
I reflect back on our time working together to build out that cloud infrastructure.
And the pace felt immensely challenging.
It pushed us.
It pushed our teens every single day to stay on top of.
And what we're doing today pales in comparison.
I remember working on taking design cycles for Silicon, trying to get them within
five years of concept through to production. And then I look at Cornellus now, much smaller
team, much more focused in a singular area of networking. And our design cycles are 12 to 18 months.
And so it's a whole pace change that goes along with it. But I do have to agree with what Zane
said about the foundations of each prior experience are what gives you the launching pad for the
next one, even if the technological problem of a scale-out respond to everything versus a highly
parallel computing challenge is different architecturally. There's a lot of foundational in there.
Now, you mentioned that you're working in networking. I think that one of the most interesting
things about AI compute is that it really takes the concept of balanced computing and dialed in
performance equilibrium between compute, network, and storage to the next level. Why is network becoming so
critical at this point. And how do you see the industry working on that together to deliver
core capabilities within the network? Yeah, if I'm being fair, when I was really focused on
compute, I really said compute was the center of the system. And now that I spend all my days
in networking, I say networking's the center of the system. So that's either a me problem or that's
actually really a part of what's happening. But I just think that in the world we're living in,
compute advancements have had a tremendous amount of attention and resourcing and dollars
and innovation put towards them and have shown just massive scale improvements, like X factors of
improvements in order to keep up with the challenge. But the networking, a lot of it's still running
on architectures that were built. They're super proficient, but were designed for totally different
problems. And so this parallel computing opens up like a new set of work wire.
that we haven't fully adapted to.
And that's why you see all these systems
with 50% utilization of the compute.
And for me, when you step back from that,
it's a business challenge, it's a ecosystem challenge.
But actually, it is a fundamental civilization challenge
to build again on what Zane said,
because we simply can't afford from a power perspective,
from an infrastructure perspective,
from a supply chain perspective, from a minerals perspective,
we can't afford to leave that much
capacity underutilized, not helping solve these problems. So the thing that gets me excited
every day is working with this ecosystem to actually get after it and solve that and drive
that utilization of compute forward. And I see the network as one of the biggest unlockers
of that potential. I think the physics is on the side of the common network's becoming the center
of. It's not just me. Okay. Cool. Cool. It's like how much power is in the network versus in
the compete, right? I mean, it's out of course.
Exactly. People throw out like 30%. Yeah, it's a lot. But if you look at the very nature of the
transformer model and how you train and how you do inference, you have multiple networking problems.
You have to solve. You have a scale up network where you have to get all these GPs acting
like they are one machine fully coherent. That's very difficult challenge, but it's also
different than a scale out problem where we have to still maintain some level of coherence,
but in a more relaxed latency domain. And you know, it's like, okay, that's,
getting it to be a bigger and bigger set of challenges.
And then now we realize, oh, now I need to think about the
hall and multi-haul networking.
And now I'm actually looking at scale across
so that I can marshal global compute resources
solving on a problem and make that problem incredibly large.
And those connections, I think,
a little bit like neural connections, like, you know, in your brain,
your brain is a connectivity problem.
And AI is fundamentally a connectivity problem.
If you can do the networking better,
You're going to save tons of power.
You're going to get more interesting AI possibilities.
So it's just a very natural place for the center of gravity industry to be right.
I think so many people don't realize at how small of scale,
the challenge of scale actually really shows up in AI systems and in networking.
And you hear in news people putting together 500,000 GPU systems,
building to a million GPU systems.
But the real problems at scale literally start once you put eight together.
And it only gets worse from there.
It's shocking how small it starts at and how much opportunity there is to improve and the complexity of scale up as well.
So, yeah, it's an exciting problem to be a part of.
And one of the things I talk about, too, not just my team at Cornellus and all of us,
but in this domain and space and area we play in, like, you know your work matters.
You have that significance of we can really make a difference and impact here by improving the scalability of these AI systems.
I know that you guys understand market segmentation really well.
You understand the difference between enterprise and hyperscale intimately.
Enter the neocloud and providing new challenges in terms of the ways they operate
and the ways they want to engage with tech vendors.
How do you look at serving, from your perspectives,
serving these different deployment models, these different types of customers,
and the broad proliferation of AI adoption that's coming in the enterprise?
as well. And what are the key points that you think about in terms of delivering infrastructure
capabilities that address those markets? I think the neoclouds, and it's an interesting term because
what is open AI? Is that really a neocloud? Is that a hyperscaler now? Or what is that definition
anymore? But it's interesting having been in the industry for as long as we have, that no matter
how big the incumbents are, there's always a challenge you're coming with a slight take, a pivot
off of both the business model as well as the underlying infrastructure model. And I think that's
what we've seen with the neoclouds. Like what happens when you are building infrastructure first
and foremost and solely for this problem, for this application, not saying, oh, I have this
massive book of business and how do I also do this? Right. Right. There's a focus thing there that is
interesting because you're not trying to bolt a solution on to a bunch of existing infrastructure
and then also partially optimized. It's very focused and we see that a lot. The enterprise is
a bit different in that what we're seeing a lot with our customer base is a lot of folks that have
HPC applications in their business model. So whether that's oil and gas or automotive and drug
discovery, all those types of things, those are the ones that are moving most quickly towards
also having AI integrated into how they deliver their business.
And I think some of it, yeah, comes from their history of using compute as a fundamental
resource to drive their business. And so they have that mindset built in. And that's been an
interesting transformation as a lot of those companies are choosing to actually drive that
portion of their business in an on-prem or colo model because of the power.
of the data.
I think from an OCP perspective,
it's a community.
We have lots of members of the community come together to work out solutions
in a lot of different segments, including Enterprising,
not surprises people sometimes.
At an engineering level, I think what's being deployed in Neo-Clouds
and what's being deployed in hyperscale increasingly look very similar.
And as we saw the letter called to action yesterday,
it's like there's a real desire out there to standardize elements of these really large
scale liquid cooled DC voltage distributing like massive systems.
I don't think you're going to see lots of those systems in some small enterprise data
same, right?
That's like a completely different thing.
And it doesn't mean there isn't going to be AI there because AI is just like a very big
thing.
It sort of boils down to the frontier model world and other AI applications.
I think there's still a lot of ink to be written on what that tail of applications is and
what role smaller models play, what kind of inference solutions are going to be out there closer to
deployment and lower latencies to end users and most kind of things. And I think we'll see a lot more
experimentation. I feel like enterprises are moving from a lot of experimentation more into like,
no, I'm really getting business value in the next 12 months. So I think we will see that segmentation,
I think, come up a little bit more into clarity. But in short term, you're still going to
read about just massive gigawatt data center. And it could be built by a neopode. It could be built by
Oracle, it could be built by Google, but you may see people deploying their infrastructure
in all kinds of different modes.
We're already seeing that, obviously, and I think we'll see more and more of that, because
finding the power is the thing and finding a place where you can deploy and scale up or scale
down your capacity.
If young people are going to want lots of options for doing them.
And people, I don't know if they fully realize how much of the hyperscale and meocloud
is a shared business model.
I mean, the neoclods are being used to provide excess capacity.
and capabilities to try out new infrastructure styles.
I do think as an industry,
we are relying a lot more on standards than ever before.
OCP, the Ultra Ethernet Consortium, UAL, things like that,
coming together.
It's like these problems are too big to go out alone.
And so bringing people towards standards
and much more open, open source, open methodologies
will allow for faster solving and scaling of these problems.
So I think that trend will continue.
You know, it's interesting. I was preparing for our interviews this week, and I was thinking about
what was OCP Summit like last year versus this year? And there were seedings of exciting things
happening in the industry. I recently had a conversation with you, Lisa, where you said, we're doing
this with brute force. We need to do it more elegantly. That line has stuck with me. You mentioned
the open letter of needing to move faster. So as we look forward, and there's this strong imperative
to move faster as an industry, to work together,
to utilize the things that we've started
to actually drive broad proliferation of technology.
What do you think we're going to be talking about in 26?
I've called this wrong pretty much every year.
And he's the CTO.
I know what we're not.
It does feel like every year has a little zeitgeist to it, right?
A lot of it seems like in the open letter,
feels like that's kind of in the zeit case,
but this show.
I feel like we're going to decisively pivot
from a training-dominated ecosystem
conversation to an inference dominated ecosystem.
And I think that's going to create new products,
new opportunities, new standards that need to be developed.
And we're going to say a lot more conversation about that because that's going to
become the first sort of problem.
You already see people with like workloads splitting and some of the cool
technical advances where people are really architecting for that already.
And I think that's going to move center stage.
And then like I mentioned in my last comment, I think we're really going to see a lot
more enterprise use cases and people really making money with AI.
that's going to go hand in hand with that inference,
and that may surface new problems and opportunities.
Hopefully, we're celebrating that building out this extraordinary infrastructure
has gotten so much easier because of the collaboration across the different companies
that everybody seems so eager to do like that.
Yeah.
He gave two and he was supposed to give one.
So I know this don't want to mind, which I think in a year we are going to be talking
about the rise of the enterprise.
And that's really something I see so many advancements.
And I look at, again, just even at Cornellus, like, we're not encouraging employees to try out new AI tools.
We've redefined the workflow of how work gets done.
And so that utilization of AI is a part of it.
And so, yeah, we're a silicon and a system and a solution vendor, but we're an enterprise ourselves.
And I see that.
And I see the way in the last even, I'll say six months, we've reshaped our workforce, our workforce expectations.
and how the actual work gets done.
And I think that we can probably move a bit faster
because we are smaller,
but the enterprise is right behind us in general.
It has to happen and it will.
I want to add a bonus question in here.
You guys have talked about the new from training to inference.
How do we look at agented computing within that context?
And where do you think it fits in terms of changing workload dynamics?
Maybe I'm going to say something contrarian,
but actually Agentic, I see it as a continuation of this momentum that's already started
versus a complete redo where you used to have your entire workflow without AI in it,
and then you added it.
And then Agentic goes on top.
So I don't think it's as big of a step function change as no AI to AI built into everything you do.
So I see it as additive.
Like we're already seeing adopters are taking steps there.
Okay.
Yeah.
So much possibility.
but I think it's more of an evolution than a revolution.
I tend to agree, but I think one implication that it's going to have,
and I don't know if this is really going to be true, not,
but in my gut, I think that we've maybe neglected a little bit
the front side of the machine, the front network, the CPU, you know,
as we've engineered this like incredible high bandwidth fabric around the backside,
the GPU of the machine.
You know, we have things like rag and we have all kinds of interaction
between that machine and traditional database.
And I think as the agentic builds out,
There's going to be more bandwidth on the front end.
There's going to be security.
It's going to be really challenged.
And I think we're going to have to engineer the front end of the machine.
And I don't know that's just like a pastor.
Nick feels to me like an insufficient response to the moment.
I think we think the architecture of the front end of the machine quite a bit.
And probably smart people are working on that.
But I haven't seen a lot of discussion in the open yet.
But I think it's going to demand more of that.
And it's going to create interesting opportunities for that side of the,
equation in that part of the ecosystem, which is some players that maybe haven't been like in that
LLM training ecosystem, I'm going to get some opportunities.
Yep.
The demand profile is changing.
So, yeah, it'll be interesting.
So I can't wait to start our interview next year with a summary of these pernostications.
Only if we were right.
If I scored too bad, I might not be able to.
So one final question for both of you.
I'm sure that people online want to engage with you and learn more about what you're doing
and how they can get involved in things that you're doing.
How can they engage with you and learn more?
Jess being me on LinkedIn.
Okay.
And for me, find us at cornellus networks.com or also on LinkedIn, both the company and myself,
we welcome the conversation on building the future of AI.
Well, Lisa and Zane, it's been a pleasure.
Thanks so much.
All right.
Thank you.
Thanks for joining Tech Arena.
Subscribe and engage at our website, Techorina.AI.
All content is copyright by Techorina.
