Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 4x18: The Past and Future of CXL with Allyson Klein
Episode Date: March 6, 2023CXL grew out of a long history of Intel platform technology and promises to revolutionize this platform. This episode of Utilizing CXL features Allyson Klein of TechArena, who shares her vision for th...e future of disaggregated servers with Craig Rodgers and Stephen Foskett. The foundation for CXL was laid with coherent interconnects and fabrics in previous decades, and the concept of tiered memory draws heavily on the history of Intel and Micron 3D XPoint and Optane Persistent Memory. As was the case for virtualization, CXL has an initial value proposition and a revolutionary future. CXL opens up virtually unlimited memory, pools of processors and offload engines, and unprecedented flexibility. Ultimately CXL will allow us to deliver compute resources that are truly balanced and matched to the needs of each workload. We also discussed UCIe, which promises to enable mix and match chiplets within future processors in a way that is compatible with CXL. Hosts: Stephen Foskett: https://www.twitter.com/SFoskett Craig Rodgers: https://www.twitter.com/CraigRodgersms Guest Host: Allyson Klein: https://www.twitter.com/TechAllyson Follow Gestalt IT and Utilizing Tech Website: https://www.UtilizingTech.com/ Website: https://www.GestaltIT.com/ Twitter: https://www.twitter.com/GestaltIT LinkedIn: https://www.linkedin.com/company/1789 Tags: #UtilizingCXL #DisaggregatedServers #CXLFabrics #MemoryExpansion #CXL @UtilizingTech
Transcript
Discussion (0)
Welcome to Utilizing Tech, the podcast about emerging technology from Gestalt IT.
This season of Utilizing Tech focuses on Compute Express Link, or CXL,
a new technology that promises to revolutionize enterprise computing.
I'm your host, Stephen Foskett, organizer of Tech Field Day and publisher of Gestalt IT.
And joining me today on Utilizing Tech is my co-host, Craig Rogers.
Welcome to the show, Craig.
Thanks, Stephen. Great to be here.
It's good to have you once again. So, Craig, we've been talking here on Utilizing Tech,
Utilizing CXL this season about the challenges of basically, where does all this go? Like,
what's the point? You know, we're, yeah, more memory. Memory is cool, right?
But is there more than memory?
Yeah, we want a lot more than memory.
We want storage.
We want TPUs.
We want APUs.
We want everything.
We want everything.
Yes, yes, we want everything.
Now, I think the important thing is that I think that we can see potential in this thing,
that it could be really revolutionary.
And I don't blame companies for talking about what's on the truck.
Because if they have memory expansion cards and memory expansion capability,
and that that provides a useful capability,
it allows architects to do things about right-sizing memory and stuff. That's cool.
I like it. I like it. But I think that this thing could be a lot more.
I completely agree. I completely agree. The future use cases are much more exciting than
our current capabilities. So that's why when I was talking to Allison Klein, who was formerly at Intel and Micron,
and I learned about her background with CXL and her overall strategic vision of where
the industry is headed, I knew that we had to invite her on the show.
So Allison, welcome to the broadcast.
Thank you so much, Stephen.
And thanks for mentioning my background in CXL.
It's a technology that I have a lot of passion for, and I've been covering it on my own platform,
the Tech Arena.
Yeah, tell us briefly, what is the Tech Arena?
So you mentioned that I used to be at Intel and Micron.
I had this crazy idea last year to finally pursue my dream to go out alone and establish my own media platform,
the tech arena, I felt like there was a need for additional discussion on tech innovation from
edge to cloud. And I felt that my background provided me a good way of being a voice in that
space. Yeah, I absolutely agree with you on that. And so far, I think that you have become
a voice in that space. And I can't wait to hear where you go with this in the future.
Since you were at Intel, maybe you can give us a little bit of look kind of into the past.
Where did CXL come from? What's the history of this technology?
Well, it's an interesting question, and it depends on
how far back you want to go. My first technology that I worked at at Intel was called InfiniBand.
You probably remember that. It's a fabric technology in the HPC space. And if you go
back to the early 90s, and I won't spend a lot of time here, but it's relevant. You can see that there has been a vision for channel-based,
coherent connection for server platforms for quite a long time. We didn't quite achieve it
with InfiniBand with the advent of PCI Express, but move forward in time and you get to CXL. And what I think is the impetus is trying to push up against a server platform as being this determinant of how you deliver balanced computing in data center environments.
There are constraints to utilizing processor memory IO on a motherboard as the determinant of what you can do for an
application. And CXL enables architects and engineers to envision different ways to deliver
computing in those data center environments. It's funny, in the future, I don't know what we're going to call a server.
What part of the rack is going to be the server?
Is it going to be where the compute lives or is it going to be where the RAM lives?
It's going to become this abstract thing that exists.
Is the whole rack now a server?
Well, I think that maybe what you should think about is, is there a server? Or is there an orchestrator of resources? And I think that CXL is one of the many technologies that's being worked on today to really reframe how computing is delivered within a data center environment and how resources are used, whether they're,
you know, a standard CPU or another logic processor or memory or even storage. How does the application get what it needs in the time frame that it needs?
And how can you compose the right resources for that application in that time?
It's a very different computing model than what we think about today.
Yeah. And yet, you know, in a way, I think that it's sort of reassuring because it's not like
we're upending the whole von Neumann model. I mean, we have processors, we have short-term memory, we have long-term storage,
we have IO. It's just that those things don't necessarily need to be fixed resources that
exist on a single motherboard and within a single chassis, right? That's correct. And I don't think
that we're saying, you know, what we know about computing, let's just throw that all away.
I think what we're saying is how do we remove the constraints of computing?
And one of the constraints that we're coming up at is the physical box that has been the primary determinant of a server
since the birth of servers and data centers many, many
decades ago. I think it's another evolution
as well.
If we look at the changes and optimizations that were brought by, say,
virtualization, where we were making more efficient use of resources
within a server, there was much less wasted RAM, storage, everything.
Now we're just going to have more options around how we're actually
composing these servers so it'll
bring and it'll bring in for efficiencies you know wasted ram is obviously a huge one in in the hpc
world and they'll be early adapters but don't go down in the ram modules too much now but the
the future use cases it is going to be very hard to define a server, but these we also saw the industry look at virtualization and say,
what else can we do with that? And, you know,
that was where the birth of cloud computing came in and, you know,
and the concept of more efficient utilization of platforms,
self-provisioning of services, et cetera. I don't think that if you go back to the
early stages of VMware, that was really the vision of what was the problem statement, right? It was
really how do we get more work out of these servers that are sitting at, you know, 8% to 10%
capacity, and how can we, you know, how can we utilize our resources better, but not necessarily where
we went.
I think CXL is going to be the same thing.
You start with the platforms that exist today, AMD Genoa and Intel Sapphire, and they've
got CXL 1.1.
They're doing some interesting things with CXL, but those use cases are very maybe, you know, baby steps towards
this grand vision that we've started with, you know, expansion of memory, and everybody's talking
about expansion of memory and memory pooling, and I think that the reason why they're talking about
it is because applications are requiring more memory, and app developers have been for years having to write
around the capacity constraints of memory and doing some esoteric things in order to, you know,
whether it's an AI to train a model utilizing more memory or, you know, advanced analytics where you
want to have a larger memory capacity of a database that you're referencing or whatever the case may
be, we're working around this problem. So that's the easiest one to solve.
Yeah, it's interesting that you bring up the memory question because one of the things that
people did, well, that Intel did to try to increase system memory was Optane. And of course,
that was the, you know, Intel and Micron were working together to come up with it. But basically a new tier of memory that would allow you to deploy more memory, even if it wasn't as fast, and even if it had different characteristics, and bring that to bear for computation.
Now, we saw what happened, and you saw closer than we did what happened to that product. But that being said,
as we've talked about previously on utilizing CXL, Optane really opened the door conceptually
to this whole idea of tiered memory and basically laid the groundwork for everything that's going
to come next, right? Yeah. And I think that, you know, I got the distinct pleasure of launching Optane Persistent Memory at Intel and then killing the Micron equivalent 3D crosspoint memory while I was at Micron. came out and the capability of CXL to drive larger memory sizes, support persistence,
deliver the capability of memory pooling, that having a different media with 3D Crosspoint
was not going to have the advantages and market that maybe had been envisioned and never really taken off.
CXL was a much more scalable route to market for these types of configurations.
And it's no surprise that we're seeing mass industry shift towards the CXL alternatives based on an industry standard that everybody can innovate around.
And I think one thing that after spending decades in this industry,
knowing that there's an industry standard out there which can focus innovation,
and we saw that at the start of the CXL Consortium with the who's who of industry players
and the Consortium Steering Committee, that we knew that CXL was going to be
that type of technology that coalesced industry and innovation and really accelerated movement
towards these new types of models. Do you think Optian was too early?
That's a great question, Craig. I don't think it was too early?
That's a great question, Craig.
I don't think it was too early.
I think that there are challenges with it, right? First of all, Intel talked about it well before that it was delivered.
So I think that maybe expectations were set too high for what was actually delivered.
But then I think that one of the challenges was as a proprietary technology
that required software changes up the stack
to get the full value out of the technology,
maybe it was a bit more difficult to get off the ground
than what was originally envisioned.
And I think that also with CXL coming up, offering an alternative kind of
at the time that deployment would have started enjoying that nice hockey stick of growth,
it was, you know, fighting against kind of a tsunami of innovation that was coming from
the industry until it actually being one of the main proponents of that was coming from the industry.
Intel actually being one of the main proponents of CXL from its inception tells you all you need to know about the value of an industry standard model for this type of technology.
Well, I think another thing that really got in the way of Optane adoption was frankly
the fact that the server platform that
was supposed to go with it was delayed. And that really interrupted quite a lot of things. But
from the outside that it would appear that to me, but then again, back to the question of
virtualization or the pattern of virtualization, I think there what you have, and this is something
that we're seeing powerfully with CXL, is that you need sort of an initial value prop for the technology. So for example,
as you said, with virtualization, that was basically consolidation of underutilized
physical servers. And then you need the big market changing value prop, which for VMware
and virtualization was disaster recovery and high availability. And we've seen the same thing
happen again and again. And with CXL, we're seeing that with memory expansion. I think
right-sizing memory is a valuable initial product. But the thing that gets me excited,
and I think the thing that gets you excited about this too, from what we've spoken about previously,
is what comes next.
What is the ultimate value prop?
And to me, the ultimate value prop is what you were mentioning earlier, which is basically, what is a server?
You know, maybe we need to get out of the box a little bit with server, literally out of the box when it comes to server architecture, because if the server doesn't need to have
all its memory on a local parallel bus or local parallel channels, if it doesn't need
to have all its storage and all its CPU and everything, all its IO, all bound up together
physically, then we can envision some pretty radical things.
Is that what you're seeing here?
Well, I think that even going a bit deeper,
we have the problem of the slowing of Moore's law.
And you have applications and developers
not really stopping to want more compute power
to fuel innovative applications.
So what do you do?
You look at what are the vectors that you can release more compute capability.
And that always gets to, you know, I always think about it as chasing the bottlenecks.
There's always going to be a bottleneck in a system.
Where is the bottleneck? Today, one of the bottlene esoteric designs that are out in the market
today, to enable pools of DPUs, GPUs, CPUs, so that applications can have access to the type of logic
that would be best to drive that application. And to do that in a way that a developer doesn't really need to think about it.
It's abstracted away from that app developer and an orchestrator is really taking over and assigning what resources are required.
I think that at the end of the day, getting to a balanced compute model that enables these applications is going to require us from
breaking that system. You know, we've got a motherboard, it's got a CPU, it's got some
memory, it's got multiple CPUs, it's got some memory, it might have a GPU in it, it's got a
bunch of IO. That's what we've known since we turned a PC on its side and made it a pizza box and put it in a rack and started, you know, x86 data center
computing. And it's time for a change, I think. And I think that's, you know, that's, that is a
welcome thing for the industry. And I think that it's driven, as most hardware changes are, based on what software is asking of it.
And software is asking some things today that we are keeping together with masking tape and bailing wire to deliver.
I think this will allow us to do things that won't put so much stress on the industry,
both from a standpoint of the hardware ecosystem and the software ecosystem? I really like this concept of balance because we've talked about that before,
that ultimately that's sort of the holy grail for server design, for system design, any kind
of system design. You don't want it to be unbalanced because of external factors. You want to have just the right resources.
But that also requires flexibility because not only do you have to have the ability to provision
the right resources for the system, but you have to have the ability to change those as the needs
of that system change. And I think that you're completely right that, you know, having this sort of fabric available really opens that up.
But there's another thing that we haven't talked as much about on utilizing CXL, another technology that is related, most of them as well, are looking at
tile-based architectures on processor interconnects. UCIe is an interesting one because
it's designed, in my mind at least, it's designed to be somewhat analogous to CXL in that it
leverages some of the same protocols and technologies, and that it would theoretically
enable people to get that same kind of balance within the processor, not just external to the
processor. I know that you know something about this. Am I off base? No, I think you're on base.
I think that UCI is a really exciting technology. I actually wrote about it
as being one of the most disruptive technologies of 2022. It was announced at Supercomputing last
year. And the reason, and I know that I sound like a complete geek when I say that, but the reason
why I think this is so important is getting back to that concept of the slowing of Moore's law and companies really taking a look at
chiplet architectures as a path forward.
And what I see in UCIE, and they are compatible, CXL and UCIE, and if you look at the architects,
many of the same architects that work on CXL have worked on UCIE, what you're looking at is basically enabling an industry standard of an interprocessor
connection for chiplets. And that vision is one that's consistent with CXL, but taken much further where a company could choose to have, you know, an Intel CPU core
combined with an NVIDIA GPU core on a piece of silicon and dial up exactly the type of logic
that they want in that chip and share IO and share memory spacing and all
of the things that go into a nice CPU architecture design. That's the end vision of UCIe. It's very
exciting. And I think it's going to, you know, enable, especially in the near term, large CSPs to develop some interesting
chip architectures and not necessarily be confined to whatever portfolio any particular chip vendor
has at its, you know, at its offering. I think that there's no doubt that CSPs are very interested
in it. In fact, many of them are on the board of the UCIE consortium showing their commitment to
movement in this space. And it also plays into the role of foundries, because somebody's going
to have to actually build this. And then you get into some
interesting thoughts on, you know, how do these chips come to be? What is the licensing of these
chiplets? And how do they get constructed? But they live, UCI will live completely compatible with CXL. It does not cancel out the reason why CXL will exist. I think it
gives just more flexibility to platform designers on how they want to deliver logic
into the data center and maybe beyond. But CXL is definitely the interconnect of the future
for box-to-box consideration or, you know, different subsystem consideration
of these composable pools of resources. Yeah, I agree. And I feel like that's,
you know, to the contrary, I think UCIE and CXL are really friends.
And I could definitely see a future where Intel and AMD and NVIDIA and ARM are all out there developing special processors for special big companies, like you mentioned, cloud services and so on. And there's like the Amazon processor and there's the Dell processor and there's the, you know, the Lenovo processor or the NVIDIA processor.
And they're all composed of chiplets that have the right components for their particular use case.
And again, that's all about system balance as well.
And I think it's about best of breed, right? You may love the Zen core for CPU, but you might not be in love with AMD's GPU with
whatever generation of technology we're talking about. You want to use the NVIDIA GPU. And that
gives flexibility to the consumer, which I think just brings innovation forward and puts pressure
on the industry to deliver technology so that they can compete
in a new way.
That's exciting to me because the customer wins.
It's also great that it's opening up high bandwidth memory on the chiplets as well.
They've worked very close for HBM, which is yet another tier of RAM now that developers have access to for their applications.
How many tiers of RAM have we now at this point?
Really?
Four, five?
Easily?
It's fantastic to see options there.
There's some use case that needs really, really fast memory,
but it doesn't need a lot of it.
That CPU and chiplet design is going to deliver all of that.
I think that's a really good point, Greg, because you can see
HBM within a UCI construct
and then you can also see
at the other extreme, DDR4 pool of resources for low-cost, high- construct the type of infrastructure that are going to fuel
the applications required and giving some cost trade-offs as well to be as efficient as possible.
You don't necessarily need HBM memory, which is, you know, a relatively expensive option
in the data center, and you only want to use it where you really need it. But getting that tier of memory in, you know, you may have DDR5, you know, nearer term.
You may have DDR5 closer to that CPU, but then you've got another, you know, tiering
of DDR4 that you can pull from and that that's done in a coherent fashion to enable applications
to just use that extra capacity to write to is fantastic.
I can't wait to see what applications do.
Not to mention that then you also could have a tier of persistent memory addressable something
on the other end of the CXL network as well out in the fabric. I know that
there are companies working on basically enhanced flash or other kinds of, you know, another tier
of memory out there as well. I want to get to something else here, though, and that's that,
although this sounds great, it doesn't always work out very well. And you've been in the industry a while.
You've seen this happen where you've got a great technology and it doesn't happen.
I don't want to be raining on the parade here, but there were cash coherent interconnects.
There was pooling and fabrics.
There were chiplets.
There have been all sorts of things and they don't always work out.
And in fact, we've already seen a little bit of a rough road for Intel's new foundry chiplet vision.
I know it's really early, and I don't want to judge them based on like a month.
But, you know, we've already seen some kind of pushback here. And honestly, some of the companies, even the companies that are committed to CXL are doing not CXL in their own, you know, product lines currently.
There are many ways that this whole thing can be derailed. a protocol, a committee, a consortium, consensus, all this, you're slower to market and you
come to market with an inferior product than a proprietary solution.
That happens all the time in computing.
That could happen here.
I think that, you know, I just finished a cloud computing 2023 view. And one of the things that I think we who talk about technology innovation are very comfortable with is that the future vision always takes way longer to imagine.
You know, full automated provisioning of multi-cloud solutions,
which was a vision that, you know, was expressed many, many years ago.
Not to mention flying cars.
Exactly. I'm still waiting for that. I would love that. So I think that that's why those
beachhead markets and those beachhead use cases are so important because they establish momentum.
You know, if there wasn't the initial beachhead markets for virtualization,
we may not have seen that technology take off.
If you make it too hard or too complex, you may not see it.
So when we talk about these early use cases of CXL, we look at them and we say,
oh, that's exciting. That's nice. That's fine.
But look at all of these things, oh, that's exciting. That's nice. That's fine. But look at
all of these things that we can do long-term. I think that it's important that those technologies
exist because it gets the industry momentum going. If you went to supercomputing or OCP,
you saw that there were, you know, over a dozen companies that were demonstrating early instantiations of CXL, even prior to when, you know, Genoa
and Sapphire hit the market.
And really the, you know, we still exist in the world and will for a long time where the
CPU is the determinant of a technology actually taking off in the marketplace.
So those platforms were really important. You know,
we've seen Samsung come out with their 512 gig CXL solution. Hynix has a solution in market.
You've seen some disruptors, and I know you've talked to some of the disruptors in this space
with their new technologies and their new switches. And, you know, it's exciting to see these solutions come to market
and early abstentions of what could be.
I think the most important thing to look at in these early stages are,
are they interoperable?
Are they plug and play?
Is the consortium doing its job with compliance testing and our vendors delivering to that specification to ensure that that full interoperability of solutions and that full dream of memory pools and memory expansion can work regardless of what platform you choose. That's why an industry standard exists.
Then as we move into CXL 2.0 and 3.0,
are the new use cases being adopted?
I think that's what I'll be looking for as we move forward.
And then the most exciting part,
and I think about this all the time when I,
and the technology that always brings it into my
mind is USB of, you know, USB was not invented, you know, to, to give me a fan that, you know,
plugs off of my laptop or off of my car. It was invented to be a peripheral for things like mice,
but I love where the industry goes with technologies like this. And not that I'm saying that there's going to be a CXL individual fan anytime soon.
But I do think, but yeah, wouldn't that be cool?
But I do think that the most exciting thing about an industry standard like this is what are the companies in stealth right now that are envisioning use cases that we haven't even thought about?
And what will we see coming into the market in 2024 as CXL 2.0 comes to the market and the industry gets to be able to do some really creative things?
I think you're right.
The exciting times are going to start coming around CXL2,
but CXL3, it's going to get really, really interesting.
I think there'll be unicorn companies that don't even exist right now, potentially.
If somebody nails it, corners the market,
there's going to be a lot of growth, you know, a lot of growth.
You know, our data usage requirements and just the growth in everything computing, everything worldwide.
It was just phenomenal.
So it is good to see that we are trying to optimize it best we can using technology to see Excel. And I think that in that time period, and I agree,
3.0 is where it gets really exciting because you get to thousands of nodes
within a multi-tiered, I don't even know what you call it, a cluster, a pool,
whatever, we'll come up with terms that feel natural.
But at that point, I think
that you're going to see, can this technology be broadly adopted?
Or is this too complex?
And are we going to see an unequal adoption of it where the most sophisticated data center managers take advantage of these new capabilities and maybe broad enterprise slows?
And what does that mean for the purveyors of technology
as they navigate new composable infrastructure
and traditional infrastructure?
And how does that come to market?
How does that squeeze them? Who can navigate that? Legacy servers. Hey. Yeah, exactly. You heard it here first.
So I'm excited to see it. I'm excited to see what the industry does. I'm excited to see the new
use cases that come to market and the sparks of innovation that are likely
happening in labs today.
It'll be an exciting time for the data center, and it's going to give us so much to talk
about.
Absolutely.
And I think that this is going to be the exciting thing to watch how this changes.
Now, a CXL connected fan does not sound like a good idea.
But CXL, I think, is going to open up a lot of other avenues and things that we haven't really thought about.
One of the things that I've been pointing out on this show is the fact that it will upend a lot of the assumptions that the chip designers are making when they're designing CPUs.
Not just the system designers, but the CPU designers. And I think that CXL and UCIe
will really change their priorities. And I think in a couple of generations,
we're going to see some things that are going to be very different from today's Xeon or Epic
CPUs. And I think that's going to be pretty exciting.
Absolutely. And I think that we look at the landscape of CPU competition today.
You know, it's the Intel AMD show about the competition.
What does this do for the ARM ecosystem, which is also very invested in CXL?
What does it do for something like RISC-V? Does this provide them an avenue
to start scaling initial solutions into the data center?
And how does all of that disrupt, you know,
what we think about in terms of a data center CPU,
data center compute?
How do applications adjust to that?
The application developer is going to have a lot to think about. Do we need more abstraction in the way that we're writing code so all of this innovation and disruption can be kept away from how applications are being composed.
I think that that's going to be an exciting thing to see come to fruition.
But I haven't seen an exciting time like this in data center infrastructure in years in terms of what is an opportunity space for innovation and design. So I'm excited.
I think we're still going to be waiting a while on liquid cooling,
but CXL is at least around the corner.
And flying cars.
Yeah, liquid cooling and flying cars. Which one comes first?
Yep, just as soon as the superconducting magnets are ready.
Exactly.
Well, thank you so much, Allison. This has been a really interesting conversation and really thought provoking one. And it's kind of fun to step away from products and companies and just
talk about what the technology is all about and what it means to people who are trying to,
well, we're trying to make sense of it. So I really appreciate this conversation.
As we wrap up this episode, where can people continue this conversation with you?
Where can they find you?
Where can they follow the other platforms you have?
Well, I've written about CXL and UCIE and had some guests talking about those technologies
on my platform.
You can find me at www.thetecharina.net or at TechAllison on Twitter and Allison Klein on LinkedIn.
And I'm always happy to engage and talk about technology and what questions you may have in terms of what these technologies mean for the data center and take the conversation further.
Yep, absolutely. And those of you who have enjoyed these conversations as well,
you'll find us at the Tech Field Day events that I run.
So we are hosting Edge Field Day at the end of February,
which I think is going to be behind us
by the time this episode gets published.
Also, CXL-themed Tech Field Day event in March,
which some of us are going to be involved in.
So I
urge you all to check out techfielday.com and learn more about those events and check out the
videos of those on YouTube and LinkedIn. Thank you for listening to Utilizing CXL, part of the
Utilizing Tech podcast series. If you enjoyed this discussion, please do give us a subscription.
You'll find us in your favorite podcast player. You'll find us on YouTube as well. And give us a rating and a review. That really helps. And maybe
a comment. We would love to hear from you. This podcast was brought to you by gestaltit.com,
your home for IT coverage from across the enterprise. For show notes and more episodes,
go to utilizingtech.com or find us on Twitter or Mastodon as Utilizing Tech.
Thanks for listening and we'll see you next time.