Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 4x18: The Past and Future of CXL with Allyson Klein

Starting point is 00:00:00 Welcome to Utilizing Tech, the podcast about emerging technology from Gestalt IT. This season of Utilizing Tech focuses on Compute Express Link, or CXL, a new technology that promises to revolutionize enterprise computing. I'm your host, Stephen Foskett, organizer of Tech Field Day and publisher of Gestalt IT. And joining me today on Utilizing Tech is my co-host, Craig Rogers. Welcome to the show, Craig. Thanks, Stephen. Great to be here. It's good to have you once again. So, Craig, we've been talking here on Utilizing Tech,

Starting point is 00:00:32 Utilizing CXL this season about the challenges of basically, where does all this go? Like, what's the point? You know, we're, yeah, more memory. Memory is cool, right? But is there more than memory? Yeah, we want a lot more than memory. We want storage. We want TPUs. We want APUs. We want everything.

Starting point is 00:00:57 We want everything. Yes, yes, we want everything. Now, I think the important thing is that I think that we can see potential in this thing, that it could be really revolutionary. And I don't blame companies for talking about what's on the truck. Because if they have memory expansion cards and memory expansion capability, and that that provides a useful capability, it allows architects to do things about right-sizing memory and stuff. That's cool.

Starting point is 00:01:26 I like it. I like it. But I think that this thing could be a lot more. I completely agree. I completely agree. The future use cases are much more exciting than our current capabilities. So that's why when I was talking to Allison Klein, who was formerly at Intel and Micron, and I learned about her background with CXL and her overall strategic vision of where the industry is headed, I knew that we had to invite her on the show. So Allison, welcome to the broadcast. Thank you so much, Stephen. And thanks for mentioning my background in CXL.

Starting point is 00:02:06 It's a technology that I have a lot of passion for, and I've been covering it on my own platform, the Tech Arena. Yeah, tell us briefly, what is the Tech Arena? So you mentioned that I used to be at Intel and Micron. I had this crazy idea last year to finally pursue my dream to go out alone and establish my own media platform, the tech arena, I felt like there was a need for additional discussion on tech innovation from edge to cloud. And I felt that my background provided me a good way of being a voice in that space. Yeah, I absolutely agree with you on that. And so far, I think that you have become

Starting point is 00:02:47 a voice in that space. And I can't wait to hear where you go with this in the future. Since you were at Intel, maybe you can give us a little bit of look kind of into the past. Where did CXL come from? What's the history of this technology? Well, it's an interesting question, and it depends on how far back you want to go. My first technology that I worked at at Intel was called InfiniBand. You probably remember that. It's a fabric technology in the HPC space. And if you go back to the early 90s, and I won't spend a lot of time here, but it's relevant. You can see that there has been a vision for channel-based, coherent connection for server platforms for quite a long time. We didn't quite achieve it

Starting point is 00:03:34 with InfiniBand with the advent of PCI Express, but move forward in time and you get to CXL. And what I think is the impetus is trying to push up against a server platform as being this determinant of how you deliver balanced computing in data center environments. There are constraints to utilizing processor memory IO on a motherboard as the determinant of what you can do for an application. And CXL enables architects and engineers to envision different ways to deliver computing in those data center environments. It's funny, in the future, I don't know what we're going to call a server. What part of the rack is going to be the server? Is it going to be where the compute lives or is it going to be where the RAM lives? It's going to become this abstract thing that exists. Is the whole rack now a server?

Starting point is 00:05:08 Well, I think that maybe what you should think about is, is there a server? Or is there an orchestrator of resources? And I think that CXL is one of the many technologies that's being worked on today to really reframe how computing is delivered within a data center environment and how resources are used, whether they're, you know, a standard CPU or another logic processor or memory or even storage. How does the application get what it needs in the time frame that it needs? And how can you compose the right resources for that application in that time? It's a very different computing model than what we think about today. Yeah. And yet, you know, in a way, I think that it's sort of reassuring because it's not like we're upending the whole von Neumann model. I mean, we have processors, we have short-term memory, we have long-term storage, we have IO. It's just that those things don't necessarily need to be fixed resources that exist on a single motherboard and within a single chassis, right? That's correct. And I don't think

Starting point is 00:05:59 that we're saying, you know, what we know about computing, let's just throw that all away. I think what we're saying is how do we remove the constraints of computing? And one of the constraints that we're coming up at is the physical box that has been the primary determinant of a server since the birth of servers and data centers many, many decades ago. I think it's another evolution as well. If we look at the changes and optimizations that were brought by, say, virtualization, where we were making more efficient use of resources

Starting point is 00:06:34 within a server, there was much less wasted RAM, storage, everything. Now we're just going to have more options around how we're actually composing these servers so it'll bring and it'll bring in for efficiencies you know wasted ram is obviously a huge one in in the hpc world and they'll be early adapters but don't go down in the ram modules too much now but the the future use cases it is going to be very hard to define a server, but these we also saw the industry look at virtualization and say, what else can we do with that? And, you know, that was where the birth of cloud computing came in and, you know,

Starting point is 00:07:35 and the concept of more efficient utilization of platforms, self-provisioning of services, et cetera. I don't think that if you go back to the early stages of VMware, that was really the vision of what was the problem statement, right? It was really how do we get more work out of these servers that are sitting at, you know, 8% to 10% capacity, and how can we, you know, how can we utilize our resources better, but not necessarily where we went. I think CXL is going to be the same thing. You start with the platforms that exist today, AMD Genoa and Intel Sapphire, and they've

Starting point is 00:08:18 got CXL 1.1. They're doing some interesting things with CXL, but those use cases are very maybe, you know, baby steps towards this grand vision that we've started with, you know, expansion of memory, and everybody's talking about expansion of memory and memory pooling, and I think that the reason why they're talking about it is because applications are requiring more memory, and app developers have been for years having to write around the capacity constraints of memory and doing some esoteric things in order to, you know, whether it's an AI to train a model utilizing more memory or, you know, advanced analytics where you want to have a larger memory capacity of a database that you're referencing or whatever the case may

Starting point is 00:09:04 be, we're working around this problem. So that's the easiest one to solve. Yeah, it's interesting that you bring up the memory question because one of the things that people did, well, that Intel did to try to increase system memory was Optane. And of course, that was the, you know, Intel and Micron were working together to come up with it. But basically a new tier of memory that would allow you to deploy more memory, even if it wasn't as fast, and even if it had different characteristics, and bring that to bear for computation. Now, we saw what happened, and you saw closer than we did what happened to that product. But that being said, as we've talked about previously on utilizing CXL, Optane really opened the door conceptually to this whole idea of tiered memory and basically laid the groundwork for everything that's going to come next, right? Yeah. And I think that, you know, I got the distinct pleasure of launching Optane Persistent Memory at Intel and then killing the Micron equivalent 3D crosspoint memory while I was at Micron. came out and the capability of CXL to drive larger memory sizes, support persistence,

Starting point is 00:10:28 deliver the capability of memory pooling, that having a different media with 3D Crosspoint was not going to have the advantages and market that maybe had been envisioned and never really taken off. CXL was a much more scalable route to market for these types of configurations. And it's no surprise that we're seeing mass industry shift towards the CXL alternatives based on an industry standard that everybody can innovate around. And I think one thing that after spending decades in this industry, knowing that there's an industry standard out there which can focus innovation, and we saw that at the start of the CXL Consortium with the who's who of industry players and the Consortium Steering Committee, that we knew that CXL was going to be

Starting point is 00:11:26 that type of technology that coalesced industry and innovation and really accelerated movement towards these new types of models. Do you think Optian was too early? That's a great question, Craig. I don't think it was too early? That's a great question, Craig. I don't think it was too early. I think that there are challenges with it, right? First of all, Intel talked about it well before that it was delivered. So I think that maybe expectations were set too high for what was actually delivered. But then I think that one of the challenges was as a proprietary technology

Starting point is 00:12:08 that required software changes up the stack to get the full value out of the technology, maybe it was a bit more difficult to get off the ground than what was originally envisioned. And I think that also with CXL coming up, offering an alternative kind of at the time that deployment would have started enjoying that nice hockey stick of growth, it was, you know, fighting against kind of a tsunami of innovation that was coming from the industry until it actually being one of the main proponents of that was coming from the industry.

Starting point is 00:12:52 Intel actually being one of the main proponents of CXL from its inception tells you all you need to know about the value of an industry standard model for this type of technology. Well, I think another thing that really got in the way of Optane adoption was frankly the fact that the server platform that was supposed to go with it was delayed. And that really interrupted quite a lot of things. But from the outside that it would appear that to me, but then again, back to the question of virtualization or the pattern of virtualization, I think there what you have, and this is something that we're seeing powerfully with CXL, is that you need sort of an initial value prop for the technology. So for example, as you said, with virtualization, that was basically consolidation of underutilized

Starting point is 00:13:32 physical servers. And then you need the big market changing value prop, which for VMware and virtualization was disaster recovery and high availability. And we've seen the same thing happen again and again. And with CXL, we're seeing that with memory expansion. I think right-sizing memory is a valuable initial product. But the thing that gets me excited, and I think the thing that gets you excited about this too, from what we've spoken about previously, is what comes next. What is the ultimate value prop? And to me, the ultimate value prop is what you were mentioning earlier, which is basically, what is a server?

Starting point is 00:14:15 You know, maybe we need to get out of the box a little bit with server, literally out of the box when it comes to server architecture, because if the server doesn't need to have all its memory on a local parallel bus or local parallel channels, if it doesn't need to have all its storage and all its CPU and everything, all its IO, all bound up together physically, then we can envision some pretty radical things. Is that what you're seeing here? Well, I think that even going a bit deeper, we have the problem of the slowing of Moore's law. And you have applications and developers

Starting point is 00:14:57 not really stopping to want more compute power to fuel innovative applications. So what do you do? You look at what are the vectors that you can release more compute capability. And that always gets to, you know, I always think about it as chasing the bottlenecks. There's always going to be a bottleneck in a system. Where is the bottleneck? Today, one of the bottlene esoteric designs that are out in the market today, to enable pools of DPUs, GPUs, CPUs, so that applications can have access to the type of logic

Starting point is 00:15:56 that would be best to drive that application. And to do that in a way that a developer doesn't really need to think about it. It's abstracted away from that app developer and an orchestrator is really taking over and assigning what resources are required. I think that at the end of the day, getting to a balanced compute model that enables these applications is going to require us from breaking that system. You know, we've got a motherboard, it's got a CPU, it's got some memory, it's got multiple CPUs, it's got some memory, it might have a GPU in it, it's got a bunch of IO. That's what we've known since we turned a PC on its side and made it a pizza box and put it in a rack and started, you know, x86 data center computing. And it's time for a change, I think. And I think that's, you know, that's, that is a welcome thing for the industry. And I think that it's driven, as most hardware changes are, based on what software is asking of it.

Starting point is 00:17:05 And software is asking some things today that we are keeping together with masking tape and bailing wire to deliver. I think this will allow us to do things that won't put so much stress on the industry, both from a standpoint of the hardware ecosystem and the software ecosystem? I really like this concept of balance because we've talked about that before, that ultimately that's sort of the holy grail for server design, for system design, any kind of system design. You don't want it to be unbalanced because of external factors. You want to have just the right resources. But that also requires flexibility because not only do you have to have the ability to provision the right resources for the system, but you have to have the ability to change those as the needs of that system change. And I think that you're completely right that, you know, having this sort of fabric available really opens that up.

Starting point is 00:18:07 But there's another thing that we haven't talked as much about on utilizing CXL, another technology that is related, most of them as well, are looking at tile-based architectures on processor interconnects. UCIe is an interesting one because it's designed, in my mind at least, it's designed to be somewhat analogous to CXL in that it leverages some of the same protocols and technologies, and that it would theoretically enable people to get that same kind of balance within the processor, not just external to the processor. I know that you know something about this. Am I off base? No, I think you're on base. I think that UCI is a really exciting technology. I actually wrote about it as being one of the most disruptive technologies of 2022. It was announced at Supercomputing last

Starting point is 00:19:12 year. And the reason, and I know that I sound like a complete geek when I say that, but the reason why I think this is so important is getting back to that concept of the slowing of Moore's law and companies really taking a look at chiplet architectures as a path forward. And what I see in UCIE, and they are compatible, CXL and UCIE, and if you look at the architects, many of the same architects that work on CXL have worked on UCIE, what you're looking at is basically enabling an industry standard of an interprocessor connection for chiplets. And that vision is one that's consistent with CXL, but taken much further where a company could choose to have, you know, an Intel CPU core combined with an NVIDIA GPU core on a piece of silicon and dial up exactly the type of logic that they want in that chip and share IO and share memory spacing and all

Starting point is 00:20:28 of the things that go into a nice CPU architecture design. That's the end vision of UCIe. It's very exciting. And I think it's going to, you know, enable, especially in the near term, large CSPs to develop some interesting chip architectures and not necessarily be confined to whatever portfolio any particular chip vendor has at its, you know, at its offering. I think that there's no doubt that CSPs are very interested in it. In fact, many of them are on the board of the UCIE consortium showing their commitment to movement in this space. And it also plays into the role of foundries, because somebody's going to have to actually build this. And then you get into some interesting thoughts on, you know, how do these chips come to be? What is the licensing of these

Starting point is 00:21:32 chiplets? And how do they get constructed? But they live, UCI will live completely compatible with CXL. It does not cancel out the reason why CXL will exist. I think it gives just more flexibility to platform designers on how they want to deliver logic into the data center and maybe beyond. But CXL is definitely the interconnect of the future for box-to-box consideration or, you know, different subsystem consideration of these composable pools of resources. Yeah, I agree. And I feel like that's, you know, to the contrary, I think UCIE and CXL are really friends. And I could definitely see a future where Intel and AMD and NVIDIA and ARM are all out there developing special processors for special big companies, like you mentioned, cloud services and so on. And there's like the Amazon processor and there's the Dell processor and there's the, you know, the Lenovo processor or the NVIDIA processor. And they're all composed of chiplets that have the right components for their particular use case.

Starting point is 00:22:55 And again, that's all about system balance as well. And I think it's about best of breed, right? You may love the Zen core for CPU, but you might not be in love with AMD's GPU with whatever generation of technology we're talking about. You want to use the NVIDIA GPU. And that gives flexibility to the consumer, which I think just brings innovation forward and puts pressure on the industry to deliver technology so that they can compete in a new way. That's exciting to me because the customer wins. It's also great that it's opening up high bandwidth memory on the chiplets as well.

Starting point is 00:23:36 They've worked very close for HBM, which is yet another tier of RAM now that developers have access to for their applications. How many tiers of RAM have we now at this point? Really? Four, five? Easily? It's fantastic to see options there. There's some use case that needs really, really fast memory, but it doesn't need a lot of it.

Starting point is 00:24:07 That CPU and chiplet design is going to deliver all of that. I think that's a really good point, Greg, because you can see HBM within a UCI construct and then you can also see at the other extreme, DDR4 pool of resources for low-cost, high- construct the type of infrastructure that are going to fuel the applications required and giving some cost trade-offs as well to be as efficient as possible. You don't necessarily need HBM memory, which is, you know, a relatively expensive option in the data center, and you only want to use it where you really need it. But getting that tier of memory in, you know, you may have DDR5, you know, nearer term.

Starting point is 00:25:11 You may have DDR5 closer to that CPU, but then you've got another, you know, tiering of DDR4 that you can pull from and that that's done in a coherent fashion to enable applications to just use that extra capacity to write to is fantastic. I can't wait to see what applications do. Not to mention that then you also could have a tier of persistent memory addressable something on the other end of the CXL network as well out in the fabric. I know that there are companies working on basically enhanced flash or other kinds of, you know, another tier of memory out there as well. I want to get to something else here, though, and that's that,

Starting point is 00:26:00 although this sounds great, it doesn't always work out very well. And you've been in the industry a while. You've seen this happen where you've got a great technology and it doesn't happen. I don't want to be raining on the parade here, but there were cash coherent interconnects. There was pooling and fabrics. There were chiplets. There have been all sorts of things and they don't always work out. And in fact, we've already seen a little bit of a rough road for Intel's new foundry chiplet vision. I know it's really early, and I don't want to judge them based on like a month.

Starting point is 00:26:39 But, you know, we've already seen some kind of pushback here. And honestly, some of the companies, even the companies that are committed to CXL are doing not CXL in their own, you know, product lines currently. There are many ways that this whole thing can be derailed. a protocol, a committee, a consortium, consensus, all this, you're slower to market and you come to market with an inferior product than a proprietary solution. That happens all the time in computing. That could happen here. I think that, you know, I just finished a cloud computing 2023 view. And one of the things that I think we who talk about technology innovation are very comfortable with is that the future vision always takes way longer to imagine. You know, full automated provisioning of multi-cloud solutions, which was a vision that, you know, was expressed many, many years ago.

Starting point is 00:27:51 Not to mention flying cars. Exactly. I'm still waiting for that. I would love that. So I think that that's why those beachhead markets and those beachhead use cases are so important because they establish momentum. You know, if there wasn't the initial beachhead markets for virtualization, we may not have seen that technology take off. If you make it too hard or too complex, you may not see it. So when we talk about these early use cases of CXL, we look at them and we say, oh, that's exciting. That's nice. That's fine.

Starting point is 00:28:24 But look at all of these things, oh, that's exciting. That's nice. That's fine. But look at all of these things that we can do long-term. I think that it's important that those technologies exist because it gets the industry momentum going. If you went to supercomputing or OCP, you saw that there were, you know, over a dozen companies that were demonstrating early instantiations of CXL, even prior to when, you know, Genoa and Sapphire hit the market. And really the, you know, we still exist in the world and will for a long time where the CPU is the determinant of a technology actually taking off in the marketplace. So those platforms were really important. You know,

Starting point is 00:29:05 we've seen Samsung come out with their 512 gig CXL solution. Hynix has a solution in market. You've seen some disruptors, and I know you've talked to some of the disruptors in this space with their new technologies and their new switches. And, you know, it's exciting to see these solutions come to market and early abstentions of what could be. I think the most important thing to look at in these early stages are, are they interoperable? Are they plug and play? Is the consortium doing its job with compliance testing and our vendors delivering to that specification to ensure that that full interoperability of solutions and that full dream of memory pools and memory expansion can work regardless of what platform you choose. That's why an industry standard exists.

Starting point is 00:30:06 Then as we move into CXL 2.0 and 3.0, are the new use cases being adopted? I think that's what I'll be looking for as we move forward. And then the most exciting part, and I think about this all the time when I, and the technology that always brings it into my mind is USB of, you know, USB was not invented, you know, to, to give me a fan that, you know, plugs off of my laptop or off of my car. It was invented to be a peripheral for things like mice,

Starting point is 00:30:39 but I love where the industry goes with technologies like this. And not that I'm saying that there's going to be a CXL individual fan anytime soon. But I do think, but yeah, wouldn't that be cool? But I do think that the most exciting thing about an industry standard like this is what are the companies in stealth right now that are envisioning use cases that we haven't even thought about? And what will we see coming into the market in 2024 as CXL 2.0 comes to the market and the industry gets to be able to do some really creative things? I think you're right. The exciting times are going to start coming around CXL2, but CXL3, it's going to get really, really interesting. I think there'll be unicorn companies that don't even exist right now, potentially.

Starting point is 00:31:40 If somebody nails it, corners the market, there's going to be a lot of growth, you know, a lot of growth. You know, our data usage requirements and just the growth in everything computing, everything worldwide. It was just phenomenal. So it is good to see that we are trying to optimize it best we can using technology to see Excel. And I think that in that time period, and I agree, 3.0 is where it gets really exciting because you get to thousands of nodes within a multi-tiered, I don't even know what you call it, a cluster, a pool, whatever, we'll come up with terms that feel natural.

Starting point is 00:32:20 But at that point, I think that you're going to see, can this technology be broadly adopted? Or is this too complex? And are we going to see an unequal adoption of it where the most sophisticated data center managers take advantage of these new capabilities and maybe broad enterprise slows? And what does that mean for the purveyors of technology as they navigate new composable infrastructure and traditional infrastructure? And how does that come to market?

Starting point is 00:33:07 How does that squeeze them? Who can navigate that? Legacy servers. Hey. Yeah, exactly. You heard it here first. So I'm excited to see it. I'm excited to see what the industry does. I'm excited to see the new use cases that come to market and the sparks of innovation that are likely happening in labs today. It'll be an exciting time for the data center, and it's going to give us so much to talk about. Absolutely. And I think that this is going to be the exciting thing to watch how this changes.

Starting point is 00:33:40 Now, a CXL connected fan does not sound like a good idea. But CXL, I think, is going to open up a lot of other avenues and things that we haven't really thought about. One of the things that I've been pointing out on this show is the fact that it will upend a lot of the assumptions that the chip designers are making when they're designing CPUs. Not just the system designers, but the CPU designers. And I think that CXL and UCIe will really change their priorities. And I think in a couple of generations, we're going to see some things that are going to be very different from today's Xeon or Epic CPUs. And I think that's going to be pretty exciting. Absolutely. And I think that we look at the landscape of CPU competition today.

Starting point is 00:34:27 You know, it's the Intel AMD show about the competition. What does this do for the ARM ecosystem, which is also very invested in CXL? What does it do for something like RISC-V? Does this provide them an avenue to start scaling initial solutions into the data center? And how does all of that disrupt, you know, what we think about in terms of a data center CPU, data center compute? How do applications adjust to that?

Starting point is 00:35:04 The application developer is going to have a lot to think about. Do we need more abstraction in the way that we're writing code so all of this innovation and disruption can be kept away from how applications are being composed. I think that that's going to be an exciting thing to see come to fruition. But I haven't seen an exciting time like this in data center infrastructure in years in terms of what is an opportunity space for innovation and design. So I'm excited. I think we're still going to be waiting a while on liquid cooling, but CXL is at least around the corner. And flying cars. Yeah, liquid cooling and flying cars. Which one comes first? Yep, just as soon as the superconducting magnets are ready.

Starting point is 00:36:05 Exactly. Well, thank you so much, Allison. This has been a really interesting conversation and really thought provoking one. And it's kind of fun to step away from products and companies and just talk about what the technology is all about and what it means to people who are trying to, well, we're trying to make sense of it. So I really appreciate this conversation. As we wrap up this episode, where can people continue this conversation with you? Where can they find you? Where can they follow the other platforms you have? Well, I've written about CXL and UCIE and had some guests talking about those technologies

Starting point is 00:36:38 on my platform. You can find me at www.thetecharina.net or at TechAllison on Twitter and Allison Klein on LinkedIn. And I'm always happy to engage and talk about technology and what questions you may have in terms of what these technologies mean for the data center and take the conversation further. Yep, absolutely. And those of you who have enjoyed these conversations as well, you'll find us at the Tech Field Day events that I run. So we are hosting Edge Field Day at the end of February, which I think is going to be behind us by the time this episode gets published.

Starting point is 00:37:19 Also, CXL-themed Tech Field Day event in March, which some of us are going to be involved in. So I urge you all to check out techfielday.com and learn more about those events and check out the videos of those on YouTube and LinkedIn. Thank you for listening to Utilizing CXL, part of the Utilizing Tech podcast series. If you enjoyed this discussion, please do give us a subscription. You'll find us in your favorite podcast player. You'll find us on YouTube as well. And give us a rating and a review. That really helps. And maybe a comment. We would love to hear from you. This podcast was brought to you by gestaltit.com,

Starting point is 00:37:54 your home for IT coverage from across the enterprise. For show notes and more episodes, go to utilizingtech.com or find us on Twitter or Mastodon as Utilizing Tech. Thanks for listening and we'll see you next time.

Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 4x18: The Past and Future of CXL with Allyson Klein

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.