Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 4x1: What's Next for CXL after Memory Expansion?

Starting point is 00:00:00 Welcome to Utilizing CXL, the podcast about this emerging CXL technology. I'm your host, Stephen Foskett from Gestalt IT, and we are meeting with a number of exciting people in the industry here at the CXL Forum in New York City. We had the opportunity to sit down and have a bit of a discussion about what's happening with CXL, where it stands, and where it's going. And so we just decided to jump on stage and record this live. So we are in the World Trade Center area of New York. We are at the CXL Forum. We will be doing more of these CXL Forum events in the future. And, of course, this is the first real episode of Utilizing CXL, the CXL-focused podcast season from Utilizing Tech.

Starting point is 00:00:53 So if you'd like to learn more about this podcast or see our previous seasons focused on AI, just go to UtilizingTech.com. But before we get into the conversation, let's meet who's on the panel today. Good to meet you. This is Julie Choi from Samsung Memory Business Unit based in Korea. I am head of new product business marketing. Yes. I'm Steve Scarble. I'm the senior product manager and software architect at MemVerge. Yeah, I'm Shailesh Tussu. I'm the VP of CXL product development at Marvell. And I'm George Apostol, founder and CEO at Elastics.Cloud. So this is a real unusual opportunity for us because as Frank Barry from MemVerge just put us together here, essentially I just gave an opening introduction here to the CXL forum

Starting point is 00:01:49 and kind of laid out where CXL is from an end user perspective and where it's going. And the goal of this podcast this whole season is going to be to try to explain CXL to the IT architect audience and to help them to understand where this technology is going to be to try to explain CXL to the IT architect audience and to help them to understand where this technology is going. And yet, the opportunity to get all of you together for a conversation was too good to pass up. So which of you, I guess I don't want to make you fight, but which of you is working on the thing that's most real in the market? I think it might be Samsung, right? You guys are in the market with a product. So tell us a little bit about where you stand

Starting point is 00:02:28 with your product now. Well, in fact, I think Samsung is leading as well as in other industry. So in CXL interface, we actually have launched the first product as a memory expander in last year. And then this year, we have also launched the CXL Semantic SSD as a storage device as well. So I think we are in the first place as a pangae to develop all this market.

Starting point is 00:03:01 Yes. So one of the first hardware products to support CXL. That's right. yes. CXL expander with 512 gigabyte memory, yeah. And I know that Memverge is here already with software that supports CXL as well, right? Correct, yeah.

Starting point is 00:03:15 So we've been doing memory tiering and memory snapshotting for a number of years now, initially with the Optane persistent memory, but yeah, CXL just dovetails right in from our perspective, right? We can continue doing the tiering across that for unmodified applications to give those applications the bandwidth performance that they're looking for

Starting point is 00:03:33 on the capacity as well. And of course, we wouldn't get very far without you guys as well. So tell us where you stand. Yeah, so Marvell is already demonstrating in an FPGA CXL-based memory pooling. So we're doing expansion pooling with a unified architecture. And coming up soon, we'll be doing it with the silicon.

Starting point is 00:03:53 But today, we are demonstrating in an FPGA. At Elasticsearch Cloud, we're focused on smart, intelligent switch system on a chip. And we are leveraging the capabilities of CXL to extend the capabilities of the switch. So today we have things running in FPGA. We can demonstrate symmetric multiprocessor memory pooling. And at the later shows, we'll start to unveil more and more of our functionality in FPGA with the expectation of having silicon next year. Yeah. So from the external perspective, then,

Starting point is 00:04:32 we actually pretty much went in order here, right? Because we can't do anything without hardware. And so we have a soon-to-ship product from Samsung for memory expansion and storage. We have a soon-to-ship product from Samsung for memory expansion and storage. We have a shipping product in terms of software that enables that product to work. We also have support in, I know that there's support in the Linux kernel, or at least developing support in the Linux kernel. We're getting support from the server

Starting point is 00:05:01 and CPU platforms as well. We heard this morning that we're gonna have support in the next and CPU platforms as well. We heard this morning that we're going to have support in the next generation server platforms from most of the, well, from the two big CPU makers in Q1. And of course, for the folks here on my left, you're working on basically moving this forward into the next generation. So we hear, you know, Marvell, you are doing the chips that are going to glue the next generation products together,

Starting point is 00:05:33 and you're prototyping those in FPGA, and then we're hearing that we're also working on switching fabrics. So essentially, right here, we have a cross-section of the current state and the future state, and of course, all of these products are going to be working together. From the perspective of the future, let me go backwards here and ask you, when do we get CXL fabrics in the market? So I believe we'll start to see solutions come out in the second half of next year. And will those require, those will be CXL 3 version 3 products, right? Well, so the product that we're creating, we're calling it 2.x because it's more than 2, but not all of the features of 3.

Starting point is 00:06:19 The 3 spec is still evolving, and as you know, it takes time to do silicon. But one of the things that we're also tracking is the processors and the speed of those processors coming out. So even though 3.0 is 64 gigatransfers per second, I think most of the devices coming out you'll see in the next year or so are going to be at 32 gigatransfers because that's where the processors are today. Yeah, and I think that that's important. You know, we're excited about CXL, the prospects of CXL.

Starting point is 00:06:50 And certainly, memory expansion is really happening and is really important for people who need to write size memory or need to access more and more memory in their servers. But in the future, I think that we need to kind of temper our enthusiasm and realize that it's gonna be a little while before some of this rack scale architecture and disaggregated systems and composable systems

Starting point is 00:07:14 and all that comes. And in order to get there, we need what basically you guys are working on as well. You know, Marvell, we need controllers. We need chips that support this thing. How does the development process work to create sort of a generic product that can be used by a lot of different products, a lot of different OEMs? How does the development process work, and how are you doing?

Starting point is 00:07:40 Yeah, given CXL is a brand-new technology, and everyone is really looking at different use cases for it, and the use cases are evolving by the month. It's really – so what we did is at Marvell is we actually created an architecture that is really looking at it from an expansion to a pooling to a switching and something that can scale from CXL 1.1 all the way to future CXL devices. There are some thoughts on how it even goes to CXL 3.0. Now, what we have found is, okay, so we need to show how does this technology really help end users and their applications, because it was not very well understood even six months ago in what CXL can do. And in fact, six months from now, we predict that what people will be doing with it is

Starting point is 00:08:27 something that no one's talking about today, right? So we are providing various tools to enable the technology and enable use case, even creation or understanding of CXL going forward. These tools start with FPGA, but the FPGA really is representing our silicon architecture and our silicon. We're not ready to announce our silicon yet, but it's coming up pretty soon associated with that. And we feel that's the best way to really enable

Starting point is 00:08:58 both the early adapters, as you said, is basically the cloud data centers, but also the rest of the industry beyond that, the cloud data centers, but also the rest of the industry beyond that, beyond the cloud data centers, that need to actually try something out, feel it out, before they can actually get more advanced use cases out there. Yeah. Well, it's very smart, and I think that most of the companies in this space are implementing

Starting point is 00:09:18 an FPGA before they try to produce ASICs for these things, because it's just, you know, it's still fairly new, but the FPGA gives you the ability to have real product that really works, at least, you know, in labs and testing and kind of building out those products and yet react as demands change or as the protocols change, right? Right, and then given new technology and, you know, multiple host companies are all signed up for the CXL, everyone has its first generation implementation for everyone. So one of the things that we have done is ensure that we actually can operate across the industry. And not only with multiple hosts, but also with partnering with memory vendors like Samsung to understand whether it will operate or not operate with our solutions in the future.

Starting point is 00:10:09 And I want to talk to you a little bit about software support, because as I said in my introduction, I see software as the potential key that unlocks CXL, the entire CXL market, or the roadblock that stops it from being implemented. because if we don't have software support for this new hardware, essentially what we're going to end up with is point solutions for certain specific use cases that are supported by software, by proprietary software for that solution. But if we have a general software layer

Starting point is 00:10:42 that allows us to, for example, in the case of MemVerge, to unlock the potential for hierarchical or tiered memory, then that means that a whole category of the industry has opened up, right? I mean, you can essentially, MemVerge would be essentially able to support any kind of tiered or hierarchical memory system, right? That's exactly right, yeah. So our vision is to work with all the hardware vendors, whether it be switch guys, device guys, and unify this from an application or a software perspective, right? So, you know, the operating system

Starting point is 00:11:11 Linux has got some CXL tiering built into it today, although it needs to mature a little bit, but our tiering solution, snapshotting solution will work across any of this stuff. And it goes beyond that, right? I mean, it's not just at the device level, from the host level.

Starting point is 00:11:27 You've also got the orchestration, or I guess what the CXL spec calls the fabric manager, right? So somebody has to tell the device guys, the switch guys, what devices, what ports I need for a particular instance, and go make that request, go make it happen, and then present that up to the application. Now, could that be an external host? Sure, that could be done.

Starting point is 00:11:48 Should it be the application itself? More likely, probably, right? So the application requests the resources, sends the request out, and it all magically happens under the covers, right? And that's really where we are with MemVerge. And initially, MemVerge was enabling Optane-based memory, but of course, it seems to me that the product was always designed with the idea in mind that there could be all sorts of different types of memory on different buses and different ways of connecting and so on. And, you know, where do you think memory will reside in future systems? Do you think that it's going to be an expansion card in the server?

Starting point is 00:12:25 Do you think we're going to have a tray of memory in a rack? I mean, is it going to be connected via fabric? Is it going to be shared? What is the future of big memory with CXL look like? I think it's going to be a combination of all of it, right? I mean, depending on the latencies, as you add switches, right, we add that latency factor, which might be okay for a lot of applications, but for some it's not.

Starting point is 00:12:47 So I think we're still going to need the local DRAM, we're still going to need some local high bandwidth memory, maybe some local CXL, but then we can definitely expand out over the fabric to pools and sharing and everything else. So I think that's going to be an exciting thing. So to your point, you know, we may end up with memory trays or arrays, memory arrays, right,

Starting point is 00:13:09 versus disk-based arrays, right, where we just plug it into the fabric somewhere and we can access it from anywhere. Yeah, and that's where I wanted to go here with Samsung as well. Memory trays, memory arrays, that probably sounds pretty good to a company that's focused on memory chips. Yeah. But as Steve mentioned, at this moment, there will be a different market positioning of the memory expander. Maybe some of customers, they will be requiring higher capacity. In that case, we will support as an expander type. And some of devices or some of customers, they will be requiring persistent characteristics. In that case, we will support the dual-mode interface that can support the persistent memory

Starting point is 00:14:09 characteristics. So we are looking at all those types together, yes. Yeah, and I was actually going to go there next. So Samsung is, of course, a huge maker of DRAM, but also a huge maker of flash memory. And you are clearly focused on bringing both of these products to market, which are two somewhat different products, but could be used in, well, it's hard to think about exactly how all that's going to play out. But how do you see this? Do you see it more as DRAM or more as persistent or a combination for different use cases? We are actually looking at all three types, I should say. So this is quite strategic questions in our place, from our perspective, because CXL memory or CXL expander, some customers are asking low-cost version. In that case,

Starting point is 00:15:07 that will also cannibalize our market. So we have some threats as well in memory perspective. However, at the same time, customers are asking more memories whenever they want to adopt flexibly when they need with the AI and ML processing with mem larger memory sizes they will they will have more controllability on their side so as a premium type we are looking at the 512 gigabytes or memory size of the memory expander. And then in the persistent memory wise, we are looking at the CXL semantic SSDs.

Starting point is 00:15:52 And also for some customers who really want low cost version of the memories, we are also looking at whether we can support CXL memory with a low cost version. Yes, so I can say we have all three different products at this moment. Yeah. Yeah, and that's interesting. You know, low cost would be a very, it's interesting to think of that as a different market, but the high capacity might lead us to one of the things that you're enabling down here,

Starting point is 00:16:23 which is sharing memory, sharing access to memory between different CPUs and accelerators and CPUs and so on. And that's something that you guys are enabling in software, right? Yeah, yeah. So we'll be, you know, working with the vendors to bring memory pooling, memory sharing to market. But, you know, again, the software has to be enabled to do that, right? You know, the locking that needs to occur, multi-reader, multi-writer,

Starting point is 00:16:46 all that stuff needs to occur. You could go off and modify the application to go do that, but that's a lot of effort, given the millions of applications that are out there, right? So if we can add that shim in between the unmodified application and the hardware and allow the applications to do that natively and transparently, that's one thing that we're looking at at MemVirtual. So when do you suppose that we're going to get to this point

Starting point is 00:17:10 where we have shared memory? In other words, multi-access, multi... I don't know what the right word for it is, but when do we get there versus just having it be an expansion? Yeah, so I think we will be enabling all of those aspects of it starting coming up next year. And starting with FPGAs to actually show the concept on how it can be enabled.

Starting point is 00:17:39 I do think that deploying it at scale is probably about a couple of years away. Because people will have to get comfortable from all the RAS capabilities, the security capabilities. Although the silicon is now being built for that, the silicon actually has a responsibility to be even more RAS capable than a single server because the blast radius is higher. So we are putting in lots of functionality in there and

Starting point is 00:18:07 security functionalities. Now you specifically are sharing 3.0 does define some ways of doing it, which are really well defined in 3.0. But before 3.0, 2.0, with some software, and VenWord is one of the examples and other people working on it, it can be done, but does require a lot more software and careful orchestration with the hardware. And I would say that you're working on CXL switching, but I think when people hear this,

Starting point is 00:18:39 they may think that that is only this future use case in terms of sharing with multiple nodes and rack scale and everything. But CXL switches don't just live in rack scale architecture, right? I mean, we're going to see these things pretty much everywhere from inside servers as well as outside servers, right?

Starting point is 00:18:58 Yes, and that's what we're driving for. I think one of the advantages of being in a startup is we can move quickly. So we have an FPGA product today where we are demonstrating sharing. We have customers that will be deploying that FPGA using the expanded and extended memory and the ability to share that memory amongst multiple processors. So that's the beginning of it. But in my slide deck, I have a slide deck

Starting point is 00:19:26 called the evolution of composability. And it starts with memory that we have today. It starts then at the persistent memory that she talked about is the next thing. And then coming from that is going to be the processors and accelerators. So CPU, GPU, DPU, whatever XPU comes next is part of the heterogeneous architecture that you need today in order to be able to process those large AI ML workloads that we're seeing.

Starting point is 00:19:56 And so we'll be able to enable that as well, and then to be able to scale that both inside the box and inside the rack and rack to rack. So we're looking at all those solutions to get to that ultimate level of rack level composability. And as you talked about in your presentation, we view composability as the way to create these right size virtual servers that fit the workload. And so we're on a path towards getting there. Well, I think that that's actually a really great summary of sort of the state of where we're at right now. So, you know, it's great

Starting point is 00:20:37 that we were able to have you all join us here for what is, I guess, the first episode of where we're going with this podcast, because we've gotten a pretty good look at sort of where CXL is today with memory expansion products shipping today from Samsung, with software enablement from MemBridge today, and then where it's going in the future, whether it's controllers or switch hardware, whether that enables eventually, as you just said, to get to some sort of rack scale or

Starting point is 00:21:06 even multi-rack scale solution. And I think that those of you listening, keep tuned into this. There's a lot of exciting development happening here. And again, as I said in my presentation today, and as I'm going to say basically on every episode of this, this is an industry-wide consortium. There is basically no company in the industry that isn't involved in this, or at least looking at being involved in this. There's no company in the industry that's not looking to adopt CXL. And I think that what that means is that we could end up with a really transformative technology thanks to the efforts of the people in the industry that are working on developing products but also based on end users who are excited about the possibilities that this brings in terms of transforming the

Starting point is 00:21:59 architecture of their servers, transforming their data centers and delivering products at a, you know, and delivering products at the right product at the right cost, right-sizing memory and that sort of thing, as well as the potential in the future of using this technology to deliver capabilities that they've never had in terms of shared memory and enabling XPUs and rack scale and everything.

Starting point is 00:22:22 So we're going to be talking about this on the Utilizing CXL podcast going forward. Please do subscribe. You'll find us in all of your favorite podcast applications, as well as on YouTube at utilizingtech.com, where, as I said, we've got three seasons talking about artificial intelligence and machine learning. And now we're going to have at least one season talking about CXL technology. So if you'd like to be part of this, if you'd like to follow along,

Starting point is 00:22:49 please do check out utilizingtech.com. We're, of course, also talking about it on Gestalt IT. But for those of you here with me on the panel, where can we continue the conversation with you relative to your products and thoughts on the industry? Yeah, so you can follow us on our website. We've got a lot of content and information, and we'll be appearing at various CXL get-togethers in the coming years.

Starting point is 00:23:13 So elastics.cloud is where you can find us on the web. You'll be seeing basically a demonstration at the OCP forum coming up next week. And you can also get a lot of information from marvell.com from a perspective of where we are going in the future. But coming up, you'll see a lot more information from Marvell coming up. Great.

Starting point is 00:23:39 Thank you. Yeah, and you'll find a lot of CXL-focused blogs, white papers, solution briefs, et cetera, on memverge.com. So I encourage you to get in contact and see where we're going from the software ecosphere perspective. Yeah, it will be a great honor to support all of you who is interested with the CXL memory expander. Thank you.

Starting point is 00:24:01 And we'll include links to all these things in the show notes for the episode as well. So thank you very much for listening to this, the first full episode of Utilizing CXL. If you enjoyed this discussion, please do subscribe because we'll be talking to folks like this in the industry on a weekly basis at utilizingtech.com. This podcast is brought to you by gestaltit.com, your home for IT coverage from across the enterprise, including a weekly tech news show called The Rundown, hosted by me.

Starting point is 00:24:31 For show notes and more episodes, go to utilizingtech.com or follow us on Twitter at Utilizing Tech. I'm Stephen Foskett. You can find me on Twitter at sfoskett. Thanks for listening, and we'll see you next week.

Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 4x1: What's Next for CXL after Memory Expansion?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.