Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 4x06: Enabling CXL with VMware with Arvind Jagannath of VMware

Starting point is 00:00:00 Welcome to Utilizing Tech, the podcast about emerging technology from Gischtalt IT. This season of Utilizing Tech focuses on CXL, a new technology that promises to revolutionize enterprise computing. I'm your host, Stephen Foskett, organizer of Tech Field Day and publisher of Gischtalt IT. Joining today as my co-host is Craig Rogers. Hi, I'm Craig Rogers. You can find me on Twitter at CraigRogersMS. So, Craig, we were talking quite a lot about CXL, and we, of course, were part of a panel at the recent OCP Summit where we did an all-day CXL forum presentation.

Starting point is 00:00:39 One of the things that occurred to us then, and we were excited to see that day, was a presentation from VMware, which, of course, is an important prospective part of the CXL world, wouldn't you say? I wholeheartedly agree. For CXL to succeed, we need companies like VMware on board supporting that software layer. Absolutely. And I feel like if CXL is well supported in the hypervisor, then it basically makes all of this stuff possible because everything from memory expansion to really all the way up to composability and rack scale architecture, all of that, it makes a lot of sense to support it in the hypervisor,

Starting point is 00:01:23 maybe even more than the OS, though, of course, supporting the OS is still really important. So that's why we decided to invite as our guest in this episode, the presenter from the CXL forum, Arvind Jagnaath from VMware. Arvind, welcome to the podcast. Thank you, Stephen, Craig. It's my pleasure to be here. Well, it's our pleasure to have you because, you know, although we haven't yet heard exactly what VMware is going to be doing in terms of CXL support officially, having you there really speaks volumes in my mind.

Starting point is 00:02:01 Even just having you, you know, join us on the podcast and join us at CXL Forum. It says that this is on the VMware roadmap and radar, at least. Yes, absolutely. So VMware has been working on a lot of memory initiatives and has been enabling a lot of the newer platforms, such as Sapphire Rapids and Genoa. And we definitely are totally into enabling CXL as part of some of those platform enablement. And what would that support look like? I mean, obviously, if it's supported, sort of passed through to the OS, that's one thing. But would it be the case where this would basically be a tier of memory available for virtual machines to use, just kind of like what you guys were doing with Optane as well? Yeah, yeah. So we are thinking about several different plays here. So one is, you know, just a basic enablement, like the one we demoed during VMware Explore. So there, you know, memory is presented, the CXL-based memory is presented as a separate NUMA. And then, you know, we access a flat memory.

Starting point is 00:03:11 It's basically a memory expansion use case where we showed that, you know, the applications could use additional bandwidth. But at the same time, we are definitely looking into some other use cases, such as the memory tiering use case, where we could provide better TCO for our customers and reduce overall cost. And we also have other use cases, such as with accelerators, that I can definitely talk about. It's interesting to think that VMware are going to be a disaggregated memory hypervisor. Tiering is one of the obvious things you see. Obviously, VMware has been able to tier with vSAN, but now we'll have these potential memory tiers within the hypervisor.

Starting point is 00:04:05 Are VMware doing anything or can you share anything about tiering specifically within workloads? Yeah, yeah, definitely. So VMware has been in this journey over the last few years where we have worked on Intel Optane. So we have enabled the various hardware modes available within Intel Optane, such as the memory mode and the app direct mode. But at the same time, VMware and Intel actually supported us and collaborated with us in the software-based memory tiering architecture, which means that we could actually make use of, I mean, we could be more aware of some of the latency characteristics of Optane. And we could use, for example, the DAX mode and sort of treat Optane as a separate, slower, higher latency tier and start doing some intelligence around page movements such as you know hot and

Starting point is 00:05:07 page a cold page replacements which means that you know you could move hot move pages between the hot and cold tears and actually make this seamless which means that a lot of the workloads start getting sort of a uniform-like performance across the board, and they don't see a whole lot of performance degradation. So kind of like DRS for memory? Yeah, it's interesting you say that DRS for memory is one of the things we did do is sort of made DRS aware of these different tiers. So which means that as we start using one or more of these tiers, DRS is aware across

Starting point is 00:05:58 the cluster and we can make use of that component to make sure that the right host is chosen whenever, you know, applications get into performance kind of degradations. So, yeah, it is kind of DRS, but this can be taken as something that is built into the hypervisor, which means that software tiering is completely part of our ESX kernel hypervisor operating system. And it can also make sure that, you know, there are other benefits such as, you know, we can make sure that across the board when we run multiple VMs, we can ensure some, we can place some guarantees around fairness of the VMs, we can ensure some, we can place some guarantees around fairness of the VMs, which means that we don't see necessarily a case where, you know, one VM can be starved or, you know, a rogue VM occupies the hottest tier all the time. And we can sort of control and ensure that, you know, all the workloads across the board

Starting point is 00:07:03 run consistently. Yeah. So on that note, I think that that's one of the most important aspects here is that this is basically everything in vSphere is going to be fundamentally a shared resource. And we're used to having memory be a really unified homogenous thing. Essentially, any memory on the system is the same for any process running. But that's not necessarily the case, right? Because, I mean, VMware vSphere is designed to recognize that under NUMA, there could be some differences already in terms of memory access. And of course, as you mentioned, with Optane persistent memory, there's quite a big difference

Starting point is 00:07:43 between different memory regions. I think we're assuming that CXL memory expansion will look somewhere in between NUMA and Optane in that it'll be still maybe not as high latency, maybe a little more high performance, but not quite system memory. Is that right? Yeah, yeah, that's a good point, Stephen. So the VMware philosophy is to make sure that the underlying operating system handles the different latency characteristics of these different devices and just presents a plain sort of uniform address space

Starting point is 00:08:22 across the board. So basically what we do is, you know, and this philosophy is followed for several of our hardware enablement kind of, you know, effort. So what we are doing is, you know, we are taking all these different types of memory and it's not necessary that we use, you know,

Starting point is 00:08:44 a strict software tiering mechanism for different memories. I mean, there are still certain memory expanders that can actually fall within, like you said, between the DRAM and Optane buckets, which we could handle by doing smaller optimizations with NUMA, but for overall software tiering, we are targeting that for a memory semantic SHD, for example, kind of use case where you might see a whole lot of larger latencies. And I think we also might get into a use case where we see pooling as one of the use cases where you know, hosts share memory. So that could also become one of the things that we start handling where you know, multiple

Starting point is 00:09:34 hosts start sharing memory and accessing memory simultaneously from the same pool. So that is another thing that we are looking into there is yet another thing that we want to look into which is you know what we have seen across our customer base is a lot of stranded memory so what happens is when you know when we when our customers provision host there is a lot of memory that's sitting idle on these hosts. So what we want to be able to do is even look into schemes where we could potentially access this memory from other hosts and share memory from these other hosts.

Starting point is 00:10:19 So this is pooling of a different kind, if you see. So we are also looking into such use cases. Yeah, I think that there's a whole spectrum of things from the very simplest initial use case, which is essentially just for those who haven't kind of caught on this, has been already introduced by SK Hynix and Samsung and is already coming to market with the next generation server platforms. And that is basically adding a memory expansion card essentially on the CXL slash PCIe bus that would allow you to, for example, add a half a terabyte of RAM to a system that already has all the memory channels filled up just so that you have a little bit more RAM and it's going to be a little lower latency, but you can use it. And obviously that would be a very useful thing.

Starting point is 00:11:09 But what you're describing is sort of what comes next after that. So one of the next things was, well, what about an external chassis with RAM that could be shared among different hosts? And then the next thing was, what about pooling memory where you have different hosts dynamically sharing memory? And then the next thing was, well, pooling memory where you have different hosts dynamically sharing memory? And then the next thing was, well, what if you have hosts that can access each other's memory? And beyond that was this idea of what if you can cooperatively access memory from host to host? Now, that's a really interesting use case in a vSphere cluster. I assume that's on your radar as some future enhancements. Absolutely. And I can talk

Starting point is 00:11:46 about that in some form of a disclaimer or a disclosure, I mean, under NDA, which is, you know, we are looking at several different use cases. While, you know, we see massive proliferation with respect to, you know, just the basic use cases with memory expansion and perhaps the lower TC option with memory semantic SSD. But the DRAM host sharing is something that we already do in some respect. For example, if you look at vMotion, vMotion is sort of a subset of this use case where we actually track pages across two hosts and make sure that, you know, pages can be brought back from the other host if need be, right? And so there is actually a lot of IP within VMware engineering where, you know, VMware has worked on some of these stuff,

Starting point is 00:12:40 memory issues. There is another use case that we are targeting, which is, you know, using an accelerator. So with an accelerator, it opens up yet another kind of interesting use case, which is, you know, we will be able to sort of look into doing vMotion enhancements, which means that, you know, for example, if you see, like I mentioned, vMotion is really a subset of memory access or, you know, memory tracking. So when we actually do vMotion and we actually actively track pages and start moving the pages to the target host, you will see faster vMotions happening. And accelerators are, you know, sort of,

Starting point is 00:13:29 this requires special IPs within accelerator, which means that, you know, these are some technologies that serve VMware customers really well because, you know, vMotion is a huge pain point and host evacuation is a pain point for a lot of the VMware customers. So what we can do is, you know, we can leverage accelerators and we can actually see if, you know, we can track this memory and choose good target hosts into which we could easily Vmotion. And so Vmotion becomes less of a pain point for this larger memory host that a lot of our customers experience.

Starting point is 00:14:07 The other use cases that we can look into with the accelerator are, you know, really, it opens up a whole new gamut of use cases. Things like, you know, encryption or dedupe, you know, or, you know, tracking memory resiliency. Because, you know, this is all about memory. We could look into the accelerator sort of, you know, acting on behalf of the CPU and looking into and tracking, you know, different memory regions of applications, making sure that it's resilient and protecting against, for example, even hardware failures. So we can look at this as a complete hardware virtualization, memory hardware virtualization. VMware have always been very good at giving insight

Starting point is 00:14:55 to the hardware underlying, the underlying hardware, and they've also made it, they've helped very well so we can track it and monitor it, build operations around it. And have you faced any challenges around diving into CXL? Because there's no, I don't know, are there standards there? How are you monitoring and diving into that then? Yeah, that's a good point, Craig. So VMware has been heavily focused into sort of the operational aspects of, you know, any new hardware technology that we bring. We believe that operations is one of the core pain points that a lot of our customers need to solve, for which we need to look into so

Starting point is 00:15:48 In terms of making sure that you know We get the smoothest from possible operations for our customers. We want to make sure that we provide the appropriate monitoring and measurement of You know memory across the board. So for example, I mentioned DRS as You know one of the things that we look to actually do a proper measurement and load balancing across the cluster. But monitoring is useful as well because even with our current latest release, we introduced something called the BMMR, which actually makes sure that we can track the different tiers memory usage, which means we can track bandwidth and miss rates and current latencies for the different tiers.

Starting point is 00:16:39 Customers are able to make a better judgment of how they want to use these different tiers. And they could even build some sort of, you know, proactive alerting and proactive load balancing capabilities with DRS using these measured kind of monitoring statistics. And that, I believe, is going to help customers a lot. Absolutely. And VMware will be the people to deliver that for sure, given the sheer scale of VMware deployments worldwide. People have had that insight on everything else they're going to want to don't see Excel, if not more. If we look at adding memory say to a host

Starting point is 00:17:25 are you working with any vendors to let that happen dynamically where you could move memory from one host to another based on current scenarios how dynamic do you think that will be? Yeah, so currently we are not really working with a specific vendor. So VMware is really an ecosystem player.

Starting point is 00:17:49 We view all vendors the same. There isn't a specific vendor we are working with on this specific architecture. I mean, we are looking at it as an infrastructure piece, which we could enable in the sense that, you know, when memory has to be moved dynamically, we will make those changes that can enable this in our hypervisor, independent of any specific vendor. So these types of innovations should be available as we progress in the CXL journey. One of the things you mentioned there was, and I hadn't really thought about this, was the fact that vMotion helped sort of set the table a little bit for some more advanced memory manipulation. thing that I know is in vSphere is the BitFusion product, which would allow you to use external GPUs dynamically within vSphere. And of course, that's very valuable in a lot of use cases. But it strikes me that that also overlaps with the future CXL mission, which would be to share devices like GPUs as well in a more dynamic way.

Starting point is 00:19:06 Does that also, is that same technology going to be able to kind of port forward into the CXL world? Yeah, we definitely are, you know, excited about the possibilities with the GPUs and sharing memory for the GPUs, etc. And, you know, we already have technologies, like you mentioned, Bitfusion already sort of enables this composability and disaggregation of resources. So definitely when CXL and then we also have, you know, GPU Direct RDMA, which is sort of, is sort of another version where we can offload some of the resources from the CPU. And we have the SmartNIC play where we can use SmartNIC to offload some of the processing that CPUs normally would do.

Starting point is 00:20:01 So definitely, like you mentioned, Stephen, Bitfusion is an area we want to look into in the future. Just in general, IO devices themselves could, you know, become more and more sort of, you know, aligned with the CXL kind of roadmap. I mean, we already enable, you know, a lot of the networking offloads for example i mean flow based offloads more intelligent offloads into the network so a net nix for example so we can definitely um look into cxl as being you know sort of offering another level of you know doing uh better offloads and better flows and uh you know sort flows and doing more of a proactive kind of offloads. And Bitfusion could definitely be part of that.

Starting point is 00:20:54 I guess another sort of future here is, of course, the big question, which is composable infrastructure. And I know that there have been many proposals on how that that might be enabled. It is pretty exciting to think that we could deploy really a custom hardware platform to match our custom software platform, our custom virtual machines. I think some people might say, oh, well, composable infrastructure doesn't need vSphere. But I think that's really not the case.

Starting point is 00:21:29 In fact, I think one of the coolest things about composability is the idea that you can put together a big server that has a configuration you could never achieve in sort of off-the-shelf hardware. In other words, maybe I need, I don't know, 39 CPUs. And maybe I need, I don't know, 724 gigabytes of memory and maybe I need a half a GPU. Go build me that guy. Well, with composability, theoretically, you could build me that guy. And that would actually be really cool with vSphere because you could basically deploy a piece of physical infrastructure that is impossible in physical world. That's pretty cool. Is that how VMware sees it? Or might we see actually a different kind of situation where maybe you guys are the

Starting point is 00:22:14 arbiters of composability? Yeah, absolutely. So we see these use cases, Stephen, already with hyperscalers, right? So a lot of the cloud providers or even on-prem customers trying to provide sort of the cloud-like operating model you know trying to provide sort of a devops kind of deployment so we definitely see that you know there is a lot of value in sort of doing such composability and disaggregation of resources. So this takes sort of the hardware abstraction to a completely different level. When you think about bare metal

Starting point is 00:22:54 provisioning, you can actually think about creating a bare metal instance or a server dynamically. And CXL actually enables memory used to be sort of the last frontier in this, you know. Of course, CPU is also one of the issues here that we need to tackle with composability, but memory is one of the bigger sort of challenges. And CXL definitely helps with that, where you could dynamically create this hardware abstraction. And then you could dynamically create a server for your specific customer or a tenant or a use case that such customers can use. So I guess it's probably too early to guess what exactly this is going to look like because of course none of this hardware even exists yet um but but it is it is pretty cool to think about uh the a future where

Starting point is 00:23:53 the hardware is as dynamic and configurable as v-sphere you know yeah yeah and i think in terms of the hardware um architecture and you, the fabric and how things can get assembled, it could be a combination of, you know, using specialized devices like accelerators and, you know, special switches and, you know, fabric managers. On the software side, we already have, you know, DRS sort of, you know, in the same mode or method as a fabric manager. And we could definitely think about enhancing the capabilities of DRS to actually act like a fabric manager across multiple hosts in a cluster. But at the same time, we are seeing that it's easy to build the infrastructures like the switches or the pooling the shared appliances.

Starting point is 00:24:56 But at the same time, how do you operationalize it and make it very simple for customers to deploy, configure, manage, and use, that still has to be solved. And I think VMware can play a significant role in making sure the operations problem is solved. And have you been looking at any specific use cases

Starting point is 00:25:21 within the VMware ecosystem? A couple of spring to mind would be Tanzu, VDI, Horizon. Do you see CXL making any impact there or adding any operational gains in terms of efficiency? Yeah, absolutely, Craig. It's interesting you bring this up because when we were actually working on Project Capitola and Intel Optane and the software tiering, VDI emerged as one of the prime use cases because customers want to scale up their memory and they want to consolidate their servers. They don't want extra hosts to be deployed necessarily. And,

Starting point is 00:26:06 you know, it also helps them with some of their green initiatives and power cooling and other management. So definitely, you know, in terms of such manageability, it definitely helps with, you know, VDI like when VDI like workloads come into picture. It helps when you have, you know, different tiers and you have different types of memory being used. And, you know, VMware providing this single access or a uniform access across these different tiers and scaling up that memory definitely helps for such workloads. That's exciting because, you know, technologies like this that force you to change how you architect solutions are going to have huge impact, massive impact. And if VMware are on board and letting companies make better use of their equipment and give them the control that they're used to having and the insight to run it operationally well, I think it could be successful.

Starting point is 00:27:17 So I really think it's great that VMware is involved here. And I think it's actually reassuring to the industry, just to have you here say this. Now, we know that you haven't announced this support. I imagine that'll come at a VMware Explorer event pretty soon. But, you know, I really appreciate you being willing to come on here and talk to the world and just share your enthusiasm for the technology personally, as well as, you know, from a company perspective. One of the things that occurred to me as soon as I saw the first CXL announcements was that this is going to require especially VMware, Microsoft, Linux

Starting point is 00:28:00 support. We've seen some Linux support, We've seen some third-party drivers and software supporting it. But the fact that VMware is there, that you are going to be eagerly working on this stuff is really reassuring. You can't give us any clues, I know, about when this stuff might happen. But I guess, is this a major thing? Does this have to wait for some major revision of vSphere? Or is this the kind of thing we might see come along sooner than that? What we are currently doing is, we are actually starting our enablement journey with CXL with 1.1. So we are already talking to a lot of vendors,

Starting point is 00:28:42 including the CPU vendors, and testing CXL in some of their platforms but really what we see as you know a productized version is you know more on the CXL 2.0 kind of time frame with granite rapids because we feel that you know the ecosystem will evolve and you know it will become more mature and use cases will become more clear currently we do have certain target use cases in mind that we are focusing and the engineering is working on but you know as we mature and as these technologies become more and more sort of baked, I think CXL 2.0 seems to be the right kind of fit or the environment where we fully start supporting CXL. And just as a reminder to the audience as well, so you mentioned, you know, so the first CPU platforms that really support CXL are coming or have been announced by AMD and are widely

Starting point is 00:29:56 expected to be announced by Intel very soon. The next generation, as you mentioned, is widely expected to follow that fairly quickly. So, you know, next year at this time, we'll probably see this, at least from a hardware perspective, perspective being rolled out. And now and as a software, of course, you need software to make the hardware run. And I think it would be really exciting to see an announcement from somebody like VMware in coordination with those things. So this sounds great. And of course, beyond that, you know, we've got more and more coming. So we heard again at the CXL forum from the head of the CXL consortium, some very, a lot of excitement about PCI Express 5, PCI Express 6, CXL 3, CXL Beyond 3.

Starting point is 00:30:46 And there is a huge roadmap of support in the industry from a software and hardware perspective for basically everything we've been talking about. So from the purposes of this podcast, this may sound a little pie in the sky, a little futuristic, but I wouldn't bet on that. This is really happening. This is really coming out and we're going to start seeing people deploying this stuff now in the first quarter of 2023. And I think that there's a good chance that we're going to see more support from companies like VMware in the following year. So thank you for that update. Before we go, where can people connect with you and continue this conversation, Arvind? You can all reach me at LinkedIn and you can just search for me and please feel free to reach out and I'll be able to provide more information on our CXL journey. And at the same time, we are always doing more blogs and white papers on CXL. We already did a few on memory in general, but

Starting point is 00:31:56 going forward, please look out for more blogs about our CXL enablement and happy to share more when you connect with me. I'll also point out that the presentation you mentioned from Flash Memory Summit is available on YouTube. If people look that up, it's called Towards a CXL Future with VMware, and they could probably find it. We'll put it in the show notes here too. So is the OCP one. Yeah. Great. Craig, you and I just published a big white paper that we contributed to for Intel. If you go to gestaltit.com, you'll see that in the sidebar. What else is going on? Geez, too much. Lots of podcasts, lots of writing. I have a lot of writing to do this week actually. But the Intel white paper was fantastic to see that coming out. Obviously touched on VMware a lot in that white paper. and we made some observations of where we think some things might be going, and we called out the perfect configuration.

Starting point is 00:33:09 Yeah, it was great to work with. It was a good team. Thank you very much, Craig, for that. And, of course, also you did a presentation for Tech Field Day on CXL, and we'll include that in the show notes too. So thanks for joining us for the Utilizing CXL podcast, part of the Utilizing Tech podcast series. If you enjoyed this discussion, please do subscribe in your favorite podcast application and give us a review. This podcast is brought to you by gestaltit.com, your home for IT coverage from across the enterprise. But for show notes and more episodes, go to utilizingtech.com or find us on Twitter at Utilizingtech. Thanks for listening and we'll see you next week.

Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 4x06: Enabling CXL with VMware with Arvind Jagannath of VMware

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.