Storage Developer Conference - #177: NVM Express State of the Union

Starting point is 00:00:00 Hello, everybody. Mark Carlson here, SNEA Technical Council Co-Chair. Welcome to the SDC Podcast. Every week, the SDC Podcast presents important technical topics to the storage developer community. Each episode is hand-selected by the SNEA Technical Council from the presentations at our annual Storage Developer Conference. The link to the slides is available in the show notes at snea.org slash podcasts. You are listening to STC Podcast, episode number 177. Welcome, everybody. We're going to talk today about the NVM Express, kind of where we're at as an organization, what we've got going in terms of the standard, just kind of give a state of the union. My name is Nick Adams.

Starting point is 00:00:57 I'm a principal engineer at Intel. I do a lot of firmware architecture work, and I also am on the NVMe board and do quite a bit of work with the NVMe working group, the technical working group. So we'll kind of get started just with a little bit of a background of NVMexpress. I think that probably a lot of you are very aware of this specific graphic. But one of the key things here is that even up to 2017, 2018, people were just wondering about, how's NVMe going to do? Is it going to be kind of growing

Starting point is 00:01:35 and kind of take over in terms of storage or not? And as you can see here, NVMe growth in the marketplace in terms of SSD capacity is just outpacing everything. As you go out to 2025 and 2026, it's nearly ubiquitous. This is just a testament to the ability to standardize an interface to a good technology and see the results of that work. Do you have a similar graph for SME over cover? Not yet.

Starting point is 00:02:14 Something that we should do as we go forward, there's no question. Work in progress. You know, but as we look here, right, at kind of the breakdown across enterprise, cloud, and client, you can see just that continued growth in every one of these categories. You don't see that it's only growing in one particular place. You see a lot of growth across client, enterprise, and cloud in terms of units here. This is number of units, not necessarily size specifically. You know, and one of the key things about NVMe is that, you know, we're kind of across many different types of devices, right, with the intent of getting to the bigger and bigger devices, clear up to, like, warehouse size, all the way down into small stuff like tablets and even to cell phones.

Starting point is 00:03:03 And so, you know, ubiquity and storage is something that, you know, there's a lot of value in. If you go to past, you know, storage standards, there's always differences between client, server, and, you know, kind of outside of those, you know, kind of, you know, historically PC-centric spaces. But now we're seeing a lot more consistency across that. And I think that that's something that really bodes well for at least the storage space. So then as we kind of go forward, you know, what did we do with NVMe 2.0? NVMe 2.0 was released this last year. You know, one of the big things that we really drove was the refactoring of the specifications. And people are like, you know,

Starting point is 00:03:51 why would you do refactoring of the specification? What really drove that? You know, what we're really looking for is to make sure that we had a clear definition of, like, the foundation of what made NVMe NVMe. We had a kind of a mix in terms of PCIe was really built into the base specification initially, and then we had added fabrics as something kind of off to the side, but that didn't really reflect what was going on from an architectural standpoint. You know, we really needed to be able to kind of readjust the specification to reflect the architecture to allow us to be able to expand and extend in the way that the industry needed. And, you know, part of that was kind of splitting

Starting point is 00:04:31 out that PCI Express portion, making that more of a transport as opposed to something fundamental to the actual NVMe specification, and then moving those kind of the fabric aspects, not any particular transport, but kind of the fundamental pieces of what was fabrics into the base specification so that we had one architecture, whether it's a fabric-based or, you know, on-device type-based interface. So that was a lot of what drove us to do that refactoring. As we look at how that has worked out, we now have this NVMe-based specification that really defines the core architecture of what is NVMe. It has all of the extensions

Starting point is 00:05:16 in terms of optional features as well as that base foundational architecture. One of the changes you'll see in the spec itself is that there's a very kind of robust architectural section that really describes like how this functions together, like how is it intended to work together. You know before that had been kind of disparate throughout the spec and so that was one of the key things that really came in to the base spec itself. The next thing that you'll see is that we've created these command set

Starting point is 00:05:44 specifications. NVM command set, that's what we've always had there. It's the block storage. The NVM zone namespace command set, the key value command set, these are now really, you know, they're kind of cordoned off separate command set specifications that if you're going to do an implementation of a particular command set, you know exactly what you need to get out of the specification, and you know what you don't need out of the specification. And that's something that's really helpful when it comes to design or implementation. Then when we go over and we talk about transports, this is really like, what is the medium that we're going to be having that controller on that we're going to be talking to

Starting point is 00:06:25 from whatever that host system is. So here, the NVMe over PCI Express, it's pulled out as a separate, consistent, standalone specification. NVMe over RDMA for the Ethernet-based, and then also NVMe over TCP, again, Ethernet-based, but for TCP. And we also, you know, kind of outside of NVMe Express, there's also support for Fiverr Channel that's also been defined. Again, the separate transport specification. Beyond this, then we still have the NVMe management interface specification. So as you can see, we now have a family of specs that really make up what is NVMe 2.0. And as we go forward, you'll see from our roadmap that this even expands further based on what we're expanding within the architecture.

Starting point is 00:07:15 So since the NVMe 2.0 family of specs was released, we currently have, well, I say currently, as of the end of May, we have 27 authorized technical proposals. This means active solutions that NVME as an organization is working on. We've got 30 technical proposals that are ratified and publicly posted to our website, as well as five ratified ECNs. So each of these documents is effort that's going on since the release of the NVMe 2.0 family of specs. That effort has continued. These numbers are now even higher. So the point is that there's active work going on

Starting point is 00:08:00 within the NVM standards community. So the next thing I'd like to talk about is really our specification roadmap. You know, in the past, when we've talked about roadmaps, we've really focused on kind of, you know, our 1.0 release or a 1.1 release or specification, you know, revisions. But one of the things is with the amount of content that we're processing at this

Starting point is 00:08:27 point and how it works within the various specifications, we've transitioned to a new way of talking about our roadmap. Our roadmap is now being talked about in terms of features. And so when we look at this, when we look at the green boxes that we've got here, these are ratified features that are already released as technical proposals, and those technical proposals are independently downloadable from our website. So we have dispersed namespaces and the NVMe over Fabric automated discovery. There's a couple of TPs around that area. This is features that we've released that are already ratified.

Starting point is 00:09:07 We have a set of planned features over the rest of this year and into early next year. Scalable resource management, a network boot and UEFI specification. I'll come back to that in a second. Cross-namespace copy, which has a couple of different features, and we'll go through that further in the rest of the presentation. Some enhancements to NVMe over Fabric Discovery. Flexible data placement. Keeper IO. And, of course, computational programs that also is kind of partnered with something called subsystem local memory.

Starting point is 00:09:39 You know, one of the things that you'll see here is that none of these are tied to a specific specification. That's because some of our technical proposals are really contained within one specification, and some of them cross different specifications. You know, some of the examples, like on the computational programs or subsystem local memory, these are things that are going to have some impact on the core architecture of what we're doing as NVMe. There'll be base spec changes as well as the planned new specifications, even potentially some changes to other specs within that family of specifications. So there's just different impact of some of these different features. The other thing I really wanted to point out here is with regard to that network boot. Network boot is a new specification that we will release that really talks to the details of how you do booting of NVMe over fabrics in a consistent kind of industry standard type of way.

Starting point is 00:10:36 You know, there's work with UEFI and kind of that standards body as well to make sure that there's alignment. And, you know, this is going to be released as a separate spec, and it will be able to be implemented even before kind of the rest of the family of specs gets out because it's basically standalone and can be done independently. So we go forward. I wanted to just kind of talk about some of the features that we've added to the NVMe specification. The first one that I want to talk about is dispersed namespaces. But before I go there, I really want to give a little bit of background,

Starting point is 00:11:12 just to give some context before we talk about what has actually changed. So one of the things that you have to understand is, with regard to what is an NVM subsystem, as you can see in the picture on the right, historically an NVM subsystem, as you can see in the picture on the right, historically an NVM subsystem, it contains one or more controllers, some number of namespaces, zero or more, and then one or more ports,

Starting point is 00:11:34 a way for the host to talk to the controller. With that, basically the controller, that's that interface between the host and the NVM subsystem. And the namespace is some set of non-volatile memory that's available to be used for data storage. A dispersed namespace, it's a shared namespace that may be concurrently accessed by controllers from two or more NVM subsystems. And I'll show you a picture of this, a couple of usage models, on the next slide. And a couple of key things there, though, is that the log page provides a list of NQNs. That's a way of being able to kind of identify a namespace for all NVM subsystems that contain

Starting point is 00:12:18 controllers that can be accessed, or controllers that can access that dispersed namespace. Also, the NVM subsystem may support reservations on dispersed namespaces. So we've added the capability of supporting reservations in this new environment. So let's talk about a couple of usage models that actually make use of or can make use of dispersed namespaces. So as you'll notice in the picture on the left, we're talking about like an online data migration scenario. Different from the picture before, there's actually two NVM subsystems in this picture. You've got one that has a controller, multiple namespaces,

Starting point is 00:13:00 one local namespace, and one dispersed namespace, shared namespace. And then on the right of that picture, you have a separate controller that has, again, a private namespace, the NS1, and then NSID3, which is, again, that shared dispersed namespace that you can see as NSB. It's a little bit hard to picture, but that's the shared one down here.

Starting point is 00:13:24 And the intent here is that you can have this new controller, can see is NSB. It's a little bit hard to picture, but that's the shared one down here. And the intent here is that you can have this new controller have the ability to migrate data from the first controller to the second controller, and you've got one host that can access these things independently, even though they're two distinct controllers, they're sharing one namespace across the two of them. That's one of the uses for dispersed namespaces. Another one is about data replication. So in this case, you know, you have the ability to have the same data available and accessible. You know, from a use model standpoint, maybe we've got an office that's here in California and another office that's over in the East Coast. And you want to be able to have a local copy of the exact same data. But you need to be able to keep those things in sync.

Starting point is 00:14:15 This is a scenario that allows you to be able to do that. You have one namespace shared across NVM subsystems that are actually in two physically separated locations. And you're able to get hosts that can talk to that local server to be able to do that. The third usage is really about high availability. This is where you want to make sure that no matter if there is any area of your overall system that comes down, that you have access to your data. So whether that's a connection between the host and one of the controllers, whether one of the controllers goes down or one of the hosts goes down, you always have access to your data very, very, very quickly, right? And so again, you've got just multiple different scenarios that we're now able to cover because of the kind of the extension to the specification around dispersed namespaces.

Starting point is 00:15:11 So the next thing that we'd like to talk about today is about NVMe over Fabric's discovery enhancements. We've got a couple of different things that we are going to discuss here. Automated discovery of Fabric's discovery controllers for IP networks. So how do we find these discovery controllers? What are they? We'll talk about that a little bit. The NVMe over Fabric's centralized discovery controller. What's a centralized discovery controller and why do I care? We'll go through that. And then some activity that's

Starting point is 00:15:35 still going on inside of the standards body around discovery enhancements. Subsystem-driven zoning with pull registrations. And we'll talk to all three of these things in just a little bit more detail. First, let's walk through what it actually means. What is a discovery controller? Just want to give a little bit of context. So if we're a host and we want to be able to understand what fabric-based controllers are out there. The way that NVMe is set up and architected, you have to be aware of a discovery controller.

Starting point is 00:16:12 And historically, that discovery controller has been known either through administrative configuration, a priori knowledge, or some mechanism that's outside of the specification. What we've done here is we've added a new common way of being able to automate that discovery, but we'll come back to that in a minute. Once we're able to actually get access to that discovery service that contains the discovery controller, the host is able to get a log page. It's called the discovery log page, and it basically has a list of entries of what are the NVMe over Fabrics controllers available

Starting point is 00:16:48 inside of this network. Then you're able to go out, establish a connection with the specific I-O controller that you're looking to establish that connection with. But when we're talking about this new added functionality, what we've done is we've created something called domain names system service discovery and DNS service discovery and the idea is that we're using the standard DNS framework to be able to actually go out and discover what are the available what are the available and Vimeo or fabric controllers available on the system.

Starting point is 00:17:31 So as we look at this and we look at, okay, so let's define what do we mean by these discovery controllers. We have a centralized discovery controller now, and we have a direct discovery controller. Let's talk about what each of those things are. So, you know, in the past, you had to always know exactly which discovery controller you wanted to talk to, and that discovery controller had to be aware of all of the controllers in the network that you may, as a host, need to be able to access. Well, now what we've done is we've created a mechanism to centralize how a host goes and gets that collection of information. That's what we're calling a centralized discovery controller.

Starting point is 00:18:07 That discovery controller reports discovery information that's been registered by what we call direct discovery controllers. These direct discovery controllers, as you see over here on the right, they contain information about a set of other NVMe over Fabric controllers. A discovery controller, the DDC, is capable of registering discovery information with a CDC. So it only has to contain a part of the information. And then it has a couple of different mechanisms

Starting point is 00:18:38 for being able to register that with the centralized discovery controller. And one of the really big benefits of this is now you've created a mechanism where the host doesn't have to be aware of every individual discovery controller a priori if you've got a very large set of these discovery controllers. And we have a mechanism for being able to create a central one that they can find automatically, go out and talk to, and then be able to establish and get the information from all of these direct discovery controllers.

Starting point is 00:19:10 So the mechanisms that we have to be able to do the registration of the DDC, the direct discovery controller, with the centralized discovery controller, there's a couple. One is a push registration where the discovery, basically the direct discovery controller runs a command on the CDC to do the registration. So it's pushing its information to the CDC. The other is a notification that goes from the direct discovery controller to the CDC. And then the CDC, that centralized discovery controller, actually pulls the information from the DDC. The third mechanism is administrative configuration a priori. But again, the thing is essentially that we have a mechanism to be able to do that in a much more automated

Starting point is 00:19:59 fashion than what's been historically necessary to do. The third concept, and this is the TP that's kind of outstanding still, and we're still taking through the standardization process. It's actually almost done at this point. But we'll talk about this one here. So it's called fabric zoning. So the NVMe architecture is adding support for this fabric zoning concept. Using fabric zoning, a centralized discovery controller can filter the log page information so that a host only has namespaces that are allocated to that host.

Starting point is 00:20:33 So if you think about this, what you're doing is you're taking this centralized discovery controller that's got all this information across your entire network. And what you've done is you say, okay, host A, you're able to only see some subset of the information that the overall centralized discovery controller is actually making available to the network as a whole. And this concept we call a zone group. So this zone group is a set of access control rules that are enforced by that

Starting point is 00:21:05 centralized discovery controller, and the zone group contains zones, and each zone contains a set of hosts and namespaces. And again, so the idea is you have the ability to control who has access to which of these zones, which in turn gives access to particular namespaces for certain hosts. That zoning database, which is basically a listing of these different zone groups, is maintained by the CDC. And the DDC, the direct discovery controller, can provide fabric zoning information to the CDC using, again, those push or pull notification methodologies that we talked about on the previous slide. The next thing that we want to talk about is scalable resource management.

Starting point is 00:21:56 So scalable resource management is really an NVMe over fabrics kind of management infrastructure that we created to be able to make it so that you can put together dynamically and configure and construct an exported NVM subsystem from an underlying set of resources in the NVM subsystem. So the idea here is that we're creating an appearance of a controller with certain ports and certain namespaces that we call an export. I say a controller. I mean an exported NVM subsystem as opposed to a controller.

Starting point is 00:22:33 We're creating this subsystem so that it looks a certain way and it has access to certain kind of actual physical resources. And what this is allowing us to do is it allows us to create something that is exposed to the network in a certain way with a certain set of ports and namespaces that have made it available via the content that is physically present on that local server. And so, you know, as we look over here on the right, you can see that the underlying NVM subsystem, this is the actual physical content that we have on a local server. We've got four ports and eight namespaces here. When we create, using admin commands, this exported NVM subsystem, we're able to assign a couple of those ports and a few of those namespaces to a particular exported NVM subsystem. And that exported NVM subsystem becomes available over Fabrics to a host. One other note on this is that there's the ability to manage host access

Starting point is 00:23:37 to the exported NVM subsystem using an allowed host list. So you essentially can say only these hosts have access to this new subsystem that I've created. So the next thing is network boot and this kind of this partnership with UEFI. NVM Express over Fabrics hosts require, you know, being able to identify the host via NQN and host ID. Currently, that has to be set up by an administrator. And so the ability for something like BIOS to be able to boot, it requires customization of that BIOS to be able to do that.

Starting point is 00:24:18 This isn't a scalable solution. And so one of the things that's been done here is we've created a mechanism by which we can put that data into SMBIOS, which is a structure inside of system firmware, that then we have a standard way of being able to go and get that information so that we can use it as we're trying to boot to some network location. Because it's defined inside of this network boot spec, it allows BIOSes to be written in such a way that it can take advantage of this infrastructure. And so now we have a consistent way of being able to do a fabrics-based boot. Of course, I'm simplifying the specifics of what's

Starting point is 00:24:59 required here, but in general, the idea is how do we create a generic way of being able to advertise to system firmware that we have a bootable device on our network? The other key thing about this capability is that we're creating a new NVM Express boot specification, and this will be kind of released separately from our base specification or any of our command or transport specs. It's a different way that we're releasing content here. But the idea, again, is that we're defining something that can be used by the industry and incorporated.

Starting point is 00:25:32 This allows, again, open source implementation for things like UEFI or any of the system biases that are dependent on UEFI. Another feature that we've been working on is cross-namespace copy. So what cross-namespace copy does is basically it allows us to enhance the copy command that is existing in the current NVMe specification. So we always had the ability to copy one or more source logical block ranges in a namespace to a single contiguous destination in the same namespace. So we could collect a bunch of kind of random content and then put it into one block of content in the same namespace. But now what we've enhanced that copy command to do is one or more source logical blocks

Starting point is 00:26:27 in one or more namespaces can then point to a single consecutive destination logical block range in a destination namespace that may be different from the source namespaces. Yeah, the source namespaces. This is a useful thing to be able to kind of move and collect data from various namespaces. This is a useful thing to be able to kind of move and collect data from various namespaces into a single place. One of the restrictions of the current definition in this TP

Starting point is 00:26:56 is that the copy command doesn't reformat the data, so both the source and the destination formats need to be the same. So the format associated with a namespace must be consistent between the source and destination locations. So that is a restriction that we currently have. The end-to-end data protection type and size also has to be the same. So again, we're not reformatting data. Logical block storage tag mask, storage tag size must be the same as well. So these are some restrictions on it, but it does allow us to be able to kind of move in the direction of having a more robust copy command. Computational storage.

Starting point is 00:27:37 There's a lot of conversations around computational storage at the event this week. So a couple of the key things that NVMe is doing here, right, is, you know, we have, you know, the high performance and reduced latency are promises of computational storage, right? The idea of reducing power due to reduction in data movement, these are key pieces. And high performance and reduced latency due to the elimination data movement these are key pieces and high performance and reduced latency due to the elimination of you know processor and IO bottlenecks you know these are all concepts that you know we understand that we're pushing and driving for inside of NVMe so as we go right now what we're looking at is

Starting point is 00:28:20 how do we create an infrastructure that allows us to move that kind of processing down into the device I think that you all are aware of kind of what is being looked at there but in terms of NVMe we are looking to do kind of more more detail of what's necessary at that device level so as we look here into computational programs this is one of two TPs that are really being driven through NVMe at this point. We're first looking at how do we standardize the framework for computational storage at that device level. We have a new command set that we're working on for

Starting point is 00:29:00 operating on computational namespaces. These computational namespaces provide the ability to have a set of fixed function programs. So this is something where you're calling a function that's fixed inside of the device. We're also looking at the ability to do downloadable programs, something that's vendor-specific, or potentially we're looking at longer-term how we standardize that framework as well of the actual program. But as we walk through, you know, kind of what's necessary in terms of a device kind of going through this flow in NVMe, you know, first we need to be able to get the data off of a traditional namespace, you know, some kind kind of block storage namespace. Two, we need to be able to execute a program against that data.

Starting point is 00:29:52 So that data has to be in some kind of local memory of some sort. Then the program reads the data from that memory namespace, does some amount of work on it, puts it back into that memory namespace. So that idea, we've created these concepts of we already have a traditional name space that's for storage we've created a concept of a memory name space and also the concept of a computational name space okay

Starting point is 00:30:20 computational name spaces hold programs memory name spaces have the ability to be accessed accessed. Actually, we'll talk about the actual memory namespace aspect. Computational programs, they operate on byte addressable memory. We didn't have this concept inside of a storage device from NVMe, you know, from a standard standpoint. So what we've had to do is create this thing called a memory namespace that is byte addressable. We have a specific command set, we call it the memory command set, that we're working on associated with this memory namespace. These memory command set and memory namespaces will be required for computational programs, but one of the reasons that we've split this out is it's a generic concept that's not necessarily tied only to the computational programs command set or the computational programs concept, right? Our point is that this is more of an architectural infrastructure that is available in NVMe or that will be available in NVMe going forward.

Starting point is 00:31:36 We need to have mechanisms to copy to and from any other type of namespace to this memory namespace. As we go back to the kind of expanded copy command that we just talked about, there was that note about, hey, this does not support changing formats. As we go into what's necessary for computational programs, clearly that won't work. So that's something that we are focused on as we go forward. Flexible data placement. This is something that's been getting a lot of conversation and progress inside of the NVM Standards Committee recently. There's been a lot of conversation on this. I'm not going to go into a lot of detail on it.

Starting point is 00:32:17 It's getting closer to being solidified, as Mike's over there smiling in the background. But we're not quite there yet. But one of the key things, though, is that we've been, and we've taken multiple stabs at how do we, what's the right way to kind of abstract host access to particular places for data on storage. There's been multiple kind of historical ways that we've approached this. You know, everything from, anyway, we've had multiple different attempts at this, let's suffice it to say. So we're now, you know, because of the fact that we've done this multiple times and we've seen some of the kind of the ways that that has worked and not,

Starting point is 00:33:00 I think that we now have a methodology that has gotten quite a bit of forward progress. And we've been able to kind of, some of our partners have proved out that this functionality is working very well. And so, you know, we're really looking at how do we create that abstraction between host access and placement from what a controller needs to do to put that data onto the actual device. This is something that you'll hear more and more about as we move forward. Keeper I.O. So Keeper I.O. is, you know, we've had self-encrypting drives for quite a long time on LBA ranges within namespaces.

Starting point is 00:33:37 This is something that we've had for years. However, what we're finding is that we also need the ability to actually go and be able to do that on a transactional level. As we're doing a write operation or a read operation, we need to be able to independently encrypt with a particular key that particular transaction. This enables a lot of different use cases that haven't historically been possible. KeyPro provides that dynamic fine-grained encryption control by essentially calling out what encryption key we need to use on a transaction-by-transaction basis. Assigning that encryption key for sensitive file and host objects, better support for general data production regulations, easier support of erasure. When data is spread and mixed, essentially, because of the fact that you would get rid of an encryption

Starting point is 00:34:33 key, now that data is no longer accessible. It allows much easier mechanisms for being able to get rid of spread and mixed data. Mechanisms to download and manage keys are outside the scope of the specification. That's something that we're not managing ourselves in NVMe. Keys are stored in volatile memory so that when a device is powered down, we don't have access to those keys any longer,

Starting point is 00:35:02 and they have to be restored when power comes back. Another key thing to note here is that there's a liaison agreement between TCG and NVM Express around this concept, and we've been working in tandem to get the NVMe portions as well as the TCG portions of what's required to get the specification in place. And we'll continue that partnership until everything has been finalized. Perfect. Thank you, Fred. So in summary, the NVMe technology has succeeded in unifying client, cloud, enterprise storage around a common architecture.

Starting point is 00:35:44 And that adoption continues to grow and is projected to grow even stronger over the next few years. Following the refactoring that we created with that NVMe 2.0 family of specs, the NVMe architecture is focusing on communicating new features. We're communicating what we're working on,

Starting point is 00:36:03 where we're planning to intercept that, but we're focusing on features as opposed to specification releases. We will do specification releases. Generally, our spec releases are going to happen at a time frame where we're adding significant or major new features, and that's where we'll try and kind of align that. But the thought process that you should be thinking about is, as the NVMe standards body finalizes and ratifies technical proposals, they're available independently. You can go get them off of the website, and the definitions are there that you can go and implement or work proof of concepts on right from there.

Starting point is 00:36:43 The NVMe technical community continues to maintain and enhance existing specs while driving new innovations. You know, one of the pieces, you know, clearly you see that there's broad engagement across the industry in NVMe. We have a lot of work that's continuing to go on each week. You talk about the main standards body, the technical work group, but we actually have a handful or more of independent task groups that are kicked off to be able to work on some of these things

Starting point is 00:37:14 like the NVMe boot specification or the fabrics and large device, we call it FMDS, distributed systems like that. There's a number of different task groups that are associated with some of these kind of more focused areas that then feed back into that technical work group. And we do that so that people can engage in the areas that actually are applicable to them. But we have that one key technical work group that kind of manages all of the different inputs from the various areas. So if you are interested in contributing to NVMe and aren't

Starting point is 00:37:53 currently doing that, we'd encourage you guys to check that out. So that's what I had to present today, but I'm happy to take questions if people have things that they'd like to know. And if I can't answer those questions, I promise between the number of board members and contributors in this room that we can get you an answer. So does anyone have any questions that they'd like to ask? Yeah. Yeah, so when it comes to backward compatibility, our goal is definitely to be backwards compatible. There are a few specific instances where we do things that aren't backward compatible, usually to fix bugs or things that weren't clear in the specification the first time people interpreted them a couple of different ways. We keep a list of what is not backwards compatible on our website. So if you go, there's a change log that we publish with every one of our actual specification releases. And you can look at exactly what is not backwards compatible. From a compliance standpoint, we work with UNH, IOL.

Starting point is 00:39:08 They are our interoperability lab, right? We work with them to make sure that we understand where those compatibility breaks are going to happen. It shouldn't impact compliance until we've actually released a new version of the specification. But for the vast majority of what we do, our intent is to be backwards compatible. I just want to add, when you're looking at it going from NDE 1.4 to 2.0, the website has a password for password compatibility.

Starting point is 00:39:37 But anything that was ratified after the last release, ETCN, ETC, clearly the header section has the notification of working with the password. So just to repeat for the people that are joining online or watching this later, one of the points that was made in the room is that we're now tracking that backwards incompatibility on a TP by TP basis and that as we're releasing technical proposals that content is tracked inside of the technical proposal that's published directly and you won't have to wait for a new overall specification to be released to understand what is incompatible with any particular TP release.

Starting point is 00:40:26 Thank you for that addition, Mike. I think it's valuable. Any other questions? Sure. Oh, gotcha. So to repeat the question, and please correct me if I got it wrong, when we were talking about automated discovery in NVMe over Fabrics, is that specifically applicable to NVMe over TCP? And actually the focus of the automated discovery is around NVMe over TCP.

Starting point is 00:41:12 I'm not sure if I understand the question. Sorry. NVMe over Fabrics is a term that we use broadly to encompass all. Yeah, yeah. So again, to clarify, NVMe over fabrics is a term that we're using as kind of that foundational architectural support for something that's over a fabric. When we talk about something that's specific to NVMe over TCP or NVMe over RDMA, we would use that specific terminology there.

Starting point is 00:41:47 Yep. Any other questions? Yeah. I couldn't understand the second. Is it just me? Do you want to answer that, Fred? Yeah. More like,

Starting point is 00:42:17 if you're an NGNS for a call, you can get registered with the NGNS and you can register with the NGNS server. But that's not how the NGNS is directed. So, if you're an actual preacher today, It's using DNS, and there's a particular type that comes back as a discovery service that's correct right Fred? the host software

Starting point is 00:42:55 has to know about that type yeah exactly yeah yeah Yeah. Yeah. This comes... The information is on the network and then there are, we have the authentication protocol so that you can verify the discovery servers when they are connected to. We will only release their information to those hosts that they know they are allowed to talk to. So we still, all of those security methods exist already or are in different PPs. So it is not just a wild, best-kind environment.

Starting point is 00:43:47 There is some methodology for fixing those things off so that people are supposed to see it and be able to use the storage actually. So specifically... The CDC will prevent you... Sorry. It's okay. Go for it. So the CDC won't prevent you from the access that the CDC will provide to the discovery.

Starting point is 00:44:09 The CDCs today are able to filter who they provide information to. That's an existing capability that's been in the beginning of fabric. The CDCs have, in addition to that, the ability to have more boundaries around things so that they only release information to the host that they know can offer us the key. And that's part of zoning, correct? That's part of the new zoning. Exactly. So the key thing is about, again, I'm repeating some of this just so that it gets recorded, right?

Starting point is 00:44:46 So the key thing is that the centralized discovery controller is reporting what is there, what is available, but it doesn't necessarily allow you to connect in any kind of way without... It won't necessarily report to you something is available if it knows you're not allowed to talk to it. If you are... So that's for, again, something that's zoned, though, correct? Exactly. So let's... We need to...

Starting point is 00:45:11 Right. So we have to be careful to separate which TPs are which, right? And so we're talking about like an overall... We're talking about kind of how the overall system works a little bit here. Exactly. They all work together to provide the capabilities that you're talking about kind of how the overall system works a little bit here. Exactly. They all work together to provide the capabilities

Starting point is 00:45:27 that you're talking about, but individual ones kind of provide infrastructure pieces towards that end. Yes, in the back. Yes. Do you still require... ANA is required for dispersed namespaces. So again, ANA is required for dispersed namespaces for the recording.

Starting point is 00:45:51 There we go. Yes? I just want to make a point that the... Perfect. Okay. Any other questions? Well, thank you very much for the questions afterwards. And, you know, definitely check out the rest of the NVMe track this afternoon. be sure and join our developers mailing list by sending an email to developers-subscribe at snea.org. Here you can ask questions and discuss this topic further with your peers in the storage developer community. For additional information about the Storage Developer Conference, visit www.storagedeveloper.org.

Your Ad Here

Storage Developer Conference - #177: NVM Express State of the Union

...

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.