Storage Developer Conference - #27: Standards for Improving SSD Performance and Endurance

Starting point is 00:00:00 Hello, everybody. Mark Carlson here, SNEA Technical Council Chair. Welcome to the SDC Podcast. Every week, the SDC Podcast presents important technical topics to the developer community. Each episode is hand-selected by the SNEA Technical Council from the presentations at our annual Storage Developer Conference. The link to the slides is available in the show notes at snea.org slash podcast. You are listening to SDC Podcast Episode 27. Today we hear from Bill Martin, Principal Engineer Storage Standards with Samsung, as he presents Standards for Improving SSD Performance and Endurance from the 2016 Storage Developer

Starting point is 00:00:47 Conference. My name is Bill Martin. From the slide, you'll see that I represent Samsung Semiconductor. However, this presentation is more from a standards point of view. I am Vice Chair of the SNEA technical council. I am vice chair of the Insights T10 SCSI working group. I am secretary of the Insights T13 serial ATA working group. So some of the things that I will be presenting today are initiatives that Samsung and I have personally driven. Other things that I'm presenting today are other initiatives that Samsung and I have personally driven.

Starting point is 00:01:28 Other things that I'm presenting today are other initiatives that are being driven within the industry at the moment, and I will try to give an unbiased view of those. I have actually not even pointed out which ones are Samsung initiatives and which ones are not. So I am trying to do this presentation as a very open presentation of what's going on in the industry, some of the things that the industry is trying to do to meet the performance requirements of the hyperscaler

Starting point is 00:01:59 and to improve SSD performance. So with that, OK. And somehow I have just disconnected from the... Thank you. Okay. What is being standardized today? So there are standardization efforts going on to provide an interface for better collaboration

Starting point is 00:02:42 between the SSD and storage systems, whether that is a storage array or whether that is a host. So when I talk about storage system, it's anything that talks to the SSD storage device. There is stream operations, and that particular feature allows you to store data with a similar lifetime and associated physical locations. Kind of a wordy explanation there. I come from a standards background, having been in standards for over 20 years. We try to obfuscate things. Really what this is doing is if you think about NAND flash,

Starting point is 00:03:25 this is an attempt to allow you to store things that you expect to erase at the same time in a single erase block or multiple erase blocks. Background operation control, again, obfuscated to be advanced garbage collection, but it's basically doing any of the background operations that need to be done on the SSD device and controlling when those are done. It looks like I was told that this projector had problems earlier in the day and it looks like I am experiencing such problems. So redeterminism is a late-breaking idea coming into the standards work

Starting point is 00:04:15 and is an attempt to help to get away from the tail latency, and I'll talk more about that later, but basically trying to store things that you expect to access at a certain time in an area where if you expect parallelism, a read won't block a read or a read won't block a write or won't be blocked by a write. There's a couple of other features that are out there to help limit long tail effects. One is a rebuild assist for SSDs. That type of feature has been out there for HDDs for a while now. There were some lacking things in HDDs.

Starting point is 00:04:55 This attempts to alleviate some of those and basically helps you to figure out how to move data around to make your storage device more effective. Depopulation is another late-breaking development in the standards community. Again, it's an attempt to clean up your storage device in such a way that you avoid long-tail latency, the long-tail effect. And then finally, I'll touch on object storage and what's going on in the long tail effect. And then finally I'll touch on object storage and what's going on in the object storage group of the SNEA and where that's currently at and where it's going.

Starting point is 00:05:37 So right streams. The development for right streams has been going on in NVMe, SCSI, and SATA. So it's an attempt to take that across all three of those technologies. It allows the host to associate each write operation with a stream. So when I do a write, there's a tag in that write that indicates which stream that write is associated with. The device then places all the data associated with the stream in one set of physically associated locations, i.e. an erase block, a die, whatever association will help that SSD to provide better performance and better endurance.

Starting point is 00:06:17 The big key here is that all the data associated with the stream is expected to be invalidated at the same time. So what you're doing by that is you are avoiding garbage collection because if you stored the data all into an erase block and you invalidate it all at one time through either a trim or an unmap command, then that data will disappear at the same time. You don't have to garbage collect anything. All you have to do is erase that block. So it improves garbage collection overhead,

Starting point is 00:06:56 and it also improves your write amplification factor because you don't have to rewrite data. You've written it once. When you erase it, you erase that entire block, and you don't have to come back and move it around. It improves system performance because you're not doing garbage collection, and it improves device endurance because you don't have to rewrite the data. How do streams work?

Starting point is 00:07:25 This is a little animated slide that I put together to kind of give you an idea. Assume you have three different places that data is coming from and each of those things are going to erase that data or delete the data at the same time. So you have virtual machine A, you have the database,

Starting point is 00:07:42 you have virtual machine C. In a non-stream, data is written in the order that the rights are received. So you have all of these things coming in from these three different sources of data, and they're going into individual locations in memory unrelated to where it's coming from. On the other hand, if you have streams, your data is grouped. So while the data comes in in the same order, it's placed in the appropriate location based on that ordering, thereby allowing you when a particular source of the data no longer wants that data to erase an entire chunk of data at one time.

Starting point is 00:08:32 Stream improvements, these are some tests. This actually is a Samsung test that was done showing the improvement of it. So here you'll see with the legacy operation we have about 75 gigabytes throughput. With the stream operation we're getting about 750 gigabytes of throughput. It's greater than a 9x performance improvement and that's an actual test that we were able to run using 100%, 128k writes with four different streams of data. In addition, we saw a greater than 2x SSD endurance improvement. So these are real numbers done by real testing showing that this particular feature really does improve

Starting point is 00:09:24 both performance and endurance. So that's the right amplification? Yes, that is right amplification. And it gets down to? This graph is inaccurate. It can't be below one. The graph is inaccurate. Unfortunately, I didn't have the source for the graph. I tried to change it because so this, it is inaccurate. It actually goes from about 2.5 down to just about 1, 1.1. That's why I took this with four streams. Well, I mean, it's three-dimensional, so it looks like it's right there also.

Starting point is 00:09:59 Yeah. It's difficult in a three-dimensional graph. I think this is actually supposed to be right at 1.0, and this is somewhere in excess of 2.5. So I was not the source of these numbers. So the NVMe standardization effort for this, so it came in and NVMe wanted to take what we were bringing in and make a more generalized idea of what we were doing

Starting point is 00:10:37 to allow other features to be able to come in and not keep eating up command codes. So this got turned into a technology or a feature called directives. It's extensible to provide host directives to device. You have a directive type of which streams is type 1. Type 0 is the identify directive type. And you have a directive identifier that may optionally be associated with that stream. So the first two directives that are being defined are the identify directive and the streams directive.

Starting point is 00:11:17 Okay, so there are two new commands associated with directives in NVMe. One of those is the directive send command and it is used to configure specified directive. The other one is a directive receive command and that's used to return parameters for a specified directive. And then

Starting point is 00:11:39 existing IO commands, the write command has been enhanced to allow a field in that write command that had been reserved to indicate the directive type, and another field in that write command to be the directive. It's now called the directive-specific field. Originally it was called the directive identifier, but again in standards we have to make things less obvious for you

Starting point is 00:12:08 to make it extensible so when somebody really doesn't want it to be an identifier but wants something else, they can still use it, and so now it's called the directive-specific field. In the future, other I.O. commands may be added to commands that would have these particular fields. Yes? What is expected of typical number of streams that you see? At the moment it seems that most SSD providers are looking at something on the order of 16 possibly up to 100

Starting point is 00:12:47 but initially we're looking at the fact that this can be useful in the on the order of magnitude of 16 streams for a complete NVMe subsystem. The identifier field is 16 bits long. That's how big it can get. Is this for the computer or for the whole drive? This is for the entire subsystem. The projector has a problem. It will come back in a minute. I've gotten to the point I ignore it.

Starting point is 00:13:22 So the directive-specific field in the streams directive is the streams identifier? Correct. Okay. So the identify directive, that's, yes. If you're grouping up writes for each stream, are you caching them at length, or where is that taking place? That's taking place.

Starting point is 00:13:45 The device basically is saying, okay, for stream N, I know where I want to put these in this particular physical location. So I don't have to do any caching. I say, the stream comes in, I know exactly where it's going. Now, you may come back

Starting point is 00:14:02 to that question in a couple more slides when I talk about how identifiers are assigned and allocated. But first off, identify directive. The directive received for that returns two bitmaps in a single data structure. One of them is a bitmap of the supported directives, and the directive type is a 5-bit value. And so it's a bitmap large enough to support that 5 bits worth of possible directive types. And the enabled comes back telling you which directives have been enabled for this particular NVMe controller.

Starting point is 00:14:49 Directive send is used to enable or disable a specific directive. That is the only function of the directive send within the identify directive. No, that's a single. That is a directive type and an enable or disable. So it's a 5-bit value it's a 5-bit encoded value of the directive type and then a single bit that is either one to enable or one to disable. So stream directive overview

Starting point is 00:15:22 the identifier of the stream within NVMe is scoped at the namespace associated with a specific host. This was probably the most difficult concept to get down into words because we're trying to deal with multi-pathing here. So you have multiple hosts that are connected to a single controller through separate ports, so they are a specific host. If those two connections to the controller have the same host identifier and you define a directive within a specific namespace, that is consistent across both of those connections. That allows failover for multi-pathing and that applies to n number of hosts and each namespace for each host.

Starting point is 00:16:16 Resource allocation. So one of the things that came up is an NVM subsystem has some amount of resources that they can use for streams. And that is however, whatever the maximum number of streams that subsystem is capable of supporting. Okay, those resources can be allocated two different ways. They can be allocated just based on the entire NVM subsystem, in which case every time I want to go down and start using a stream, it comes out of a pool that's there for the entire subsystem. Or I can have a particular namespace associated with a specific host request to lock down a certain amount of those resources. This was actually our primary use case where the storage subsystem that wants to use this

Starting point is 00:17:12 wants to tightly maintain those identifiers so that they know exactly, I know that I have four that I can use and somebody else isn't going to come in and need one, and all of a sudden one of my four is gone. So this allows you to lock it down on a per namespace per host basis, but if you choose not to,

Starting point is 00:17:37 you can do it across the entire subsystem. So the stream directive receive command, there's actually three sub operations of the stream directive receive command. One of them reports the properties. There are two different sets of properties all within the return data structure there. One of those sets is for the NVM subsystem. That tells you what's the maximum streams the subsystem supports. So this is what your total pool is for the subsystem. The next is how many subsystem streams are available. These are resources that have not been allocated to a specific namespace associated with the host identifier. The next one that you can see up there is NBM subsystem streams open.

Starting point is 00:18:37 This is how many streams have been open out of this particular pool. Then you have another set of information that comes back on a per namespace basis. You have a stream write size and a stream granularity size. And what these are is these are the optimal write size that you should be using when you write to a stream within the namespace. These are namespace specific because in NVMe,

Starting point is 00:19:07 you can format each namespace to a different formatting in terms of how many LBAs and so forth, LBA size, all of that. So these are NVMe namespace-specific. So the stream write size is what's the right size that I should be using. The stream granularity size is how big is my, for a NAND flash, how big is my erase block. And stream granularity is actually as a multiple of stream right size units. So if your stream right size was a page size, then your granularity is actually as a multiple of stream rate size units.

Starting point is 00:19:49 So if your stream rate size was a page size, then your granularity is N pages. That's the type of thing that's being passed back there. Helps the storage application to know how best to use this SSD. Allocated stream resources. These are what I talked about before where for this namespace you can request some number of stream resources from the nvm subsystem if this number is zero it doesn't mean you can't use streams it means that you'll be pulling them from the subsystem stream resources streams open is how many streams are open in this particular namespace. The other, the next command is the status of open streams. That tells you how many, that tells

Starting point is 00:20:37 you the specific stream identifiers of every stream that is currently open in this particular namespace. If you send this command with a namespace identifier of all Fs, which is asking for all namespaces, then this will be the status of open streams out of this pool of the NVM subsystem streams. And then the last one, request stream resources for namespace. This is the command that actually allows you to say, this is how many I want reserved and locked down for this particular namespace. Okay.

Starting point is 00:21:19 Then the stream directive send. Again, there's two operations here. One of them is to release a stream identifier. And a little bit later I'll get into implicit and explicit release of this. But what this does is it says if I've been using stream identifier 55 and I know that I'm done writing to that particular stream identifier and I don't ever intend to do it again, I can send down a command to tell the SSD, I'm never going to use 55 again, or if I do use 55 again, it means something different. So please forget what you were previously doing

Starting point is 00:21:58 with 55 and if I write to that again, it's something new and different. The other one is to release stream resources for a namespace. If I decide for this namespace I don't need stream resources anymore, I can release all of them and return them to the pool. One of the facts is that we did not build it so that you could incrementally increase or decrease the number of stream resources that were allocated to a given namespace. You can request an amount, when you're done you can get rid of all of them. If you need to change how many you need to release all of them and go tell it here's a different number that I want. Okay so in the right command for streams, you specify a stream type of one, and you specify an identifier which is in the directive-specific field.

Starting point is 00:23:01 That identifier is associated with a particular namespace associated with a host and that is true whether or not you're taking it from that global subsystem pool or whether you're taking it from your namespace pool. Now if you have a namespace pool you will only be allowed to operate in that. You'll never be able to use all of those and then go grab one out of the subsystem pool. If the identifier specified is zero, then it operates as a normal write operation. In other words, it goes wherever the SSD chooses to put it as shown in that first example of a stream without...

Starting point is 00:23:44 the first example of writes without stream. So if you have things that you say, well this is kind of scratch pad stuff that I really, it's not associated with anything, you can send it to stream zero and it gets stored the same way as it always did. Now, when I send a write with an identifier, if that identifier associated with that namespace has not currently been used and allocated resources, that implicitly opens a new stream. That is the only way to open a stream.

Starting point is 00:24:20 It's an implicit open process where the first time the SSD sees that identifier, that stream is open. If it's necessary, a stream identifier may be implicitly released. So if you're operating in a mode where you don't go and do an explicit release of identifiers, and say you're working in a namespace that has four identifiers allocated, and you've used four identifiers, and they're all still open, and you come down and you give a fifth identifier, the SSD, through some algorithm that is vendor-specific,

Starting point is 00:25:01 will decide which of those four to close in order to open the new one. For hosts or subsystems that want to tightly control identifier use, they use the release identifier. That way they'd know that you're not going to implicitly close something they didn't want closed. So this is for hosts that want some very tight control. For hosts that really are kind of giving a hint of, you know,

Starting point is 00:25:31 okay, I'll tell you what I'm doing, but I really don't want to manage the pool at all, then they can rely on that implicit close where the device will choose something out of some algorithm as to what it's going to close. There are a number of possibilities and they are vendor specific. Yes? So is this at one end transparent to the OS or it could drive up the stack to a database and say I want this identifier and everything handled this way. Where is the brokering the identifiers?

Starting point is 00:26:02 It really depends on what you're looking at. If you're looking at a storage subsystem, these identifiers are probably brokered inside that subsystem. If you go to the very highest level, this could actually be that the identifiers are a Linux file handle. So there's a very great breadth of where those are brokered. And if it's Linux file handles, if you don't have an SSD that can support a very large number of them,

Starting point is 00:26:39 then you really have to do some sort of hashing somewhere in the OS to move that down to a smaller number as to what the SSD is capable of handling. The next feature that we've been working on is called advanced background operation. By the way, towards the end of the presentation, I'll come back and I'll tell you where each of these are in the process. So the next one is advanced background operation. It's being developed, again, in NVMe, SCSI, and serial ATA.

Starting point is 00:27:14 Why do we want this? Okay, IO performance is degraded when advanced background operation, e.g., garbage collection, occurs at the same time as I.O. So you really don't want to do really heavy-duty garbage collection right in the middle of a really heavy-duty I.O. time frame. So this allows you to avoid that overlap of I.O. and garbage collection. What does it provide? The intent is to provide a notification when advanced background operations are imminent. When is the storage device getting to the point that if I don't do something,

Starting point is 00:28:01 it's going to have to do garbage collection in order to do your next write. It provides a mechanism for a host to specify, do your advanced background operations now, and here's how much time you're allowed to spend doing it. And during this period of time, I will try to throttle IOs to you. This is very useful when you get into an array environment where you can say, okay, you know what, this SSD is going to get to do ABO right now, and I'm going to do my writes somewhere else within my array structure. It also provides a mechanism for a host to specify an amount of resources to free up.

Starting point is 00:28:42 So a host may choose to come down and say, you know, I know that I've got a really big chunk of data that I'm getting ready to write to you, and I want to make sure you have this much resources freedom. So it provides predictable and consistent performance. Now, I want to talk just for a minute about these resources that are being dealt with. The resources are not in terms of megabytes, gigabytes, terabytes, or anything like that. It is a percentage of the device's resources, and it may be storage resources, but it may be lookup tables and other things that it has to manage that are the limiting factor.

Starting point is 00:29:27 So it is a pure strict percentage number that may or may not relate to actual storage capacity on the device. And how that relates is, again, vendor specific. But it gives you a feel for how close you're coming to it so again we did some some testing internally using this feature without using advanced background operation where we basically did an FIO process three seconds of FIO followed by three seconds of idle, but not telling the device when that idle time was. You can see, you know, at times we get about 20,000 IOPS, but the general case is about 10,000 IOPS.

Starting point is 00:30:17 When instead, during that idle time, we go back and we say, oh, this is idle time, do advanced background operation. You see, we come all the way up here to about 75,000 IOPS. A very significant increase in the IOPS available if you can make certain, if you know when that idle time is, tell the device, go do your garbage collection. So, again, NVMe standardization of this, this is another directive type. The device returns characteristics for the ABO. It returns a minimum fraction of the resources available. This is the point

Starting point is 00:31:00 at which the device will start doing its own background operation. And that's 100? No, it's a percentage of... Is it zero or 100%? This is the minimum fraction. So if you know that when you're at 75% you'll start ABO, then that would be 75%. Although, actually, we moved away from percentage, and this is a fraction with 64,000 as the numerator.

Starting point is 00:31:35 And this is the denominator. But basically, it's, okay, at what point in time, what percentage of my device's resources are the point when I'll trip this? Okay, a current fraction resource is available. So you can compare that to here and see how close am I getting to doing advanced background operation. A maximum fraction of resource is possible. What this is for, and this got put in when we got asked to put in that request to say,

Starting point is 00:32:03 please free up this much percentage of your resources. Well, this tells you, well, what's the max possible that you could even ask for? Because if you ask for something more than this, I can't give it to you anyway. You know, if you've got my storage device 70% full of data, then you can't ask me for 80% or even 50% of my resources. So the next thing on this slide is the current status of the advanced background operation, which will tell you there's none in process, there's host initiated, or there's controller initiated. So it tells you, okay, this is initiated through requesting that you do it at the time I ask. This is, you didn't tell me to, I hit a point, I have to.

Starting point is 00:32:57 So the host can also request notification of advanced background operations imminent. And that will be a little bit more evident as we move forward. request notification of advanced background operations imminent. And that will be a little bit more evident as we move forward. So in the directive send command, you can set a notification threshold. So again, this is as a percentage or fraction of the resources. That threshold is where if you request a notification, you'll get the notification that it's imminent. You can specify the maximum time to perform advanced background operation and a target fraction of the resources.

Starting point is 00:33:41 So that was the extent of the background operations. Read determinism. This is only happening at the moment in NVMe. This has come out of a... yes? In the idle period, can you read from the drive or does it just do in-write? You may be able to read or write, and there are actually two modes of operation. One mode of operation is that when you do a read or write and there are actually two modes of operation. One mode of operation is that when you do a read or write, the advanced background operations will stop until you're done with that and then resume. The other mode of operation is the advanced background

Starting point is 00:34:16 operations will continue and you will get degraded performance. So you can do reads or writes during that period of time and you have a choice do I just want to do them with degraded performance or if I initiated them do I want to just tell them to stop and wait for my IOs to finish and when they have go back and continue doing it so you can get at that Slender Drive reports that ABO is in progress. Does it also report how much work is left in terms of time or percent? So you can get at that through the current fraction resources available. So you can go back during the time that it's in progress and find out how much is available.

Starting point is 00:35:00 And this is returned in the same response that also tells you the current status of ABO. So redeterminism being developed in NVMe, this is as a result of a request from several of the hyperscalers. And the intent here is to allow data that's expected to be read in parallel to be placed in physical locations that avoid blocking.

Starting point is 00:35:30 And actually it's not only read but written in parallel. So what you want to be able to do is say, okay, I know that this block of data may be read at the same time as this other block of data may be written or read. It's a very complex thing to figure that out, but that's the direction that this is going. So this is intended to avoid read blocking and read-write blocking. So attempting to avoid anything where your read is blocked by another read or a write.

Starting point is 00:36:08 Alternate mechanisms are currently being discussed. There are at least three different mechanisms. The two highest ones, though, there's one that is provide a request mechanism to storage for different read groups. To say, okay, this is in group A, this is in group B, this is in group C. I know that if I'm trying to read something in group A, that if I'm doing something in group B or C, that it's not going to conflict and I'm not going to get blocked. So that's this first one. The second one is a request to provide layout information to the application

Starting point is 00:36:46 and allow the application to manage it. And the current proposal there is around the concept of tell me a range of LBAs that are in one group and a range of LBAs that are in another group and so forth so that I can start partitioning my rights according to that mapping. I'm not in favor of that. Rebuild

Starting point is 00:37:16 assist for SSDs. The development on this has been in the SCSI standards group. It allows the storage system to easily determine LBAs that should be recovered from another source. It is an extension of an existing SCSI command called getLBASTATUS. What it does is it returns information on LBA extents that are anticipated to return unrecovered errors. So if you've read a particular LBA and you had to retry three times before it actually came back,

Starting point is 00:37:51 you'd want to flag that, you know what, this LBA is probably going to continue returning unrecovered errors. That contributes to the long tail effect because that LBA is the one that I'm going to read, read, read, read before I finally going to read, read, read, read before I finally get the data, causing a very long latency on that particular read, even though I got the data. By doing this, a storage subsystem is able to read the LBA extents marked as returning

Starting point is 00:38:19 unrecovered errors from a different location. In other words, hopefully you have some sort of redundancy of your data. You can go pick it up from someplace else to avoid the long tail effect in the first place, but then you can rewrite it onto that SSD so that it no longer, and this actually applies to HDDs as well, so that it no longer is going to have that extended read latency. This was actually completed for SCSI just last week. The other development that's going on in SCSI and Serial ATA is depopulation. Depopulation is a mechanism for the storage subsystem to indicate to the storage device,

Starting point is 00:39:17 I would like you to take offline this particular physical element of your storage device, for an HDD that might be a head, it might be a platter. For an SSD it might be a die. So there is a mechanism within here to gain information from the device about what physical elements may be going bad that may provide you problems. Now, there are two different modes of operation of this particular feature. One of them is that when you depopulate a particular element, only things associated with that element are affected, and the remainder of your data is still good.

Starting point is 00:40:08 The other is a repurposing depopulation. So you know you have a device. You go out, you scan the device, you find out, okay, which elements are kind of flaky. I want to get rid of those and then I want to reformat the device. Okay, this is really saying, you know, I want to take it from here, I want to reformat it,

Starting point is 00:40:34 get rid of garbage, and then use it someplace else. Whether it's someplace else in my data center or in somebody else's data center, I'm going to physically move that device. No, this is just a, I said, is intended. That's an intended use of it. The repurposing depopulation, repurposing the device. It doesn't have to. It could. Yes. Okay.

Starting point is 00:41:11 And I am down to about nine minutes, so I have to pick up my pace a little bit. I guess I've only got six slides left. So data preserving depopulation. You report physical element status. You have the ability to remove a physical element you have the ability to remove an LBA range from you so you may determine that a particular LBA range is associated with a physical element

Starting point is 00:41:37 and say, okay, don't accept that as a place to store things and then up above there you have to remember I'm not going to try to write to those LBAs anymore because they're no longer valid LBAs to use. And it has the ability to truncate the logical capacity and the process here at least my best way of representing it and I may be not totally accurate is that you can go out, figure out which physical elements associated with which LBAs have difficulties or bad,

Starting point is 00:42:15 move your data from your highest LBA range into good locations, then truncate the top end of your LBA range so there's no data that you're losing, and then depopulate the elements that are questionable or bad. Repurposing depopulation, same characteristics of the data preserving, but it does not retain data in elements that are not depopulated.

Starting point is 00:42:47 The double not here is a little bit difficult to understand perhaps, but if I depopulate element A, I may lose the data in element B when I do this type of depopulation, and following depopulation, the storage device may be initialized. It's not required that that is an option within the command that's being proposed for that. So with that list of things, NVMe standardization status, the directives identify and streams, technical proposal is complete. 30-day member review started on the 15th of September, last Thursday.

Starting point is 00:43:30 So you have three weeks left to comment on it if you happen to be a part of the NVME membership. Advanced Background Operation Directive was originally part of that other proposal, was pulled out. The technical proposal as it was originally presented is almost complete. However, some other alternatives are being considered.

Starting point is 00:43:54 Redeterminism is just starting the process. I foresee quite a bit of debate on this over the next several months. My prediction is six months or more to settle on the mechanism. SCSI standardization status, ABO and STREAMS was approved May of 2015, a year and a half ago.

Starting point is 00:44:18 STREAMS does not support the implicit identifier assignment and release. Within the current SCSI model you have to explicitly open a stream then use it and explicitly close it and there is no implicit operation. ABO does not support the target fraction other than that it supports everything that I described in the NVMe model. Both of these will be updated shortly to match the NVMe model. The streams will be updated first since the NVMe model is currently in member review, but it will not be updated until the member review completes. ABO will probably wait until things settle out in NVMe. Rebuild assist for SSD approved on the same

Starting point is 00:45:08 day as the streams was approved in NVMe also on the 15th of September, last Thursday. That was a busy day. Depopulation is under development in parallel in SCSI and serial ATA.

Starting point is 00:45:24 I anticipate it to complete sometime in 2017, and I'm not going to get any more specific than that. Having been in the conversations, it's not a quick process. So SATA, ABO standardization was completed in April of this year. Again, it does not support the target fraction. Stream standardization kind of went on a hiatus waiting for NVME to complete. It's mostly there, ready to go. I expect completion of that later this year.

Starting point is 00:45:58 And again, we'll go back and update the ABO to be consistent with the NVME standard. Depopulation, I think I already basically said that, should complete in 2017. It's joint stuff between SCSI and Serial ATA. So isn't one question on that, isn't there two portions of that depopulated? The date of reserving one is going to be 2017, but we're hoping to get the other version where you lose the data.

Starting point is 00:46:31 That's the repurposing. We're hoping that diminishes this goal. It's getting close. It's close, so yeah, it may. We've got some time to replace. It's already halfway through September, which means we have one more meeting cycle to finish it off. However, we are meeting every other week or so on that via conference calls. Okay, object drive.

Starting point is 00:46:59 This is a very quick slide, kind of an update of where we're at. Not a whole lot different from last year. It is being developed in SNEA. SNEA has started with an IP-based storage management proposal, and currently we're revision zero, version 12, and what is in there is supported by Swordfish, which has just been released. So this piece of stuff is released and is kind of the underlying girders for the IP-based storage management. Future directions, we hope to get into a key value API or something else of that nature,

Starting point is 00:47:42 but currently we don't have proposals that are on the table that we're working on. So if you're interested in object drives, I'd encourage you to get involved in the object drive technical working group. So with that, I have a total of three minutes for questions. So anything, or hopefully you asked questions as things came up. Okay. Yeah, one question in the back. So especially since the stream part of the thing for SCSI got done last year,

Starting point is 00:48:16 where in the software stack status is it supported by any device drivers? Right now where I see that being supported in terms of the SCSI model of that is actually in array controllers that are talking to SSDs. At the moment I'm not aware of it being built into a stack. I'm giving a presentation at five about how we can support the work in line. Okay. so I'm getting a presentation at five about how we okay Okay, thank you. Thanks for listening. If you have questions about the material presented in this podcast,

Starting point is 00:49:17 be sure and join our developers mailing list by sending an email to developers-subscribe at sneha.org. Here you can ask questions and discuss this topic further with your peers in the developer community. For additional information about the Storage Developer Conference, visit storagedeveloper.org.

Your Ad Here

Storage Developer Conference - #27: Standards for Improving SSD Performance and Endurance

...

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.