Storage Developer Conference - #6: SMR – The Next Generation of Storage Technology

Starting point is 00:00:00 Hello, everybody. Mark Carlson here, SNEA Technical Council Chair. Welcome to the SDC Podcast. Every week, the SDC Podcast presents important technical topics to the developer community. Each episode is hand-selected by the SNEA Technical Council from the presentations at our annual Storage Developer Conference. The link to the slides is available in the show notes at snea.org slash podcast. You are listening to SDC Podcast Episode 6. Today, we hear from Jorge Campello, Director of Systems and Solutions with HGST, as he presents SMR, the Next generation of storage technology, from the 2015 Storage Developer Conference.

Starting point is 00:00:50 Single magnetic recording is a new mode of performing magnetic recording. So in traditional, what we call perpendicular magnetic recording, the data is written on the media as a set of concentric tracks and as you write them you write them as independent tracks. What happens as you scale is that these tracks get smaller and closer together but they still remain as independent tracks. And that's what we've been doing in magnetic recording for a long time. We got to a point where, in order to continue to pack more data within the same area, we shifted to this mode where, instead of writing the tracks independently, we write them as these sets of overlapping tracks. And one of these sets of overlapping tracks, we call it a zone.

Starting point is 00:02:08 And in between zones, you have this little gap. So if you look at the picture of what we call PMR versus SMR, you see that these gaps between tracks only occur every so often into these bigger packs that are called zones. This looks great, you know, it looks like I fit more of these patterns on the same area. But it comes with this caveat. The caveat which is intrinsic to the recording mechanism is that while the zones themselves are independent, you cannot change, I'm saying here sector because that's how you usually talk to an HDD through the interface, but you can't change sectors within a zone independently. Because when you write it, you overwrite the track next to it. And then the question comes, okay, awesome, so what now?

Starting point is 00:03:00 So we've been at what now for a few years now. And, you know, in these corporate communications trainings that we have, they say, look, you have to say the same message seven times for it to stick. This is the third year I've been coming around talking to people about SMR, so I still have four to go. But there are a set of standards that have been worked on within T10 and T13 called ZBC or ZAC. ZBC would be the T10 version and ZAC the T13 version of it. Now, why SMR?

Starting point is 00:03:39 You know, it's the question. And we say, well, because we can increase aerodensity. If you embrace this technology, you're on a curve that grows quicker than if you don't embrace it. There are other technologies coming around and you might have heard of Hammer or BitPattern Media or MAMR and all kinds of different technologies. And while they all will increase aero density, there's always going to be this difference

Starting point is 00:04:07 of how quickly they grow if you adopt SMR or not. So the industry made this decision to make this transition into SMR. And we are just about finalizing the standards. They're kind of in a stable mode in T10. And in T13, almost there. And now we're at the point where, OK, let's develop storage solutions for these things.

Starting point is 00:04:41 So what are these solutions that we came up with in the standard? So how do we solve that problem that you can't write things independently? Well, can't you just let the drive do it? Why even bother, right? Why are you talking to us, bothering us, making us change software stack and all that? Okay, that's one model, one approach you can go. This is what we call the drive management.

Starting point is 00:05:00 The other approach is, why do you guys in the drive industry want to do all these things? We on the host side, we know much better, we have more resources and visibility. Why don't you just let the host manage the problem? Okay? We have the host manage model. And the question, isn't there something in between? Yeah, there is. We call it the host aware model. And these are basically the three models that we've been talking about in the industry over the last couple of years. So what we're going to do is we're going to spend some time talking about these three different models and see how they differ and what they mean for the system. Okay, again, at a high level, these three different models. Drive manage, in principle, has no software impact on the host.

Starting point is 00:05:51 It's designed as the drive does everything under the hood. So you don't have to change anything in your system, right? As you'll see, there's some caveats in that the performance characteristics and the overall behavior is going to be different than what you were used to. And one of the things that happened in the storage industry over the last couple of decades is that much more than what comes through the interface

Starting point is 00:06:18 standard has been ingrained into host side solutions. There are expectations as to if I'm using lower LBA numbers, that these are located in certain positions in a disk. And if I do IOs of a certain type, I'll expect certain type of latencies and performance characteristics. And those will change quite a bit when you embrace a drive-managed model. But from a software point of view, if you were just talking about run it and not getting any software errors, that would work.

Starting point is 00:06:58 The host-managed is completely different. It's the opposite side of the spectrum, which is it just won't work. It's identified as a different device type. I mean, you're lucky if your hardware lets you talk to it, let alone your software. So it's the opposite side of the spectrum. And then you'll see that the host aware tries

Starting point is 00:07:20 to bridge this gap somewhere in the middle, letting you do more of the management on the host as you feel comfortable with it but letting the drive pick up the heavy lifting for areas where you feel that either it's too hard for you to do it or that the performance impact is not going to be so great that you would bother changing your host site software for it right so these are the basic characteristics. So now back to how we figured this out

Starting point is 00:07:49 from a standard point of view. So what we did is we created this device model, which we called a zone block device. And the zone block device is characterized by partitioning your LBA space. In this case here, it goes from 0 to m minus 1, right? m blocks. That's your whole LBA range.

Starting point is 00:08:13 And it's partitioned into these n zones. Now, zones can be of three types. At a high level, there's two types. One is conventional, meaning that you would address them as you would any normal block device. Beat sectors, get a sector back, address them by LBAs, everything is fine. And then you have these new type of zones, which are called the right-pointer zones.

Starting point is 00:08:42 Right-pointer zone is characterized by having a right pointer. And the idea of the right pointer zone is that you should be writing at the location pointed to by the right pointer. And here is where we differentiate two types for the right pointer zones. We have what's called the sequential write required, which means you can only send your right IOs to the location pointed to by the right pointer. If you don't, you'll generate an IO error.

Starting point is 00:09:23 So the idea is you write sequentially, and every time you write, the write pointer gets updated to the next free LBA, and you can only write sequentially. The other type is what's called sequential write preferred. As long as you're writing sequentially and always at a write pointer, the write pointer will be there telling you where the next location is. But you're not required to. If you don't, then the drive will take care of it. And the right pointer becomes an undefined entity at that point. So let's look at how this looks in the drive model throughout. So typically, you're going to have,

Starting point is 00:10:14 within a particular drive, a combination of these zones. And if you're using a host-aware or a host-managed type solution, you can have conventional zones in both of them. And what distinguishes the host-aware from the host-managed is in the host-managed, you'll have these sequential write required zones. So you're required to write sequentially on these zones. If you don't, you get an I-O error. So it's up to the host to manage the I-O patterns

Starting point is 00:10:51 so that they satisfy the constraint that's intrinsic to the recording technology of SMR. Hence the name host managed. In the host aware, what happens is the drive presents you these capabilities, the same capabilities that are presented in the host-managed model, for you to manage the drive. But if you choose not to, then the drive sort of takes over. And that's sort of the model for it.

Starting point is 00:11:19 And they do it on a zone-by-zone basis. So when you look at a host aware model, you'll have the sequential right preferred. So down here, you have this picture where it shows a particular choice of how device manufacturer might configure their product. You might have a certain amount of conventional zones. They put in there in purple and a certain number of sequential

Starting point is 00:11:46 right zones. It could have been sequential right preferred or sequential right required, depending if it was a host-aware or a host-managed device. So get the picture overall. We have these independent zones. Zones can be written independently.

Starting point is 00:12:05 Within a zone, there's a right pointer, and you're supposed to just fill it up from left to right, sequentially. If you're in a host managed, you have to do that, or you'll get an IO error. And if you're in a host aware, you continue doing that. The moment that you deviate from that, all of a sudden the right pointer becomes undefined, and the drive takes over management of that zone. Now there's one way to recover that zone, which is through the execution of what's called reset right pointer command. What a reset right pointer command does is it

Starting point is 00:12:39 sets the right pointer back to the first LBA of the zone and validates all the data that's beyond it. So essentially, it's like an erase block, if you're familiar with SSD. Sort of, that data's all invalid, and now you're coming back to the beginning. And this works in either model. So if you had written out of order a zone in a host-aware scenario, and you say, oh, gee, I didn't mean to do that. I really wanted to manage it. Well, if you reset the write pointer in that zone and start writing again, then you're

Starting point is 00:13:09 back into having the management features in that zone. For a host-managed solution, the only way to overwrite data that you had written previously is to first reset the zone and then write those locations again. So I've been talking about these, and then basically these are the commands from the standard. I'm concentrating on the first two here. One is this called report zones, and the other is the research right pointer that I talked about.

Starting point is 00:13:46 The report zones, what it does is it informs the host what is the configuration of zones in a device. It also tells you, for all the sequential right, the right pointer zones, it tells you the value of the right pointer for all the zones. And remembering that in the host-aware model, that some of those might be undefined, which in the standard says it's vendor unique so each vendor makes a choice of what they do when you wrote randomly in the zone. But typically this is the command you would do if you want to discover where the

Starting point is 00:14:19 right pointers are. There's a couple of other here that have to do with zone conditions. And then you can open zones, close, and finish. I'm not going to go into the specific details of that. And I concentrate more on the effect of the overall model and using those first two. So here's a picture in which I put side by side the three models and I'm showing the same I.O. patterns. So the first column refers to drive managed, the middle one host aware, and the one to the right host managed. And I'm depicting here three zones side by side on the device.

Starting point is 00:15:06 So everything under the lower line is on the device. The middle is the I.O. pattern and on top of the host. So let's look at the left column, which is drive managed. So there's no SMR management on the host, per se. The IOPanners keep coming. And then left to right depicts location, bottom to top, time or order. So you see these IOs are coming. And in the middle zone, you see that there

Starting point is 00:15:39 are some IOs that are out of order, in the sense that there's a little blue space that's a gap that means that IO came that was not right at the right pointer. So when you look at how I'm depicting in the bottom a representation of the models, the drive always presents to the outside world an address space from logical block address 0 to the max. And I'm talking about having this address space either being constrained or unconstrained.

Starting point is 00:16:15 A drive-managed model presents to the world an unconstrained view. And it makes the constraint portion completely opaque to the outside world, meaning you have no idea what's happening in the constraint space. It's doing stuff, moving data around, doing what it needs to do to satisfy the intrinsic physical recording constraints and the I.O. pattern you sent. But you have no visibility. If you look at the far right in a host-managed scenario, the host has to have full SMR management.

Starting point is 00:17:01 And the drive does not have an unconstrained layer to present to you. It only has a constraint layer to present to you. And what has a constraint layer to present to you. And what happens is, as your I.O. patterns come down and you get an I.O. that would violate the writing rules of a sequential write required zone, meaning

Starting point is 00:17:17 your I.O. is not at the location of the write pointer, that I.O. generates an interface error. If you look at the middle guy, which is the host to where, and a good way of thinking about it is sort of a hybrid between the two. And this hybrid meaning that you can have a drive managed or host managed model on a zone

Starting point is 00:17:38 by zone basis. So if you look at the IO pattern, which is the same as before, for the first zone, which was sequentially written, you have this unconstrained view, it's writing, but you have visibility of what's happening in the constraint world. Because you're satisfying the rules, and you still get the right pointer that's valid and tells you how much data was written. In the middle zone, where you have an I.O. that's out of order, well, what happens is that it writes it, so you still have this view of the world on the unconstrained view, so you get your I.O. back. However, you lost visibility as to what is happening in that zone. Did it stage it somewhere else? Has he put it back in place? What's happened? Well, you know, you have no idea. The drive internally is doing something to produce that unconstrained view for you,

Starting point is 00:18:31 but you have no visibility of what that is. The moment you sent that out of order I O, then you lost that visibility. But for the last zone, since you write that one sequentially as well, it gives you the sequential on the unconstrained view, but you have the visibility to the constraint, meaning your right pointer location is valid. And when you do report zones, you get the location in that right pointer. So these are, again, at a high level, the three models.

Starting point is 00:19:01 And what I want to do now is spend a little time talking about what are the consequences of each one of those models. So let's start with the drive managed. That seems like, oh man, if that just solved all the problems, we'd even not bother. So when we look at the drive managed, and we look at the typical type of I-O patterns, you'll see that if you're just doing sequential reads, it's going to be very similar to normal SMR drive, I mean PMR drive. Random reads are going to look similar.

Starting point is 00:19:44 Sequential write is also similar because you're satisfying the constraint. Now, when you're doing random rights, well, that's where your mileage may vary. And of course, when you present this thing, look, it's three quarters the same. It's just, oh, but that one quarter, oh, that was the most important. I mean, you're just killing me.

Starting point is 00:20:12 Okay, so let's take a look at that one quarter down there. That's the most important. That's killing you. And I'm going to further divide it into six parts. So let's look at how I'm partitioning this. So I'm looking at having a low-duty cycle or a high-duty cycle. And I'm looking at having small blocks, large blocks, or what I'm calling huge blocks. Completely undefined, so I can change the numbers to fit what I'm saying.

Starting point is 00:20:42 And the trick is, if you have small blocks, then a drive, by using simple caching techniques, can really take care of the access pattern. And actually, it's going to look much better than we used to with PMR. And the reason for that is when you have large, very small blocks, the majority of the time of your I.O. is spent locating the recording head to the position where you want to write. And very little time is spent really actually doing the reads or writes. So if you have a lot of small block random writes, you can use the simple technique of, well, just write them sequentially

Starting point is 00:21:17 in the cache area that I have. And then when I have a lot of them, then I can read them, a bunch of them, and destage them. And what happens is, as you have a large number of them, you get to decide the order in which you're going to write. And then you get to write them in the order of the closest one, and the next closest, and the next closest. So all your seeks that you would do, which is the majority of the time, of a random write of a small block, they all got reduced. So all of a sudden, the overall time that you spend doing those IOs is much smaller.

Starting point is 00:21:53 So you basically had no seek when you write them all, because you're sequentializing them. And you change the one big seek for a bunch of very small ones. So you end up actually being able to support much more random writes than you were with a normal PMR drive. And it's so much so that it doesn't even matter if you have a high-duty cycle, because we really do it faster. You can actually support more IOPS.

Starting point is 00:22:17 It's awesome. So this is that case. Now let's look at the huge blocks. Huge blocks, on the other hand, you spend so much time writing sequentially that the edge effect of when you started it versus the end kind of becomes smaller. It starts looking a lot like just writing sequentially,

Starting point is 00:22:37 which you can do easily in SMR-type solutions. So those are not so bad. There's a bunch of techniques that you could use, you could envision being used under the hoods that will give you something that looks good. Okay, what about, oh, but I don't care. The important is the large block. You guys are killing me again, right? So in the large blocks, and this is what I'm defining by large.

Starting point is 00:23:02 It's something where the actual time you spend during the I.O. writing the data is an appreciable amount of the overall I.O. command. So this is a situation in which your I.O. time for the write is no longer dominated by the time to move mechanically the head to the position where you're going to start writing. But the amount of time you spend writing becomes an appreciable part of that as well. So when you do that, that little trick I said in the beginning, the math starts not adding up so much. Now if you stage it somewhere and you have to write it somewhere else, it starts getting a little bit more expensive to do that.

Starting point is 00:23:38 The fact that you're doing the I.O. twice starts being a heavy cost on you. And here, again, if your duty cycle is low, we can hide all manners of sin. We've got the time. We're going to back up. Oh, no, no, wait. I got it. I got it. I'll return you something.

Starting point is 00:23:56 Go back. Fix it. And come back. And it all works. But when you have a high duty cycle, and you have these large blocks, then it's not so trivial. Then you can see where these simple caching techniques that I described, they're not going to really get you what you want. Furthermore, since the IOs are bigger, whatever caching space you had, you're going to run out of it much quicker. So that's where it gets a lot more tricky for the drive

Starting point is 00:24:29 managed solution to take care of it. Now, there's a lot more techniques and things that can be done internally than the simple caching that I was describing. But it gives you a high level idea of what are the things that are intrinsically hard to be dealt with in these solutions. So if you live in a world where you either have low duty cycle

Starting point is 00:24:50 or you have only very small blocks or very huge blocks, typically you have an expectation that, you know what? If I just plug in this guy, either a drive manage or a host aware where you're just using in drive manage mode, I might get away with it. If you are in a position where, no, I have a lot of blocks that are larger, several hundreds of k's or megs, but not huge, where

Starting point is 00:25:22 it looks like you're sitting in the same place, that's where you might want to dig in a little bit deeper and look at your solution and say, you know, is this going to fit my needs or am I going to have to now look at the other alternatives where I do use the command set and try to massage my workload to fit my needs? Okay. So much for the drive-managed model. Let's switch gears and talk about the host-managed model.

Starting point is 00:25:53 So the host-managed model, as I mentioned before, it has this characteristics that it's a new device type. It's not backwards compatible. Out of order writes will generate interface errors. There's a lot of layers in the stack, a lot of little pieces of the plumbing between some initiator sending an I.O. and the actual device.

Starting point is 00:26:18 They're going to be very unhappy because these things fail. There's a lot of little bits and pieces for you to take care of. Now, how can you manage these and what types of solutions? Well, there's some that are hardware-based. You get an appliance for someone and they're taking care of it

Starting point is 00:26:35 for you and they'll present you either a virtualized block interface, meaning SAS and doing everything on the back end or they'll present use an as interface and you just talk about it's a filer and someone did it for you and then you're just using it as a storage solution right that's one option if you're trying to build one of those yourself then

Starting point is 00:26:58 you're probably going to be looking more in the bottom categories where you're saying well how can I get I get hardware that's not going to bomb on me when I see a new device type? And then what do they have to change in software to make it work? And in there, I'm basically breaking down on kernel-level support type activities and then application-level support. And by that, I mean in kernel level you can think of

Starting point is 00:27:27 either doing something where you're dealing with the block layer and you're trying to virtualize the effects of SMR. So you're doing that sort of the block layer and then you have file systems

Starting point is 00:27:40 and everything on top of it. So you use something like a device mapper, and then you might implement what you would implement in HD there, or you might implement some forms of caching in there. Another approach is for you to go and say, no, what I'm going to do is most of my applications access storage through a file system, so I'm going to go and design a file system that knows how to deal with these type of devices,

Starting point is 00:28:05 and that's the one that's going to do the heavy lifting for me. And then my application stack on top of that is just going to continue and change. I might have to tweak some parameters to make it work. So that's another approach you might take. And then you'll see currently there are efforts in all of these kind of directions. There was some support for XFS that's planned and no time

Starting point is 00:28:27 announced, so don't hold your breath on it. There are some projects that are branching off of existing file systems and making modifications to them to satisfy what is needed to support SMR-type drives. There are new file system projects that are out there and people coming out with solutions. And then there's all the other ones on the device mapper type implementations.

Starting point is 00:28:50 And you're going to see a lot of these types of solutions in the other talks today and then tomorrow as well. Now the other approach is, well, I'm not going to modify the kernel and I'm not going to rely on the kernel

Starting point is 00:29:07 infrastructure because of this new device type. I'm going to manage everything in application space. And to do that, what you do is you talk directly through a SGIO interface, and then you talk directly to the drive. There's a set of new commands that you can use, and you build your own solution. Downside of that is you lose a lot of the kernel facilities. So there's no caching. Like, you have to bypass the page cache. You don't have a file system.

Starting point is 00:29:33 You're talking about the raw device yourself. But if you're doing things like large objects in a distributed object storage type solution, you might do a local key value store that talks directly to the drive and manages everything on their own. That might be an approach that you might take. If you're taking that approach, there is an open source project that does, that's a library in user space that honestly, what it does is it wraps the new interface commands into a C interface.

Starting point is 00:30:07 So that you don't have to know how to populate the CDB yourself using SGIO. You just use these commands that basically just wrap those for you. And if you're interested in this, you can go to GitHub, and then you can download the project, playing around, and then there are some example applications. There's a few GUI applications that tell you the states of the zones in your drives. You can use this as a development tool and write your own, et cetera. So if you're interested in learning more, you can talk to me after this, and then I can give you some more information. You can talk to the gentleman over there, Damien Lemuel, who's the main maintainer of that project.

Starting point is 00:30:54 But again, these are the solutions, or let's say the main avenues that you have today, if you're choosing this host-managed approach. OK? I've seen a lot of heads of people enthusiastic. They already picked which one of the two. if you're choosing this host-managed approach. Okay? You know, I've seen a lot of heads of people, enthusiastic, they already picked which one of the two. They say, ah, this solves all my problem. And I see some that may be holding out,

Starting point is 00:31:15 okay, but there's that last model. Maybe that's going to be easier. Okay, so let's talk about it. So we have the last model, which is this host-aware model. As I told you before, the host-aware model is sort of this superset of the two in a hybrid mode, where on a zone-by-zone basis, you can decide to be in drive-managed mode or in host-managed mode. It has this advantage that it is backwards compatible,

Starting point is 00:31:44 and it does implement the new standard commands for you to manage these up. And how would this look in practice? Well, what ends up happening is if you have a host that is not aware of SMR, basically what you're going to get is a device that works in drive management. Your device doesn't know that it is an SMR. Your host doesn't know it's an SMR device.

Starting point is 00:32:19 It's going to write the same I.O. pattern that you've been writing before. And some things are going to come out of order, some things are going to come in order. And you have no visibility, use none of the commands, so essentially you're using a drive-managed mode of that solution. Now, the tricky part, and this is where I... What happens when you do have a host system

Starting point is 00:32:41 that is aware of SMR? Well, what you get really is what I call a configurable host-managed device. In a host-managed device, the location of zones, be it conventional or sequential write required, they're not chosen by the host. They're chosen by the device manufacturer at manufacturing time. The report zones command is used to inform the host of what was a decision that cannot be changed. So, oh, I want more conventional zones. I'm sorry. It's you get what you get and you don't make a fuss.

Starting point is 00:33:23 That's the model for host-managed. When you're talking about host-aware, you have this freedom where you got the information about the zone layout. They all start in this host-managed mode. All the sequential right, all the right pointer zones do. A host-aware device might have conventional zones that don't have right pointers associated with them, but for the right pointer zones you can treat them all as if they were sequential write required if you choose to. However, if in your implementation, you know, I had this host managed device before, but I really would hope that, you know, this zone at the end of the device, or if I had a few in the middle because I use these partitioning techniques and I really wanted a manufacturer to do a host managed, it fits all my needs.

Starting point is 00:34:16 Except that I wanted a little conventional zone here, here, or here. You're not going to get that done in a host managed device. There's too many needs and too many different people wanting their conventional zones in different places and in different ways, and it's impossible for device manufacturers to have configurations that are going to fit everyone's needs. And that's really where a host-aware type solution comes in. You get to choose which are going to be in drive manage mode and host aware mode.

Starting point is 00:34:46 And you say, well, but drive manage mode is not the same as conventional zone, because, you know, conventional zone, I assume that you guys are doing just PMR. Nothing in the standard says that. Bear in mind, conventional zone just means that there's no right pointers associated with it, and it's going to satisfy the normal interface commands as usual. How the drive implements that is up to the device manufacturer. So, number one, so in a host-managed device with a certain amount of conventional zone, you might really be getting some SMR zone in the beginning with drive management

Starting point is 00:35:26 that makes it look good. So there's no guarantee. So that's number one. Number two is you're going to most likely have differences in experience that are going to vary in the amount of zones you choose to manage as a host or use in drive manage mode where the drive manages it. If you think of a device that was built to be able to deal with the whole device working in drive manage mode, whatever caching schemes, whatever over-provisioning, whatever mechanisms were in place, were designed to give this good performance for random writes happening throughout the surface.

Starting point is 00:36:10 If you were to restrict those into a small portion of the device, you can think it's very likely that the drive will have an easy time giving you good performance for those. You can also imagine that the more of the proportion of the drive that you put in this mode that the drive has to handle with it, the more it's going to look like just the drive managed device. So one of the other issues that we have encountered with this overall model, or the zone block device model, and feedback with a lot of folks is, you have this solution which is,

Starting point is 00:36:55 the drive does everything, and I don't know, have no control, it's hard for me to use. Or you have this one where, I have to do everything. And that's kind of hard. It doesn't really fit a lot of the programming paradigms that I've been using. And then we say, well, but you have HostAware, right? And they say, well, HostAware isn't really much better,

Starting point is 00:37:14 because the tools you give me are still, the host has to manage it. Or I let the drive manage it. I can just choose what fraction of the drive is in one of those extreme modes. But it's not like there's other types of commands in there. So one of the things that would be interesting, this is for all of you guys to think about as you experiment

Starting point is 00:37:36 with these and build solutions and say, oh, if I only had this type of feature or this other type of, it would be nice for us to know. So we're wrapping up this, but five years from now, we'll have an update to this. And then you'll get to say something. Okay, we're coming back to the ten minutes of Q&A. Alright.

Starting point is 00:38:03 So, in summary, what does it want to say? It's SMR looks to everyone like, this thing, this hard thing I have to deal with. And when those things come around, it really is an opportunity. There's going to be those who figure out how to deal with it and can take advantage of technology. And there are going to be those who figure out how to deal with it and can take advantage of

Starting point is 00:38:25 technology. And there are going to be those which, you know, they just wait around until someone else figures it out to buy the solutions from the ones who figure it out. So it's time to choose in which camp you want to be. And that's all I had to say. Questions? Can you comment on the relative advantage, area density-wise, of SMR versus non-SMR?

Starting point is 00:38:59 Can I start? Can you give a little more? Yeah, it's a good question. It's a pertinent question. I give you these graphs. that look awesome, right? Awesome, there it is. Crystal clear, isn't it? is we have a product that's an 8 terabyte product that's not SMR, and we have a 10 terabyte product that is SMR. So these are announced products from HGST. So that gives you an idea.

Starting point is 00:39:37 The actual number of what is the arrow density gain, these are things that we don't quantify not because, oh, we're trying to, eh. It's because it's really freaking hard. There's going to be estimates, and they vary. And then you talk to folks, and some are very optimistic. So you're going to get 30%, 40% differences ongoing. Others say, well, maybe not so much.

Starting point is 00:40:01 Maybe it's just that these slopes are all getting smaller and the advantage is going to be more on the time axis that you're going to get the same capacity point a year earlier than if you don't use SMR. So those are the type of discussions you're going to end up having.

Starting point is 00:40:20 But I have a particular two product numbers that you can draw conclusions from. Any more questions? Yeah? So, I mean, Jim just talked a little about the same topic. At what point will the drive get so big that a rebuild problem just becomes, like, So that's a good question. That's not so related to SMR per se, but it's a problem that we in the HD industry

Starting point is 00:41:00 have been working very diligently and very hard to make worse and worse year by year. And the reality is that a lot of folks, you know, of our customers in the industry are getting to the conclusion that, you know, RAID-type solutions, after a certain capacity size, start making less and less sense. So the way around it is to use other type of solutions where you basically use declustered approaches, where it could be RAID or erasure coding if you have more than one block of redundancy. But basically, the stripes, if you will,

Starting point is 00:41:42 they are separated into different HEDs in the pool at any given time. If you lose a particular drive, then the rebuild is not rebuild everything to one single drive, because writing one single of these drives, you know, takes a while. And the way we promote this is saying, look, I guarantee you that you're not going to be able to use it up in less than a month. But what happens is that you just rebuild a little piece in different places of your spare capacity at time. So these are the type of solutions that you're going to have to be using as these drives get bigger and bigger.

Starting point is 00:42:17 We're already at a point where it's uncomfortably big. Let me take a few more questions. I have a couple questions. First one is, the drive at some point, it's Yes. Yes. Yes. Yes. Yes. Yes. Skipping? Yes. All that stuff? Yes.

Starting point is 00:42:46 Yes. So I mean, these are the type of things. The typical drives today, they have spares. When we find these, we reallocate that one sector to a different place. And then they're usually located throughout the surface, so it's not really far from its original location, but with certain granularity. So that same mechanism is going to be used for SMR. You can imagine

Starting point is 00:43:11 that those are going to have a little bit more tricky because you're going to have to space that zone out a little bit more because you're going to have to write those sectors independently. So you'll see that it's going to be tricky for device manufacturers to put that in place. Is one of those three modes advantageous to deal with defects or problems? I don't think that any one is particularly better than the other in respect of dealing with defects. I would say that drive managed solutions or host aware

Starting point is 00:43:42 solutions are going to have a lot of the infrastructure built in place to dealing with redirecting of IOs to a different place. So probably you'll feel a lot less of that effect versus a host managed that's always put in place and every now and then if there's something that has to go

Starting point is 00:44:00 and read somewhere else that is only going to happen because of defects, so it might have more of a visible impact to you. So then, why would I, what would be the purpose of using host methods since we have to have a lot of research to develop and to deal with these more involved guys, you know, like J-Cross, these people with sensitive eyes? What advantage do you see in your experience that one would see it? from host manager versus other stuff Most of the modem houses are trying to reduce IO by punching them together.

Starting point is 00:44:47 Essentially becoming more of those. Like what this is here. And they're already punched on semi-old to a large I.D. So they already have the very same as the month is. But in general, this month is strict meaning because the same goes for race, it goes for race. You can easily identify the race by the numbers

Starting point is 00:45:23 of the sun, in which case the race by the numbers of the ones in which case the race is the one that should. Only rightfuls support rights, i.e. themselves. It's a question. We have a use case called, let's say, call storage and operated by the data driver. Just one right sequential. So it's likely to be very similar It doesn't matter. It doesn't matter. It doesn't matter. And host-managed host-managed. They're just as in all of them. It's likely to be very similar. The advantage of host-managed in general from a system

Starting point is 00:46:15 point of view is since it's easier for a device manufacturer to implement it, there's a good chance that you'll see products that are of that mode for a certain capacity point coming sooner. And also, you are guaranteed that you're not going to have any rewriting in the background. If you have applications that are very time sensitive and you need to have that control,

Starting point is 00:46:41 the host managed solution is going to be more adequate. So those would be the kind of two advantages I see. You might get a better deal on the host-managed from availability, maybe pricing, and you have more absolute control. So if an I.O. is out of error, it's an I.O. error, and you fix your application. When you have the host-managed or host-aware or drive-man managed, it just happens in the background and

Starting point is 00:47:08 you'll see the small hiccups here and there. And if your solution is deploying a lot of these in an environment, that might be something that's more of a concern for you. But it's definitely harder for you to incorporate. Is there any fault work performance measurements on the delta? Between? Take a standard workload and look at the IOPS or the latency delta between host managed and drive managed. So that's a difficult question because the IO patterns that you can have in host managed are limited versus drive managed, right?

Starting point is 00:47:42 So if you... It's not good because you would just use that as your baseline. You can't do care what you can't do. Right. So you take a host's not good because you would just use that as your baseline. You can't do care what you can't do. Right. So you take a host managed and then run a drive managed. Yep, yep. So what I was going to say, follow up is,

Starting point is 00:47:54 so you have to look at, you know, okay, let's look at the set of workloads which you can do that. So most of these workloads where you have mixed workloads and all that, the writes are all out of order. So you can't do any of those. You can do that. So most of these workloads where you have mixed workloads and all that, the writes are all out of order. So you can't do any of those. You can do sequential writes,

Starting point is 00:48:08 and that's back to the drive manage picture I had. Well, if you're writing sequentially, it looks all the same. If you're reading sequentially, it looks all the same. If you're reading randomly, it looks all the same. And if you're writing randomly then... I mean, if you're writing randomly, then it gets tricky. If we don't do random writes,

Starting point is 00:48:31 though, is there a performance delta? In theory, the interface standard in SMR per se probably don't imply a difference. I'll take that with a big asterisk. Fine print. Fine print. Various manufacturers might choose to take advantage of the fact that you can only write

Starting point is 00:48:57 sequentially to implement a lot of other techniques that can squeeze more capacity that might have impact on your performance. Disclaimer. From what I've seen, individual vendors decide. And So you're not getting the change. You're getting vendors, which get what they have. So I'm really happy to be here.

Starting point is 00:49:36 Thank you. I think you have an identity. I'm so moved. I feel that a few years from now, So, folks, I got the out of time sign. The next session starts in 10 minutes, and, you know, I'll be around. Thanks for listening. If you have questions about the material presented in this podcast, be sure and join our developers mailing list by sending an email to developers-subscribe at sneha.org.

Starting point is 00:50:18 Here you can ask questions and discuss this topic further with your peers in the developer community. For additional information about the Storage Developer Conference, visit storagedeveloper.org.

Your Ad Here

Storage Developer Conference - #6: SMR – The Next Generation of Storage Technology

...

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.