Storage Developer Conference - #3: Standardizing Storage Intelligence and the Performance and Endurance Enhancements it Provides

Starting point is 00:00:00 Hello, everybody. Mark Carlson here, SNEA Technical Council Chair. Welcome to the SDC Podcast. Every week, the SDC Podcast presents important technical topics to the developer community. Each episode is hand-selected by the SNEA Technical Council from the presentations at our annual Storage Developer Conference. The link to the slides is available in the show notes at snea.org slash podcast. You are listening to STC Podcast Episode 3. Today we hear from Bill Martin and Chang-Ho Choi, principal engineers with Samsung, as they present on Standardizing Storage Intelligence

Starting point is 00:00:44 from the 2015 Storage Developers Conference. I am Bill Martin from Samsung. I represent Samsung on SSD standards throughout the industry, and my co-presenter today will be Chang-Ho Choi, also from Samsung, who is working on the development of what we're talking about inside Samsung. And we'll talk about some of the questions that Dr. Gibby asked about earlier. So, as I said, I hope that you're having a great time at Storage Developer Conference. In addition to being from Samsung, I am also the chair

Starting point is 00:01:25 of the technical council for SNEA. So I do a lot to get this put on and hope that this is useful to you. To start with, I'd like to talk about what issues we are addressing with the initiative that we have called storage intelligence. Currently, hosts don't have a mechanism to understand how storage devices, the internal features, which leads to inefficient operation of background operations and inefficient placement of data. So when you are talking to an SSD, there are some things

Starting point is 00:02:08 that if the host knows about them, then it can do a better job of telling the device where to or how to place the data on the device. Additionally, the current technology requires multiple translation layers for key value to block storage and storage intelligence the overall initiative that Samsung has been working on for the past couple of years also has a prong that deals with that. In addition, data and computational processing is

Starting point is 00:02:47 not co-located, which leads to increased I.O. traffic, underutilization of the compute power that's in the storage device, and overutilized compute power in host and the storage subsystem.

Starting point is 00:03:05 So, those are the issues. How do we solve these issues? utilize compute power and host in the storage subsystem. So those are the issues. How do we solve these issues? We solve them with what Samsung has called storage intelligence. As we've taken it into the standards community it is getting called by its individual component

Starting point is 00:03:20 names, but as a whole this talk is talking about all of those components. So what is storage intelligence? It is an interface to provide better collaboration between an SSD and a storage subsystem. It deals with background operations, and we'll talk about each of these items a little bit more as we go through the talk. But background operations in particular for SSD are advanced garbage collection. And the word advanced there has some meaning in the fact that SSDs are always doing background operations, and among those are garbage collection operations that may not impact performance.

Starting point is 00:04:03 But this specifically deals with those operations that impact performance. Stream operations, this particular feature stores data with similar lifetimes in associated physical locations. What that means is not that this is not a hot-cold storage issue. This is an issue that all of this data is going to expire at the same time or no longer be needed by the storage subsystem or the server at the same time. So it will all be trimmed or pick your particular technology as to what the term for that is where you are telling the storage device that you no longer want access to that data. It's also a mechanism to offload performance operations to the SSD. The first of those is object storage. It defines a key value storage API. And secondly is in storage compute, which is a framework for offloading a processing to the storage device, allowing you to use

Starting point is 00:05:15 the compute power of the storage device to do your processing, which could be key value processing, could be any type of processing you choose by putting and using the CPU that's already on the SSD to do whatever computation you need. So, Chang-Hai? Good afternoon, I'm a semiconductor, Samsung semiconductor here. Okay, Bill introduced three new features as a storage intelligence we'll cover one by one. And let's start with the background operation control. With the background operation control, host can control background operations such as a garbage collection inside SSD. Right now there is no interface or no

Starting point is 00:06:15 mechanism host or storage system can control the internal garbage collection for SSD. But we want to provide an interface and we want to provide a way host can control background operation. There are four commands for background operation control. The first one is that we provide interface to set background operation mode. We have two background operation mode. One is stop background operation and the other mode is continue background operation mode. In stop background operation mode, even device starteditiated background operation, they suspend background operation when I.O. comes to device. Because we want to give a high priority to I.O. traffic, we suspend background operation in stopgc mode and after we serve the I.O. we resume the background operation.

Starting point is 00:07:26 In continue background operation mode, actually continue means that we don't want to suspend our background operation. We just want to continue background operation even when there is I.O. and second command is start and stop background operation as I said your host can initiate advanced background operation and also it could send stop command. When you send a start

Starting point is 00:08:07 background operation mode you could also set up some a you could set up a period of time that device could perform background operation. In that way host could control even background operation time. And finally, host can retrieve background operation status through the background operation log page. In the log page, we maintain the background operation status. For example, we make some flag if there is internal background operation running. Also we have a one field showing that the advanced background operation meaning that host initiated background operation is running or not. As I explained, host can specify a period of time that device may perform background

Starting point is 00:09:08 operation when host sends startBackgroundOperation command. Why do we need background operation control? As most of you understand already, I. performance drops a lot when I.O. comes to device at this, when the device running background operation, for example garbage collection. So we want to avoid overlap of I.O. and background operation inside the device. In this way we could achieve predictable better consistent performance. We measure the benefit of background operation control with FIO tools. As you see here, we make a three second idle time before we run the FIO traffic. We run this one on 480 gigabyte NVMe device. We did 128

Starting point is 00:10:21 kilobytes write with four different lifetime. As you see in the left side, even there is idle time, device didn't run any garbage collection. And also after idle time, when FIOs start write traffic, device runs garbage collection. So I.O. and garbage collection overlap in this case. Because of that you see a lot of performance drops and a lot of performance fluctuation. But if you run the background operations such as garbage collection during idle time, as you see in the right side graph, you could see a lot higher and consistent better performance. As you see in this

Starting point is 00:11:16 traffic model, we run the host run the garbage collection during idle time, after that host send the write traffic to device. Okay, this is background operation. Do you have any question on background operation control? Is that showing 4x more performance? Much more than 4x in this case. We used 480GB SATA device. So in this measurement we could see more than 4x performance. Much higher performance. There is something missing from this, and that's that what they're stopping is, let me back up a second.

Starting point is 00:12:09 SSDs do back up all the time, as Bill said. They're designed for a certain workload in that balance to work out. And we have products that run from client, which work at a certain load to enterprise which work at much higher loads and they're designed for that sort of back end and built in work to be going on and maintain their speed. If you overload what that particular product was worked for, you end up paying the piper sooner or later. Sure.

Starting point is 00:12:41 That's what's being shown as a difference here. What this feature provides is the ability to stay out of that mode. That's correct, yeah. That's correct, yeah. The advanced warning? So the advanced warning in the SCSI proposal, there is within the information that is provided to the host, there is a log page that lets you know a percentage of available device resources. The host is able to set a threshold on that percentage and then the device will give an asynchronous event notification when you cross that threshold saying you crossed a threshold which isn't all the way at the point that we're degrading performance but if you don't do something soon we will end up with degraded performance indicating to the host that you need to do to allow

Starting point is 00:13:54 us time to do these advanced background operations. Right now it's not a smart but it's a separate log farm. It is a separate log page. It is done in SCSI, done through the same mechanism as the resource provisioning log pages provide you information. To the point of Bill and I'm sorry your name is basically you're being getting advice, your application is getting advice that hey you're about to reach the threshold, it's up to you and I'm going to show my age here like in a mainframe when you get a threshold your application needs to know what to do with it to improve its own performance, right? Correct. That's basically what you're doing. You're giving a mechanism to help the application.

Starting point is 00:14:48 Exactly, yes. Yes. So the level of your utilization, let's say, in the SSD device, is that information available to the host at any time, or is it only when he gets the treasure? No, no, just so you know, we call it like a free resource information like in SSD could be free, the number of free erase blocks, that information will be available anytime. So isn't this pre-notice, pre-advanced warning? Isn't that more of an indication that if you don't kick in garbage collection, it's a low watermark and you're about to run out of available resources or erasing blocks to write to?

Starting point is 00:15:34 Actually, it's a lot of mechanisms. So when you get the advanced warning, you're really not going to stop garbage collection. You better start it. Correct. You don't have a process. Yes, that's a warning that you should allocate time when you're not going to be impacting your IOs to do garbage collection. Yes.

Starting point is 00:15:53 The normal balance has been set. You're going to have to go to the other space that that IO is for. That's what you're getting from the warning. Basically, you pay early, or you pay late, but you pay. But it can work if you. It's a tough question. Is there any way to form up a here? Okay. Male Speaker 2 Okay. Male Speaker 2 Sure. Male Speaker 2

Starting point is 00:16:07 Yeah. Male Speaker 2 I have a question. Is there any way for the to query the criticality of the ground operations or like the size of the backlog? Because there could be a potential different size of the ground operation. It could be different. Maybe some of these do the mirroring.

Starting point is 00:16:15 So I'll assume it could be low priority or critical operations. And if you have a question, you can ask me. Male Speaker 2 Okay. Male Speaker 2 Okay. So I'm going to ask you to do the background operations. Male Speaker 2

Starting point is 00:16:23 Okay. Male Speaker 2 Okay. Male Speaker 2 Okay. It could be different, or maybe some SSDs do the mirroring. So I'll assume it could be low priority, high priority, or great calculations. And how would the application know if the backlog is in terms of megabytes or gigabytes? Like one gigabyte will test you or one terabyte will test you. And speaking about application, I can see that application can manage that, but how do you see that managed in the server environment where on top of the server you have other on the host and who is the manager of your storage subsystem that would have to manage that. In terms of your criticality this really it's not like Dprag, you don't do Dprag on SSDs.

Starting point is 00:17:22 Right now there is no priority mechanism yet, but as you know I think that could be an interesting thing so we could explore in the future. To answer that in a slightly different direction, there is background operation that is going on all the time. Okay, that you don't see doesn't affect anything. This is the background operation, the warning here is the background operation that will impact your performance because you are over utilizing the device. Let's go, let's move forward to the next one. Make sure we have time to cover each other. Sure, okay. I think if you have more questions we could discuss after the presentation. Let's move on to stream operation. Okay, stream operation, as somebody asked before, we'll cover endurance also with stream operation. With the stream operation, host can associate each write data with a stream. In our implementation in

Starting point is 00:18:22 SCSI standard, we are using stream ID. And then stream ID you could think is a kind of a tag of data. Based on that tag, a device places all data associated with a stream in a physically associated location. You might think erase block as one example of physically associated locations. Also, device is expecting that all associated data with stream is expected to be invalidated at the same time. All of them would be trimmed together or unmapped. You might think a very simple example is that you could think about one big file. When you delete the file, all data will be invalidated at the same time. So you could use A stream for a file. That's one use case but there is much more than that. Why do we need a stream operation?

Starting point is 00:19:28 We'll see the example in the next slide but as you know when different lifetime data is intermixed in one erase block, there is high garbage collection overhead. Because when you run the garbage collection, you need to read a valid page and copy to a new erasable log. Because of that operation, if different lifetime data mixed in one erasable log, you need to copy more. Also, it causes to copy more. Also it cause high write application. Write application factor here is that actually, write application

Starting point is 00:20:11 factor is the amount of data written to the physical NAND block divided by the amount of data written by host. That's a WAF, write application factor. With the stream operation we could achieve high system performance and also we could achieve a better device endurance, meaning that we will have a better lifetime of SSD. Let's see one example. will have a better lifetime over SSD. Let's see one example.

Starting point is 00:20:48 Let's assume that there are three different kinds of data. As you see here, we could assume that there is a virtual machine A data, and database data, and another virtual machine data. Also, let us assume that the blue rectangle is the erase block inside SSD. Without stream operation, data will be written in order, right, processed in the device. So in this case, as you see, the first erase block, as you see, there are three different kinds of data. Like virtual machine A data, database data, and virtual machine C data.

Starting point is 00:21:32 Let us assume that the host deleted virtual machine C, because it could be migrated to another server, or they not need a virtual machine anymore. In this case, when garbage collection occurs in the first erase block, you need to copy A1, D1, D2 data. That cause additional data right to the physical land block. But with stream operation, data is grouped according to a stream. As I said to you, let us assume that virtual machine A has a stream ID 1, and database could be stream ID 2, and virtual machine C data would be stream ID 3. And based on that, inside the device device firmware will allocate the separate erase block and they just put the same stream ID data into the into a same erase block. Let's go over the same situation the host delete virtual

Starting point is 00:22:39 machine C data and garbage collection occurs. In that case, the third erase block, all data in the third erase block will be invalid. So when garbage collection occurs, they don't have to copy to another erase block. So we could reduce additional data right to a new erase block. In this way we could achieve a lower write implication meaning that we could get a better SSA endurance and lifetime. Also we run, we measure the benefit of stream operation, we mimicked four different lifetime data such as 1x, 10x, and 33x and 55x lifetime. We assumed that there is four different lifetime data. We do 100% of 128 kilobytes write with FIO. As you see in the write throughput, legacy one meaning that without multistream or without stream operation. So there is some fluctuation, also the performance is very low. By the way we measure the performance in the steady state. Before we start a measurement, we make a device dirty.

Starting point is 00:24:09 And with a stream operation that is a multi-stream case, you could get a consistent high performance numbers. More importantly, if you see the WAF, right application factor, without stream operation, the WAF is almost 3. But with the stream operation, we could achieve one of the WAF. So based on our measurement, we could see 3x SSD endurance and 3x SSE lifetime. This is a stream operation. So I think, do you have any other questions? We could spend several minutes for this one.

Starting point is 00:24:55 If you have, or if it is clear, then I could move to the next one. Okay. Looks like actually it's very intuitive and then let's move on to object storage. Object storage uses a key value store storage model instead of a block storage model. But as you know, most current key value store implementation is done by, mostly maintained by host side. Host do key value mapping to physical location in the host memory. Also they maintain mapping information inside the host memory. Because of that, the host needs a lot of memory footprint as the key value size increases. Why do we need object storage? As I explained right before, all translation from key value

Starting point is 00:26:11 to block storage protocol occurs in the host side, meaning that they use the host computing cycles and host memory. Also there is double logging issue. When you write data, the host will log first and inside the device they do logging also. So double logging is another overhead of current implementation. Another big issue is that when initialized at the initialization time the host need as I said the host needed to maintain a key value mapping information so at the initialization time they needed to retrieve all the key value mapping information from the device. This is a

Starting point is 00:27:05 kind of a big overhead for the current implementation. With the object storage, we could reduce a host's burden including host computing power and host memory footprint for key value mapping. Let's compare object storage and current implementation legacy mode. Okay as you see here all mapping and all data logging starts from host side. So host need to run mapping and data logging and also periodically they need to flush the mapping information from the memory to the device because the current memory does not have any persistency. So if there is a power failure then they lose mapping information. Because of that they flushes a key value mapping to the device periodically. And device just store data and also maintain some

Starting point is 00:28:20 metadata logging and stores map also because device need to maintain okay, host do map from key value store to LBA but device do LBA to physical page number mapping so there is double mappings also in this case. But if we move to object storage model, what host need to do is to just transfer data to device. Then device will take care of everything including mapping, logging, metadata maintain, maintenance also. So Bill will cover in-storage compute. So in-storage compute is a process to offload the compute of data calculations from the host to download an application to the device for the device to then do the processing. So this allows the device to do any type of computing that the host wants. As I mentioned earlier, the device has a processor on it that is not fully utilized for the flash translation layer, has compute cycles available for processing other activities.

Starting point is 00:29:54 So this offloads the host from doing that processing. So why do we want in storage compute? Right now, high I.O. traffic is caused by reading all of the data that's necessary to perform a computation, performing the computation, and then writing the results. Unused device compute power and bandwidth can be used by putting this in storage compute into the device. It also reduces the I.O. traffic between the storage and the host, as I said before.

Starting point is 00:30:35 You're not going down and reading a large amount of data, then processing it. You tell the device what you want to compute on. The device then does that computation and returns only the data that the host is interested in. It reduces the host computing burden because you're taking advantage of that compute power on a distributed number of storage devices. It enhances the application system performance and power consumption, again, because you don't have to overpower your host compute.

Starting point is 00:31:07 You're utilizing the compute power that is spread out across all of your storage devices. So today, if the host wants to do a computation, whatever type of database you may be computing on, the host has to retrieve all of the data from the device or devices. This may be data that's stored across multiple devices. It's got to pull all the data in, perform its computation, and generate the results. Within storage compute, what happens is the host requests a computation to the device. The device or devices, as I said before, distributed across

Starting point is 00:31:49 a number of devices, do the computation. It does the search, whatever computation you need done, to find just the data that the host wants. So instead of having a large volume of data that's returned up in this top picture of the current implementation, you have a much smaller piece of data that's returned up in this top picture of the current implementation, you have a much smaller piece of data that's returned, therefore reducing your burden on the IO bus, which with today's solid state storage is actually becoming more of the bottleneck

Starting point is 00:32:20 because of the high performance of the storage. Yes? Do you see that working the same way with conditional computation? Yes. That can get the logic in conditional computation can get fairly. So all that can be forced down to the SSD?

Starting point is 00:32:40 Any of that that you can write an application for that can be distributed across a number of SSDs can be forced down to the SSD. If you were treating data from 100 SSDs, right, and doing computations based on 40 of those, like erase-basedbased computation or... Okay, so what you can do with that is you can push the computation down to some extent where you limit the amount of data that you're retrieving. But no, the computation is not going to move the data from one device to another or some subset of it and do that computation down there. But it's trying to limit the amount of data that's returned from the device.

Starting point is 00:33:26 I think it depends on what kind of computation you're considering. I think it depends on the complexity of the computer. Yes. I think we've considered searches fairly easy, but now when you start adding in higher Yes. Yes. We're proposing that you have the ability, we're working on an API that would allow you to compile your own program, to have a common language between all vendors. So the next part of this presentation moves into the standardization process.

Starting point is 00:34:20 It is our goal to make this something that is available consistently from all vendors. But basically, yes, you would compile your own application and download that application to the storage device. But they're really small applications, right? The whole idea is to make a really tiny thing go into this. Yes. Yeah. So you'd have to bring down your application

Starting point is 00:34:44 with small parts of the disk and seal that up. But I'm more considering the complexity of what you would need to do if you were to use a filter that does not say. Because that means you actually have to either precompile or. This is, we're assuming that you have defined this as an SSD for a specific use case. So you are compiling an application, downloading it, and it's staying there.

Starting point is 00:35:15 It's not that we're suggesting that you create an application, put it on the device, run the application, then put some new application on. This is assuming that you have an application that you are using over and over and over with the device. And so you compile it once, you download it to all of your devices, and that way you can then make calls to that application

Starting point is 00:35:38 and have them perform that compute. Yes? To your point, small things, let's say that your application is constantly doing some pattern search. You know what a pattern is? It's a small pattern and you want to know how many times you should get it on your space and your SSD.

Starting point is 00:35:48 Just launch that over it, right there. You'll look for it. And that's a typical one that I know basically we use in some areas. Now, are we doing it like this? No. We're doing it like this. We're doing it like this. We're doing it like this.

Starting point is 00:35:56 We're doing it like this. We're doing it like this. We're doing it like this. We're doing it like this. We're doing it like this. We're doing it like this. We're doing it like this. We're doing it like this.

Starting point is 00:36:04 We're doing it like this. We're doing it like this. We're doing it like this. We're doing it like this. We're doing it like this. Launch that over right there. You'll look for it. And that's a typical one that I know is already used in some areas. Now, are we doing it like this? No, because the SST3 don't have that capability. This will help us a lot with that. It will save you a lot of cycles on the post. The post just

Starting point is 00:36:20 doesn't trigger it and boom, you're going. Go for it. Yes, Tom. I was curious, there's a kind of library of functions that you can call Yes, Tom. standard data for the framework of the computer. Because if for basic idea is that we don't have, I think it is impossible to prepare all the libraries for world one. So we want to provide mechanism and then like a think of any application developer, any developer could develop one library. What remote that could run inside SSD. So in that case, if you provide interface or some framework, they could download their library inside SSD, and then they could initiate that library running.

Starting point is 00:37:21 So you will create needs to do that, but also in some way, which is done as purely as something can't be. Correct. Yes. . Yes. This is a question more out of curiosity. When I was looking at the presentation, are you guys thinking, and probably I'm going way ahead of this already. Are you guys thinking of doing some in-band type of ability

Starting point is 00:37:46 to send within the data instructions to, hey, do this compute stuff while I'm sending this data to you? Or is that in other plans? What we're looking at right now in terms of our plans is that the data is already on the device, and you're sending a request down to the device to do some sort of computation on the data that already exists there. .

Starting point is 00:38:12 JOHN LUTHER KING, JR.: Right. I saw that. I could see a little follow-up. . JOHN LUTHER KING, JR.: And there's definitely extensibility. . Yeah, it's an extensible.

Starting point is 00:38:21 . JOHN LUTHER KING, JR.: OK, I've got ten minutes. I'm going to try to get through the standardization process because I think that's really important. That's something that people want to be aware of. I am, from Samsung's point of view, driving the standardization effort. Changho is driving the development effort. So currently this is standardized for SCSI.

Starting point is 00:38:43 It's documented in SCSI block commands for the background operation control and the stream operation. The last meeting as our developers are developing this we found some places that we had some things that we needed to clean up there. So at our last T10 meeting there were some corrections and enhancements that were approved so it's still we are actively developing products in this and making certain that the standards stay consistent with that product development within the SCSI area. Additionally there's a proposal being considered for SATA and there is an expected completion. My current

Starting point is 00:39:26 expectation is to complete this sometime in the December timeframe of this year. We've been through two iterations of discussion on that proposal, had really great inputs from the group. It will be something that is translatable through the SCSI to ATA translation, so the two of those are compatible with each other. And thirdly, it has been approved as a work item by NVME. It is not being discussed in the NVME working group at the moment because they are very heavily devoted to NVME over fabric at the moment because they are very heavily devoted to NVMe over fabric at the moment.

Starting point is 00:40:05 However, there is a subgroup that is discussing this. We are meeting about once every three or four weeks as people have available time and the next meeting of that's actually going to be this next Friday. If you are a member of NVME and would like to get involved in those discussions because it's a separate distribution list that we're using for that, please contact me and let me know and I'll make sure you get on the distribution list. So the hope is to be able to bring this into the NVME technical working group in the November time frame. And my goal is to try to have this fairly clean before we get there in November so that by March we're at a point that we could be ratifying this for NVME. So the process for object storage and in-storage compute.

Starting point is 00:41:01 Object storage is being developed in CISNIA object drive TWIG. We don't have the proposal in there yet for an API. The technical working group within SNEA currently, actually, let's see, okay, so the group currently is working on the third bullet down here, which is management of IP drives. There is an expectation to bring in the in-storage compute as well as object storage into that group in the near future. The group is also monitoring and has members who are participating in the Kinetic Open Storage Foundation and trying to make certain that we don't do anything that

Starting point is 00:41:55 is in conflict but rather enhances what's done there so that we're not putting out two separate things for our customers, but at the same time we believe we have some things that will enhance and be better than what Kinetic Open Storage Foundation has in terms of a standardization process, so we will be bringing things in to do that. So for object storage and in storage compute, the requirements document is completed for both of those.

Starting point is 00:42:30 And for both of those, an API document is expected to be started in the near future. For the management of IP drives, the requirement document is well developed. And an outline of the standard was started in July in the object storage technical working group. Yes? Is the object storage interface supposed to be

Starting point is 00:42:55 based upon existing protocols or will it be yet another object? You're talking wire protocol or the API at the library level? The library API itself. . OSD 1 and OSD 2. It will be something different than OSD.

Starting point is 00:43:23 Yes? It actually has an ethernet interface drive that has a similar concept. Are you compatible with that in terms of capabilities, ether-facing? You're talking about the kinetic inter... No, kinetic does not give you the capability to derive HST tokens. It matches Ethernet drive. That's that. Are you talking about WD? HGST? You're talking HGST drives. So HGST is participating in the working group and will be bringing... will is participating in the working group and will be bringing, will be participating in the development of the standard. So I am not specifically

Starting point is 00:44:12 familiar with their interface but I know, I know the people involved in the group. I know that they will stand up for making certain that it is compatible. Why OSD? OSD had some limitations, has not been, you know, it has not moved forward in a large number of years and we feel that there's a need for a new and different direction for object storage. I believe we think OSD is little. Yeah. Yes.

Starting point is 00:45:05 Yes. To standardize object drives. My thought is that all the storage and the storage computer, as I explained before, we want to make a standard framework. Yes, we could define the protocol and teach storage protocol, but I think that part could be a next step for this one because if you make a good framework as a standard, then I think we could set up some kind of a protocol, a storage protocol for that. But at least for now, I think we are focusing on the overall of the instrument computer. So two additional comments and then I'll get to you. Related to what's being done in SNEA, that is, with the exception of the management of IP drives work that's being done, is more wire protocol agnostic.

Starting point is 00:46:30 So the framework should work over any wire protocol. Yes? to use the in-storage compute thing to implement a protocol? To implement what? To implement a protocol. So like, whatever, whatever. So that you could basically use the in-storage compute to install, say, a OSD interface. Yes.

Starting point is 00:47:02 You absolutely could. As a matter of fact, a picture that has been drawn within the object storage group that I don't have in this slide set actually shows the view of the fact that you have object drives at the same level in a stack, theoretically, as in-storage compute drives.

Starting point is 00:47:25 And a subsequent picture shows object drives could be a subset of in-storage compute where you could download your object drive code, whatever that is, into your in-storage compute engine. So if you were to combine an object and an in-storage compute, I don't think the drives have initiator capability. Correct. So from the standpoint of a drive, you do that in storage subsystems. You really don't do that on your drive per se. Excuse me, but you just push the key down into the drive. Isn't that what's done in storage? systems. You really don't do that on your drive per se. It is. I don't see that as being a natural and immediate outcome, but there's no reason it couldn't be. You know, I'm always saying, under the supervision of the storage system that would be monitoring

Starting point is 00:48:55 traffic, he would throw the code down there, give them parameters that would set limits, and then the drives would function as a federation and they would keep track of what objects were getting access and they could move them about not in a scripted way but in more of an array grid of drives. It's theoretically possible but it does take a change to

Starting point is 00:49:17 the code on the drives to make them initiator capable. So on this slide you've got both my contact and Chang Ho's contact. If you're interested in this technology and you're at a place that you would like to partner and look at samples, et cetera, Chang Ho's the person to be getting in touch with. If you're interested in making certain that the standard goes the way you'd like it To or at least attempting to influence us

Starting point is 00:49:50 Please contact me and I will help you get involved in whatever of the standards you're interested in So thank you further questions, I'm willing to take them we are at the end of our time But since we're the last the last thing we can go on however long you like thanks for listening if you have questions about the material presented in this podcast be sure and join our developers mailing list by sending an email to developers-subscribe at sneha.org here you can ask questions and discuss this topic further with your peers in the developer community. For additional information about the Storage Developer Conference, visit storagedeveloper.org.

Your Ad Here

Storage Developer Conference - #3: Standardizing Storage Intelligence and the Performance and Endurance Enhancements it Provides

...

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.