Storage Developer Conference - #188: Open Industry Storage Management with SNIA Swordfish™

Episode Date: April 17, 2023

...

Transcript
Discussion (0)
Starting point is 00:00:00 Hello, everybody. Mark Carlson here, SNEA Technical Council Co-Chair. Welcome to the SDC Podcast. Every week, the SDC Podcast presents important technical topics to the storage developer community. Each episode is hand-selected by the SNEA Technical Council from the presentations at our annual Storage Developer Conference. The link to the slides is available in the show notes at snea.org slash podcasts. You are listening to SDC Podcast, episode number 188. I'd like to welcome you to the session. This is Open Industry Standard Management with Sneha Swordfish. The original presenter for this, Rochelle Alvers, was unable to make it, so I had to actually step in at the last minute and go ahead and do the slides.
Starting point is 00:00:55 So remember, if you like this session, my name is Chris Linetti. If you didn't like this session, Rochelle Alvers. And with that, let's go ahead and start. This is the abstract directly cut from the agenda. You're all here now, so I assume that you've read this and you understand what we're going to be learning today. We're going to be learning a lot about the Swordfish API fundamentals. And again, you could read this on your own. No need to belabor the point on that one. Let's go ahead and jump into what is Redfish and Swordfish. DMTF Redfish covers a lot of different things.
Starting point is 00:01:32 It covers servers. It covers data center. It covers basic fabric management. It's the protocol that's highly successful in the market. All the servers in the industry are using it. And it's a REST API method to manage everyone's server. It uses JSON. There's YAML schemas out there.
Starting point is 00:01:52 It's a very well-defined standard. What Swordfish is, Swordfish is a small extension to the Redfish protocol that adds the ability to do more custom things that need to happen for enterprise storage. For instance, all servers have chassis, they have drives, they have power supplies. Why would Swordfish need to reinvent any of that? We're going to use what's in Redfish already. Redfish did a fine job of defining all that, so we're going to hijack and use the exact same terminology, the exact same methods, the exact same yaml to do the exact same things.
Starting point is 00:02:25 The difference is, Swordfish, we're going to add some concepts to deal with shared storage over fiber channel, shared storage in a bunch of different ways. And we're going to enhance it to be able to handle much larger things like storage pools that might have hundreds of volumes or namespaces carved out. So Swordfish is an extension to Redfish that covers a lot more use cases other than single storage or software-defined networking or software-defined storage. The goal here is that Swordfish will cover
Starting point is 00:02:57 object storage, file storage, as well as block storage. And again, this is something that's proven with Redfish in the sense that Redfish has already managed all the servers out there. Now we're adding Swordfish capability to manage more complex environments. And in fact, things like classes of service, lines of service, data protection lines of service,
Starting point is 00:03:21 those things are more easily covered in Swordfish, the extension to Redfish, than they are in the basic Redfish itself. NVMe, as an example, and NVMe-OF have a natural fit in the Swordfish world. So we're doing a lot of work to make sure that NVMe-OF is covered in Swordfish properly. So we've got, and again, there's a very tight-knit group between DMTF and SNIA. In other words, even though the DMTF owns the drive part, if we find something that's wrong or missing or needs to be updated, we'll actually throw that over the wall,
Starting point is 00:03:53 and we've got people that are actually working in both groups. So there's a lot of commingling between those two groups to make sure these protocols stay in lockstep. So here's the basic Redfish hierarchy for Swordfish advanced storage. So you can see we've got the service route. We've got the collection resources. And in this case, I've got storage as a collection resource. And you'll notice that under storage, we've got a singleton. For instance, we could have three different arrays out there, or we have a single array. That single array may have a number of volumes.
Starting point is 00:04:25 It may have a number of pools. It may have a number of controllers. So you can see that there's a lot of different things you could put under this. You can almost think of this like a file system. You keep going deeper and deeper and deeper into the model. But again, if you notice that storage also has links down to chassis because a storage array controller or a storage system is going to be using chassis. But the chassis information is actually referenced under storage, your singleton under storage,
Starting point is 00:04:50 but it doesn't actually need to live there. It actually lives under chassis. So again, if you want to look at all your chassis, you can go to the service route, go to chassis, and start walking down the tree and find all your chassis. Each of those chassis will probably reference the storage system they're connected to or the server they're connected to. So you can see this model has a lot of dotted lines that actually let you cross from different domains. And again, storage is the one we care most about because we're in the enterprise swordfish storage business.
Starting point is 00:05:19 But there's also a major collection resource called servers or systems. So systems might handle a HP ProLiant, a Dell PowerEdge, various servers in the market might be under servers, but storage is where we like to think that Swordfish belongs. So Swordfish configurations, here's an example of what a simple external array looks like. Now, I say simple because there's a lot of options. Don't get bogged down in all the options. This is what a standard array might look like.
Starting point is 00:05:55 You've got a subsystem, you've got a service route, you've got a chassis, you've got storage. That storage has snapshots, it has volumes, it has drives, but you can see where everything lives underneath. And you can kind of get the ownership of all these things. Now, again, the mockups that we've created for simple storage array don't necessarily have all the things on the left-hand side there. The telemetry service, the update service, the event services. But a real implementation likely would. And, in fact, if you want to see how to implement event services, that's already in the default Redfish definitions.
Starting point is 00:06:30 Easy to do. If you want to be able to do session services, you can actually implement tokens so that people can log into your system using a username and password, get a token back, and then all future commands are done with that token. So you have a user control. That would be done in session services.
Starting point is 00:06:44 But a lot of these mockups don't have that stuff implemented directly because the mockup is really trying to show you what's different about Swordfish, not necessarily all the things that are the same because 90% of the stuff in Swordfish is the same as Redfish. Now, if we want to go to a simple external array, we're going to add a little
Starting point is 00:07:05 mapping. Now, what I mean by that is you'll notice the difference between this one and this one. I've added fabrics. Because now, instead of going with an array that lives inside of a server or a dedicated array, I've now gone to an external array. And the thing about an external array is I need the ability to map a volume to a host. So in that case, I need to be able to implement fabrics so I can have endpoints. An endpoint lives in a fabric, and an endpoint represents an iSCSI initiator, a fiber channel initiator, RDMA. It represents some method of getting to that storage. In fact, you'll see there's something called connections where I can actually map a volume to a host. So in the old SCSI world,
Starting point is 00:07:46 you'd think of this as the SCSI initiator target nexus. That's what connections is. But connection lives under fabric because it's a term that doesn't necessarily need to live inside the array itself. It lives in, it's a theoretical concept that lives in the fabric. Is this making sense? Any questions so far? Sure. So if I've got an external array that needs to have mapping so I can map a specific host
Starting point is 00:08:19 to a specific array volume, I would define fabrics. Now, this implementation under service root, he would have storage where he'd put all of his inwardly facing stuff, but he'd also have a fabric folder, and that fabric folder would contain all the endpoints that you want to register with that storage array. And those endpoints could then be mapped. Now, fabric, though, is different, because fabric, while the mappings, the connections are configured on the array, you also could implement a switch using this model, and we're trying to add switching in the future. When you do switching,
Starting point is 00:08:58 zoning would be under the fabrics. So there's different things. If you want to, and again, the model is designed to be extensible. Is fabrics part of the small course of Redfish? No, fabrics does not exist normally in the Redfish model directly, because Redfish is for servers. It doesn't handle external mounting. If you want to go external mounting, you're going to go with Swordfish because then you're going to add fabrics. And in fact, if you're doing a PCI RAID controller inside of a server and you're not sharing that, you just want to give that right to the host, you would implement it under storage,
Starting point is 00:09:39 but you would never need a fabrics. If on the other hand, you want to be able to map that initiator group to that volume and be able to see it from outside in the real world across a fabric, that's how you do it. What do you mean? I'm sorry, I didn't hear that. Yes. Yes. Model's the same.
Starting point is 00:10:27 Because when you're talking to an array, when you talk to an array and say, I want to talk about this fabric that happens to have this endpoint, and you go to the Ethernet switch, you should be looking at a similar fabric. Yeah, the connection is the sub-add of the fabric. I remember it's an independent object in the progress model. Right.
Starting point is 00:10:48 And a connection on a fabric, if I'm talking to an Ethernet switch, might be the zoning to connect an array to a host. But if I'm talking to the array, the connection may be to connect a host to a volume using LUN masking, as an example. And namespaces is something that throws me for a loop. This is completely different, because in NVMe, in NVMe-OF,
Starting point is 00:11:16 you've got a concept called namespaces, which aren't exactly like volumes. They're a different thing entirely. Because each NVMe is its own controller and can be shaved up in different ways. And it presents its own namespace. And each one of those drives, each one of those NVMe devices, is actually going to be its own storage controller. And every time you add a new host to it and create a new namespace,
Starting point is 00:11:41 you're creating a new controller. Because controllers in NVMe world are ephemeral. So they can appear, disappear, you can have hundreds of them. So storage controllers for NVMe is used a little differently, although it's the same concept. You can have storage controllers on a simple array, you can have storage controllers in NVMe as well. we also have an opaque array we have an NVMe front end with a SAS back end so you can see there's a lot of different options here we tried to think of all the different options
Starting point is 00:12:17 to make this flexible so that you don't run into a place where you don't have an idea where to go so in here we may have a shelf so that you don't run into a place where you don't have an idea where to go. So in here, we may have a shelf that's Ethernet attached, has an Ethernet to SAS bridge of some sort, that then goes out and talks to SAS devices, and those SAS devices may be NVMe devices, they may be NVMe drives,
Starting point is 00:12:41 they may be regular drives, who knows? It could be an eBoff. Yes? So how do you keep this? We were talking about that before you got here. How do you keep one? You have one. For instance, you have one network on the backside. You have one endpoint where everything's contained in it.
Starting point is 00:12:57 How do you do that? Right. You could do mapping in such a way that only a certain host sees a certain Ethernet port. Or you could, there's a lot of different ways to map it. Because each controller that pops up could decide what you see and what you don't see based on your initiator. Based on who's asking. Because a storage controller... Sorry?
Starting point is 00:13:27 Yes. Yep. The storage controller can let you through or it can block you. And I can take a drive and shave it up three different ways and hand three different storage controllers to three different servers.
Starting point is 00:13:41 Each of them will see its storage controller, but the other storage controls won't respond to it. So NVMe is the big expansion we've been pushing. We've had a lot of meetings about NVMe to make sure we do this right. And unfortunately, there's a lot of different ways to do NVMe. We've tried to make sure that the mock-ups show you the most common ways that are out there. So the mock-ups can give you an idea of what's possible, but you can go away from mock-ups and go directly to the YAML or the JSON to figure out exactly how you want to implement your devices. There's a lot of different ways to skin this gap.
Starting point is 00:14:32 Solve this problem. I shouldn't use terms like that because they sometimes don't translate properly. There's a lot of different ways to solve the problem of NVMe. Because we have eBoffs, we have controller-based, we have a lot of different ways to do this. And we want to show you that there's multiple different ways to solve that same problem.
Starting point is 00:14:55 But none of it breaks the model. The models continue to operate. So are you running two swordfish databases, one inside the other? Oh, what do you mean? Let's say you were just talking about it, where you have a redfish database for CXL. Sure.
Starting point is 00:15:19 And then you have a link to that by the internet or through the app. So that endpoint? And then they can fade that over to your main... You theoretically could. As an example, you could have... Let's say
Starting point is 00:15:37 you have a server, and it's Redfish enabled, and you're going to run some service on top of it. You could have that service actually be the trap door to get to the Redfish on the baseboard, while at the same time adding its own folders to it as well. Okay. So you're talking about only kind of scenes.
Starting point is 00:15:56 Yep. Yep. And the other thing is, you can actually reference things in the Redfish model, because you can say that storage systems, systems, and there's an OData ID where you can reference where it points to. It doesn't have to point to the same system you're on. We do this for replication. So if I have an array and one of the features I can do is I can say replicate this volume to another array. And the OData ID for the target of that replication is a completely different system on the other side of the world.
Starting point is 00:16:28 So you can reference out of the machine you're in as well. So the Redfish specs, you can see that we've been working very hard on them. In fact, we keep updating them. About every six months, you're going to see a new version. We've got 1.2.3, 1.2.4, 1.2.5 is being worked on right now. So this is a very vibrant infrastructure right now. We've got a lot of people working on these calls. So there's a lot of different vendors that are represented on these calls to try and make sure that we can cover
Starting point is 00:17:00 things properly. Now, I should note also that there is a conformance testing program available as well. So if you're a member of the SNEA labs, or I'm sorry, the SMI Innovation Lab, and you want to put your piece of hardware in the SNEA lab or in the Innovation Lab where other people can hit it or not, you can choose not to. But the point is, you get conformance testing program with that for free. So you can run that thing all day long if you want. And the conformance testing program will help you to avoid straying
Starting point is 00:17:32 from the approved documentation. It will check your syntax, it will check all kinds of things to make sure that you've got a compliant implementation. That's a resource you should definitely be looking at, is hosting equipment in the SNEA lab, in the SNEA Innovation Lab, and running the CTP on a regular basis. And I built a simulator and put it in there,
Starting point is 00:17:55 and it found 300 different things. I spelled OData with a capital O at one point, and it trailed all the way through my code, and I had to fix that in 300 different places. But it caught those kind of things, things I wouldn't have thought of. The conformance testing program is very useful for that. Now, these mockups are all available
Starting point is 00:18:19 via swordfishmockups.com, and you'll see there's a whole bunch of different mockups in there that you can actually open up, and you can walk down the trees and see how everything's laid out and everything's operating. Or you could open up the Redfish schema guide or the Swordfish schema guide. Both are available as well, and you could walk through how those are laid out as well. I recommend looking at the schema guides instead of the JSON files. If you're comfortable looking at JSON, that's great, but there's a lot of options in the JSON files
Starting point is 00:18:45 that you probably won't implement. And the schema guide really puts it into an English language kind of thing. It makes it a lot easier to consume. And again, NVMe functionality. We've got mapped NVMe objects that exist in the Redfish and Swordfish models. We've got subsystems. We've got a controller. And when you build a controller,
Starting point is 00:19:06 you have the discovery controller, the admin controller, the IO controller. So these things are ephemeral, but they pop up, and they need to be able to be created automatically. You have namespaces, endurance groups, NVM sets, so you can actually set a three-pack of drives together
Starting point is 00:19:22 so that they actually roll together with their endurance groups, and NVMe domains. So there's a lot of things we've added in to try and make this a more reasonable implementation. And again, here's the Redfish and Swordfish and what gets added for NVMe. All the different options. Now again, you guys are going to get a copy of these decks, are you not? You should be able to download these videos as well. So I can go into detail in here, but there's a lot of material. And I definitely suggest zooming in and playing around with these models. And common usage. When we say storage, what we really mean is a subsystem.
Starting point is 00:20:10 Whether that's a classic array, it's an NVMe eBof shelf, an NVMe array, whatever it is, storage means subsystem. But storage in a Redfish model could mean PCI RAID controller inside your server. Or it could mean a SAS controller or a SAS HBA. So storage can mean a lot of things to a lot of different people. And we're using the words volume and namespace interchangeably. I'm old school.
Starting point is 00:20:36 I kind of like using the word LUN, but it's really not as valid anymore. And under chassis, you're going to find all your thermals. You're going to find your power. You're going to find your drives. There's a lot of things under chassis, you're going to find all your thermals. You're going to find your power. You're going to find your drives. There's a lot of things under chassis that are really good about exposing the real-world intricacies. You can have an array that has 12 shelves
Starting point is 00:20:57 and 14 power supplies, and all that can be displayed there, including the redundancy levels of the power supplies, everything in there. Now, scalability is a big concern. The idea here is that when you do RESTful, one of the problems with SMIS was that SMIS was not necessarily scalable. Because once you got into tens of thousands of objects, it fell down because the entire database for SMIS was stateful, which meant you usually had to run a Pegasus server or some other server that actually talked to a captive database. So it was very common for people to implement SMIS as an off-array
Starting point is 00:21:38 proxy agent, because you'd need a VM that has usually two or four processors with usually eight gigs of RAM just to be able to house the SMIS proxy agent. But Swordfish and Redfish are designed to be run on very little hardware and very little state is actually saved. So the idea is that the whole RESTful concept here is that we don't store as much state. Most of these commands are designed to be implemented directly to the hardware, not where you have a database storing this information. If somebody asks for the status of a drive, you don't pull it from your database. You talk to the thing that you look at the drive now and give it back to them.
Starting point is 00:22:18 So in a case of where you want to store that information, what would you be storing? Oh, configuration data. Yeah, generally that's stored on the array already because the array's got a database that has that information. But the Redfish, Swordfish implementation isn't necessarily the owner of that data. It just queries the data.
Starting point is 00:22:46 It's a pass-through thing. Your array is already storing that data. If your array were to go down, Redfish and Swordfish would give you nothing if it was running off array. S-to-minus on the other hand was designed where if the array went down, you still could query that S-to-minus agent all day long
Starting point is 00:23:04 and get all kinds of data about the array because it stored the entire state of that array in the S-minus agent. This, on the other hand, is tell me now what's going on. And to back that up, Redfish is commonly deployed on very low-end processors for, you know, baseboard management control processors, which are, you know, like ILO, DRAC, BMC, those kind of things are not heavy-duty machines. And they're able to serve up Swordfish and Redfish, well, they're able to serve up Swordfish without a problem. And
Starting point is 00:23:36 Redfish is not much bigger, because most of it's the same. I think if you're asking for. If you're asking for how many drives do I have, you could be doing that on your BMC using Skazine Closure Services. You could be using I2C. You could be doing a lot of different things. Sorry?
Starting point is 00:24:16 Yeah, but you're also doing REST calls, which are a lot more lightweight than CinexML. So querying a database as opposed to just making the request, sure, it's a little slower. The request is slower, but the end-all result is actually faster because the Redfish and Swordfish BMCs respond quickly. It responds quickly.
Starting point is 00:24:42 It depends on how fast you query the device, right? Yep. Because remember, if you've got an array, the array is keeping status of all its stuff. It's already got the database. It's already got the configuration. It's already got all that stuff. Your Redfish is just a hole into it.
Starting point is 00:25:01 It's just a management interface to it. Depends on your BMC. You could, you don't have to. There's no reason to. And the other question is, how often are you running these management commands as opposed to running workload? Yeah, yeah. I mean, are you going to be doing 10,000 changes
Starting point is 00:25:36 to a machine in every hour? Or is this kind of... Yeah, exactly. The idea is that this is for configuration monitoring, that kind of thing. It's not necessarily, you're not running workload through this. You're only running management load. This is your, quote, out-of-band management.
Starting point is 00:26:00 And again, we have our logical NVMeOF, our exported logical subsystem. So you asked about this a little while ago, about how to export an external logical subsystem where you've got a master system and then multiple subs. and if you look at swordfish, the swordfish mockups.com, there is a lot of sample NVMe instances out there to show you what this thing looks like. I know that looks hairy, but realize that most of that model, chassis, drives,
Starting point is 00:26:44 systems, a lot of that model is already defined in Redfish and well defined and well used. You're adding things like controllers and storage but you can see what the relationships are between things. And again, a solid arrow is subordinate. A dotted arrow is
Starting point is 00:27:00 referenced. So you have a navigation link to jump from one to the other. It looks complex. It's really not that bad. And for external connectivity, you've got your storage collections. And again, under storage, you could have a singleton array there.
Starting point is 00:27:38 You could have duals. You could have five different shells, each counting as a singleton with their own models under those. But at the same time, they could share fabrics. So you only have to identify an initiator or an endpoint once. And again, there's a lot of models here, and all of this is out in the documentation. Now, you notice also that we have a partnership with Open Fabric Alliance,
Starting point is 00:28:23 so there's places where that ties into them as well. And where that framework lives. And these models show exactly how to do Fabric connectivity for an eBoff. Ethernet, a bunch of flash, right? And of course, there's a lot more information out there. We try to make sure that the Swordfish mock-ups are out there, the Swordfish forum out there. There's the Redfish and Swordfish standards. There's OFA, the Open Fabric Alliance, and the NVME specs as well.
Starting point is 00:29:22 So we try to make sure all of this is available. And that's the deck. I've actually got no more to add to this. Any questions? What process is there to get the new properties or specs from the NVMe? I'm sorry, repeat that? I'm probably going to have to refer that over to Rochelle at some point. on the Solvig. So how can I add this back? I'm probably going to have to refer that over to Rochelle at some point. And I'll try and get her in on some of these questions as well. Another question is,
Starting point is 00:30:17 is the TCG Trust-Headed Computing Group as part of the Solvig? I'm sorry? The TCG Trust-Headed Computing Group? No. We do have members in common that we relate to, that we try and keep informed of what we're doing, but they're not joined in that way. Thanks for listening.
Starting point is 00:30:37 If you have questions about the material presented in this podcast, be sure and join our developers mailing list by sending an email to developers-subscribe at sneha.org. Here you can ask questions and discuss this topic further with your peers in the storage developer community. For additional information about the Storage Developer Conference, visit www.storagedeveloper.org.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.