Storage Developer Conference - #23: Overview of Swordfish: Scalable Storage Management

Episode Date: October 20, 2016

...

Transcript
Discussion (0)
Starting point is 00:00:00 Hello, everybody. Mark Carlson here, SNEA Technical Council Chair. Welcome to the SDC Podcast. Every week, the SDC Podcast presents important technical topics to the developer community. Each episode is hand-selected by the SNEA Technical Council from the presentations at our annual Storage Developer Conference. The link to the slides is available in the show notes at snea.org slash podcast. You are listening to SDC Podcast Episode 23. Today we hear from Rochelle Alvers, Principal Storage Management Architect with Broadcom, as she presents Overview of Swordfish, Scalable Storage Management, from the 2016 Storage Developer Conference.
Starting point is 00:00:49 What is Swordfish? How many people in here have heard of Swordfish before today or before this week? Wow, that's, okay, Marali, you put your hand down. How many people not involved in developing Swordfish have heard of Swordfish before this week? Okay, that's better. All right, so what we're going to cover here a little bit is talking about where Swordfish came from, what's the genesis for it, and then a little bit overview of kind of what the concepts are, how we've developed it, and who's involved in building it. And then, obviously, questions.
Starting point is 00:01:27 So, who basically, obviously, you guys all read this already, right? And as Mark just said, the SNEA's Scalable Storage Management Technical Workgroup has been developing the Swordfish specification. We actually announced the release on Monday.
Starting point is 00:01:49 Sorry, a little bit of tiny little bit of housekeeping here. One of the things that you get an opportunity to do for sitting in and listening to this presentation today is a chance to register to win a Phantom 3 drone. And if I just parked up and got really animated at the thought of a Phantom 3 drone, it's because I have one of these, and they're awesome. So I'm a true nerd. So Don and Mark and Arnold are passing out what you need to do to register for these.
Starting point is 00:02:20 It's going to be an online registration, so no need to be present to win. Your presence here is your present to win part. So if you don't have a ticket, wave your hand and make sure you get your ticket, and then you go online and register to win. But they're really, really fun. You do have to register them with the FAA, though. You don't need to register with the Future Farmers of America, though.
Starting point is 00:02:48 Anyway, so back to our regularly scheduled programming. So what Swordfish is, and with the technical work group, we were formed about nine months ago to start and define the Swordfish specification, we released version 1.0 on Monday. And so I'll kind of start that again and repeat that again for you. We formed this Twig nine months ago and released the spec on Monday. So let's kind of go through the history of that. One key thing, disclaimer side,
Starting point is 00:03:27 I'm sure everyone has seen a variant of this disclaimer slide before. But note on here, snea.org slash swordfish. So if you don't remember anything else, remember snea.org slash swordfish. So that's where you can go to find any and all pertinent information from today and moving forward. Okay. So what are the drivers for swordfish?
Starting point is 00:03:58 So over the last several years, we've obviously had storage management standards for quite a while. I'm sure a lot of you are familiar with SMIS. And that's actually been a really good standard for managing enterprise class storage. It's been extremely widely adopted. But there have obviously been some things we needed to work on for that. We've had a lot of feedback from customers and vendors alike on things that we need to think about in terms of not just SMIS, but storage management standards in general. So things like, you know, make them easier to implement and consume. So from the implementation side, you know, the technologies we picked require a lot of, you know, very specific knowledge. They're not used broadly.
Starting point is 00:04:54 And that's true for both the implementation side and the consumption side. It's got a high learning curve. So, you know, what can we do to simplify that? Improve access efficiency. So one of the things that we've done in the past is we've basically defined standards largely driven by vendors. And by vendors, I mean the people that build storage hardware and storage infrastructure, more so than clients.
Starting point is 00:05:21 And the clients are, you know, the people that use it, right? Either the storage applications or the direct end users. And so when you build things from the vendor perspective, what you get is look at all my neat knobs and widgets and look at every feature that I built. And so you don't think about, you know, necessarily I want to look at this particular attribute of this system a thousand times. What we've gotten is a lot of feedback over the years to say, I want to look at this particular attribute. We've had to do a lot of refinement. We've done that in the current systems,
Starting point is 00:05:57 but what we need to do is actually when you're doing a new standard, build that in from scratch. That's where improving access efficiency comes in. The third and fifth ones kind of go together. Providing useful access via standard browser and using standard tools, right? It's very hard in not just SMIS, but in a lot of other standards, legacy standards, particularly even in the server space, to use them directly. So if you look at a lot of the tools people use in, like DevOps people use directly in their day-to-day work, you can't actually interact directly with the tools. And so those are a lot of the feedback we've gotten over the years.
Starting point is 00:06:44 Another thing is just a shift in where storage is deployed. You know, five, ten years ago, you had DAS, you had SAN, but what we've seen is a rapid transition to including converged, hyper-converged, hyperscale environments, and the standards really haven't kept up with that. So those are a lot of the areas of feedback we've gotten to say, what can we do to expand the standards to those? So there's a couple of different ways you could do that.
Starting point is 00:07:11 One is you could evolve the existing standards. The other one is start from scratch, build something to address that. And so we basically took the latter approach. So with Swordfish, we basically said, okay, taking all of those things into account, as well as, you know, what are we, you know, what else can we do? I said start from scratch, but we didn't really start from scratch, because we couldn't start from scratch completely and turn a new standard in nine months. I've already talked about us, you know, basically starting nine months ago and releasing a spec today. What we did instead was steal liberally from a bunch of different sources. And one of those sources was a spec called Redfish. So some of you may have heard of Redfish. How many of you have heard? Okay, so what Redfish is, it's actually a spec that the DMTF organization put out.
Starting point is 00:08:09 A lot of the same vendors involved in developing Swordfish were involved in developing Redfish. Redfish is a standard that DMTF came out with to do very similar objectives in the server management space initially, and then a plan to extend it. It initially came out a year ago, almost a year ago right now, with base functionality in replacing IPMI for BMCs. That's the level of functionality it came out with. But the base protocol infrastructure is RESTful with JSON data transports and including OData metadata as well. So we were able to basically take all of that, and I'll show you a little bit more about all of that.
Starting point is 00:09:07 But basically we were able to take all of that protocol level work that they've done, as well as all of those schemas they've defined about servers, because as we all know, a lot of storage is actually built using server components. So we basically took all of those, and we just leveraged them directly. And we extend, what we do with Swordfish
Starting point is 00:09:29 is we just extend all of that Redfish and focus exclusively on what we call the storage services. And that's what Swordfish is. Other things we did was we stole or leveraged, pick your favorite term, work that was actually going on in SMIS. SMIS was actually working on simplified models that were much more client-oriented. Some of those things that I talked about on the last slide around what can we do to be more client-oriented, what can we do to focus on getting those refactored APIs
Starting point is 00:10:14 so that somebody doesn't have to make the same call once, they can do it, or 1,000 times they can make it once, get it a lot more user-centric, and simplify the models. We stole a lot of that. One of the other key features we've added into this, and this is big, is we've actually moved to a class of service-based provisioning and monitoring model. And so what that actually means is instead of, again, focusing on this is very, very user-centric rather than equipment and vendor-centric, is we've moved to a model where instead of focusing on, you know, you have to understand every single knob and widget and configuration,
Starting point is 00:10:58 you can actually set it up so that the vendors can do it out of the box, and then the storage administrator can figure it so that when a DevOps guy or your IT admin comes in and wants to configure the system, what he actually gets presented with is, here's the class of service that someone else has configured for him. I just get capacity, and whatever class of service you're permitted to use, you just look at that and configure your capacity off that. You don't have to know anything about the underlying infrastructure.
Starting point is 00:11:31 And that's completely configurable for your environment. And that's done in a completely standard way, obviously. It's a standard API. And so then the vendors are able to differentiate all around that. We cover block file and object. And then the other thing that this does is because we're working so tightly with Redfish, it covers very seamlessly the storage server and fabric together. So there's existing simple fabric models,
Starting point is 00:12:03 and we'll be working to extend those to cover storage fabrics. Redfish is already working to extend those to cover networking fabrics. It's really going to give us something we've been trying to do for a long time in the standard space, a really true seamless standard API to cover your entire data center. Okay. So who is developing this? I talked a little bit about, you know, all the players really having a lot of overlap. These companies are highlighted in blue down here,
Starting point is 00:12:36 or blue, purple. The companies highlighted in purple here really have been the key players. I want to highlight a couple of things. Microsoft and VMware have really been very instrumental in this whole process from a client perspective. One of the issues we've had when we developed SMIS, and one of the reasons it kind of took a long time to get things really stabilized, and it was largely driven by the vendors, and it took time to get a lot of client engagement. We have a lot of client engagement up front.
Starting point is 00:13:13 We'd really like a lot more input from clients, but this has been very, very client-driven so far. And as you can see, we've had Broadcom, my company, Dell, EMC, HP, HPE, sorry, Intel, and Intel bringing a lot of their input from their RackScale design architecture in as well. So that's got both a client and provider, both the client and vendor inputs, which is a really good perspective. It also has a lot of that server storage networking ecosystem input. And then we've had Nimble and then a couple of smaller players like Innova Development in as well,
Starting point is 00:13:57 looking at it from an infrastructure perspective. So it's been a really good breadth. And then also I'd highlight that most of these companies are also playing and key that most of these companies are also playing and key players in the companies that are developing the Redfish space. We have a lot of other companies here that have also been watching what's been going on. And most of these companies are also companies that have been very active in developing SMIS. But we'd love to have a lot more people come and engage and work with us on validating what we've done
Starting point is 00:14:31 and expanding functionality, validating, developing reference implementations, developing real implementations, and working with us moving forward. And by the way, we don't actually have to wait till the end for questions. If people do have questions, we will take, I'm happy to take questions anytime as we go through. It will probably also help me not have coughing fits as I go forward here. So I guess I probably could have clarified that up front.
Starting point is 00:15:07 All right, so what functionality did we include in V1? You know, I keep saying, you know, yay, we did this in nine months. You're probably thinking, yeah, probably didn't get much there. Because we leveraged so much stuff, we actually got a ton of functionality in here. We actually have full block functionality,
Starting point is 00:15:26 full provisioning with class and service controls, volume mapping and masking, full replication capabilities, capacity and health metrics, and then we also put file system on top of that. So the file system leverages the entire block infrastructure and then adds file system and file share schemas on top of it. We've also got support for object drive storage. I know earlier I said we were going to do object store.
Starting point is 00:15:52 This is not full object store. This is for object drive, and so for anyone who's not aware, there's another technical work group going in SNEA that is the object drive storage technical work group. They're in the process or have already released a spec for object drives. It's out for public review right now, so if you haven't heard of this, go look at that. And this has support for the object drive storage in it. And so if you're not aware of it, just go look at that. It's a pretty interesting specification for folks to go be aware of.
Starting point is 00:16:34 Okay. So diving down a little bit into what this stuff actually looks like. Before I do that, any questions on what we've covered so far? All right. So what does this stuff actually look like? We've talked a little bit about this being REST-based and then using JSON and then OData metadata extensions as well. And we also talked about it leveraging Redfish.
Starting point is 00:17:04 So the way this is actually structured and what we've done is it uses the same Redfish resource we've just extended using the Redfish resource now. So this slide and the next slide basically talk a little bit about directly how we've done that
Starting point is 00:17:19 as well as starting to lead in a little bit to what a Swordfish implementation actually will look like. So this is a slide we've leveraged from Redfish. So if you see any of the Redfish presentations, you'll see this slide there. So Redfish is fundamentally structured in this hierarchy. So there's this notion called collections with entities inside them. So that's the terminology there.
Starting point is 00:17:56 And the primary structure that they use is you have four, and there's a few more things that have started to show up now, but the primary structure is you have systems, which is the logical structure for a computer system, and then the chassis, which is the physical part of the system.
Starting point is 00:18:16 And then the managers over, I'm pointing to my screen. Isn't that really helpful? And then the manager. Is there a pointer? Ah, thank you. And then there's a... Ah, yay, a green pointer.
Starting point is 00:18:40 The manager's pointer over here is where you will either see functionality for a BMC or, in our case, where we will add things like a software management infrastructure. So that's where we'll extend that. And then there's other services here like session management, account management, schemas, and events. And so these things all hang off of what we call the service route. So the collection is where you'll see, you'll go here and see how many of those
Starting point is 00:19:13 things exist. So this is the logical system. This is the physical system. The distinction is pretty much that way. If you're quibbling why something's in one spot or another, it was probably a multi-hour discussion
Starting point is 00:19:31 as to why it should be one spot or another. And, you know, everything's not quite that clean, but it is where it is. But you'll see over here exactly what you expect to see, model, serial number, inventory information, and then information like power and thermal and rack hierarchies and stuff, or a chassis, okay? And so what we've done with Swordfish
Starting point is 00:20:01 is we basically looked at that entire hierarchy and said, yep, we'll use that. And we added all the stuff that's in purple. And so in purple, we did a couple of things. One is we recognized that in some configurations, our storage systems are built entirely and exactly out of standard servers. And so some of these systems will be using exactly these storage systems, or these systems. But in some cases, we have storage systems that are very similar, but have custom hardware,
Starting point is 00:20:41 and so there's also this logical construct that's called a storage system. And those things are very, very similar as well. But we basically have... The bulk of the focus is actually in what we call the storage service. So the storage service is where you'll see all of what you expect to see,
Starting point is 00:21:01 a logical construct for storage. This is where you see, you know, storage, you know, the volumes, and the storage pools, and the endpoints, and groups, and this is where we've added the class of service constructs, and all of that. So again, you see, you know, the collection that says how many of them there are, and this is where you see the details of each individual entry. So one thing about cardinality of all of this is for a Redfish implementation that's done on a BMC, what you might see is one of each of these. For a storage, like a large-scale storage system, what you might see is a storage service
Starting point is 00:21:56 that has, for the storage services, you might see a large number of storage services because that's the way the system is developed. This thing is just designed to be built on a much larger scale. It's a very highly scalable system. We have a question up here. Hang on. Hold on just a sec highly scalable system. So we have a question up here. Sorry. Hang on. Hold on just a sec for the mic.
Starting point is 00:22:30 When you say class of service, could you give us some example of what kind of class I was talking about? Is it related to QoS or is it a group or something like that? Okay. So the question about what do we mean by class of service. So a class of service is actually defined to be, I think we will get into that in a little bit and have a few more details. But fundamentally, a class of service is defined to be based on a set of, extending a set of capabilities. And so you get to define these capabilities. They can be based on protection.
Starting point is 00:23:19 They can be on capacity, on a whole set of attributes that we've defined. But basically, it could be performance-related. It could be protection-related. It could be availability-related. So some people call this quality of service-related, quality of service instead of class of service. We've chosen to call it class of service. And so I'll have, when I get a little bit further in, I'll have some mock-ups that show a little bit more detailed examples. And so if I don't have quite all the detail for you, then let me know. All right. Okay. So we also talked about how we extend Redfish.
Starting point is 00:24:06 This I want to highlight again. One of the things that we do, Redfish actually covers what we call local storage. When you have storage attached to a local server, Redfish covers that notion. They have this notion of a volume that's got some set of attributes in it. We also have a notion of a volume. And so we didn't want those to diverge at all.
Starting point is 00:24:37 So what we've done is basically said we will not diverge. Instead, we extend that volume. We have all of those same attributes in there, but when you move to Swordfish, you of necessity have a need for some additional attributes. And so the model that we have to work between the organizations is to extend that to include all of those additional attributes. One of the things that we didn't include in our V1 release, but we will be including and developing and including in a later release,
Starting point is 00:25:10 is an implementer's guide that's very specific that includes details for implementers to talk about, you know, which specific attributes to include and specific implementations. So our spec includes all of these things now. We will just be adding additional guidelines for when to use all of the attributes.
Starting point is 00:25:37 Okay, so what does a Swordfish system look like? Or can you see what a Swordfish system looks like today? And the answer is yes. Even though we don't actually have any implementations, you can actually see what this looks like. Now, how do you do that? So one of our development tools for putting the spec together is we call mockups. And so this has actually helped us develop this a little bit more quickly. And some of this goes back to the fact that with the JSON infrastructure,
Starting point is 00:26:11 we can actually put together static views of what a system would look like in JSON and say, does this make sense? And yes or no, or modify it. And then instead of having to work entirely in schema. And so what we've developed are actually three different sets of mockups, one that's actually, here's a small-scale system, here's a large-scale system that has everything,
Starting point is 00:26:38 and here's a file system view. And so we've actually used those as both development tools, and we've released all three sets of our mock-ups as part of our, well, they're actually part of all of our work-in-progress releases, but they'll be released as part of the spec bundle as well, so that you can get a good sense of what different configurations would look like. So I'll give you a sense of what Swordfish systems look like by actually using part of that work.
Starting point is 00:27:16 So here we go. So here's a little bit of the Swordfish mock-ups. For those in the back, you should sit in front. And as a note, all of the slides are online, so you can actually... Thanks, Marty. You can actually see the stuff. You can download the slides. You can also...
Starting point is 00:27:39 We actually have two different ways you can actually look at this, one of which is you can download the mock-ups, put them on your own systems, and navigate through them. We're actually also putting all of these online at swordfish.mockable.io, both in static form, as well as adding some simulated interactions in a few areas
Starting point is 00:28:02 so that you can actually look like you're actively interacting with the system. So, what you can actually do is interact with the system a little bit. So, like I said, there's three different systems on here, and this actually
Starting point is 00:28:18 shows you a little bit of the service route. So, you'll see if this were just a small-scale system, you wouldn't necessarily see all of these things, but you can see the storage systems and the storage services here, as well as some of those other elements we talked about. And so if you wanted to just navigate through, you'd basically update the URL you're asking for at the top. And I navigate down into the storage service collection, and now I can see, hey, there's three of these things on here.
Starting point is 00:28:50 So we move forward. We picked one. So what's actually in a storage service? And so here's all of those things we actually already talked about. So there's a class of service. There's volumes. There's pools. There's groups.
Starting point is 00:29:02 And then there's actually also points or pointers to other resources that the storage service references or leverages, so things like the system and the chassis. So there's ways to say, these are the relationships here, and so you can actually navigate around and just migrate your way around the system. So you never have to go on a system. You can actually just go into and navigate your way around
Starting point is 00:29:31 and find all the relationships. I still haven't made you go to a schema and research anything. You can actually just point your browser at a system and navigate your way around and find everything. This is completely different than interacting with an SMIS-based system where you would have had to go read a manual to find something. What if I want a file system? Identical except that now I've also got a, right in about here I have to look, right
Starting point is 00:30:02 in here I now have a file system link where I can go dig down in and see the details of the file systems. Okay, so I actually want to do something. So let's say I want to discover something about my system, which is kind of what I've been doing. I've just been navigating around and discovering stuff, right? Let's say I want to discover something for a specific reason. So do I have space to, you know, what do I want to discover something for a specific reason. So do I have space to,
Starting point is 00:30:27 you know, what do I want to have space to do? Do I want to, do I have space to, say, check the capacity in a storage pool? So again, going back to the class of service, my, I'm a DevOps guy, and my storage admin has told me that in a particular storage pool or with a particular class of service, and in this case, I'm in Boston, and I have permission for anything in gold. So I've done a search in, because I know how to do appropriate search parameters.
Starting point is 00:31:01 I know how to search for the storage pools that have the type of class of service for Gold Boston. Does that show up on you? Or I happen to know that I'm looking for, you know, name special pool because the other search got truncated on this screen. So I happen to know that I can search in special pool. So I navigate my way down to special pool,
Starting point is 00:31:28 and I can look at the capacity here, do appropriate calculations on it, and say, hey, look, I do have enough capacity here. And yay, I can go create a volume in this pool. So now I could actually go in and do an appropriate, it's a REST API, I can post to create a volume into this pool. And that's exactly how simple it is.
Starting point is 00:31:58 I don't have to worry about what array this is and what attributes it is or anything like that. It's already been all set up for me by the storage admin. Okay, so that's I don't actually even have to worry about what vendor's equipment it is underneath either.
Starting point is 00:32:18 Storage admin did all of that for me. Alright, so that's a little bit about how everything works. I think we're down to a little bit about how everything works. I think we're down to just a few minutes left. So where are we? Oh, I forgot this was a build slide. Sorry.
Starting point is 00:32:34 Yep. Like I said, we just finished the v1.0 spec, released that this week. We've had a bunch of interim releases through the year. Sneha.org slash Swordfish will tell you all of the rest of this if you're interested in participating. We would love to have you in several different ways
Starting point is 00:32:52 join the group and work with us on developing the spec. If you're looking at it outside the group, send feedback through the portal. We're also working on setting up a storage management customer panel. Email storagemanagement at sneha.org for more information.
Starting point is 00:33:14 Everyone is wrong. We're also going to be at Ignite next week. There will also be a customer event that we're working on. We have registrations for folks to attend on Tuesday night at Ignite next week. Well, I think that's it. I'll leave it here.
Starting point is 00:33:39 Questions? We have a... I have the microphone. Yes. Let me bring the mic to you. You talked about an implementation guide. Do you define there the border between redfish and swordfish? Yes.
Starting point is 00:34:03 So a couple of things. One is with this first release, we've actually put out a specification and the beginning of a user's guide. So we decided to focus on the user's guide first, and we'll be putting out an implementation guide later. So the implementation guide targeted at the vendors, but focusing on the users
Starting point is 00:34:24 and highlighting user interaction for the users first. What we put in the spec is actually focused on just the swordfish part and refer back in the spec and refer everybody else to say, that's the redfish part. What we actually expect to see from a client perspective and a user perspective
Starting point is 00:34:46 is that the difference between swordfish and redfish should be completely transparent. And we actually have a station set up out here in the mezzanine, thank you, where we can actually walk you through a little bit and show you how, from a client perspective, that it should be completely transparent when you're interacting with Swordfish versus Redfish.
Starting point is 00:35:13 And then I can show you the pieces that are Swordfish versus Redfish there, but from a client perspective, you shouldn't be able to tell it all. The other thing that we're doing to make it completely transparent is we're actually posting the schemas. We have JSON and CSDL schemas.
Starting point is 00:35:28 We're actually posting those on the DMTF website. So even when you're interacting there, you don't have to come to SNIA versus DMTF when you're building the system to get them. They'll all be in one spot. Any other questions? Oh, come on. How far does the drone go, at least?
Starting point is 00:35:57 It's a mile, by the way. So I've got a question. Is the standard somehow prepared or will be prepared for somehow managing storage systems like write or distribute file systems like Ceph or something like that? Is this a part of your work or will be, maybe? So I think the question was...
Starting point is 00:36:20 I'm sorry, can... No, I didn't quite catch all the questions. Can you... Okay, so my question was, I'm sorry, can... You don't have to repeat it. No, I didn't quite catch all the questions. Can you... Okay, so my question was, is the standard somehow prepared for managing and monitoring solutions such a right solution or distributed file systems?
Starting point is 00:36:38 Stuff like that. He's talking about what? Is there data? Oh, um... I'm sorry. Like objects. You don't support objects. Oh, no. Right.
Starting point is 00:36:53 So in the first version, we only included block and file. But it is completely up to anybody coming in as to what extensions are added past that. We expect to add object, and in the last couple of days, I've talked to folks who also want to look at potential extensions for Ceph. We've talked to folks who have an intent to come in and add some extensions into file share space. But really, if we have two or more vendors who have functionality that they want to add into a particular area, we're open to that.
Starting point is 00:37:34 We already have a roadmap that says we're going to be adding performance metrics. We're going to be adding a bunch of capabilities around events. We have fabric extensions. We're going to be doing some collaboration with DMTF in about three or four areas. But really, any capability is fair game as long as we have two or more vendors, or two or more companies, I shouldn't say.
Starting point is 00:38:04 I keep saying vendors. But two or more companies that are interested in adding to the standard. Okay. Thanks, Rochelle. Thanks for listening. If you have questions about the material presented in this podcast, be sure and join our developers mailing list by sending an email to developers-subscribe at sneha.org. Here you can ask questions and discuss this topic further with your peers in the developer community. For additional information about the Storage Developer Conference, visit storagedeveloper.org.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.