Storage Developer Conference - #188: Open Industry Storage Management with SNIA Swordfish™
Episode Date: April 17, 2023...
Transcript
Discussion (0)
Hello, everybody. Mark Carlson here, SNEA Technical Council Co-Chair. Welcome to the
SDC Podcast. Every week, the SDC Podcast presents important technical topics to the storage
developer community. Each episode is hand-selected by the SNEA Technical Council from the presentations at our annual Storage
Developer Conference. The link to the slides is available in the show notes at snea.org
slash podcasts. You are listening to SDC Podcast, episode number 188.
I'd like to welcome you to the session. This is Open Industry Standard Management with Sneha Swordfish.
The original presenter for this, Rochelle Alvers, was unable to make it,
so I had to actually step in at the last minute and go ahead and do the slides.
So remember, if you like this session, my name is Chris Linetti.
If you didn't like this session, Rochelle Alvers.
And with that, let's go ahead and start.
This is the abstract directly cut from the
agenda. You're all here now, so I assume that you've read this and you understand what we're
going to be learning today. We're going to be learning a lot about the Swordfish API fundamentals.
And again, you could read this on your own. No need to belabor the point on that one. Let's go ahead and jump into what is Redfish and Swordfish.
DMTF Redfish covers a lot of different things.
It covers servers.
It covers data center.
It covers basic fabric management.
It's the protocol that's highly successful in the market.
All the servers in the industry are using it.
And it's a REST API method to manage everyone's server.
It uses JSON.
There's YAML schemas out there.
It's a very well-defined standard.
What Swordfish is, Swordfish is a small extension to the Redfish protocol that adds the ability to do more custom things that need to happen for enterprise storage.
For instance, all servers have chassis, they have drives, they have power supplies.
Why would Swordfish need to reinvent any of that?
We're going to use what's in Redfish already.
Redfish did a fine job of defining all that,
so we're going to hijack and use the exact same terminology,
the exact same methods, the exact same yaml to do the exact same things.
The difference is, Swordfish, we're going to add some concepts to deal with shared storage
over fiber channel, shared storage in a bunch of different ways. And we're going to enhance it to
be able to handle much larger things like storage pools that might have hundreds of volumes or
namespaces carved out. So Swordfish is an extension to Redfish
that covers a lot more use cases
other than single storage or software-defined networking
or software-defined storage.
The goal here is that Swordfish will cover
object storage, file storage, as well as block storage.
And again, this is something that's proven
with Redfish in the sense that Redfish
has already managed all the servers out there.
Now we're adding Swordfish capability
to manage more complex environments.
And in fact, things like classes of service,
lines of service, data protection lines of service,
those things are more easily covered in Swordfish,
the extension to Redfish, than they are in the basic Redfish itself.
NVMe, as an example, and NVMe-OF have a natural fit in the Swordfish world.
So we're doing a lot of work to make sure that NVMe-OF is covered in Swordfish properly.
So we've got, and again, there's a very tight-knit group between DMTF and SNIA.
In other words, even though the DMTF owns the drive part,
if we find something that's wrong or missing or needs to be updated,
we'll actually throw that over the wall,
and we've got people that are actually working in both groups.
So there's a lot of commingling between those two groups
to make sure these protocols stay in lockstep.
So here's the basic Redfish hierarchy for Swordfish advanced storage.
So you can see we've got the service route. We've got the collection resources. And in this case,
I've got storage as a collection resource. And you'll notice that under storage, we've got a
singleton. For instance, we could have three different arrays out there, or we have a single
array. That single array may have a number of volumes.
It may have a number of pools.
It may have a number of controllers.
So you can see that there's a lot of different things you could put under this.
You can almost think of this like a file system.
You keep going deeper and deeper and deeper into the model.
But again, if you notice that storage also has links down to chassis
because a storage array controller or a storage system is going to be using chassis.
But the chassis information is actually referenced under storage, your singleton under storage,
but it doesn't actually need to live there.
It actually lives under chassis.
So again, if you want to look at all your chassis, you can go to the service route,
go to chassis, and start walking down the tree and find all your chassis.
Each of those chassis will probably reference the storage system they're connected to
or the server they're connected to.
So you can see this model has a lot of dotted lines that actually let you cross from different domains.
And again, storage is the one we care most about because we're in the enterprise swordfish storage business.
But there's also a major collection resource called servers or systems.
So systems might handle a HP ProLiant, a Dell PowerEdge,
various servers in the market might be under servers,
but storage is where we like to think that Swordfish belongs.
So Swordfish configurations, here's an example of what a simple external array looks like.
Now, I say simple because there's a lot of options.
Don't get bogged down in all the options.
This is what a standard array might look like.
You've got a subsystem, you've got a service route, you've got a chassis, you've got storage.
That storage has snapshots, it has volumes, it has drives, but you can see where everything lives underneath.
And you can kind of get the ownership of all these things.
Now, again, the mockups that we've created for simple storage array don't necessarily have all the things on the left-hand side there.
The telemetry service, the update service, the event services.
But a real implementation likely would.
And, in fact, if you want to see how to implement event services,
that's already in the default Redfish definitions.
Easy to do.
If you want to be able to do session services,
you can actually implement tokens
so that people can log into your system
using a username and password, get a token back,
and then all future commands are done with that token.
So you have a user control.
That would be done in session services.
But a lot of these mockups
don't have that stuff implemented directly
because the mockup is really trying to show you
what's different about Swordfish,
not necessarily all the things that are the same
because 90% of the stuff in Swordfish
is the same as Redfish.
Now, if we want to go to a simple external array, we're going to add a little
mapping. Now, what I mean by that is you'll notice the difference between this one and this one.
I've added fabrics. Because now, instead of going with an array that lives inside of a server
or a dedicated array, I've now gone to an external array. And the thing about an external array is
I need the ability to map a volume to a host. So in that case, I need to be
able to implement fabrics so I can have endpoints. An endpoint lives in a fabric, and an endpoint
represents an iSCSI initiator, a fiber channel initiator, RDMA. It represents some method of
getting to that storage. In fact, you'll see there's something called connections where I can
actually map a volume to a host. So in the old SCSI world,
you'd think of this as the SCSI initiator target nexus. That's what connections is.
But connection lives under fabric because it's a term that doesn't necessarily need to live inside
the array itself. It lives in, it's a theoretical concept that lives in the fabric.
Is this making sense?
Any questions so far?
Sure.
So if I've got an external array that needs to have mapping
so I can map a specific host
to a specific array volume,
I would define fabrics.
Now, this implementation under service root, he would have
storage where he'd put all of his inwardly facing stuff, but he'd also have a fabric folder,
and that fabric folder would contain all the endpoints that you want to register with that
storage array. And those endpoints could then be mapped. Now, fabric, though, is different, because fabric, while
the mappings, the connections are configured on the array, you also could implement a switch
using this model, and we're trying to add switching in the future. When you do switching,
zoning would be under the fabrics. So there's different things. If you want to, and again, the model is designed to be extensible.
Is fabrics part of the small course of Redfish?
No, fabrics does not exist normally in the Redfish model directly,
because Redfish is for servers.
It doesn't handle external mounting.
If you want to go external mounting, you're going to go with Swordfish because then you're going to add fabrics.
And in fact, if you're doing a PCI RAID controller inside of a server and you're not sharing that,
you just want to give that right to the host, you would implement it under storage,
but you would never need a fabrics. If on the other hand, you want to be able to map
that initiator group to that volume
and be able to see it from outside in the real world
across a fabric, that's how you do it.
What do you mean? I'm sorry, I didn't hear that.
Yes.
Yes.
Model's the same.
Because when you're talking to an array,
when you talk to an array and say,
I want to talk about this fabric that happens to have this endpoint,
and you go to the Ethernet switch,
you should be looking at a similar fabric.
Yeah, the connection is the sub-add of the fabric.
I remember it's an independent object in the progress model.
Right.
And a connection on a fabric, if I'm talking to an Ethernet switch,
might be the zoning to connect an array to a host.
But if I'm talking to the array,
the connection may be to connect a host to a volume using LUN masking,
as an example.
And namespaces is something that throws me for a loop.
This is completely different,
because in NVMe, in NVMe-OF,
you've got a concept called namespaces,
which aren't exactly like volumes.
They're a different thing entirely.
Because each NVMe is its own controller and can be shaved up in different ways.
And it presents its own namespace.
And each one of those drives, each one of those NVMe devices,
is actually going to be its own storage controller.
And every time you add a new host to it and create a new namespace,
you're creating a new controller.
Because controllers in NVMe world are ephemeral. So they can appear, disappear, you can have hundreds of them. So storage
controllers for NVMe is used a little differently, although it's the same concept. You can have
storage controllers on a simple array, you can have storage controllers in NVMe as well. we also have an opaque array
we have an NVMe front end
with a SAS back end
so you can see there's a lot of different options here
we tried to think of all the different options
to make this flexible
so that you don't run into a place where
you don't have an idea where to go
so in here we may have a shelf so that you don't run into a place where you don't have an idea where to go.
So in here, we may have a shelf that's Ethernet attached,
has an Ethernet to SAS bridge of some sort,
that then goes out and talks to SAS devices,
and those SAS devices may be NVMe devices, they may be NVMe drives,
they may be regular drives, who knows?
It could be an eBoff.
Yes? So how do you keep this?
We were talking about that before you got here.
How do you keep one?
You have one.
For instance, you have one network on the backside.
You have one endpoint where everything's contained in it.
How do you do that?
Right.
You could do mapping in such a way that only a certain host sees a certain Ethernet port.
Or you could, there's a lot of different ways to map it.
Because each controller that pops up could decide what you see and what you don't see based on your initiator.
Based on who's asking.
Because a storage controller...
Sorry?
Yes.
Yep.
The storage controller can let you through
or it can block you.
And I can take a drive
and shave it up three different ways
and hand three different storage controllers
to three different servers.
Each of them will see its storage controller,
but the other storage controls won't respond to it. So NVMe is the big expansion we've been pushing. We've had a lot of meetings
about NVMe to make sure we do this right. And unfortunately, there's a lot of different
ways to do NVMe. We've tried to
make sure that the mock-ups show you the most common ways that are out there. So the mock-ups
can give you an idea of what's possible, but you can go away from mock-ups and go directly
to the YAML or the JSON to figure out exactly how you want to implement your devices.
There's a lot of different ways to skin this gap.
Solve this problem.
I shouldn't use terms like that
because they sometimes don't translate properly.
There's a lot of different ways to solve the problem of NVMe.
Because we have eBoffs, we have controller-based, we have a lot of different
ways to do this. And we want
to show you that there's multiple different ways
to solve that same problem.
But none of it breaks the model.
The models continue to operate.
So are you running two
swordfish databases, one inside the other?
Oh, what do you mean?
Let's say you were just talking about it, where you have
a redfish database for CXL.
Sure.
And then you have a link to that
by the internet or through the app.
So that endpoint?
And then they can fade that over to your main...
You theoretically could.
As an example,
you could have...
Let's say
you have a server, and it's Redfish
enabled, and you're going to run some
service on top of it. You could have that service
actually
be the trap door to get to the Redfish on the baseboard, while at the same time adding
its own folders to it as well.
Okay.
So you're talking about only kind of scenes.
Yep.
Yep.
And the other thing is, you can actually reference things in the Redfish model, because you can
say that storage systems, systems, and there's an
OData ID where you can reference where it points to. It doesn't have to point to the same system
you're on. We do this for replication. So if I have an array and one of the features I can do is
I can say replicate this volume to another array. And the OData ID for the target of that replication
is a completely different system on the other side of the world.
So you can reference out of the machine you're in as well.
So the Redfish specs,
you can see that we've been working very hard on them.
In fact, we keep updating them.
About every six months, you're going to see a new version.
We've got 1.2.3, 1.2.4, 1.2.5 is being worked on right now. So this is a very vibrant
infrastructure right now. We've got a lot of people working on these calls. So there's a lot
of different vendors that are represented on these calls to try and make sure that we can cover
things properly. Now, I should note also that there is a conformance testing program
available as well. So if you're a member of the SNEA labs, or I'm sorry, the SMI Innovation Lab,
and you want to put your piece of hardware in the SNEA lab or in the Innovation Lab where other
people can hit it or not, you can choose not to. But the point is, you get conformance testing
program with that for free.
So you can run that thing all day long if you want.
And the conformance testing program
will help you to avoid straying
from the approved documentation.
It will check your syntax,
it will check all kinds of things
to make sure that you've got a compliant implementation.
That's a resource you should definitely be looking at,
is hosting equipment in the SNEA lab, in the SNEA Innovation Lab,
and running the CTP on a regular basis.
And I built a simulator and put it in there,
and it found 300 different things.
I spelled OData with a capital O at one point,
and it trailed all the way through my code,
and I had to fix that in 300 different places.
But it caught those kind of things,
things I wouldn't have thought of.
The conformance testing program is very useful for that.
Now, these mockups are all available
via swordfishmockups.com,
and you'll see there's a whole bunch
of different mockups in there
that you can actually open up, and you can walk down the trees and see how everything's laid out and everything's
operating. Or you could open up the Redfish schema guide or the Swordfish schema guide. Both are
available as well, and you could walk through how those are laid out as well. I recommend looking at
the schema guides instead of the JSON files. If you're comfortable looking at JSON, that's great,
but there's a lot of options in the JSON files
that you probably won't implement.
And the schema guide really puts it into an English language kind of thing.
It makes it a lot easier to consume.
And again, NVMe functionality.
We've got mapped NVMe objects
that exist in the Redfish and Swordfish models.
We've got subsystems. We've got a controller.
And when you build a controller,
you have the discovery controller,
the admin controller, the IO controller.
So these things are ephemeral,
but they pop up,
and they need to be able to be created automatically.
You have namespaces, endurance groups,
NVM sets,
so you can actually set a three-pack of drives together
so that they actually roll together
with their endurance groups,
and NVMe domains. So there's a lot of things we've added in to try and make this a more reasonable implementation. And again, here's the Redfish and Swordfish and what gets
added for NVMe. All the different options. Now again, you guys are going to get a copy of these
decks, are you not? You should be able to download these videos as well.
So I can go into detail in here, but there's a lot of material.
And I definitely suggest zooming in and playing around with these models.
And common usage. When we say storage, what we really mean is a subsystem.
Whether that's a classic array, it's an NVMe eBof shelf, an NVMe array, whatever it is, storage means subsystem.
But storage in a Redfish model could mean PCI RAID controller inside your server.
Or it could mean a SAS controller or a SAS HBA.
So storage can mean a lot of things
to a lot of different people.
And we're using the words volume
and namespace interchangeably.
I'm old school.
I kind of like using the word LUN,
but it's really not as valid anymore.
And under chassis, you're going to find all your thermals. You're going to find your power. You're going to find your drives. There's a lot of things under chassis, you're going to find all your thermals.
You're going to find your power.
You're going to find your drives.
There's a lot of things under chassis
that are really good about exposing the real-world intricacies.
You can have an array that has 12 shelves
and 14 power supplies,
and all that can be displayed there,
including the redundancy levels of the power supplies,
everything in there.
Now, scalability is a big concern.
The idea here is that when you do RESTful, one of the problems with SMIS was that SMIS was not necessarily scalable.
Because once you got into tens of thousands of objects, it fell down because the entire database for SMIS was stateful,
which meant you usually had to run a Pegasus server or some other server that actually talked to a captive database. So it was very common for people to implement SMIS as an off-array
proxy agent, because you'd need a VM that has usually two or four processors with usually eight gigs of RAM just
to be able to house the SMIS proxy agent. But Swordfish and Redfish are designed to be run on
very little hardware and very little state is actually saved. So the idea is that the whole
RESTful concept here is that we don't store as much state. Most of these commands are designed
to be implemented directly to the hardware, not where you have a database storing this information.
If somebody asks for the status of a drive,
you don't pull it from your database.
You talk to the thing that you look at the drive now and give it back to them.
So in a case of where you want to store that information,
what would you be storing?
Oh, configuration data.
Yeah, generally that's stored on the array already
because the array's got a database that has that information.
But the Redfish, Swordfish implementation
isn't necessarily the owner of that data.
It just queries the data.
It's a pass-through thing.
Your array is already storing that data.
If your array were to go down,
Redfish and Swordfish would give you nothing
if it was running off array.
S-to-minus on the other hand was designed
where if the array went down,
you still could query that S-to-minus agent all day long
and get all kinds of data about the array
because it stored the entire state of that array in the S-minus agent.
This, on the other hand, is tell me now what's going on.
And to back that up,
Redfish is commonly deployed on very low-end processors
for, you know, baseboard management control processors,
which are, you know, like ILO, DRAC, BMC, those kind of things are not heavy-duty machines. And they're able to serve
up Swordfish and Redfish, well, they're able to serve up Swordfish without a problem. And
Redfish is not much bigger, because most of it's the same.
I think if you're asking for.
If you're asking for how many drives do I have,
you could be doing that on your BMC
using Skazine Closure Services.
You could be using I2C.
You could be doing a lot of different things.
Sorry?
Yeah, but you're also doing REST calls,
which are a lot more lightweight than CinexML.
So querying a database
as opposed to just making the request,
sure, it's a little slower.
The request is slower, but the end-all result is actually faster
because the Redfish and Swordfish BMCs respond quickly.
It responds quickly.
It depends on how fast you query the device, right?
Yep.
Because remember, if you've got an array,
the array is keeping status of all its stuff.
It's already got the database.
It's already got the configuration.
It's already got all that stuff.
Your Redfish is just a hole into it.
It's just a management interface to it.
Depends on your BMC.
You could, you don't have to.
There's no reason to.
And the other question is,
how often are you running these management commands as opposed to running workload?
Yeah, yeah.
I mean, are you going to be doing 10,000 changes
to a machine in every hour?
Or is this kind of...
Yeah, exactly.
The idea is that this is for configuration monitoring,
that kind of thing.
It's not necessarily, you're not running workload through this.
You're only running management load.
This is your, quote, out-of-band management.
And again, we have our logical NVMeOF, our exported logical subsystem.
So you asked about this a little while ago, about how to export an external logical subsystem where you've got a master system and then multiple subs. and if you look at swordfish,
the swordfish mockups.com,
there is a lot of sample NVMe instances out there
to show you what this thing looks like.
I know that looks hairy,
but realize that most of that model,
chassis, drives,
systems, a lot of that model
is already defined in Redfish and well defined
and well used.
You're adding things like controllers and storage
but you can see what the relationships
are between things.
And again, a solid arrow
is subordinate. A dotted arrow is
referenced. So you have a navigation
link to jump from one to the other.
It looks complex.
It's really not that bad.
And for external connectivity,
you've got your storage collections.
And again, under storage,
you could have a singleton array there.
You could have duals.
You could have five different shells,
each counting as a singleton with their own models under those.
But at the same time, they could share fabrics.
So you only have to identify an initiator or an endpoint once. And again, there's a lot of models here,
and all of this is out in the documentation.
Now, you notice also that we have a partnership
with Open Fabric Alliance,
so there's places where that ties into them as well.
And where that framework lives.
And these models show exactly how to do Fabric connectivity for an eBoff.
Ethernet, a bunch of flash, right? And of course, there's a lot more information out there.
We try to make sure that the Swordfish mock-ups are out there,
the Swordfish forum out there.
There's the Redfish and Swordfish standards.
There's OFA, the Open Fabric Alliance, and the NVME specs as well.
So we try to make sure all of this is available.
And that's the deck.
I've actually got no more to add to this.
Any questions?
What process is there to get the new properties or specs from the NVMe? I'm sorry, repeat that? I'm probably going to have to refer that over to Rochelle at some point. on the Solvig. So how can I add this back?
I'm probably going to have to refer that over to Rochelle at some point.
And I'll try and get her in on some of these questions as well.
Another question is,
is the TCG Trust-Headed Computing Group
as part of the Solvig?
I'm sorry?
The TCG Trust-Headed Computing Group? No.
We do have members in common that we relate to,
that we try and keep informed of what we're doing,
but they're not joined in that way.
Thanks for listening.
If you have questions about the material presented in this podcast,
be sure and join our developers mailing list by sending an email to
developers-subscribe at sneha.org. Here you can ask questions and discuss this topic further with
your peers in the storage developer community. For additional information about the Storage Developer Conference, visit www.storagedeveloper.org.