Storage Developer Conference - #178: Key per IO - Fine Grain Encryption for Storage
Episode Date: December 12, 2022...
Transcript
Discussion (0)
Hello, everybody. Mark Carlson here, SNEA Technical Council Co-Chair. Welcome to the
SDC Podcast. Every week, the SDC Podcast presents important technical topics to the storage
developer community. Each episode is hand-selected by the SNEA Technical Council from the presentations at our annual Storage
Developer Conference. The link to the slides is available in the show notes at snea.org
slash podcasts. You are listening to SDC Podcast, episode number 178. Okay, welcome everyone. This is our 335 talk. We'll be talking about key pariah, fine grain encryption for storage.
So my name is Frederick Knight. I'm from NetApp, and we have Festus from SolidEye.
We'll be telling you about this topic this afternoon. Festus is going to start. He's going to share about some of the key operation and some of
the information about the encryption part of it. And I'll be talking about the NVMe interfaces
to how this stuff is all going to work. So Festus, and he'll get started here. Thanks, Fred.
So, as Fred touched on, we'll start with just an overview of the evolution of data arrest protection schemes.
Then that should get us into key parallel architectural components.
And we'll look at the benefits.
But more importantly, we'll look at the latest updates that TCG Storage Workgroup have been making towards the standardization of key parallel.
Once we get through that, Fred will introduce various key parallel use cases and then how NVMe interface is being modified to support this.
All right, so to start, most of you are already familiar with general approaches
toward on-source devices, their arrest protection scheme. You generally have some media encryption,
media encryption keys generated by the device
that are then used to encrypt or decrypt user data, right,
as it's being written to NAND or being read out of NAND.
So that's the general building block.
And on top of that, some of the recent technology,
like OPPO, they've been built to add the layer of authentication
to try to tie those media encryption keys,
which are generated by the device,
to some outside password, user password.
So that's, you know, general how the data risk protection
work in today devices. So that works well especially for use case where you have
contiguous range of LBA on a storage device and you want to associate that
with a particular key and tie that to some user supply password. But as you can see on the
left, it creates challenges if the number of your ranges increase because it means
you have to manage more keys on the device which, you know, as the number of
keys increases, now you have a more complicated scheme on how you protect
those keys on a search device. that makes the the search device itself more of a an
appealing target for theft fundamentally because your keys are there and the data
stays there so one of the things we've been trying to explore in TCG is
figuring out ways we can improve on that architecture,
maybe provide ways, especially for use case
that may be comfortable with dealing with
the key management of the keys themselves,
provide some sort of an interface and method
that they can inject keys into a storage device,
and leverage those keys to perform
on-device user data encryption.
So with the idea of externally managing media encryption keys,
it introduced some excellent benefits in terms of now you can,
instead of associating a range of ALBAs on-device with one key,
you can actually associate a high-level object with keys.
For instance, one set of key can be associated with objects across different devices
instead of the old architecture
where you have only some LBAs
that can be entirely coupled with one key.
But that also means, you know,
if you can manage media encryption keys externally,
it means you can cryptoerase at a higher level,
not just at LBA range level.
You can cryptoerase at an object level.
That means be spending multiple storage device appliances.
With external management of keys, it obviously simplifies
your key management implementation of the device since you don't own the keys. So the
audit process that Eric talked about earlier becomes a lot more simpler since you don't have
the keys. So these are some of the key benefits
that KeyPario at a high level provides.
There are not, that you cannot,
at least some of this, obviously,
the flexibility exists if you do software encryption.
But, you know, this tries to extend that layer
down into the storage device.
So at high level, the way KeyPriority operates,
you basically have some, you have a lay in the middle,
so let's call that a KeyPriority host management application
that sort of tries to relay the keys
that may be owned by a particular application,
try to relay those keys to the
storage device, and then select those keys to perform user data encryption on the device.
So in that example, we have, say, you have your key-parry application talking to some
key manager.
In this instance, let's assume a key management service that may be hosted by some server.
It has to generate some keys or retrieve some keys, and that key management service, external, obviously, to the SSD,
may go ahead and create those keys and send it back to the application, and the application will then inject those keys into the SSD, and then within the SSD, obviously,
the keys could be populated in the SSD controller key cache
for subsequent IA usage.
So that's kind of the high-level model of how key-parry operates.
In a storage workgroup, we've been basically taking this model
and tried to find various architectural,
even protocol elements that have been needed
to support something like this.
So there's also the NVMe component.
We'll get into that.
Where we get into details, various architectural elements
that are being added to NVMe command set
to be able to support the key selection per namespace.
From a TCG perspective,
the protocol layering that we envision today
is for key management,
right now, we're gonna, for version.1,
we'll start with KMIP.
KMIP is a fairly common
key management protocol that allow exchange of keys between some host
application and some key managers but then from a host to the storage
perspective we're gonna have a couple of different protocols. One is what most of you are really familiar with,
using a security send and receive
to do some basic activation of the KEEPIO
as an ASP like we have all for today.
So you have that.
That's not going to change in terms of, you know,
setting up which authorities are allowed
to perform different configuration options.
The new thing we're adding is obviously the interface to the KMIP protocol.
So we're looking to have a KMIP protocol, the request message protocol, as a payload for TCG com packet.
So this at least keeps existing software stack
that manages output the same,
while allowing quick integration with KMF
without introducing some of the older TCG problems
like sessions.
So, you know, as you can see, this is a quick,
it's a stateless protocol, basically just strictly designed to inject keys into the storage device.
You then have, obviously, on the NVMe side, you have extension to NVMe I commands to support selecting keys. And then TCG will still use protocol two, security protocol two, to do things
like a TCG-specific reset,
but also to do things like clearing of keys
and in a particular key slots.
So that's how, therefore, what the protocol layering
will look like once the standard comes out.
From a key injection perspective,
we actually did an interesting set of problems here.
I want to get you through this a little bit.
So one thing, when we started,
one of the questions we had was,
when you inject a key,
when you try to protect the keys, do you protect them from the key origination point, from the entity that owns the key, all the way to the storage device?
Or do you trust the key by your host management application and protect the keys from that application to the storage device. As you can see, one of the decisions we made
is basically to consider the interaction
between the key manager, the entity that holds the keys,
to the host application
that's managing various storage devices.
Those interactions are out of scope,
will be out of scope with this standard.
Not only because we, you know,
there may be some use cases with that application
who want to know, you know, key with this key ID
or want it to be in a particular namespace,
so they may want to get some metadata on the keys,
therefore enforcing a stricter traffic encryption
from all the way to the key manager to the
namespace, it may create some complications with those use cases. So for
version 1, the first thing we did was to focus on the interaction between
the key prior host management application, and the actual SSD element.
So, since there are many technologies
that deal with the transport security,
the first, we built a couple option.
One was basically just to rely on those technology,
like SPDM, Secure Sessions, or PCIe,
since they give you full end-to-end link encryption
between your host and SSD traffic.
Then key-priority traffic doesn't need to add any more,
you know, complexity.
You can just leverage those existing transport
and send down the keys.
So from the protocol perspective,
the keys, you know, at least the first provisioning,
the key encryption keys you're sending,
they're in plain text,
but in practice they're not,
since they're leveraging the protection provided by things like SPDM or PCIe.
Subsequent key update, though,
the standard obviously provides ways to perform authenticated key updates
using NIST AES GCM, though, the standard app provides ways to perform authenticated key updates using
NIST AES GCM
since there you
can add some integrity
protection on the ciphertext
in addition to confidentiality.
So the first set of keys
you inject is basically
keys that you pre-share that will
help authenticate the next set of keys.
So if you have to have a specific bit strength
on the wrapping keys?
Don't they have to be higher
for the strength of the wrapping?
Well, the...
Oh, yeah, yeah.
Thank you. Thanks.
So the question it's asking is
the security strength of the wrapping keys,
does it have to be higher than the keys themselves that they're wrapping?
Very good question.
For, I believe we're considering a 256 across the board for the wrapping keys,
also media encryption keys.
So that should meet the security strength for for the
wrapping keys and the data themselves they're encrypted by the keys wrapped by
the wrapping keys that opens the question So, you know, for the first set of keys, as I mentioned, you set them up so you can use to authenticate the next set of keys.
The other option we have, and all these are host configurable, you know, we looked at use case where you may not have
these link protection technologies available. You know, I'm not sure how many vendors have already
link encryption shipping today or support for SPDM secure session. So for those use case, we provide
an option to use an SDK transport,
at least for provisioning of this first set of keys.
But then subsequent key updates, you can use symmetric KES GCM.
In this option, the assumption is that the device can be pre-provisioned. The source device can be pre-provisioned with some sort of public key certificates.
You can then use the certificate to set up the key transport algorithm.
The way it will work, basically,
the host application pulls the certificate from the device
and then use the public key, register the certificate with the key manager,
and then it can then tell the key manager to use that certificate to
encrypt whatever key encryption key wants to send down and can just relate
that to the source device in this setup of the first keys one of the future work
we're considering is it you know as a briefly touched on, our current options, they target the protection of keys
from the host application to the source device.
But there are use case where you may want to establish
end-to-end protection of a key.
For example, if the key is owned by a user who sits on top
of that application, you example, if the key's owned by a user who sits on top of that application,
you wanna protect the keys from that user
all the way to the namespace.
In addition to that, we've heard from,
some of the feedback we've gotten is that
not everyone want those key encryption keys
to persist on a storage device.
Some use case will want basically the device to not have access
to any of the keys when it loses power.
So that's the concept of ephemeral kegs.
We've been looking to figure out exactly how it will work.
The idea is that you still want to be able to establish
the protection of the keys, but still allowing the intermediate layer to be able to dictate which namespace will consume
which keys. So this will probably not make it in a... it's not going to make it in a
version 1.0 but in subsequent versions. So once you've established your
first keys, we said the key encryption keys,
we can view them as the authentication keys.
You then go to establishing the media encryption keys.
So these are the keys that are going to be used
for user data protection.
So for these keys,
the idea is that they can only be provisioned
by the entity that provisioned the authentication keys.
So these will all be sent down encrypted using the previously provisioned key encryption keys or authentication keys.
Obviously, it's very important that you have the integrity guarantees of the ciphertext for MEX.
Since they are used for encryption of user data, you want to make sure that the keys
you're using, they haven't been tampered with. At least if they've been tampered with,
we should be able to detect that before those keys are accepted. One of the main
properties of KeyPyro is that the media encryption keys then never exist on a
storage device once they lose power. So on every boot the host will basically have to re-authenticate
by supplying MEX again protected by previously provisioned authentication keys.
The other property that's nice that we baked in in version 1 is the ability to support replay
protection. As you imagine, if I have an analyzer sitting on a traffic and record
the previously
inserted keys, it can just replay
and get access to the data.
So having a replay protection
is just to make sure that
every time a key is inserted
that we have a quick challenge
test to be able to tell that
the keys
that are being injected are fresh, not old copies that have been previously injected.
So this will come as well in the version 1.0.
So I think at a high level for key provisioning schemes, this is what you can expect to see,
at least in version 1.0.
Any questions before we jump into the interface interactions?
In your presentation, you have mentioned ETIE.
I'm assuming this will work over at the
I was wondering
what that means for
fabrics
I think it's a very good question who's wondering what that means for fabrics.
I think it's a very good question,
and that inspired that option, second option.
We need an option that doesn't rely
on either SPDM secure station or PCI IDE,
something that's native to the protocol
that can provide the link protection.
And that's what our option today okay thank you So just to elaborate a little bit on Mike's question there about fabrics,
we do have authentication and TLS types of protocols. We have FCS SP.
So there are ways to secure the transports for some of those fabric environments.
So basically the use case for Keeper IO is to be able to do much more fine-grained control.
The methods we have with self-encrypting drives today are either for the entire drive or for predefined LBA ranges.
That makes it hard for the application to be able to make sure that their data goes in a particular range of LBAs
so that it gets encrypted with the appropriate key. So the idea here is that each IO operation,
each read command, each write command, gets to select its own key that is unique for that data.
So here's what that might look like. We have our green tenant, our yellow tenant, our blue tenant, and our purple tenant. And each of their data is being encrypted with their own key
before it's stored out on the media so that they don't have to worry about where it goes using the
existing SED technologies that TCG has provided. So they can then all be mixed out on that volume,
and you can more easily control the coming and going of that data.
You can erase the data and get rid of it,
because just by getting rid of the key,
it makes it virtually impossible to be able to get that data back
in its original form.
So a couple places where this might be implemented.
If we have some specific machines, the tenants on the left,
our green, yellow, and red tenant,
and they send some operations to an array controller out of the fabric,
the array controller could then have its own KMIP database
that keeps track of each of those connections
to ensure that each tenant has its own data encrypted
and stored securely,
so that if somebody wants to securely obliterate
one of those tenants,
they don't have to go find all of the data
that is associated with that tenant
and scrub it somehow. They can just delete
the key out of the KMIP database. They can delete the key out of each of those SSDs. And all of a
sudden, all of the data for that tenant is gone without disturbing any of the other data for any
of the other tenants. But right now, to be able to do that, you have to do it with the whole drive,
or the array has to have previously divided up those LBAs into some fixed LBA ranges.
So you can do the same thing in virtual machines, where each virtual machine gets its own
key associated with it. It goes through the hypervisor, and the data keeps its association with that key,
and now it's stored out on whatever device it's being written to.
And so, again, to get rid of any one of those individual machines, those virtual machines,
you just have to get rid of the key that was being used by that machine.
So there's several different other ways that this can be put to use.
This just happens to be two of the common use cases.
So to operate with this,
the system is going to have to do some discovery
just like it does with any feature of a device.
It's going to have to figure out
how many key tags are available.
So the key tag is the thing that's used with the I.O. You say, I want to use key number one,
key number 10, key number 100. That's the key tag. So each I.O. doesn't have to pack an extra
512 bytes worth of key as part of the I.O. It's the key tag that references the key. So then the key gets stored
separately in the device, and the appropriate key is then the associated key with that key tag is
what gets used. So the host has to know how many of those a device can support. Sometimes different
encryption algorithms have different requirements on the amount of data that they're encrypting.
And so there are maybe some granularity and alignment requirements
that are going to be associated with different algorithms,
and the host is going to need to know that.
So some of this is obtained through the NVMe identify commands,
and some of this is obtained through the TCG discovery commands,
the security protocol send to request the information,
and the security protocol receive to get the result back to be able to determine what the capabilities are that the device supports.
So the first step, establishing the key encryption keys.
Festus talked about that.
That's a negotiation that happens between the host and the device to get the key encryption keys there.
And there's a couple of ways that that can happen that he talked about.
And then the media encryption keys have to be inserted.
And the key encryption keys are the things that are used to protect that. So we have
a couple of different protocol-specific ways that the transports protect that information through
either some of the PCI mechanisms, through some of the fabric encryption mechanisms, having a TLS
channel established between the host and the storage device. So the media encryption keys get injected
using the TCG security send command
so that the device can learn what the keys are.
So here we have a very limited device.
This device has, looks like, seven key tag slots in its key cache.
Now, hopefully, nobody's going to build a device
this small. We expect the smallest ones will probably be at least in the hundreds, maybe
thousands of keys that will be supported by the device. But this key tag value is a 16-bit value,
so it's possible that there can be 65,000 keys within the device at any given point in time.
And since that's all backed by a K-MIP
database, there can be millions of keys, hundreds of millions of keys, however large your K-MIP
database can grow to support all of those keys, and they can be staged through the cache on an
as-needed basis. So in this case, we've got these seven key tags.
There's their key tag number, and there's a 256-bit key that's been inserted into the key cache that's associated with that key tag.
Now, you can see here, these media encryption keys aren't very creative.
They're sort of sequential, just so it's kind of easy to see that they're different, and you can tell that they're sort of sequential just so it's kind of easy to see that they're different and you can tell that they're there so we've got different key tags we've got seven different values that
the host can reference as it's going to be sending its IOs so we've injected these keys
and we've started using some of them and the host determines that it hasn't quite
accurately predicted or maybe the the load has changed and it needs to change what's in the cache
just like any cache the host is now managing this on a least recently used mechanism and it's going
to kick out the ones that it doesn't need and And in this case, you see it's kept the first two there,
the EF and the E0,
but it's had to change some of the ones after that.
And it's injected some new keys into those key tags.
So we've got a K-NIP database
that has all of these extra keys in it
that can't fit in the device at any one point in time.
So we have to stage through them
as we need them to do our different IOs.
So in particular, the commands in NVMe that need to be aware of this are any command that's going to do IO.
So the compare command,
it's got to be able to make sure that the data can get decrypted to be compared.
The copy command, it's going to have to decrypt the data,
copy the data, and then re-encrypt it again.
The verify, read, write, writing zeros.
Why do you need an encryption key on write zeros?
Well, if you didn't, then everyone would be able to tell
what data was zeros, and it wouldn't be protected.
So all of the write zeros are going to make it look like something different depending on the encryption key. would be able to tell what data was zeros, and it wouldn't be protected.
So all of the right zeros are going to make it look like something different depending on the encryption key.
And the zone append command for ZNS devices
because that's an additional way to get data written.
So here we have our key tag database,
which has the keys in it,
and we're going to be using some of those key tags to write some data
to our device. So here's an example sequence where we've issued a write command. We're writing out
to LBA 100. We're going to write eight blocks, and we're going to use key tag number one. So what
key tag number one means is that we have a media encryption key in the cache that ends in that little EF value.
Then we go on to write to LBA 200, and we're going to write 16 blocks this time using key tag 100.
So that's the key that ends in the value E6.
Then we go to read the data from LBA 100, and the host now has to know where that key tag is. It notes that it was written with
key tag number one, but it's possible that that key could now be in a different slot. It could
have gotten unloaded by the time that it wanted to do the read, and it might have had to have gotten
loaded back into a different slot because of the least recently used algorithms. In this case, that key happens to still
be in slot number one, so key tag number one is used to do the read. That key is then used to do
the decryption, and you get your data back. So in the next example, the host makes a mistake.
I mean, the hosts aren't perfect. They have bugs. Sometimes applications have bugs, or sometimes human error comes in.
In this case, they try to read LBA 200.
They know they want to read 16 blocks, but they specify key tag number one.
So what does the device do?
It looks into slot number one.
It finds an encryption key, and it does a decryption.
And so depending on the implementation and the device, there's a couple of things that can happen
if you decrypt data with the wrong key you get back garbage that's just the way it is you give
it the wrong key that's what you get so if the device in fact stored its ecc check values
prior to doing the encryption then when you do the decryption, you're going to find that
that ECC check value doesn't match what it should, and you're going to think you got a media error
because the device is going to see the mismatch on the ECC. So there's a number of bad things
that can happen to a host that specifies the wrong media encryption key when it tries to do the read of the data.
But the point is the data is protected, and if you don't specify the right key,
you don't get the right data. So then the host notices its mistake. It reissues the read,
this time with key tag 100. That happens to match what it was written with,
and so now the host gets their data back.
We had a question, Randy.
Who manages the... which media keys are currently loaded
to the main cache?
So the management of the media encryption keys
in that key cache and that association with the key tag
is managed by the host.
The host indicates which
media keys get loaded into which slots. The host can have those keys removed from a slot. It can
replace the key that's in that slot, so the host is in total control of managing that.
Are the slots associated with the host or with the namespace?
The slots are associated with the host or the namespace.
It is per namespace.
Each namespace has its own key cache that can be loaded by the host.
The host is control of which keys go into which slots for each namespace.
So if you have multiple hosts going into one namespace. If you have multiple hosts that are both going to access the same namespace,
then the assumption is that those hosts are going to know about each other,
they're going to be coordinating their LBA accesses,
and therefore they're also going to be coordinating the key management.
Sort of like a cluster. When a cluster accesses the same namespace, the individual nodes have to
have some level of coordination between them. So the same thing happens with this key slot management. Yes, if you want to restrict access to one at a time,
then those reservations would apply here as well.
You can say, you know, I want to...
Reservation covers the whole namespace.
Right.
So you can use reservations to lock out other hosts from being able to do things to that namespace. Right. So you can use reservations to lock out other hosts
from being able to do things
to that namespace.
But that's not much
of a shared environment
when you do that.
So we had another question.
Are the commands
that manage the cache
are made in the NVMe layer
or in the TCG layer?
Are those commands
at the TCG layer or at the NVMe layer?
That's sort of both because they are NVMe security send command
and an NVMe security receive command,
which then contains TCG content within the data buffers that flow back and forth.
So it's sort of both.
Yeah, yeah.
It works the same way that all the other TCG commands work.
Yes.
Question there.
Yes, the key cache is limited.
There's only so many slots.
That's what the discovery process is.
If the hosts want to, they could agree to divide up the key cache.
You get slot 1 to 100, and I get slot 101 to 200.
So there's negotiations that have to happen in any shared environment.
Yes.
Yeah.
I'm sorry. This is the first time I've heard of this concept. So I have a very, very basic idea. Yes. Yeah. and especially contrast it with the wider of the data.
It uses something in the first data.
So the question is about the value of the feature.
So yes, what you described works perfectly fine.
You can have a host software implementation that encrypts the data,
burning host CPU cycles to do
that encryption work and send the
data out in an encrypted form to the media
and can keep track of all of
that and manage all of that so that when it
reads the data back in again,
it then additionally burns more
host CPU cycles to do the decryption
of that data. But if we can take
all of that encryption and decryption work
and we can shuffle it out to the device, then
we've saved our, yes, it
could be thought of as a form of computational storage that's using
unique TCG style APIs.
Yeah, I mean, there's many other use cases for
it.
One of the ones that was mentioned at the beginning was the European GDPR requirements.
I mean, you can imagine this going to a worst-case environment.
Well, actually, for a storage vendor, it might be a really good solution,
is that every person in the world gets their own key. Yeah, that tends to,
that starts to make my mind explode
with scaling issues,
especially if each person wants
to have their own unique key
for each of their own types of data.
I want to have a key from,
so when I want to be forgotten on Facebook,
I can say, just forget my Facebook key.
That's a lot of keys to manage.
Two people with one picture.
Yes, question?
And that's what you can do with Opal.
How does this compare to today's methods? So today's methods, you can either self-encrypt the whole drive, or you can self-encrypt predefined fixed LBA ranges.
And that number of predefined fixed LBA ranges is not terribly large, but now the host is stuck managing
those pre-allocated spaces, and it's much less dynamic. If the first one that you set up,
if they only store a little bit of data, you've already pre-allocated it. If the second range,
if they start filling that up with data and you realize, oh my goodness, I don't have enough space, what do you do?
There's not a whole lot you can do.
It's all preallocated.
You're going to have to take a second range, which means you're now going to be using a separate key.
You're going to have two keys for the same application.
Just because it stored too much data and overflowed its first range, the difference between the static nature of predefined ranges
and each I.O. gets to define its own.
So it's fully dynamic.
Is this one-sided?
Does the initiator encrypt and the target just pick the encrypted data?
So do both ends have to implement it? So the question is, do both ends have to implement it?
And the answer is no.
The data is going over the wire just like it does with the self-encrypting device today.
It goes in the clear.
And then the device is doing the encryption.
This is an extension of the SCD or the self-encrypting drive
so that the data gets encrypted in the device when it receives it and the data gets decrypted before it is sent back to the host.
So what the host is doing is the host is managing which key is being used for any given I.O.
And the device is then doing all the work to encrypt and store the data.
This is still about protecting data at rest.
Yeah, if your drive falls off the back of the truck, and somebody walks away with that drive, what do they get?
Well, in the case of an SED today with Opal, they get the drive and they get the key. The device is the only one that knows
that key in today's self-encrypting drives. In this case, they get the drive with the data,
but they don't get any of the keys. The keys are all back in the host,
because when the power is lost, the keys are lost too.
Question? Yes.
So the question is about managing the depth of the NVMe queue. 65,000 queues with 65,000 IOs on each queue. How do you
coordinate that with the management plane access for managing that cache? Right. And making sure
that that happens correctly is the responsibility of the host. If you want to use a key, you have
to send the admin or you have to send the security
send command to get the key out there. And then you have to wait for that command to complete
before you can use the key because that's the only way you know that the key is there to be used.
There have been some proof of concepts that have been done with this,
which for the size of the scale that was done in those proof of concepts, have been done with this, which for the size of the scale that was done
in those proof of concepts, there were no issues.
I don't know if the scale,
that obviously didn't scale all the way up to the 64K
by 64K configurations,
but at the scale that it was tested, it worked well.
Today's self-encrypting drives, they run these encryption
algorithms at line speed. They can still go full bore in both directions, reading and writing.
And so the expectation, yeah. Well, the flash, self-encrypting device, they changed it from
self-encrypting disk to self-encrypting device.
So the flash devices, the CPUs that are doing all of that stuff are still doing it at line speed.
Correct.
The key management was much less complex in the self-encrypting drives of today versus what we have here with the key per IO.
Do you have a comment, Festus?
So, yeah, I don't know if I can repeat all that, but some of the performance is dependent
on those access patterns.
The other thing we'll be getting the device that matches the applications is that if you
buy a rotating device and expect to get SSD performance out of it,
you bought the wrong device. If you buy a device that has a key cache of 100 elements and you
expect to run applications that need 1,000 elements in their cache, then you bought the
wrong device. You should have bought one that had room for 1,000 if that's what you really expect to run. So there will be some variability in the market in terms of the number of slots that are in them.
And so hopefully there won't be too much variability.
I'm sure manufacturers don't want to have all the different SKUs to manage and stuff like that.
So there will likely be a couple of points
where they'll have fewer or larger number of slots.
So we've talked about some of these things already.
The security send receive commands.
We have the new protocol ID.
We still have the TCG authentication discovery process.
So we have new commands for injecting the keys,
clearing the keys, replacing the keys in the cache,
purging the cache.
You can pull the plug, but you can also send a command to just tell the device to get rid of all the keys.
And we've got a number of different encryption algorithms that have been included in the standard that can be used.
You can specify when you load the key.
So the host is responsible for the key management.
Everything to do with the key,
the creation of the key,
the loading it in the device,
and the host will keep a copy of that around in its KMIB database.
So just because you pull the plug on a device
doesn't mean the key went away.
The key went away in the device, but the key still exists in the database.
And if you really want to get rid of it, you have to get rid of it in both places.
That's the host's responsibility.
The host has to make sure the right keys are there at the right time,
that all that gets coordinated,
and the host, of course, has to deal with any mistakes it makes.
So right now, there's a little bit of work left.
We're very close.
Most of these documents are in their final review at this point,
and so people are making some comments.
We may have one more spin of them, but if you're involved in the committees, take a look at them.
Because at this point, they're just a few weeks to months away from public release.
The positional is, so when you load it in the drive, it's loaded into a slot.
Well, that's up to the host.
The host could overwrite a key in a previous slot that hadn't been used in a long time.
I'm not sure what you mean.
Will the drive reject a duplicate key tag?
I don't know what a duplicate key tag is.
Mm-hmm.
Yep.
The slot 10 in the cache.
Mm-hmm.
Yep.
And just changes the value of the key in that slot.
So there is a slot number 10?
Yes.
The slots are identified,
and you load the contents of that slot.
As far as denial of service attacks,
you have to have an application
that's gone through the authentication process
and knows how to access
the key encryption keys to be able to get this stuff
out over the wire, there's an awfully lot of other
security layers that are going to make that pretty hard.
So we're working on the NVMe protocol
in the NVMexpress group and the TCG protocol
in the TCG group.
Yes?
It's associated with a namespace.
And so if you have a PCI card with one namespace,
it has one cache.
If you have an array that is presenting 10
namespaces, it's going to have 10 caches per namespace. So the last thing is that we are using
the TCG architecture with the security protocol in and out commands in SCSI, the security send
and receive, and the SATA devices. So this would all be possible to port to those protocols as well.
Right now, there's not a lot of interest.
There's been a couple of people in the SAS area that have started to ask the question
about whether it would be appropriate.
But there's, so far, NVMe only.
So we'll have to see how that develops over time.
Yes, Randy.
Did you speak out of a thinking on why the key guesses
associated with the namespace as opposed to with the key cases?
The expectation, there were a couple of expectations
as we went through the use cases.
Yeah, repeat the question.
Why it was per namespace, why the key cache is per namespace instead of per controller.
The assumption was that applications would be more per namespace than per controller.
And there were arguments on both sides.
And we just ended up deciding, yeah, it could have been a toss-up.
Maybe, you know, I don't think it was that we just sort of let the loudest person win.
But, yeah.
And we are officially out of time.
So hopefully everybody's gotten their questions answered.
And here's sort of the statements from TCG and NVMe about who they are.
And we're out of time.
Thanks for listening.
If you have questions about the material presented in this podcast, be sure and join our developers mailing list by sending an email to developers-subscribe at sneha.org.
Here you can ask questions and discuss this topic further with your peers in the Storage Developer community. For additional information about the Storage Developer Conference, visit www.storagedeveloper.org.