In The Arena by TechArena - TechArena Great Debate: Google Talks OCP LOCK and Open Standards

Episode Date: June 23, 2025

Amber Huffman and Jeff Andersen of Google join Allyson Klein to discuss the roadmap for OCP LOCK, post-quantum security, and how open ecosystems accelerate hardware trust and vendor adoption....

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Tech Arena Great Debates. My name is Alison Klein, and today the Googler is in the house with me. I'm so excited. I've got Amber Huffman and Jeff Anderson from Google. Amber and Jeff, why don't you go ahead and introduce yourselves? Thanks so much for having us, Alison. I'm really excited to talk with you today. So Amber Huffman, I'm at Google Cloud where I lead our ecosystem and standards.
Starting point is 00:00:30 I'm the OpenCupute Projects board member for Google and I also lead NVM Express. Awesome. Hi, I'm Jeff Anderson. I've been working at Google for about nine years now, working on platform attestation, low level hardware to trust. I am involved in the OCP security project, where I'm a co-lead, and the data center work group in TCG as well.
Starting point is 00:00:53 Worked in DMTF for some of their standards as well. So excited to talk about some of the new stuff we're cooking up. And I can't wait to get into it. Today's topic is about a new initiative coming from OCP, OCP lock, and we will get to that in a second, but let's just talk context. One of the biggest challenges for data center infrastructure management
Starting point is 00:01:16 is reuse of devices within decommission. Why is this such a challenge, especially when you can start considering drives? Yeah, I could kick it off. Great question. As an industry, our top priority is always protecting users data, and that's really at the core of everything we do. It's how we build and maintain that trust. One thing that Jeff's an expert in is we encrypt
Starting point is 00:01:45 drives based on the best security practices and algorithms. Now, one of the challenges is we would love to share those encrypted drives and erase all that data and be able to share it in a secondary market and get a second use out of a drive. But there's a couple of problems. One is the first challenge, even if you could go share a fully encrypted drive, you could still have nation state actors that could have resources and potentially break that encryption, which is a big problem.
Starting point is 00:02:13 The second issue is, technology evolves over time and you could have, break older encryption algorithms. So one of the things we've done is, how do we think about that we can attestably prove that something is completely erased. And that's where we have not had a mechanism stay, especially with SSDs in the industry. And that's where lock is trying to come in. That's awesome. Now, if we were able to solve this, let's just talk about this from a value chain perspective. If we're able to solve the challenges that you've laid out,
Starting point is 00:02:46 what is the opportunity for the value chain in that second use? So the big opportunity in the value chain is that we could get much better use of and rather than take SSDs that we no longer have a use for, whether it's at a hyperscaler or in some other small business, rather than crushing those devices, rather you could actually put them out for sale in a secondary use case and be completely secure that you're not leaking anyone's data. And so we would like to have a great opportunity to have the best of both worlds where you
Starting point is 00:03:19 are both secure and you also could be sustainable. So that's really where Locke is coming in. I love that. I mean, you think about drives are advancing all the time. We're seeing higher capacities, higher performance. But just because it's not maybe perfect for a data center or an AI application doesn't mean that that drive couldn't be used somewhere else at the edge.
Starting point is 00:03:42 I love it. And OCP is tackling it with lock. What is lock and why did you decide to focus here? Yeah, Jeff, do you want to take that one? Yeah, sure. So this builds on something called Calyptera, which many might be familiar with, but it's an open source hardware root of trust.
Starting point is 00:04:03 This is more than a specification. It's an actual implementation. It's RTL, it's ROM, it's firmware, it's a whole tool chain for getting this foundational security into silicon. And what we found is that, yes, there are plenty of specifications out there, but implementation quality varies widely.
Starting point is 00:04:24 And to do our due diligence for our customers requires us to look carefully at the implementations that we take in. It's more scalable for us to be able to write down and to actually implement, here's what we want to see, and then have our vendors integrate it into their products. So that's what Kalypter is, and it's essentially a root of trust for everyone.
Starting point is 00:04:44 It handles identity and attestation, which is a core component of many pieces of secure hardware. But by its nature, it is very generic. OSpeed Lock is building on that. We are essentially adding new features to Calyptera, specifically intended for storage devices. So if you take Calyptera, flip on the lock switch and put it in a storage device, that's what we're going for.
Starting point is 00:05:09 This is also, it involves other efforts as well, because some of the new features we're talking about, like the attested sanitization, some other aspects such as like multi-party authorization, require new APIs for the host. So we're also involved in the standards bodies which govern that like TCG. So what is LOCK?
Starting point is 00:05:33 It is a bit of an umbrella effort to deliver hardware for our vendors as well as defining new APIs for the host and infrastructure to integrate with these devices when they come in. Now obviously we've talked about this as being run out of a project within OCP. Let's unpack that a little bit. Why is this a great thing for OCP to take on and how does it fit into the broader objectives of the organization? And can you talk a little bit about the project itself and who's contributing?
Starting point is 00:06:08 Sure, I mean, the spirit of OCP is we have large data center customers and we have many vendors and it is much more scalable for everyone to agree on what their requirements are. So we can just build one thing and integrate it well. So that's very much in line with the spirit of luck. We have like, Google is not special in what we want out of our storage devices.
Starting point is 00:06:30 Everyone wants more secure encryption in their drives. So that creates a really nice environment where we can partner with some of our other peers in the industry, like Microsoft, and drive this new standard. We also have partners in Samsung, Kioxi and Solidim, our other core contributors to this. And we're seeing other uptake,
Starting point is 00:06:53 much more interest across a wide variety of vendors as well. So we're looking forward to engaging with the industry on this. And we were also taking advantage of the wonderfulness of Calyptera. So Calyptera followed that path before. So a few years ago, Calyptera started OCP and did their implementation of Chips Alliance and established that as a foundation.
Starting point is 00:07:14 And Locke is taking advantage of that same proven trajectory. And we're really excited to be on that journey. Now, obviously, this improves security or you wouldn't be spending time on it, but actually can you walk us through how lock enhances data security compared to traditional methods and why this is better? Sure. So how does lock improve security?
Starting point is 00:07:39 Well, one, it's a known good implementation of the lowest layers. So like in a drive that takes Calypter and takes lock, only the components that we've actually authored get to ever see the media key. We get to implement the cryptography that mixes that key or that binds that key to external access keys. And that also binds it to secrets
Starting point is 00:08:06 which we can securely erase in our test two. So if you take lock and you indicate it correctly, then as a cloud service provider, I can be fairly confident that when I provision this drive with an access key, that without that access key, the drive data cannot be accessed. And that when I tell that drive to go erase itself and then it reports back clean, that it in fact has been cleansed.
Starting point is 00:08:31 And if I double click into a couple things that Jeff just said to expand a little more, one thing that that lock is doing is multi-party authorization. And just to explain that a little more, what that's doing is traditionally a drive has a single key. And what Locke is doing is giving multiple keys. And so if somebody steals a drive, you still have to go, it's not a single password that unlocks that drive. There's an additional password and potentially from the cloud service provider
Starting point is 00:09:01 and or whoever is the owner. And then even if you're in a cloud service provider context, there's yet another password that could be from the end customer that owns the VM. And that provides you an ability that even at the end customer, you can say, hey, I don't want anybody to have access to this anymore. And that just gives you an additional level of protection
Starting point is 00:09:22 against it being stolen. And then, of course, with the implementation approach, you're looking at the tried and true what we've seen over the last 30 years with open source. Open source, the more eyes, the more ability all of us can do to make sure we harden these implementations. And so we're really excited about the way that Locke takes advantage of that. And it just layers on the additional security we've been doing at OCP overall. Like if we think about OCP safe,
Starting point is 00:09:48 that's a centralized audit capability and program. And so we're just trying to up the level of game across the industry on security, especially in the storage space. Well, and it makes sense because this is not an area where any particular cloud service provider is gonna differentiate. It's just a core capability that everybody needs and wants
Starting point is 00:10:08 to be as good as possible. One thing that I wanted to ask you about is how lock improves upon existing practices like drive destruction or multi-pass overrides, particularly when you're considering something like operational efficiency and e-waste considerations within an organization. Have you talked about that within the project? We have discussed, yes. So traditional overpass requires you to write over every bit of the drive. We do this in cases where we don't trust the drives on board cryptography to erase a key correctly.
Starting point is 00:10:46 It takes a long time, it's failure prone, so we'd rather move towards the crypto race. But we can only do that and eliminating the requirement for something like multi Passover, right? Exactly. Yes. And other improvements we're doing here is, you know, post quantum is all the rage these days. And now when you're talking about symmetric encryption instead of drive, generally, the algorithms
Starting point is 00:11:27 in use today are, are stable are okay. Like a quantum computer is not going to be able to crack AES 256. But one thing that we are adding in lock when we talk about these multi party authorization, these external access keys, is transport encryption, lightweight transport encryption, lightweight transport encryption so that we can ensure that, you know, you can hold an access key in a remote key management service
Starting point is 00:11:52 or even an HSM and provision it to the drive directly when it needs it without actually trusting the host to see that key and to not disclose it. This gives us even more confidence that no matter how the host is behaving, if you withhold that access key, the media on the drive cannot be decrypted. Now, this relies on asymmetric cryptography, which is susceptible to quantum attacks. And so that's why we're targeting post-qu post quantum security from day one.
Starting point is 00:12:25 Two different flavors of provisioning these keys, one based on elliptic curve and another that's hybridized between elliptic curve and the new ML chem. That's very cool. That's very cool. Now you've talked about Calyptera. You also have something in lock called key management blocker KMB. Can you explain the relative roles of these technologies in lock? Sure. So Calyptera provides this baseline identity and attestation primitive support. So there, there is going to be more software on a drive controller than what runs inside Calyptera and Calyptera's job is to go collect measurements
Starting point is 00:13:05 for measurements of that code and then can attest to it. So whenever you're talking to a drive, you can know that you're talking to the right drive with the right firmware. KMB is our name for the new functionality inside Calyptera, which will actually handle these media keys and the access keys and the erasable keys and fuses. So that's just the logical partition. A KMB will only be there if you enable lock features
Starting point is 00:13:33 at integration time. So if you integrate Kalyptra in like an accelerator or a CPU, don't have to add lock, it won't be there. Okay, got it. Now, Amber, you started talking about open source and the benefit of open source. But when you think about Locke's open source approach, is there anything unique here that you've seen based on all of your broad work in industry efforts in what the team is doing here with Locke that you'd want to call out?
Starting point is 00:14:02 Yeah, what I'd like to call out is the way that open source software is shifting towards open source hardware. And so if we see Calyptera, it's really the first broad open source hardware approach that's taken by major vendors collaborating and competitors collaborating together. What you see with OCP lock building on that,
Starting point is 00:14:21 it's expanding into the storage space. And I think what we increasingly will find is these functions that are not differentiators, how can we basically harden those and reduce that toil and all be much more confident. So I think that's where increasingly I'm hoping to see as we keep moving forward that more IP blocks that are not differentiating, how can we all leverage them? Because, you know Because if we think about what open source software has done, a lot of the types of attacks that you see with buffer overflows and that type of stuff
Starting point is 00:14:51 has been solved in lots of cases by protective languages like Rust and other things. So how do we bring these techniques and open source capabilities to hardware more and more? And we're excited to see the vendors be receptive and how we move forward. And this is just, there's been some talk of NVMe in this, but actually what Jeff and other architects at Microsoft
Starting point is 00:15:16 and Samsung and Kioxia and Solidime have been doing is making sure that everything we do actually applies to any storage device. And so there's nothing that we're doing in here that couldn't be applied to hard drives or who knows what next generation storage device will come up with in the future. You know, I was at OCP Dublin and it was interesting to see how many conversations were focused on memory and storage innovation and what was needed for that next wave of innovation and what was needed for that next wave of technology. Is lock a reflection of a deeper engagement with the storage and memory industry and OCP and what's driving that?
Starting point is 00:15:55 That's a great question. So I think that what's one thing that lock is showing. So today you have NVM Express as a standards body, you have many different standards bodies. What I'm seeing is a great opportunity at OCP and that deeper engagement with the storage industry is that things are moving much faster, especially with AI on a yearly beat rate, and we're needing to innovate faster. Typically, a standard will take 18 months, 24 months, whatnot in a traditional standards body.
Starting point is 00:16:26 What OCP is allowing us to do is come together with people that are ready to move quickly and solve a problem quickly and gets you say, maybe it would be only 90% good enough in a traditional standards body in terms of every i is dotted and t is crossed. But you move much more quickly. And then the implementation approach
Starting point is 00:16:47 with that that Kalypter innovated is really helpful because as Jeff was pointing out earlier, a lot of times these issues come up where people interpret the spec incorrectly. And so by having that full implementation, that is where the rubber meets the road of people seeing. That's how it happens. So I do believe that you'll see more and more of how do we just move quickly together and
Starting point is 00:17:10 have proven implementations and move forward and that will move us all forward much more quickly. Now let's just imagine a world where lock is fully implemented. How do you imagine utilization of SSTs to change? What's the impact of vendors, operators? And where I go is sustainability objectives in terms of reuse. I'm hoping that as lock gets fully implemented
Starting point is 00:17:35 that it becomes really boring to vendors because it is going to be like an IP block that you just pick up from an IP vendor, but it's just gonna happen to be this community implementation that this becomes an area where people are trying to not, you know, if there's an innovation, it's an innovation as a community to lift all boats. And so I think that's really exciting.
Starting point is 00:17:58 And it will be very exciting to have more SSDs and other storage devices available in the secondary market versus any type of e-waste and crushing of those devices. I mean, what you see today that is a big problem, especially with NAND flashes, the algorithms that you use in order to do bad blocks and other things make it so that it's really hard of what you were mentioning earlier, Allison, of kind of override or other things. You just don't have access to all of the NAND on the device. So these types of mechanisms are gonna just be a game changer of,
Starting point is 00:18:31 a lot of times today, you're often like with OCP-SAVE, you're saying, hey, can I audit your code and really understand why it's secure? Why do I have to trust it? All of that overhead is stuff that we won't really have to do in certain blocks that are proven. So I think it's just a much simpler, trustworthy world as we move forward.
Starting point is 00:18:51 Jeff, any thoughts that you have? One thing that I did want to comment on for the benefit of open source here is that one thing that we're targeting is FIPS compliance. For example, this is just an example of why open source is great. Because we're targeting FIPS compliance. FIPS and identify areas where, oh, okay, it looks like we were about to do something that is reasonable cryptographically, but won't line up with FIPs, so let's course correct. We can course correct early and make sure that there will be no surprises later on, and so that will further reduce the risk for vendors that want to take this. So when we look at that end goal, and it is so clear as to what the value is, what's the roadmap for Locke's future development and adoption to get to that vision? The roadmap to adoption of Locke is, well, we have to make it real first.
Starting point is 00:20:14 It's hard to tell vendors take this when it's still being assembled in flight. So we are developing it on a schedule because we do have intended silicon intercepts. But yeah, so there's the baseline Calyptera platform where 2.0 has recently been tagged for release. And that includes some post quantum algorithms like MLDSA. What we need out of, so then there'll be a 2.1, which will address the lock features. So include ML chem support and a few other enhancements
Starting point is 00:20:56 to the key management hardware. Then we'll implement the firmware for lock. And then we'll also provide artifacts for again for for FIPS compliance we need extensive documentation so that vendors who take lock have a very easy passwords FIPS certifier or FIPS validating their parts. That's awesome. Yeah and people can get the OCP lock 0.8 specification today, download it. The 1.0 is going to come out later. Our target is later this summer to fall.
Starting point is 00:21:33 So definitely ahead of the OCP October global summit. And Collipter 2.1 is the intercept for the OCP lock functionality. And from an RTL perspective, that should be in place. The schedule is this summer as well. So with that RTL done, obviously we can't speak to vendor schedules. But hopefully, knock on wood, they'll be picking this up in their in-flight designs.
Starting point is 00:21:57 That's awesome. And one of the things that I was thinking about when you're going through that, Amber, is that I'm sure there are people online that want to get involved. Because you've got a who's who of folks working on this together. But one of the nice things about OCP is that anybody that wants to can join the work group
Starting point is 00:22:15 and get engaged in the process. So how do folks get involved in the effort? And what would you suggest in terms of finding out more and learning more about where the specification is and where it's going? Yeah, I think that's the beauty of, as you point out, Allison, about OCP is things are open. So I would say people should go ahead and check out that 0.8 spec that's out there, provide feedback.
Starting point is 00:22:45 We actually have a, we'll include in the notes, the email address for getting feedback to people like Jeff that actually provide and co-alate. Jeff's been amazing of collating all the different feedback and diagnosing it because some of these acronyms from the security point of view, I just, my eyes start to glaze over. But there's also Chips Alliance. So Chips Alliance is also an open ecosystem that people
Starting point is 00:23:12 can go check out the implementation as well. And so we'll make sure for the podcast notes that there's links to both of those. That's awesome. Well, thank you both of you for being here. I know that you have a lot of important work on your plates on a daily basis, and it was great to learn a little bit more about OCP lock and how it's going to be influencing the storage arena. Of course, for those who are listening online, check out OCP at their website. You can find all the information about this and other work groups. And then obviously techarena.ai for more about what's going on in the data center in AI arena. Thank you so much for being here, guys. Thank you. Thanks for having us.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.