Podcast Archive - StorageReview.com - Podcast #147: Inside HPE’s New Model for High-Performance Data Protection

Starting point is 00:00:05 Chad, thanks for doing the podcast today. This is quite the environment you have here in the Raleigh office at HPE, and this is the first podcast I think I've done on a sofa. So thanks for the cozy environment. That should be conducive to our conversation. We rolled it out just for you. I don't believe that, because actually this place is pretty sweet that you've got set up. What, well, before we get into what our core topic is,

Starting point is 00:00:29 tell us a little more about what you do at HPE and what brings you here to Raleigh to hang with us for the day. Sure. My name is Chad Brunager. I'm a principal product manager for solutions with the X10,000. Okay. And we're here in the Raleigh Triangle Office. We're going to talk about the Data Protection Accelerator with the X10,000.

Starting point is 00:00:53 Yeah, so X10,000, we did a deep piece on B10,000. So this is similar but different. Talk about that because HPE made the decision, gosh, I guess it's been what two, two and a half years now to fundamentally change the way you were looking at storage from a hardware perspective and the X10,000 is another bit of that story? It is. The X 10,000 is the next chapter. So where the B-10,000 is sort of our heritage in block structure storage, the other part

Starting point is 00:01:25 of that story is unstructured. Right. So X10,000 from the ground up is HPEIP. So not through acquisition, just real innovation. And it is the unstructured platform that mirrors block. So it's built on the same MP platform. However, the software stack above it is all object and soon to be file. So this is the thing though.

Starting point is 00:01:54 We hear that and we talk about B10K and X10K, but I think of those as primary storage arrays. Now we're going to sit here and talk about X10K as a backup target. Yeah. That's wild. How did we get here? How did you get here in terms of an all-flash, fully, H.A. Architecture for backup? Yeah.

Starting point is 00:02:20 It just made sense to us when we started talking about it. To be completely honest with you, we saw the high I.O., we saw the performance, and we were looking at the X-10,000 as more than just a storage array. It's a platform. And our approach to this from the beginning has been about solutions. What kind of solutions can we meet our customers with? And how can the X-10,000 be the platform that underpins that solution? So we knew what we built. We were very excited about it. But we wanted to meet our customers where they were. And data protection is a huge part of every customer's story. And it's only becoming more and more important. I mean, we don't have to rehash the ransomware and all the attacks and all that, that kind of thing.

Starting point is 00:03:06 The other thing, though, that this helps solve for is the data explosion, which we've been talking about now every year. It gets tiring, saying it, but it never goes away. Well, AI makes it worse, right? AI makes it even worse. So AI makes it worse for data availability, also data creation. And now no one wants to throw away anything because when you're training those models, the more data. the better in many cases, right? Yeah.

Starting point is 00:03:33 And cleaning that data takes space. So you need workspace, right? And so all these companies are trying to figure out how do they get to AI. Well, setting up a technology stack that can do things like AI pipelines and training models and all of that, it's not easy. But what companies are finding to be an even bigger challenge is getting their data ready for that process. Sure.

Starting point is 00:04:02 So, garbage in, garbage out. So getting your data ready takes workspace, and it takes a platform for you to be able to do that on. The X-10,000 can be both. So you've got this platform, it's got all flash, it's got high-speed networking. What's in the X10K now for networking? It currently goes up to 100 gigabit.

Starting point is 00:04:27 Right. And I believe they're they are about to certify 200 gig a bit. So 100 gig is fast, right, for most data. And then you guys decided, we're gonna sell this as a backup target, but then we're gonna make it go even faster by putting this data protection accelerator node in front of it. So you guys were already thinking about performance,

Starting point is 00:04:55 and then you're like sitting around the table, and someone's like, you know what, we can make it go faster if we offload some stuff this accelerator node? We got there. Okay. So I think it really started with how do we add more value, right?

Starting point is 00:05:09 So we can point one of our partners, you know, backup software stack at the X-10,000. And like you said, it can go fast. Right. And that's exciting. And we thought that was very cool. And the next question wasn't,

Starting point is 00:05:24 how do we make it go faster? The question was, how do we get more value out of this process? And for us, the answer was make that process more efficient. So in the storage world, in this role, I uniquely straddle storage and data protection, and they really are two different worlds. Absolutely.

Starting point is 00:05:43 And so if you talk to the storage people, it's about I.O. And it's about usable capacity. If you talk to somebody in the data protection world, it's about effective capacity. And so we wanted to be able to bring that to the X-10,000. So the X-10,000 does have some compression with it, but a data protection customer is not going to be as interested.

Starting point is 00:06:05 And so we at HPE have our own catalyst technology that currently lives inside of the store once product family. And we wanted to be able to bring that catalyst technology into the X10,000 fold. And so the catalyst technology has an industry-leading D-dupe on average, 20 to 1. we've seen it peak up over 61, which is pretty important. Depending on the data, depending on the day of the week. So what we did was we basically took this store once software and we put it onto a different chassis. We added memory, we added compute, we added MBME drives. And then instead of storing the data on the unit itself, like a store once would work.

Starting point is 00:06:55 Right. We just direct pass it, stream it straight through to the X10,000. So what's happening is the data protection accelerator is doing all the heavy lifting. It's grinding on the D-dupe and the encryption, all of the work intensive processes for data management. And then it's just streaming that data directly to and storing it securely on the X10,000. So let's the X10,000 continue to do its job and offer the data services it has within and then just sits in that data path accelerating.

Starting point is 00:07:28 And I'm talking a lot about backing up, but also on the recovery side too. On the recovery as well. So talk about the importance of the accelerator node on recovery, because it's sort of trafficking this data back to the host that needs the restore. What's going on the other way out? So most customers don't use Restore in

Starting point is 00:07:55 catastrophic event but that's becoming more and more common these days so the questions that we get asked is if and and more recently when it happens I need to be able to get it back fast and I need to be confident that when I tell my management and my CEO that we have a solution in place that we can recover that I mean it and that we can And so this does that. So what's happening in the data path is as you move that deduplicated data off of the X10,000, it needs to put all the bits and pieces back together.

Starting point is 00:08:37 And that's where having a flash system to do it really changes the game. So all that data is coming back through the DPA. It's getting put back together. And then it's getting deposited on whatever the target is. And you talked before about working with partners, The work that we've been doing with your team on this solution has been with ComValt, but you work with several other software. So you can kind of meet your customer where they are.

Starting point is 00:09:03 That's the idea. So every customer has an environment in place. So there are going to be customers. You don't have any customers that are not backing up data? You know what? That probably exists somewhere. Well, you know, there are going to be customers that are going to say, hey, look, this is a great solution.

Starting point is 00:09:18 And I want to buy it with some ConValt and some Beam. And we can help them with that. Right. But in my prior life, I spent 21 years on the buy side of things, and we had a very robust environment already. And when we bought more storage, we brought that storage and those solutions into our space. And what you don't want to do is necessarily force these teams

Starting point is 00:09:41 that are managing all of this data protection for these very large companies to switch what they're using to do it. So yeah, we wanted to make sure that we were flexible solution and we weren't forcing somebody down a particular path. So what's the early read? I know you announced this last year, right? We did. We did.

Starting point is 00:10:02 We just started shipping in the last several weeks. What's the feedback you're hearing from customers that are trying this out? Because this is kind of a net new idea. We've seen flash targets. We've seen high performance flash targets. And even personally, as I go to these conferences like Veems conference or whatever, we've whatever, I saw the progression of their partners showing up at the show. You know, it used to be big bulky hard drive arrays, and then they would do a hybrid array,

Starting point is 00:10:31 and then an all-flash array, and then it was primary storage there, and I was like, whoa, that's interesting. But the accelerator node is not a common thing, and in fact, I can't think of anyone else that's doing it. Is anyone doing this yet? I'm not aware of anybody. So we did, we did some research to see what's going to do. going on out there.

Starting point is 00:10:52 I suppose you probably did. And no, I think that we're the only one that's doing it right now. And we really like that the idea of this, you know, adding a node for this purpose. So a little bit of spec, and we won't get too deep on it. But each of these data protecting accelerators can manage up to two petabytes of data on the X10,000. Right. So there's a lot more than maybe a PBBA could do. But then when you have more than that, I mean, this is the X10,000 is a scaled system.

Starting point is 00:11:25 And we're talking about 5, 10, 20 petabytes. So, and it's the disaggregated shared everything architecture. So you can add more compute when you need more compute. You add more drives when you need more storage. Customers are going to grow. And so we had to think about it from that perspective is what happens when their data grows and their data protection needs grow with it. And so one of the, I think,

Starting point is 00:11:50 I think really key capabilities of the data-protecting accelerator is that it scales linearly. So if I have more data than two petabytes, I just stack another one. Right. And four, five, six, seven, and... And again, that's interfacing with the backup software. So that's still where the customer is managing their experience. Now, the accelerator nodes, as we've seen, have a little gooey, and they've got some... It does for the management of the accelerator.

Starting point is 00:12:19 Right. But for the infrastructure, the backup software sees it as the target. Yeah, yeah, that's pretty neat. And so you point it at the DPA. But just real quick, it's not just the capacity, though, right? So, and this is, for me,

Starting point is 00:12:32 this is really, really exciting, is SLAs, right? So, I mean, if you have a 100 terabyte database, right, obviously one of these accelerators is more than enough to manage that. And, you know, it's got a certain restore capability. Well, if I need my 100 terabyte database back in half of that time, I can stack a second DPA and use parallel streaming to get my stuff back. So it's also as important for SLAs as it is for capacity management.

Starting point is 00:13:01 Well, I was asking you about the customers, and then we, I took you down a different trail. But so what are the customers saying? Because what you're saying about impact on SLAs, this is kind of new, right? Because they were solving, how would they solve the SLA problem before? More hardware? They would throw more network at it. They would throw more hardware at it. Right.

Starting point is 00:13:24 They would ask for more time. Extend the window, right? Extend the window. Yeah, so four hours is too long. Yeah. I need six or something like that. You know, and there's two aspects of it. There's the, I can't get it all backed up in enough time.

Starting point is 00:13:38 But then there's also, I can only get it back in 12 hours, and I need to get it back in four. Right. And so this helps with that. And as they're playing with this in doing POCs with you, it's got to be... What's the feedback? It's got to be kind of whiplash. If it's really a backup admin that's been suffering from this and you show up with a solution, like, that's got to be pretty wild for that guy. You know, it's been interesting. I guess it should have been expected. It was a little bit, but I think once you experience it, the disbelief. Like, there's no way you backed up on my VMs? Yeah, come on. You're kidding me.

Starting point is 00:14:32 And I'll believe it when I see it. And then it wasn't until customers started getting it in their environments and kicking the tires, that the feedback started trickling in that they're very happy. And then we started getting comparisons. to other competitors or even to some of our own stuff that we had in their environment. And I started hearing words like, it smokes and they're really happy. So, you know, it's rolling out now. Customers seem to be happy.

Starting point is 00:15:04 We're here today just to sort of, I wanted to blow the top off of this thing. And just I want to see what its limit is. Well, you guys quote 1.2 petabytes an hour of ingest. Is that right? That's right. So that's with four of the accelerators in front of a three node X 10,000. Yeah, I mean, that's quick. It is.

Starting point is 00:15:29 So 300 terabytes an hour per . Per DPA. Per unit. And so the sizing, I guess, for you guys, could be related to capacity, could be related to SLAs or performance. It just kind of depends. Some customers may use more or less of these accelerator nodes just based on whatever problem they're solving for.

Starting point is 00:15:50 That's exactly right. And that's how we should be talking to our customers is what's important to you. And some customers just have a lot of data. And some customers just have really important data. And so I think that we're going to find a real mix of it. So you talked about the store once catalyst code being sort of fundamental to the mix accelerator node work. How does this, as we look at the HP data protection portfolio,

Starting point is 00:16:19 how are you thinking about where the existing stuff with store once fits, where this fits? Like, how can customers start to think about X10,000 with data protection accelerator in conjunction with the other products you offer? First and foremost, scale. You know, the larger customers, the large banks, the government accounts.

Starting point is 00:16:42 They have a lot of data. And so when you start talking about multi-petabyte, five, 10, 20, that's a normal conversation for them. And that sounds like, it used to sound like an impossible number, but you must be running into many multi-petabyte customers at this point. Many, yeah. On the flip side of that, you've got a lot of smaller customers

Starting point is 00:17:05 that are outgrowing themselves. And so these smaller backup solutions that they had in place, they're outgrowing them. They're trying to modernize. They're trying to future proof and they're thinking two, three, four years down the road, where am I going to be?

Starting point is 00:17:21 And instead of forklifting, instead of rebuilding clusters and rebalancing and doing all the things that they're used to doing, now they could potentially have this disaggregated backend where they are,

Starting point is 00:17:33 it's just one optimized storage pool and they're just adding drives as they need them. And if they need more performance, they add more, DPAs. Yeah. Where is this going to go?

Starting point is 00:17:48 Because it strikes me that as we talked before about needing to have more data available, backing up more data, that backup hasn't historically innovated at the same pace as the rest of the data center. Because it worked, it just had to be there had to be available. Now we're talking performance in ways that have never been discussed before for backup and recovery. I mean, you're going to be beholden to all sorts of new hardware realities, right? And maybe that's the opportunity.

Starting point is 00:18:16 And with all the engineering you have, it's probably no big deal. But fabric's going to have to get faster. The drives are going to get bigger, probably more than faster. KULCs has a tremendous potential here too, right? It is. So faster, more resilient, and more optimized. So I think right now what we're looking at is higher density networking. and bringing this closer to the cluster.

Starting point is 00:18:45 Okay. Those are probably our two biggest initiatives right now, and then, you know, obviously resiliency. HA is important to a lot of customers. We were really adamant about making sure that we had immutability right out of the gate. So really making sure that we've laid down a solid foundation has been table stakes,

Starting point is 00:19:06 and then we can grow on that. And areas that we are looking at, at is making or really just making this backup data more valuable for our customers. So you know, you've got all this data just sitting out there collecting dust, waiting to be used. Usually it never is. Then it gets deleted. It gets backed up over again.

Starting point is 00:19:30 That's a lot of data that's just sitting there doing nothing. So how do we get that data and make that available to customers? And when you have a platform like the X10,000, now. now all of a sudden you've got a multi-purpose platform that can do more than one thing at a time. Because this solution, while it is a backup or data protection solution, the X-10,000 is not a data protection product. Right.

Starting point is 00:19:59 Right? So it's also doing AI pipelines. It's doing data lake. It's doing data fabric. It's doing all of these really scaled workload intensive. Yeah. Big enterprise jobs, yeah. And so it can do more than one thing at a time.

Starting point is 00:20:13 We've really had a hard time throwing enough at the X-10,000. To overwhelm that. We cannot overwhelm it. I'm sure we could. But it takes an awful lot. I mean, you saw what we had in the server room, and we're barely touching. Yeah, the performance. The X-10,000.

Starting point is 00:20:30 Absolutely. So in there, let's talk about that, because you've stood up within your lab here, about three and a half, four racks. of gear, of compute to backup, of a couple different X10K clusters, of I think you have four or five the accelerator nodes in there. You've made a commitment with hardware in your lab to make sure that you're testing this and see what it does for real. Yeah.

Starting point is 00:21:00 And like I said earlier, as the product manager, I want to be able to talk to my customers confidently, having pushed the system to its limits, so that I can tell them what their operating range is. I can help them understand how to maximize the configuration, the optimization, the capability of the product, so they're able to see in their own environments what we know the system will do. Yeah, well. And there's only one way to do that. Is to do it.

Starting point is 00:21:32 Is to actually do it. Right. So we've been working with your team on that. There's 30 compute hosts in a a VMware cluster that we're backing up and we've got 40, 50 VMs on each one, right? That's right. 200 gig VMs and then we're right now the guys are in the next room over, actually doing the full backup on that. It looks like it should complete in 40, 45 minutes. It's pretty quick to back up that scale of infrastructure. It is.

Starting point is 00:22:02 And I think, you know, when you, so when you have 200 gigabit networking and you've got, all flash drives from start to end, you're going to get a fast result. Right. And we put that together because we wanted to have enough to push the system. We understand most customers don't have that in their environment. Right. And that's okay. We really just wanted to kind of demonstrate things. You bring up an important point, though, about where customers are.

Starting point is 00:22:29 It kind of depends on where you catch them or who you're talking to within their infrastructure, because all the AI guys want 200, 400, 800 gig networking for all their high. for all their high-end stuff. Yeah. But backup is so often relegated to whatever's left. Yeah. So like you're talking about putting high-speed networking in that part of the infrastructure.

Starting point is 00:22:52 Customers aren't always ready for it, but they see that you guys are getting out there ahead of it, and that's critical too. Yeah, we see the opportunity. It's going to catch up. I think that, you know, like we talked about earlier, data is scaling at a tremendous rate, and the networking is going to have

Starting point is 00:23:09 to keep up. And you know, you could put the 200 gig networking in and the system has to be able to take it or you're going to flood it. So we know that the DPA can take it. We know that the X10 can take it. I think our challenge here in this test was actually being able to push it. It's a lot to push. Yeah. Yeah. And then it's like what's the reality after that? We've got change rate. We've got a node that can fail and how do we get that back? Because as much as you're excited about, and rightfully so, the rates of the stuff going in, because that's what most people do. The rates going out, really are fundamental, too, to when you need this thing back. Well, I mean, it's a rapid restore or a fast restore solution, right?

Starting point is 00:23:52 I mean, that's what we're talking to customers about, which is how do I make sure I am resilient? If and when it happens, how do I get it back? And we're talking to multiple customers right now that are telling us, hey, in our current infrastructure, we've got, you know, this database or we've got these workloads that we need to be able to get back in, you know, four hours and it's currently taking us, you know, a day and a half. Which is not good. I mean, it's not good for anyone. It's not good, but it's the common reality.

Starting point is 00:24:23 And it's the unspoken reality, right? I mean, I think, look, I'm not going to say it's the hard, fast rule. However, I think there's probably more scenarios out there where customers need their data back faster than they can get it back. Oh, there's no doubt that whatever their SLA is, that's not best case scenario. Everyone wants it back now, like immediately. And we've seen how many outages over the years

Starting point is 00:24:52 in downtime that just destroys business brands because they can't get it back. They can't get back. And that's real money. You see all of the Gartner and the Forrester and all the big analyst houses. They do these things. you can go Google them and they say that, you know, when average businesses lose, I don't know, X number of dollars per hour or minute or however they do the math.

Starting point is 00:25:17 But, you know, all you have to do is, you know, go look up the revenue for the customer that you're talking to and just do a little bit of quick math. And you can have a very real conversation with a customer and help them realize that, guess what? They're in that boat too. And when that system goes down, they're losing real dollars. All right, so we've got all these innovations. The D-PAN is fundamentally changing the game for the way, I think how customers are going to think about what's possible with backup. And I guess that's really in recovery.

Starting point is 00:25:50 And I guess that's really the exciting thing about this. You've got people testing it. You've got, you're getting exciting feedback. What have we missed? Is there something else? Or what's the key takeaway that when you're talking to customers, is you're leaving them with or what's resonating the most with them? You know, I think one of the things that gets, I think one of the things that is underappreciated

Starting point is 00:26:21 or doesn't get talked about as much is the parallel scaling. Okay. So, or the linear scaling, I should say, because with the DBA, you know, you can stack one, two, three, four, and split those streams. And I think every time I have a conversation with customers or somebody in the industry like you, it's kind of like, all right, cool, you know, it can do that. And I think if you really sit back and think about what that means, that I can just add another unit to my rack. And I can expand by another two petabytes of data protection. And I can take my ingest from 300 terabytes an hour to 600 terabytes an hour.

Starting point is 00:27:08 It makes it a lot easier to forecast your needs. It makes it easier to forecast your potential spend. It makes it easier to future-proof and know where you're going and what it's going to take to get there. I think it makes it easier for the organization to say yes, doesn't it? I hope so. Yeah, I mean that's fundamentally, can we do this? Yes. Yeah.

Starting point is 00:27:34 And to be able to say that with confidence from that organization, which is normally under attack, like figuratively and literally at all times, to be able to say yes more often, that's got a, that's probably pretty sweet. Yeah, feels good. All right, so we'll link to the page, you've got blog, you got other stuff, we've got content, we've created around the accelerator node. We'll link to all of that. in the description.

Starting point is 00:28:01 So check that out for more information on the Data Protection Accelerator node. Chad, thanks for doing this. Great talking to you.

Podcast Archive - StorageReview.com - Podcast #147: Inside HPE’s New Model for High-Performance Data Protection

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.