Grey Beards on Systems - 54: GreyBeards talk scale-out secondary storage with Jonathan Howard, Dir. Tech. Alliances at Commvault

Starting point is 00:00:00 Hey everybody, Ray Lucchesi here with Howard Marks here. Welcome to the next episode of Greybeards and Storage monthly podcast show where we get Greybeards storage assistant bloggers to talk with storage assistant vendors to discuss upcoming products, technologies and trends affecting the data center today. This is our 54th episode of Greybeards and Storage which was recorded on November 21st, 2017. We have with us here today Jonathan Howard, Director of Technical Alliances at Commvault. Howard and I both met Jonathan last week at Commvault Go 2017.

Starting point is 00:00:46 Anyways, so Jonathan, why don't you tell us a little bit about yourself and your company? Thanks, Ray. So as you stated, it's Jonathan Howard. I've been at Commvault for eight years. During that time, I've had a multitude of positions, anything from being in the field into product management. And now I'm working with our alliances team, working on technology and engineering alliances. For those who don't know, Commvault is a provider of data management

Starting point is 00:01:10 solutions. Historically, we're known for data protection, but we've evolved past that. So we have solutions that include data protection, recovery, cloud, virtualization, archiving, anything up to and including file sync and share. So realistically, anything you need to do in and out of the cloud from a data protection or management perspective, that's pretty much sooner or later we can help you soup to nuts. Oh, God, that's good. It seemed like in the last couple of last year or so, you've came out with a new solution on more of a software defined solution. Yes.

Starting point is 00:01:44 Yeah. You'd be referring to our hyperscale technology. So yes, yes. Yeah. So I mean, historically, Commvault has been rather, I wouldn't say agnostic, but we've been the really, the our approach was to more or less support pretty much every wide range of software or hardware components that we can fit behind us. But as the marketplace has been evolving with new technologies and new software-defined strategies, so we'll have our views and what we should be providing to our customers and, frankly, what our customers have been asking for us. So hyperscale software is our answer to the software-defined angle

Starting point is 00:02:23 of what our customers are asking for today. I'm going to push back a little on using the term software-defined here, because you always were software, and now you've added hardware. And using software-defined as the term for that just offends me to no end. Howard, I agree with that, actually. If you may, or sorry, if I may, I could basically say hyperscale actually comes in two flavors. There's hyperscale software, and there's hyperscale as an appliance that includes the hardware. You do not have to buy the hardware from us.

Starting point is 00:03:00 It comes in different variants. So we have expanded from a software-only offering, but to stay close to our roots, to stay, hey, Mr. Customer, if you want to use vendor A or vendor B or vendor C for the server infrastructure, we didn't want to become a software-only ore appliance model that takes, what are you calling it now? The Commvault suite? Commvault data platform. Commvault data platform. I mean, I wasn't very fond of Sympana as a name, but at least it was a name. I kind of liked Galaxy, but that was a long time ago. If you really want to go back in the day, we could always say Kinetics. Oh, God. Yeah. No, and one of the things on Simpana was it was actually something that our marketing department had basically done understanding of that. It came down to basically almost like a car manufacturer, if you will, that a lot of people understood that if you're driving a specific car and the car has a name, like the model, say,

Starting point is 00:04:08 a Volkswagen Jetta, people say they drive a Jetta. But when they were basically saying that, I drive a Beamer, I drive a BMW, there's designations of the model. So people had a lot of understanding that they knew Sympana, but they didn't know the company that was behind it. So there was a choice made to say, since we really have a platform that is a single code base that basically, depending on the license you apply, it takes on different personalities. The choice was made to say, well, why don't we just simplify and standardize and say it's Commvault software and it's the Commvault data platform. And this is why I'm not in marketing. Yeah, apparently. We're all kind of in marketing, Howard.

Starting point is 00:04:51 Although I have to say, just like back in the day, more people called the spreadsheet from Lotus, Lotus than 123. Yeah. The vernacular I heard when the product was called Simpana was, we use Commvault for backup. So, yeah. I'll push back a little on whether your marketing guys were actually right. That's a whole other story. Give it a couple years.

Starting point is 00:05:18 We'll see. But the big change here is that where you used to be the brains and the data mover, you're now actually holding the secondary copy of the data as well, right? Well, how would you look at it? We always kind of, I guess we say we always managed the secondary copy no matter what storage infrastructure you were using. Right. But I had to what storage infrastructure you were using. Right. But I had to have storage infrastructure. I mean, I have done it as cheaply as, oh, look, a couple of promise array, but you still needed something there.

Starting point is 00:05:56 Right. Correct. And now that function's built in, right? And that was one of the things that we, from the perspective of the answer to it is, yes, you could stick it on a promise array. You know, we have customers using, you know, the multitude of NAS or iSCSI or Fiber Channel or DAS or cloud copies or pretty much any storage protocol under existence can sit behind Commvault. The difference was, you know, depending on the configuration, you know, there could be multiple people involved. And then in the enterprise, there was all the ITIL and the change control management and the, you know, all the expertise across the various domains. So we've been seeing a trend in the data center in general.

Starting point is 00:06:38 I would agree. Towards buying more integrated systems. And if we go back all the way to the beginning, you bought a mainframe from IBM or a minicomputer from Decker Data General, and everything else came with that logo on it. Yeah, yeah. And then as we... Those were good old days, Howard.

Starting point is 00:07:00 Well, in many ways, life was simpler back then. Oh, yeah, Oh, yeah. Absolutely. Yeah. You never had to worry about whether a deck writer would work with a VAX. It was like, well, yeah, of course I do. Yeah, yeah. But as we went into x86 and frankly took a lot of margin out of the IBMs and decks of the world by going to x86.

Starting point is 00:07:22 Yeah. Witnessed the demise of deck or the you know whatever we started building things ourselves and so it was you know what parts kit do you have and it seems that trend's going back if you look at you know the converged infrastructure you know it's like a v block is a parts list and you could order that's parts list and it's actually like two percent cheaper if you order the parts list uh but people order V-blocks because the labor involved in figuring out what you needed and plugging it all together. And we've all had the, yes, and I have every part I need to start this project,

Starting point is 00:07:55 but the one fiber optic cable that's backordered for three weeks. Yeah, that and the software, that kind of glues it all together to some extent. So, I mean, the move towards a packaged solution as opposed to a parts kit makes a lot of sense. Separate parts, yeah. So what's the back-end storage actually made up of, Jonathan? We based our entire solution from this perspective on Red Hat with Gluster.

Starting point is 00:08:20 There's a number of configurations that Red Hat can perform, and we looked at should we, for lack of a better term, try creating one from the ground up? Or should we partner with somebody to basically have an enterprise offering out of the gate? We've been working with Red Hat for a number of years. So it made sense when we started working with Red Hat to basically go down the path of investigation with Gluster. And so far, it's turning out wonderful. And so it's effectively a Gluster file system behind the hyperscaler? Yes, there's more to it. It's the basis, you know, the basis, the foundational component is a Gluster component. But of course, there's some components that we've added in to

Starting point is 00:08:57 make sure that we work better together in that respect, if you will. Yeah, well, certainly, I wouldn't recommend anybody create their own clustered file system. That's a major piece of work. Well, there's a significant amount of effort to that. Then there's all the updates. And then, of course, on top of that, there's security components and other things that come with the enterprise. And we've seen this from, I guess you could say, some of the other hyper-converged vendors from a virtualization standpoint that, you know, some of them underneath the covers are, you know, their expansion and characteristics on top of like a KVM instance, right? It doesn't make their worthwhile or value is any less in the market because of the overall

Starting point is 00:09:36 solution that they provide. But I don't, I would argue that, you know, they're not necessarily judged that they're using a KVM foundation. Our approach was rather similar with Gluster to say, it's an enterprise-ready scalable architecture. Why don't we couple with that to design a scale-out architecture that can scale Commvault with file system? Okay. And so are you locked into their release levels and stuff like that as they put out fixes and new functionality? Then you would do the same.

Starting point is 00:10:06 And that's something that was very critical to the overall component is patch management and all these other components come part and parcel with that solution. Commvault has a streamlined methodology to stream patches down from our central repositories around the world. So basically embed those components in that same area so that, you know, as Red Hat progresses, we're going to be progressing with that. The other thing that I also like about the Gluster component is the fact that it runs in the user space more than the kernel space. So it allows the patching and the updating to be a little bit easier than some of the kernel-based file systems. So it actually makes some of the updating and patching capable in user space

Starting point is 00:10:52 without having to say, I need a new kernel every single time. If there's no problem with the kernel, it doesn't necessarily have to be moved forward. Yeah, I could see that. Although I look at a secondary storage system going, yeah, and you need to reboot it as a much smaller problem than a primary storage system.

Starting point is 00:11:09 Wholeheartedly agreed. Wholeheartedly agreed. It's a different set of characteristics, just like it's a different set of performance characteristics and how we approach the entire build. Okay. Does Gluster support, I'll call it cloud tiering kinds of things? So if you need copies primarily stored for quick recovery, we always advocate that if you're going to recover something or access something and there's a high likelihood of it, the data should be local for data locality. But once the data gets older, we can automatically tier that out to like an AWS, Azure, I think we're up to 47 different sort of object storage or S3 architectures that we support.

Starting point is 00:12:04 47. And from an Alliance's perspective. Tomorrow there'll be 48. Exactly, Howard. I was going to say from an Alliance's perspective, that list is growing. Yeah, I'm sure. This is our 54th episode, so let us know when you pass. We try to shoot for one a month, so right you know all right so i'm so the gluster file system

Starting point is 00:12:30 if i may is just yet another tier in your your overall world view is that how it works there are other things that that that gluster allow us to do so basically you know and this is sort of historically convolt could be um you know from a this is sort of historically Commvault could be, you know, from a local resiliency and availability perspective, we could already have some availability functions, but it depended on obviously how we were configured. The sort of back to that convergence of solution of, you know, customers wanting to take the sort of MacGyver out of it of I want to take absolutely everything and, you know, with some bailing wire and duct tape, put it together myself and say, look, how do I get local resiliency, redundancy, a scale out architecture and all these components,

Starting point is 00:13:16 you know, stitched together that I can simply just add nodes to. And that's where, you know, the functional components that are built into some of the areas in Gluster made it a really good choice from basically saying it can do dynamic storage tiering. It can use standardized server architectures that allowed us to make sure that from a reference design program, we can make sure we support, you know, a large variety of vendors. It has its own unified namespace, which we can adapt and use for our purposes. And of course, it's got that local resiliency and data availability component, which again, it can go higher functionality than we needed. We were gearing

Starting point is 00:13:57 it more to, as Howard pointed out earlier, is we're gearing this more to the secondary storage workload, not a primary storage workload. So there are certain components in that that we are looking at how to configure it that you can configure it, obviously, with a multitude of different options, some of which we're not using because, quite frankly, we want to make sure

Starting point is 00:14:18 that it's cost-competitive in the industry and that we have all of our other benefits and software components that allow us to change speeds and feeds, if you will, like deduplication, compression, and those components. I was going to ask about deduplication. So that would be something you would turn on at the Commvault level and that would be applicable throughout all the tiers, I guess. Well, you can selectively turn it off on a tier if you wanted to. Most customers opt not to, but we have had some use cases, for instance,

Starting point is 00:14:53 and it's typically more around long-term cloud copies, if you will. Yeah, I was thinking cloud might be a good place for that, yeah. If you send your data deduplicated to the cloud, then your S3 bill is smaller, and that's a good thing. But it also means that if you want to consume any of that data in EC2, you need to spin up an EC2 instance of whatever engine is going to rehydrate it. Right, right, right. You'd like to do the data out to, let's just stay with Amazon AWS, put it in S3. And then if we need to recover it, we can spin up an instance in EC2 just long enough to access the data. And then if it's like VMware, we could auto-convert the workload as a recovery to AWS and then spin ourselves down. Okay. Can we talk a little bit of recovery?

Starting point is 00:16:03 If you'd like, sure. Because I don't know anybody who has backup as their actual goal. You mean to tell me all that stuff I've been doing all these years is not my goal? It's really recovering the data? Ray, you were probably backing up to DevNull, right? Well,

Starting point is 00:16:19 I started You don't want to hear my backup stories. There was that one 1986 version of ArcServe. Yeah, yeah, yeah. Yeah. So one of the major changes we've seen over the backup process in the past decade or so is in the recover process, where especially for VMs've many vendors have gotten to the point where we can recover directly from their appliance and run the vm without waiting for it to restore as somebody who spent a lot of his career in the smb space where the sql server for the accounting

Starting point is 00:17:00 system has every transaction for the last 15 years because we don't have a dba never seen that before the restore but but you know one of the problems with that is when you go to restore that huge database it takes a long time correct um before it's operational is your you know what what's convoltult's current condition, you know, current state for providing that kind of functionality? Does hyperscale enhance that? Because now you know how the back end is working. Because frankly, some instant recoveries from deduplicating purpose-built backup appliances can be painful. Agreed. Well, and so, you know, if you will forgive attention, and I will use the two words that every product manager uses,

Starting point is 00:17:50 but then I will back them up, is the answer is, it depends. And the reason why I say that. I was a consultant for 30 years. It depends are my favorite words. So from our perspective, there is no one size fits all from a recovery perspective, right? It really comes down to, from our perspective, there is no one size fits all from a recovery perspective, right? It really comes down to, from our perspective, what is the SLA you're trying to achieve? And this is where we have multiple sort of ways, if you will, that we can enhance the recovery metric.

Starting point is 00:18:18 So if I start near the top, we actually can use LiveSync, which will synchronize, you know, VMs or applications to basically a warm standby, right? And you can actually, if you will, think of it like a fan out where in a way where it's like, I can basically have it synchronizing and I can have a protection copy. That, you know, it's like, again, most customers are gonna look at that saying, hey, you know, if I go to an enterprise customer who's got, you know, 20,000 VMs, he's not going to synchronize 20,000 VMs to another location, right? It's just... So it's like a replication? Yeah, there is a, LiveSync is a replication component, but from our perspective, we can also cross hypervisors during the sync. So it allows for different conversations. Now the

Starting point is 00:19:00 customers are saying, you know what, maybe I want to layer my recovery. And I want to say that my, you know, in my primary site, I'm using VMware, but I want to have, you know, AWS as my disaster recovery site. And I want to have these VMs waiting that are automatically being able to, you know, turn on and go whenever I need to. But other VMs, maybe I don't want that, right? I mean, I don't want to do it across my entire enterprise. You know, going back to Howard's point on SMB, you know, it does offer some options down in SMB, but obviously there's some costs associated with it of having that infrastructure on standby. Yeah, yeah. Right. So then it's to sit there and say, okay, that's one level. Then depending on the, you know, then depending on the storage array. But before you move on to the next level. Okay. Does that give me a CDP journal too?

Starting point is 00:19:49 The newer ones do. The other is more of a transactional component before, if you're talking like an incremental forever type thing, or you're talking continuous. I think he's talking continuous data protection in this case. Okay. Sorry. Then I misspoke. It's not quite continuous right now.

Starting point is 00:20:04 It's still transactional. It's just, it, think of it like a really short incremental process. Yeah, I was thinking more about the ability to go back in time. Oh, yes, sorry, that is there. My apologies. We do have multiple points that are going to be positioned. It just depends on, like, if you're talking a continuous stream or a transactional component. Yeah. It's there's two issues. There's, you know, what's my RTO and what's my RPO and can I have multiple places to go back

Starting point is 00:20:36 to? So, yeah. So, so now it's near CD. Right. Yeah. Okay.

Starting point is 00:20:42 So, and that's, and that's a great point. So then, you know know if you were still if we're you know basically on premise uh i i have to mention of course intellisnap from the perspective of using you know hardware array snapshots right um you know so it's like a lot of people are still oh let us let us talk about hardware array snapshots. Okay. Did I open Pandora's box? No, not necessarily.

Starting point is 00:21:08 No, no, no. You opened the goodie box. Okay. Okay. No, no, no. Pandora, we've locked away for the day. But one of the changes I'm seeing is less traditional data mover-based backup. And more snapshot-based?

Starting point is 00:21:29 And more image and snapshot-based. Partially because of the speed of recovery. Agreed. I would wholeheartedly agree with that statement. Isn't there a risk that possibly the storage device that you've taken this wonderful snapshot to may not be available when you want to recover? Well, to us, it comes with layers, right? So, I mean, a storage snapshot is the first line of defense, if you will, because, and obviously, this depends on, you know, the hardware manufacturer's array itself,

Starting point is 00:22:04 on, you know, how good their's array itself on, you know, how good their engine is and what can you do with it? You know, are there snapshots immutable or non-immutable? You know, can I, I love the new, I love the new, some of the newer arrays where you can actually revert forward in time. So I can, you know, I can revert back to a snapshot that's, you know, say I was taking a snapshot every hour on the hour and I have say 24 of them, I can go back 17 hours ago and then go, oh no,, my data is still, that's not the one I'm looking for. And I can revert forward to 16 and 15 and 14. So, you know, there's some very interesting hardware array, you know, benefits depending on the array you're using.

Starting point is 00:22:39 But we look at that as that's the first line of defense. And some of our, we call it our live functionality immediately starts working there. So you can boot up like a VM, you know, we'll automatically mount the snapshot and boot it up. And, you know, you can run it from that snapshot and, you know, and migrate it back while it's running. So, you know, if it's a flash array, right, comparing that to our backend infrastructure, even like an an s3 bucket obviously the flash array will be lightning fast and it almost gives you like production level quality if you will of running that you know recovery or you know access operation right away but to your point

Starting point is 00:23:18 if the array is lost or the location is lost or the rack is lost or whatever the scenario, there's still a chance that the array itself is no longer there for recovery. So we never advocate running on snapshots by themselves because the most telling human error story that comes to mind is we had a customer that had 37, I believe it was 37 mining operations that each one that had an array that was replicating into the central site. And they had advocated that they're just going to use our snapshot management,

Starting point is 00:23:53 replication management, because they had replicated snapshots for the entire solution. No additional copies were required. Now, luckily they had additional copies because the Commvault administrator was like, I'm just going to take some additional copies. Just have them. And of course, the- Never let them throw that last tape library away. There's always some-

Starting point is 00:24:17 Exactly. So what was happening is the storage administrator, he was doing maintenance, and they were shutting down one of their mining operations. So he was on the storage command line and he removed a relationship. And it, of course, deleted the primary copy and the replica copy. Unfortunately, it was the wrong mine. He removed an active one. So that's why we advocate where it's like, you know, using snapshots. Great.

Starting point is 00:24:44 But having replication better, but we still advocate swapping, you know, to a different type of technology. Yeah, I would say WannaCry is the other solution there that's killer. There you go. Yeah. Yeah. Well, there's, yeah, but, you know, if I'm taking snapshots and then update, you know, uploading those snapshots to S3. That's good. That'll work. Then I'm in a much better position.

Starting point is 00:25:11 One of the things I've liked about you guys for a while is the support for snapshots. Back in the days when I taught backup school for TechTarget, there was always the, well, we just take snapshots and we're completely covered. And then you'd get an email from that guy about three months later that says, yes, somebody in accounting lost the spreadsheet they actually used to close the quarter. I thought they did it in SAP. And it took me a month of loading every snapshot and looking for that. So the fact that you guys will catalog the snapshot has always been big on my list.

Starting point is 00:25:47 Yeah. No, and that was it was it we wanted to to your point it's we wanted to have a standardized approach so that um you know we could completely utilize that front-end technology but give it to people as a you know sort of another tier of recovery right um so it's you know whether you want to recover from a snapshot whether you want to recover from you know a ded's, you know, whether you want to recover from a snapshot, whether you want to recover from, you know, a deduplicated disk copy, whether you want to recover from deduplicated cloud copy or tape or, you know, whatever you wanted to use, we simply have that in sort of like an N plus one storage policy configuration, right? How many tiers do you need? What are the retentions that you need? We'll simply set this up. You know, so that's where we, you know, we started with that from IntelliSnap functionality.

Starting point is 00:26:35 And then what we wanted to make sure is the other thing that IntelliSnap gives us for the majority amount of the time is the ability to, you know, take the client as much as we can out of the operation, right? So, you know So when we want an application consistent snapshot, we'll quiesce the database, the VM or the application, and then we'll take that snapshot and then we can mount that externally and then do further data protection operations. So that kind of takes the client CPU memory and its network out of play. Of course, it doesn't take the storage array out of play, but that's where we then layer in incremental forever technologies where we only want to sort of read the blocks that we haven't, that have either changed or that are new

Starting point is 00:27:14 to minimize any impact on the storage array. And that's where we get back to hyperscale is, we can mount those snapshots to hyperscale and basically do data protection there for secondary copies that are offloading from the array. Okay, talk to me about that process. Yeah, so that would work by taking a snapshot on a storage device and then copying it over to the hyperscale as a snapshot?

Starting point is 00:27:40 No, so okay. If I use, let's take VMware as a construct. Take my snapshot. I've done application quiescing. I've quiesced the VMs. I've made sure all the storage is aligned. And then I take my snapshot. And then basically we've unquiesced those VMs.

Starting point is 00:27:59 So from a data protection standpoint, we really minimize or mitigate that redo log situation for high IOVMs. And as far as we're concerned, they're running. Now that snapshot can be mounted and then we're using, you know, you know, change block tracking and VADP tools to keep it in the construct of this is still a VMware, you know, this is still a VMware, you know, virtual machine, you know, VMDK, all the, all the components, but we want to make sure that all of the great things of, you know, the regular VADB streaming where it's like, I can, you know, I can process change block tracking for incremental forever. I could recover the VM, you know, or the file or the VMDKs, you know, all, you know, depending on, you know, one operation for protection, multiple operations for recovery or access, are all intact.

Starting point is 00:28:46 So we basically... So the change block tracking that's associated with the original data is also provided at the snapshot level? It's intact for us, yes. Ah, that's great. Because otherwise, you're back to kind of that whole full-on-full scenario, which is really, really good for your deduplication ratios, but not really good for your infrastructure. Yeah, I understand.

Starting point is 00:29:10 So do you have any idea or handle on how many customers have moved to the incremental forever model? Some of us left Commvault Go early to go off to Storage Field Day, where some other vendor, who shall remain nameless for, well, just because I feel like it, kept telling us about how their technology optimized the daily, full, weekly incremental process. My response was I'm not, I'm uninterested in optimizing stupidity,

Starting point is 00:29:40 but I am interested just in how much we still back things up based on decisions we made when how many tapes you had to recover from what's the yes um so we've been advocating it now uh so we're in version 11 uh i we've been advocating that since at least from what I can recall, late version 9. The difference, of course, has been, you know, so if you stay in the construct of VMware and change block tracking, we've been advocating that for a number of years. When we have a number of customers who, again, I would say the adoption is far and wide. I don't have an actual statistical number, but it's been a Commvault best practice for a number of years to do that because of, to your point, the benefits of, you know, the incremental forever approach of I'm just going to read the change blocks and, you know, and basically, you know, allow for full recoveries. Now the, the interesting component is something that, you know, we don't typically talk about and, you know, is the fact that even on an incremental forever, a lot of people look at us and say, yes, but you still have to run a

Starting point is 00:30:55 synthetic full to get a full recovery. And it's like, well, the synthetic full operation is more of an, you know, sort of like an indexing component, not a recovery component. The interesting thing is, is let's say I ran a full job on Monday for the baseline, and then I run six incrementals, and I haven't run any synthetic full or any secondary operations, but now I want to recover that VM. I don't have to recover the full VM and then recover six incrementals. I click off a recovery, and it says, oh, you want to recover the full vm and then recover six incrementals i click off a recovery and it says oh you want to recover from you know the last incremental it will automatically stitch together that full vm for you right so we've right as 100 agreed it's just sometimes it's a backward merge

Starting point is 00:31:39 yeah it happens automatically to basically say i understand all the blocks that have changed in between all these days. So I basically will automatically go back and stitch together that full VM automatically for you on recovery. So do I understand this right that you're saying you have customers who are complaining that the current process requires the computer to work too hard and they would rather have the process that requires them to work harder. I don't know if they're, I don't know if they're necessarily complaining. I would say, I would say sometimes they believe that without running a synthetic full operation,

Starting point is 00:32:16 you would have to do it that way. I see. The synthetic full operation is more, like I said, for our internal metadata and indices than anything else. That's a documentation problem, as in what is the documentation? Yes. Since it's a family show, I'll make sure I'll say RTM.

Starting point is 00:32:34 Well, F stands for fine, doesn't it? Yeah, yeah, yeah. Right. Huh, huh, huh. So you're going down through the sequence of things that you can recover. LiveSync, on-premise IntelliSnap. And what's next on the list there? So now I'd say now we get into, it's like, okay, now let's look at we've got our hyperscale infrastructure.

Starting point is 00:32:53 Now we're mounting that snapshot or we're doing a streaming VADP depending on your source, right? Depending on what's supported and what the customer has configured. And then from that perspective, I'm streaming the data. I've got incremental forever protection, but then I want to talk about recovery. Well, now I have live mounts from our backend, which will automatically boot that VM up from the backend and let you run it while it's, you know, while it's moving back into production. Or if I just maybe want to, to your earlier point on the, hey, I'm looking for that Excel spreadsheet. Maybe I just want to boot up the VM because I didn't realize that that piece of data was there so that guy can copy a piece of data

Starting point is 00:33:33 out. I don't know why I wouldn't just recover the individual file, but that was the quickest use case I could come up with. We see that actually for some people when they're talking about potentially databases or something like that, is I want to compare a database from 15 months, 15 days ago to today. I don't want to recover the database. I want to boot it up. So you can, you know, basically we see that as like live mount and then there's like the live mount and automatic recovery. So that's where hyperscale comes into play from, you know, say, I can automatically do that. And if our default is I think we say you should do probably about max of five per node.

Starting point is 00:34:28 But depending on what you're actually trying to achieve, you can actually do more. Okay. Is there Flash in the hyperscale reference design slash appliance depending on who I'm buying it from? Okay. Metadata Flash. It's Metadata Flash, but also there's a secondary use case,

Starting point is 00:34:42 which I would hazard a guess where Howard's going for is when I'm doing these live mount operations, am I stuck on using like the nearline SAS or SATA layer, or will it be able to be accelerated by some flash components? And the answer is yes, there are some flash components that will accelerate that. So basically, it's like the read ahead methodology is if I know I'm going to be reading that entire VM, I will start loading those components into our metadata layers so that I can accelerate some of those components. Okay, so a roadmap is assembling in front of me, which leads you down to the full copy data protection model and not just doing this for recovery, but for test and dev, where you know, snap the main database, let the developers. We actually, we, we kind of have that already. We have a clone DB function that basically, so I can take a snapshot, you know,

Starting point is 00:35:34 off of a storage rate at the beginning. And then I can say either clone from the, you know, from that snapshot, which again, we always advocate just, you know, beware of the workload where Flash is a lot easier to say, hey, if I want to run my production database and maybe two, three, or four additional sort of clone databases off the snapshots, realistically, if you plan for it, you'll have an upend. Well, I mean, if I've got a pure array or an all-Flash vmax, I could probably spin up 10 clones of the database

Starting point is 00:36:07 and just let the developers have it and still have plenty of IOPs. Exactly. But somebody is going to go, what are you doing? You could possibly slow down production. Go spend several million dollars and buy another one.

Starting point is 00:36:23 There you go. Yeah, I slow down production, go spend several million dollars and buy another one. There you go. Our kind is the kind, actually. Yeah, I rarely object when people say go spend several million dollars. Yeah, me neither. Especially when I'm involved. That's a different story. So this live mount, you mentioned that it works with hyperscaler. Does it work with other back-end storage or secondary storage solutions as well?

Starting point is 00:36:46 We can, yes. We can work with that. It's in the Commvault platform. Okay. Right. Yeah, my problem is just I've seen way too many backup back-ends designed for backup and not even for recovery let alone this and so it's like yes and we have 24 10 terabyte drives to give you 200 terabytes of usable capacity to 14 plus 2 raid

Starting point is 00:37:14 6 set you know run oracle on that well and that's and you you bring up a you know that that's that is the the marvelous point is you know when know, when we're talking about our clone database functionality, it's kind of like, you know, I know historically DBAs, you know, they don't like two things. One, getting phone calls realistically and two, having slow databases. So advocating, you know, here, Mr. Customer, you can do this from a flash array. As you mentioned, something like Pure, since we support replication you know the customer has already gone out and bought a second you know pure array great we support with intellisnap with replication so we could actually do clone databasing for dev and test off of the secondary array which realistically probably isn't used uh and this would be good dev tests right you can

Starting point is 00:38:00 boot it up from our back end however there's a difference between an all flash array and our backend, even with some flash cache to say, you can do this. Sure. You know, and if you want to run, maybe, you know, have it, you know, cause we can automatically schedule operations and do all these other components. You know, if you want to, you know, if you want to run maybe an automated process that does some, you know, like a financial reporting or something off of that, great. It's running in the background. You're fine. You probably don't want to spin up, you know, 10 copies off of a, you know, storage backend that is, you know, you know, again, focused on recovering, not necessarily access. Right. And that's where we've, that's where we've actually started with some of the components as we've been looking at saying, you know, read speeds over write speeds are more important because we want to make sure that, look, we want incremental forever everywhere.

Starting point is 00:38:50 After the baseline with deduplication, you know, for the biggest majority of our clients, you know, realistically, it's like there's almost no point not to use deduplication these days. There are a couple of fringe use cases where it's still like, yeah, realistically, you can use it, but it's not really going to help you. Things like medical imaging data doesn't dedupe, right? But to the blind share of, hey, I'm running VMware or I'm running a hypervisor or I'm running databases where I'm running all these other components, deduplication is definitely your friend. No reason not to right well i i'm you know i spent a lot of time studying deduplication and this sounds surprised right and over time i'm becoming less enamored of it um howard well because the the nature of the backup data has changed yeah well on the whole stream kind of gone kaboom when frank and you go and those guys came up with data domain

Starting point is 00:39:54 yeah we were doing weekly full daily incremental yeah and and storing 90 days yeah you know that means there's 27 full backups in there right that's a very duplicate rich environment yeah yeah and when we switch to incremental forever and change block tracking the amount of data there is that duplicates has gone down dramatically um and while it makes sense to me on something like hyperscale um i'm generally saying you know don't go buy that purpose-built backup appliance that's just a deduplicating target send your data to an s3 object store it costs a tenth as much per gigabyte to buy the object store and you're not going to get 10 to one anymore. Yeah. Nope. Yeah. Um,

Starting point is 00:40:47 but the only reason to deduplicate is to save money. And so it all comes down to, you know, is the storage on your backend priced so that it's worthwhile to deduplicate too. Right. Well, and that's a flash thing,

Starting point is 00:41:04 right? It's the other thing, but right. Which is part of the pricing. Right. Well, and that's a Flash thing, right? It's the other thing. Right. Which is part of the pricing. Yeah, well, yeah. And when we start talking about Flash, then we eliminate most of the performance penalty around deduplication.

Starting point is 00:41:15 Right, right. But again, it comes down to price. If you give me 30 terabytes of my data, I don't care if you have 10 terabytes on the back end and ddupe it or if you have 30 terabytes on the back end i care about what the check after right yeah and the performance and the you know the you know from your perspective and which i agree is you know i you know it's like if i if all i care about is the deduplication ratio then i should be advocating you know full on full backups to our customers every day to be like guys look how great look how great my ratio is um yeah right yeah yeah yeah i got a hundred we actually had a customer

Starting point is 00:41:52 he said what he asked me one day he's like what's your dedupe ratio and i just flat flat out said it's a million to one and i was like what do you mean i'm like i'm gonna take a jpeg image i'm gonna copy it a million times in the directory and and I'm going to back it up. But if you want a million to one deduplication, I'll provide that to you. Exactly. But first you have to have a million duplications. So we still do see some value, even in incremental forever with deduplication.

Starting point is 00:42:17 To your point, it's not from a 20 to one perspective anymore. No, but there is a two to three to one. All the windowses and all one you know all the windows is and all the patches and all the stuff there's that for sure and then we're seeing an uptick obvious uh also in um the second sort of in secondary environments where basically customers are you know if they're running a full uh like maybe a coding environment where they have six or seven instances for user acceptance test since or something like that of a database then even with the even with the functionality that we provide some people are still have their structures to um you know to create those duplicates in which

Starting point is 00:42:55 case hey of course that makes sense but to your point if i'm talking about the same vm right i would not advocate doing full on full backups, I'd advocate an incremental forever, in which case, then my, you know, this is how we reported on the back end, as we report a space savings comparison to basically say, here's the size of all of your applications on the front end, if everything was uncompressed and undeduplicated, and here's much how much space savings that we are providing, not a deduplication ratio. Right. Because one of the other enhancements we've done recently is not every application or architecture supports incremental forever.

Starting point is 00:43:36 So what we've done is we've created a block level driver, mostly for applications, where we can basically look at the block level of the underlying file system and take a database and track the components to say, okay, I'll have an Oracle database. And instead of doing a full RMAN dump, I'll integrate my block level driver with RMAN to basically say, RMAN's like, I need you to do this operation, but I'm only going to move the change blocks in the background. So it's still RMAN recovery. It's need you to do this operation, but I'm only going to move the change blocks in the background. So it's still RMAN recovery. It's still all the great things.

Starting point is 00:44:14 But instead of saying, I'm going to do full-on, full backups there all the time, why would I? Let's look and see where we can intelligently move less data off the infrastructure. Have you talked to a DBA about this? Yeah, we've actually showed it to them. I've never been able to convince a DBA any of these things were a good idea. Well, one of the things that we've actually have, and it's funny from like an Oracle perspective, is when you run the Oracle instances, we actually have a window that comes up and says, show me the RMAN script. Because we build it on the fly and we're like, here's the exact commands that we're using. Or you can save that and run it as an external third party,

Starting point is 00:44:51 like R man job back into our backend completely. Or if you really wanted to. My guys would go for the running the R man job into your infrastructure. Yeah, we actually see that. We do. That's as far as I could ever convince them. right all right gents gents i'm gonna have to call it here it's getting we're getting 40 uh past the time frame here um howard are any more final questions

Starting point is 00:45:16 for jonathan well i the only the only last thing that jonathan didn't mention that I think we would be remiss not to get in is to talk about how Commvault, both the platform and the company, are different from the other players. Although, now that Veeam has made its way into at least AAA, if not the majors, they qualify for this too. But everybody else built their system by acquisition. And you can see the red thread where the hands were sewn on. I'm not implying they're Frankenstein's master. Okay, I am implying they're Frankenstein's master. Ours. Oh, my gosh.

Starting point is 00:46:08 But if you look at Veritas and IBM and EMC, or I should say Dell EMC nowadays because I did say Veritas, they have an archive solution, and they have a backup solution, and they are completely independent. And if I want to backup my NetApp filer, if I want to backup and up my netapp filer then every if i want to back up and archive my netapp filer every night they will scan it twice and crush the metadata on that file system yeah so so jonathan tell us how you don't do that well that's i mean this is the the sort of the i guess you could say one of the fundamental underlyings of how Commvault approaches everything that we've done is we've got that code base that we've grown from the ground up that basically allows us to say, what type of data management operation would you like to do? Would you like to do a streaming backup operation?

Starting point is 00:47:02 Would you like to do an archiving operation? And archiving, just so we preface it, is archiving with actually removing the data or moving the data from the source into a separate tier, not archiving as in, I'm just going to store that backup for a very long time. Well, is this also archiving up to, and I can have a content index? Correct. Yes.

Starting point is 00:47:27 Because for me, the difference between a backup and an archive is the index. Backups are indexed where they came from because their purpose of a backup is putting them back where they came from. Archives are indexed by their content because they exist for me to find things in. Well, and then there's, so for us, there's two types of, I guess you could say from an indexing perspective, there's the metadata index as I know the name of the file I'm looking for, I'll search for that.

Starting point is 00:47:53 Then there's the actual content index of, I have no idea about the name of the file, but it says Howard in it. So let me search all those files. And it says Johnson because Johnson's the one suing us for what Howard did misbehave. Exactly. Right. And so then we have both of those layers to basically say, I don't need to, because from a historic perspective, archiving and content indexing and

Starting point is 00:48:19 all of these components, they create yet another copy of data. We're looking at it as, well, why don't I repurpose a, you know, a copy of data to do multiple different things? Now, you know, the operations change depending on obviously, you know, Howard, you made a really good point on the relevance of the operation to say, am I, you know, am I taking a copy of the data because one day I might have to put it back, or am I kind of moving the data into the repository, which will then be accessed or searched directly, right? But Jonathan, isn't the key here that you've,

Starting point is 00:48:55 A, you've grown up from a version of the world where you controlled the metadata, and what you've done is you've been able to extend and enhance the metadata over time to incorporate new functionality? Exactly. That's exactly where I was headed with it. So I appreciate that. It's, you know, from our perspective, a copy of the data with no understanding or metadata or characteristics or anything about source destination, content, longevity, even tagging, it's just another copy of the data. What's the relevance of that?

Starting point is 00:49:32 We really wanted to foundationally move past the fact of, yes, we house a protection copy or an archive copy, but can we give you distinct and other benefits for that data? Can we be the source to feed other workloads? Can we allow you to run searches directly off of that? And, you know, to us, that's much more important than a conversation on, you know, dedupe ratios or, you know, basically whose compression algorithm is better. It's that metadata key component that allows us to, you know, realistically understand what data we have in there.

Starting point is 00:50:06 And especially with the, you know, the upcoming changing landscapes around the world, it's being able to understand what data that you have, should you have it, or do I need to remove it, that, you know, are becoming critical common conversations with our customers. Okay. Hey, Jonathan, is there anything else you'd like to say to our listening audience? I'm looking forward to episode 55. Good for you. Very good. Better than most, I would say.

Starting point is 00:50:36 Well, this has been great, Jonathan. Thanks you very much for being on our show today. Happy to be here, guys. Thanks very much. Next month, we'll talk to another system storage technology person. Any questions you want us to ask, please let us know.

Starting point is 00:50:49 That's it for now. Bye, Howard. Bye, Ray. Until next time. Thank you.

Your Ad Here

Grey Beards on Systems - 54: GreyBeards talk scale-out secondary storage with Jonathan Howard, Dir. Tech. Alliances at Commvault

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.