Grey Beards on Systems - 135: Greybeard(s) talk file and object challenges with Theresa Miller & David Jayanathan, Cohesity

Episode Date: August 1, 2022

Sponsored By: I’ve known Theresa Miller, Director of Technology Advocacy Group at Cohesity, for many years now and just met David Jayanathan (DJ), Cohesity Solutions Architect during the podcast. Th...eresa could easily qualify as an old timer, if she wished and DJ was very knowledgeable about traditional file and object storage. We had a wide … Continue reading "135: Greybeard(s) talk file and object challenges with Theresa Miller & David Jayanathan, Cohesity"

Transcript
Discussion (0)
Starting point is 00:00:00 Hey everybody, Ray Lucchese here. Welcome to another sponsored episode of the Graybeards on Storage podcast, a show where we get Graybeards bloggers together with storage assistant vendors to discuss upcoming products, technologies, and trends affecting the data center today. This Graybeards on Storage episode is brought to you today by Cohesity. And now it is my great pleasure to introduce Teresa Miller, Director of Technology Advocacy Group, and David Giannothan, or DJ, a Solutions Architect at Cohesity. So Teresa and David, why don't you tell us a little bit about yourselves?
Starting point is 00:00:40 Sure. So as Ray mentioned, my name is Teresa Miller. I'm a longtime community advocate as a Microsoft MVP. I'm a Citrix CTP and VMware V expert, but I also run technology advocacy over at Cohesity and I've been doing a lot with smart files and files and objects specifically for those of you that don't know what SmartFiles is. So I'm excited to be here today. Thanks for having me, Ray. Great. David? My name is David Jayanathan. I'm a solutions architect here at Cohesity, specialized on our file and object platform, referred to as SmartFiles. I have a fairly
Starting point is 00:01:21 in-depth enterprise storage background comprised of about 15 years on the customer and sales side. Nice to speak with you, Ray. That's great. That's great. All right. So, David, maybe you can answer the first question up here. What challenges are customers facing when it comes to files and objects running on traditional storage? What we've seen in the storage industry is that there's a lot of very mature storage products out there that have grown over time. And they haven't necessarily met the demands of what consumers or customers are looking for. They're able to manage things within a silo, but that's typically NAS might be managed separately from object storage.
Starting point is 00:02:01 And as object is becoming more prevalent, you have different silos that you've created. And a lot of these products, frankly, don't have modern security built into them when it comes to ransomware mitigation. That's a lot of what we've seen today. Teresa? Yeah, I would agree to all of that. You know, security, to me, is one of the big ones. I think the other thing I'd like to call out is that traditional storage doesn't always integrate well with cloud. And so organizations with their hybrid cloud and even multi-cloud strategies could run into some roadblocks. So when you talk about security, Teresa, are you talking about ransomware kinds of things in terms of who can get in and how in multiple layers. RBAC more around the security front in terms of who has access to the data, but it can go so much deeper. It can be the ransomware detection. It can be where you store your data. It can be leveraging worm. It could also be
Starting point is 00:03:40 vaulting data to another location with an appropriate air gap. So the conversation can go pretty wide. Yeah, yeah. Security is a wide area to discuss, but let's move on. Teresa, is having traditional storage enough? Should a solution offer more? Yeah, that's a really great question. So when I think of enterprise and storage, I just think of having a SAN and having the ability to possibly have another copy of that data replicated with the right level of network failover in place to get over to that data, at least by way of the on-prem environment. But what about backing up that data, at least by way of the on-prem environment. But what about backing up that
Starting point is 00:04:27 data or archiving that data, migrating it from another solution, or even tiering when data gets cold? You might not want it on expensive storage. So I think one of the things that organizations should be thinking about is, well, can a single solution do all of those things instead of buying multiple separate disparate products? So, you know, David, DJ, I'm going to go ahead and turn this over to you for maybe some of your thoughts. Yeah, in traditional storage environments, the challenge is that storage administrators are typically data custodians. They're not the data owner. They're not responsible for how data is created. But they're responsible for the integrity and protection of that data.
Starting point is 00:05:16 So when it comes to helping manage the lifecycle of that data, how it's adequately secured, where it's placed, meeting legal requirements. Those are a lot of the challenges. And so having a solution that holistically takes into account retention of the data, resiliency of the data, security and governance needs for that data, that's what a modern storage solution needs to have that most traditional storage solutions do not have. That's a fairly sizable bill of materials for functionality for a storage system, DJ. Backup, you're talking about replication, you're talking about security, you're talking about compliance. This is way beyond a storage environment of traditional worldview that I've seen in the past.
Starting point is 00:06:06 I think that you're right, but that's where things are heading nowadays, right? Especially with ransomware threats or insider threats. You have to take into account other laws that are being passed like CCPA in California for integrity of people's personal identifiable information as well. And so all of that is encapsulated in what you actually need to do as a modern storage administrator to take all of that into account to facilitate what the business needs. Interesting. Interesting. So where does the cloud fit in all this stuff? How does the customer and the enterprise manage their data that spans cloud and non-cloud
Starting point is 00:06:48 environments and things of that nature? Yeah, no worries. Hey, so cloud, yeah, the world has changed a lot in the last couple of years. I think that cloud used to be one of those things that we were thinking about. And in the recent past, everyone's leveraging cloud as much as possible. And so when you consider files and objects, when organizations have not only on-prem data, data at the edge, and now they're putting it in the cloud, storage can become pretty distributed. And so I think it's really important for organizations to really put pen to paper and think about how they want to manage and span that. Now, I will say that with a modern solution, you should be able to manage your files and objects regardless of where they live.
Starting point is 00:07:48 It should be able to take on that centralized management approach that you need. But beyond that, I think it leaves us with a lot of data silos when you think about data being placed all over. Yeah. The management of data that spans cloud, on-prem, colos, and various data centers throughout America or the world or whatever is a sizable challenge. I mean, something like that would have to be running somewhere as a management framework, would have to be running in the cloud and talking to all these storage elements sitting wherever they are. Is that how you feel this? Is that how you think this would go on, Teresa?
Starting point is 00:08:36 Yeah, I think it's possible, right? I think organizations are still going to want some data on-prem. They're going to want some in the cloud. It's going to depend on the business use case. And so when you think about it being so distributed, I think having a modern solution to manage all of that is going to be critical. DJ? Yeah, I think data portability is key.
Starting point is 00:08:58 You want to be able to co-locate your data with wherever the compute that needs to consume it is and not having different tools to sort of reformat that data depending on the location of that whether it's on-prem or cloud or between clouds like the data needs to be consistent between them and having something to sit as an abstraction layer on top of that is going to be critical as hybrid cloud environments continue to grow. And you're talking edge too. I mean, data at the edge is raw data kind of stuff. It's a different world than IT, enterprise, cloud, stuff like that. Don't you think? Yeah. I mean, just using IoT as an example, right? There's a lot of endpoints that are generating data and that has to get scraped together
Starting point is 00:09:47 and moved to another location, but you don't necessarily want to compute at the edge sometimes. You want to have the flexibility to do that depending on the workload, absolutely. Yeah, yeah, and very interesting. All right, DJ, this one's for you. How important is it to have built-in security
Starting point is 00:10:03 within your files on Object Solution? Well, this one's for you. How important is it to have built-in security within your files on Object Solution? Well, this one is critically important. We have had a lot of conversations with customers specifically on security and what are you doing to help us combat ransomware? I think what's important to know is that whenever you're looking at the security posture of a storage product, it's not the only security product in your arsenal. Frequently, security is the last line of defense within the storage realm to make sure that when all your other systems have failed, you still have a way to protect that data and recover in the event that your environment is compromised. So when you're taking into consideration what your existing storage architecture does, really think of it as everything else has gone awry. This is my last bastion to be able to save the business. Like, what am I doing here to be able to recover in the event of a compromised environment? In the old days, it would have been backups or, you know, archives or, you
Starting point is 00:11:04 know, Iron Mountain kinds of things where you would go and extract data. It might take you a week to get it back. But even those sorts of things aren't viable anymore, given the frequency of ransomware and such. Teresa? Yeah. Well, I think DJ hit it on the head with recoverability. I think it's amazing that solutions can detect anomalies and be forward thinking on that and take that step. But ultimately, the recoverability and the recovery options are critical. So I actually want to take this a step further because security and cyber threats are just so impacting to businesses these days. Having an option to vault your data to another location, such as the cloud, and have that data air gapped because ransomware can affect
Starting point is 00:12:06 backups. And if there's not an air gap to prevent that from happening during a ransomware attack, how are you ever going to recover? We've seen ransomware literally take out businesses and not be able to come back because the process behind encryption and getting keys is, it can be pretty complicated. So having a vaulted copy that's protected and isolated to be able to recover from is pretty important. And the vaulted copy would be encrypted with your own keys or something like that? Or would it be encrypted based on wherever the cloud environment you select to vault your data for? And a vault of a copy, so this is really sort of a backup of the data, or this would be a replication of the data?
Starting point is 00:13:01 I'm just trying to understand what you're saying when you mean vault. Yeah. No, it's a really great question. So when it comes to vaulting, you're taking basically a copy of that data and isolating it in the cloud. It would definitely function off of a protection job. So you could consider it a backup, like potentially a third copy of that data. We, we oftentimes, um, organizationally want multiple copies of our data, but it's stored in a fashion that keeps it, it safe. Um, and then when you want it back, you can recover it. And even more importantly, it could be gated, meaning that there's a quorum at play. And in terms of quorum, that means that only a certain user, or it should be multiple users, you'd probably minimally have two,
Starting point is 00:13:57 that have to approve that recovery. Because compromise can happen at many different levels. Mm-hmm. And so when you mean quorum, you're talking about multiple people having the ability to force the recovery of that vault onto your data wherever it resides. When I think of quorum, I'm thinking about multiple systems and stuff like that. But so the vault would be something you would periodically do at a protection point. So it wouldn't necessarily be every backup or would it? Or is that something you would be able to parameterize or something like that? You can parameterize it based on policy. So it can be whatever, um, to whatever level an enterprise needs, um, from a recovery point and recovery time objective, you have, have that control and even how long you want to retain it. Um, we also talked a little bit about encryption of the storage,
Starting point is 00:14:59 um, with the, with the right solution, you're going to be able to make sure that that data is locked down, like even in a worm state. And then, like you said, the encryption keys would likely, if it's cloud, it's going to happen between your environment, your backup environment, and the cloud. So there's definitely a set of encryption keys at play. I'd like to talk about security further, DJ. Can we dive into this more since it's so important these days? Yeah, it's extremely important. You want to be able to make sure that you can represent to the business like, hey, we have this hardened platform that can serve all of your needs while maintaining a higher security posture than normal. Just to elaborate on Teresa's point for Quorum, for example, you know, you talked about Quorum as a systems concept where you have multiple systems that function together. But when you
Starting point is 00:15:54 extend Quorum to a human concept where it's like, hey, you know, I really don't want to do this type of destructive recovery operation unless we have you know two three four people that have approved a certain workflow like having that built in in addition to systems that can automatically detect you know abnormal human behavior disabling user accounts like that type of activity where it just programmatically happens based off of key indicators that you've told it to do. Like that's what a system needs to have. Right, right. I think of the, you know, missile launches requiring two keys or something like that
Starting point is 00:16:33 to actually happen. So in this case, it would require multiple people to say this is okay to go and restore this vault. Is that correct? Yeah, and that vault, you know, depending on how far a person wants to take it or an enterprise wants to take it, it doesn't even have to be a vault that they can control. That could be like a vault as a service, if you will,
Starting point is 00:16:55 where you have outsourced like the actual infrastructure management of that to another entity. And then when you layer on something like Quorum, it makes it a very hardened platform so that you know your data is secure. Teresa? Yeah, well, I think DJ said it in a very great way. I actually have nothing additional to add to that. It's exactly how it functions. You guys have mentioned worm a couple of times. In this environment, you're talking about the cloud being a worm-like solution because it's immutable.
Starting point is 00:17:32 Is that how I would read that? You can extend worm to include even the original data set. Storage protocols as a whole, by definition, can't be completely worm. Like there can be policies to make the data in a worm state or a read-only state after it's written. But the storage itself needs to be able to accept new rights. So how you deal with it after the fact, it could be you have local data protection instances in the form of snapshots. That's a common technology where those become worm the second that they're written. Or it could be the downstream data protection centric replications, archives, vault copies that have worm. Any combination of all of those can have worm enabled specifically curated to a customer's environment. Right, right. There's been some mention of smart files
Starting point is 00:18:21 in this discussion. I'm not sure I understand what smart files is. Maybe DJ, maybe you can explain what smart files is in Cohesity. Smart files is Cohesity's file and object services platform. We have a unified storage platform to serve up all protocols concurrently from the same cluster, if you will.
Starting point is 00:18:44 And then from there, we have a data management plane called Helios that sits over top of that, that allows you to do multi-cluster management between on-premise hybrid clouds. It allows you to enact a lot of the security features that we've talked about today for your file and object workloads, but then orchestrate the downstream data management capabilities in the form of replication, archiving, vaulting, all within a single umbrella. Okay, so SmartFiles is a file and object storage solution that runs all over. And Helios is a data management, data protection solution on top of that. And then there's another solution, which is data management on top of that. Is that how I understand this? Or did I get this wrong?
Starting point is 00:19:30 Teresa, you want to explain Helios? Yeah. Yeah. So Helios is, is ultimately the name of our, our UI that allows for centralized management of, of any of your Cohesity workloads. They could be our as-a-service offering. It could be our backup and recovery on-prem solution. It could be a CE that you have deployed, meaning like a cloud edition of our solution. So Helios really is just kind of like that centralized management UI that our customers use. Smart files actually just runs on top of our traditional data protect solution.
Starting point is 00:20:24 We refer to file shares as views, ultimately, in our solution. And you can store files and objects on there and take advantage of everything that DJ described. You can do backup. You can do archiving, tiering, migration. It can be in the cloud. Your data can be in the cloud. You get the security. So it's a pretty comprehensive offering. All right. Well, Teresa and David, is there anything you'd like to say to our listening
Starting point is 00:20:59 audience before we close? Yeah. So I guess my, my key takeaway for the audience today is that I would recommend taking a look at what you're using and what you have today. It possibly is traditional storage and, and think about what is, is missing to meet your enterprise needs, because you might be not only making it harder for yourself by not having a more comprehensive solution, you're probably not as secure as you need to be based on all the security conversations we had today. DJ? Yeah, it's very hard to predict what business requirements you're going to have going down the road. And so even if you think you're in a good spot today, it doesn't hurt to periodically reevaluate what technologies you have in your stable, so to speak, and make sure that you perform exercises for data recoveries at scale. Being able to explain to someone, this is what a large scale recovery would look like and set proper expectations will go a long way. And having the proof that you've done regular testing is critically important.
Starting point is 00:22:17 Right, right. Well, this has been great. Teresa and David, thank you very much for being on our show today. Thank you, Ray. It's been a pleasure. And thanks again to Cohesity for sponsoring this podcast. That's it for now. Bye, Teresa, and bye, David. Tell your friends about it. Please review us on Apple Podcasts, Google Play, and Spotify, as this will help get the word out. Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.