Grey Beards on Systems - 135: Greybeard(s) talk file and object challenges with Theresa Miller & David Jayanathan, Cohesity
Episode Date: August 1, 2022Sponsored By: I’ve known Theresa Miller, Director of Technology Advocacy Group at Cohesity, for many years now and just met David Jayanathan (DJ), Cohesity Solutions Architect during the podcast. Th...eresa could easily qualify as an old timer, if she wished and DJ was very knowledgeable about traditional file and object storage. We had a wide … Continue reading "135: Greybeard(s) talk file and object challenges with Theresa Miller & David Jayanathan, Cohesity"
Transcript
Discussion (0)
Hey everybody, Ray Lucchese here.
Welcome to another sponsored episode of the Graybeards on Storage podcast,
a show where we get Graybeards bloggers together with storage assistant vendors
to discuss upcoming products, technologies, and trends affecting the data center today.
This Graybeards on Storage episode is brought to you today by Cohesity.
And now it is my great pleasure to introduce Teresa Miller, Director of Technology Advocacy Group,
and David Giannothan, or DJ, a Solutions Architect at Cohesity.
So Teresa and David, why don't you tell us a little bit about yourselves?
Sure. So as Ray mentioned, my name is Teresa Miller.
I'm a longtime community advocate as a Microsoft MVP. I'm a Citrix CTP and VMware V expert, but I also run technology advocacy
over at Cohesity and I've been doing a lot with smart files and files and objects specifically
for those of you that don't know what SmartFiles is.
So I'm excited to be here today. Thanks for having me, Ray.
Great. David?
My name is David Jayanathan. I'm a solutions architect here at Cohesity,
specialized on our file and object platform, referred to as SmartFiles. I have a fairly
in-depth enterprise storage background comprised of about 15 years on the customer and sales side.
Nice to speak with you, Ray.
That's great. That's great.
All right. So, David, maybe you can answer the first question up here.
What challenges are customers facing when it comes to files and objects running on traditional storage?
What we've seen in the storage industry is that there's a lot of very mature storage products out there that have grown over time.
And they haven't necessarily met the demands of what consumers or customers are looking for.
They're able to manage things within a silo, but that's typically NAS might be managed separately from object storage.
And as object is becoming more prevalent, you have different silos that you've created. And a lot of these products, frankly, don't have modern security built into
them when it comes to ransomware mitigation. That's a lot of what we've seen today. Teresa?
Yeah, I would agree to all of that. You know, security, to me, is one of the big ones. I think the other thing I'd like to call out is that traditional
storage doesn't always integrate well with cloud. And so organizations with their hybrid cloud and
even multi-cloud strategies could run into some roadblocks. So when you talk about security, Teresa, are you talking about ransomware kinds of things in terms of who can get in and how in multiple layers.
RBAC more around the security front in terms of who has access to the data, but it can go so much deeper.
It can be the ransomware detection.
It can be where you store your data. It can be leveraging worm. It could also be
vaulting data to another location with an appropriate air gap. So the conversation can go pretty wide. Yeah, yeah. Security is a wide area to discuss, but let's move on.
Teresa, is having traditional storage enough?
Should a solution offer more?
Yeah, that's a really great question.
So when I think of enterprise and storage,
I just think of having a SAN and having the ability to possibly have
another copy of that data replicated with the right level of network failover in place to get
over to that data, at least by way of the on-prem environment. But what about backing up that data, at least by way of the on-prem environment. But what about backing up that
data or archiving that data, migrating it from another solution, or even tiering when data gets
cold? You might not want it on expensive storage. So I think one of the things that organizations
should be thinking about is, well, can a single solution do all of those things instead of buying multiple separate disparate products?
So, you know, David, DJ, I'm going to go ahead and turn this over to you for maybe some of your thoughts.
Yeah, in traditional storage environments, the challenge is that storage administrators are typically data custodians.
They're not the data owner.
They're not responsible for how data is created.
But they're responsible for the integrity and protection of that data.
So when it comes to helping manage the lifecycle of that data, how it's adequately secured, where it's placed, meeting legal requirements.
Those are a lot of the challenges.
And so having a solution that holistically takes into account retention of the data, resiliency of the data, security and governance needs for that data,
that's what a modern storage solution needs to have that most traditional storage solutions do not have.
That's a fairly sizable bill of materials for functionality for a storage system, DJ.
Backup, you're talking about replication, you're talking about security, you're talking about
compliance. This is way beyond a storage environment of traditional worldview that
I've seen in the past.
I think that you're right, but that's where things are heading nowadays, right? Especially with
ransomware threats or insider threats. You have to take into account other laws that are being
passed like CCPA in California for integrity of people's personal identifiable information as well.
And so all of that is encapsulated in what you actually need to do as a modern storage administrator to take all of that into account to facilitate what the business needs.
Interesting.
Interesting.
So where does the cloud fit in all this stuff?
How does the customer and the enterprise manage their data that spans cloud and non-cloud
environments and things of that nature? Yeah, no worries. Hey, so cloud, yeah, the world has
changed a lot in the last couple of years. I think that cloud used to be one of those things that
we were thinking about. And in the recent past, everyone's leveraging
cloud as much as possible. And so when you consider files and objects, when organizations
have not only on-prem data, data at the edge, and now they're putting it in the cloud, storage can become pretty distributed. And so
I think it's really important for organizations to really put pen to paper and think about
how they want to manage and span that. Now, I will say that with a modern solution,
you should be able to manage your files and objects regardless of where they live.
It should be able to take on that centralized management approach that you need.
But beyond that, I think it leaves us with a lot of data silos when you think about data being placed all over.
Yeah.
The management of data that spans cloud, on-prem, colos, and various data centers throughout
America or the world or whatever is a sizable challenge.
I mean, something like that would have to be running somewhere as a management framework, would have to be running in the cloud and talking to all these storage elements sitting wherever they are.
Is that how you feel this?
Is that how you think this would go on, Teresa?
Yeah, I think it's possible, right?
I think organizations are still going to want some data on-prem.
They're going to want some in the cloud.
It's going to depend on the business use case.
And so when you think about it being so distributed, I think having a modern solution to manage
all of that is going to be critical.
DJ?
Yeah, I think data portability is key.
You want to be able to co-locate your data with wherever the compute that needs to consume it is and not having different
tools to sort of reformat that data depending on the location of that whether it's on-prem or cloud
or between clouds like the data needs to be consistent between them and having something to
sit as an abstraction layer on top of that is going to be critical as hybrid cloud environments continue to grow. And you're talking edge too. I mean,
data at the edge is raw data kind of stuff. It's a different world than IT, enterprise,
cloud, stuff like that. Don't you think? Yeah. I mean, just using IoT as an example, right? There's
a lot of endpoints that are generating data
and that has to get scraped together
and moved to another location,
but you don't necessarily want to compute
at the edge sometimes.
You want to have the flexibility to do that
depending on the workload, absolutely.
Yeah, yeah, and very interesting.
All right, DJ, this one's for you.
How important is it to have built-in security
within your files on Object Solution? Well, this one's for you. How important is it to have built-in security within your files on Object Solution?
Well, this one is critically important.
We have had a lot of conversations with customers specifically on security and what are you doing to help us combat ransomware?
I think what's important to know is that whenever you're looking at the security posture of a storage product, it's not the only security product in your arsenal.
Frequently, security is the last line of defense within the storage realm to make sure that when all your other systems have failed, you still have a way to protect that data and recover in the event that your environment is compromised. So when you're taking into consideration what your existing storage
architecture does, really think of it as everything else has gone awry. This is my last bastion to be
able to save the business. Like, what am I doing here to be able to recover in the event of a
compromised environment? In the old days, it would have been backups or, you know, archives or, you
know, Iron Mountain kinds of things where you would go and extract data.
It might take you a week to get it back.
But even those sorts of things aren't viable anymore, given the frequency of ransomware and such.
Teresa?
Yeah.
Well, I think DJ hit it on the head with recoverability. I think it's amazing that solutions can detect anomalies and be forward thinking on that and take that step. But ultimately, the recoverability and the recovery options are critical.
So I actually want to take this a step further because security and cyber threats are just so impacting to businesses these days.
Having an option to vault your data to another location, such as the cloud, and have that data air gapped because ransomware can affect
backups. And if there's not an air gap to prevent that from happening during a ransomware attack,
how are you ever going to recover? We've seen ransomware literally take out businesses and not
be able to come back because the process behind encryption and getting keys is, it can be pretty complicated.
So having a vaulted copy that's protected and isolated to be able to recover from is
pretty important.
And the vaulted copy would be encrypted with your own keys or something like that?
Or would it be encrypted based on wherever the cloud environment you select to vault your data for?
And a vault of a copy, so this is really sort of a backup of the data, or this would be a replication of the data?
I'm just trying to understand what you're saying when you mean vault.
Yeah.
No, it's a really great question. So when it comes to vaulting, you're taking basically a copy of that data and isolating it in the cloud. It would definitely function off of
a protection job. So you could consider it a backup, like potentially a third copy of that
data. We, we oftentimes, um, organizationally want multiple copies of our data, but it's stored in a
fashion that keeps it, it safe. Um, and then when you want it back, you can recover it. And even
more importantly, it could be gated,
meaning that there's a quorum at play. And in terms of quorum, that means that only a certain user, or it should be multiple users, you'd probably minimally have two,
that have to approve that recovery. Because compromise can happen at many different levels.
Mm-hmm. And so when you mean quorum, you're talking about multiple people having the ability to force the recovery of that vault onto your data wherever it resides.
When I think of quorum, I'm thinking about multiple systems and stuff like that. But so the vault would be something you would periodically do
at a protection point. So it wouldn't necessarily be every backup or would it? Or is that something
you would be able to parameterize or something like that? You can parameterize it based on policy.
So it can be whatever, um, to whatever level an enterprise
needs, um, from a recovery point and recovery time objective, you have, have that control and even
how long you want to retain it. Um, we also talked a little bit about encryption of the storage,
um, with the, with the right solution, you're going to be able to make sure that that data is locked down, like even in a worm state.
And then, like you said, the encryption keys would likely, if it's cloud, it's going to happen between your environment, your backup environment, and the cloud.
So there's definitely a set of encryption keys at play.
I'd like to talk about security further, DJ. Can we dive into this more since it's so important
these days? Yeah, it's extremely important. You want to be able to make sure that you can
represent to the business like, hey, we have this hardened platform that can serve all of your needs
while maintaining a higher security posture than normal. Just to elaborate on Teresa's point for Quorum, for example, you know, you talked about
Quorum as a systems concept where you have multiple systems that function together. But when you
extend Quorum to a human concept where it's like, hey, you know, I really don't want to do this type
of destructive recovery operation unless we have you know two three four
people that have approved a certain workflow like having that built in in addition to systems that
can automatically detect you know abnormal human behavior disabling user accounts like that type
of activity where it just programmatically happens based off of key indicators that you've told it to do.
Like that's what a system needs to have.
Right, right.
I think of the, you know, missile launches requiring two keys or something like that
to actually happen.
So in this case, it would require multiple people to say this is okay to go and restore
this vault.
Is that correct?
Yeah, and that vault, you know, depending on how far a person wants to take it
or an enterprise wants to take it,
it doesn't even have to be a vault that they can control.
That could be like a vault as a service, if you will,
where you have outsourced
like the actual infrastructure management of that
to another entity.
And then when you layer on something like Quorum,
it makes it a very hardened platform so
that you know your data is secure. Teresa? Yeah, well, I think DJ said it in a very
great way. I actually have nothing additional to add to that. It's exactly how it functions.
You guys have mentioned worm a couple of times. In this environment, you're talking about the cloud being a worm-like solution because it's immutable.
Is that how I would read that?
You can extend worm to include even the original data set.
Storage protocols as a whole, by definition, can't be completely worm. Like there can be policies to make the data in a worm state or a read-only state after it's written.
But the storage itself needs to be able to accept new rights.
So how you deal with it after the fact, it could be you have local data protection instances in the form of snapshots.
That's a common technology where those become worm the second that they're written. Or it could be the downstream data protection centric replications, archives,
vault copies that have worm. Any combination of all of those can have worm enabled specifically
curated to a customer's environment. Right, right. There's been some mention of smart files
in this discussion. I'm not sure I understand what smart files is.
Maybe DJ, maybe you can explain
what smart files is in Cohesity.
Smart files is Cohesity's file
and object services platform.
We have a unified storage platform
to serve up all protocols concurrently
from the same cluster, if you will.
And then from there, we have a data management plane called Helios that sits over top of
that, that allows you to do multi-cluster management between on-premise hybrid clouds.
It allows you to enact a lot of the security features that we've talked about today for
your file and object workloads, but then orchestrate the downstream data management
capabilities in the form of replication, archiving, vaulting, all within a single umbrella.
Okay, so SmartFiles is a file and object storage solution that runs all over. And Helios is a
data management, data protection solution on top of that. And then there's another solution, which is data management on top of that.
Is that how I understand this? Or did I get this wrong?
Teresa, you want to explain Helios?
Yeah. Yeah. So Helios is,
is ultimately the name of our,
our UI that allows for centralized management of,
of any of your Cohesity workloads. They could be our as-a-service offering. It could be our backup and recovery on-prem solution.
It could be a CE that you have deployed, meaning like a cloud edition of our solution.
So Helios really is just kind of like that centralized management UI that our customers use.
Smart files actually just runs on top of our traditional data protect solution.
We refer to file shares as views, ultimately, in our solution.
And you can store files and objects on there
and take advantage of everything that DJ described.
You can do backup.
You can do archiving, tiering, migration.
It can be in the cloud. Your data
can be in the cloud. You get the security. So it's a pretty comprehensive offering.
All right. Well, Teresa and David, is there anything you'd like to say to our listening
audience before we close? Yeah. So I guess my, my key takeaway for the audience today is that I would
recommend taking a look at what you're using and what you have today. It possibly is traditional
storage and, and think about what is, is missing to meet your enterprise needs, because you might be not only making it harder for yourself by not
having a more comprehensive solution, you're probably not as secure as you need to be based
on all the security conversations we had today. DJ? Yeah, it's very hard to predict what business
requirements you're going to have going down the road. And so even if you think you're in a good spot today, it doesn't hurt to periodically reevaluate what technologies you have in your stable, so to speak, and make sure that you perform exercises for data recoveries at scale. Being able to explain to someone, this is what a large
scale recovery would look like and set proper expectations will go a long way. And having
the proof that you've done regular testing is critically important.
Right, right. Well, this has been great. Teresa and David, thank you very much for being on our
show today. Thank you, Ray. It's been a pleasure.
And thanks again to Cohesity for sponsoring this podcast.
That's it for now.
Bye, Teresa, and bye, David.
Tell your friends about it.
Please review us on Apple Podcasts, Google Play, and Spotify,
as this will help get the word out. Thank you.