Grey Beards on Systems - 150: GreyBeard talks Zero Trust with Jonathan Halstuch, Co-founder & CTO, RackTop Systems
Episode Date: June 22, 2023Sponsored By: This is another in our series of sponsored podcasts with Jonathan Halstuch (@JAHGT), Co-Founder and CTO of RackTop Systems. You can hear more in Episode #147 on RansomWare protection and... Episode #145 on proactive NAS security. Zero Trust Architecture (ZTA) has been touted as the next level of security for a while now. … Continue reading "150: GreyBeard talks Zero Trust with Jonathan Halstuch, Co-founder & CTO, RackTop Systems"
Transcript
Discussion (0)
Hey everybody, Ray Lucchese here.
Welcome to another sponsored episode of the Greybeards on Storage podcast,
a show where we get Greybeards bloggers together with storage assistant vendors
to discuss upcoming products, technologies, and trends affecting the data center today. This Greybeards on Storage episode is again brought to you by Racktop Systems.
And now it is my great pleasure to once again introduce Jonathan Halstuck, co-founder and CTO of Racktop Systems.
So, Jonathan, why don't you tell us a little bit about yourself and what Zero Trust Security means for storage?
Hey, Ray, thanks for having me on the show.
Excited to talk about that today.
So, yeah, my background's in defense and intelligence and protecting data and also understanding how adversaries and nation states go after data for their own purposes.
And really, it's all about collecting data, right? When we think
about cybersecurity, sometimes people think about the network first and endpoints, but really,
the bad guys are after your data. So what we're thinking about is putting the protections
as close to the data as possible where the data lives. And if you think about your largest asset,
it's that unstructured data that typically makes up 80 to 90% of an enterprise environment.
And that's the stuff that lives on a NAS or a file share.
And so we wanted to take a zero trust approach to that security, right?
Everybody's talking about zero trust and what that means.
And really, if you break down zero trust, it moves away from that kind of implicit trust
evaluation to kind of one time saying, yeah, I'm going to give you permission to read and
write to this data.
And then you always have permission to a more dynamic evaluation of trust where you're
basically evaluating trust for each transaction to an enterprise resource. And in this case,
we're talking about that biggest enterprise resource, your unstructured data. So in real
time, after a user is given read write permissions to a folder or read permissions to a folder,
we're then evaluating, hey, do we want that file operation to happen in real time?
And if it seems nefarious or suspicious or malicious, we can alert on it or block it.
And we're doing it where the data lives so that you don't have to require endpoints to
be deployed to monitor this or agents.
You can actually do this right from the NAS or the file share itself.
So my understanding of zero trust architecture is it's mutual authentication happening almost all the time during transactions that go on and stuff like that.
The challenge with storage is, you know, there's lots of transactions that go between a client and the storage server.
So how does one actually implement something like zero trust architecture in a
storage environment? Yeah. So from that aspect, a lot of that capability is built into the protocol
and we're not changing how that works, right? So you have different protocols from different eras.
So NFS v3 is a stateless protocol. And so every transaction is going through that log on type of event.
You have newer protocols like NFS 4.0 and later, as well as all the SMB protocols popular around
Windows that are stateful. And with those, you get the concept of a log on and authentication,
and then a session that you're logged into where you have that session for a period of time. And
different policies will dictate how long that session can go on for before you need to
reauthenticate to create a new session. And both the client could log off and end the session or
the server could say, Hey, it's been too long. We need you to reauthenticate and start a new
session. And so that's kind of the base of the protocol. So where the real differences
in the real zero trust approach that we bring to the table though, is what happens after you
authenticate, right? So somehow either that credential was gotten, you know, it's the right,
it's the actual user and they're using their credentials, or maybe the credential has been
compromised and it's a bad actor that's using that credential. And then it's looking at what
are they doing to access the data on the storage and then making that zero trust evaluation for
those operations, right? The read of the file, the modify, the write, the delete, that type of stuff.
So that logon stuff's kind of handled at the protocol layer. We don't really have to do
anything to be compliant with zero trust from that aspect, but then it's that user entity behavior analytics on top of that afterwards.
So this is based on trying to understand how bad actors would differ from a normal actor
looking at a file or something like that. Is that how this plays out?
Yeah, essentially. Because if you think about it, a bad actor, if they steal a login credential and have access to get to that file share somehow, or they co-opt even a multi-factor authentication login, once they're in and they have the permissions, right, you've logged in now, even things like encryption tend to be tied to credentials and things.
So they're going to be able to decrypt data and things like that. So then we have to monitor the behaviors of those users
to see if they're doing something bad. And it was another thing that was brought up, something like,
I think it was policy-based access controls. Is that something you guys have support for? I mean,
different sorts of identification or authentication capabilities based on
types of data, I guess. I'm not sure quite how it all plays out.
Yeah, so policy-based access controls
and attribute-based access controls kind of tie together.
And that is something we are supporting
among different organizations.
One of the things that was started really by the government,
specifically the DOD,
but has become more kind of relevant in the mainstream
is basically having policies about who can access a
particular type of data with an attribute. And maybe that attribute is PII data or HIPAA data.
So it would be the ability to access that type of data as well as from what either system or
machine or location and tying all that together to determine kind of a policy approach
beyond just our traditional kind of role-based access controls. And so that gets more involved
and can be valuable even in the private sector. So in the government, they do it and they call
it multi-level security where you can have unclassified data, secret data, top secret data.
When you're on a top secret machine and top secret network, you can browse down and read secret data and unclassified data
and write to the top secret level. If you're on a secret level machine and network, you can only see
secret data and below. You wouldn't be able to see the top secret data. So all that policy engine
kind of gets tied in and rolled in. And so it's typically happening
outside the storage, but the storage is questioning the policy engine. Do I allow this
access to this particular file or not? There is to back to kind of, is it chatty or does it require
more IO? It does, but it's definitely doable and can be architected with caching and sessions to still be performing and even be used in high-performance type storage applications.
Yeah.
So, I mean, there's a, you know, I do SSH kind of calls quite often to servers and stuff like that.
And every once in a while it comes back with, you know, a hash has changed or something like that.
And I kind of typically ignore those sorts of things,
but you can't ignore these sorts of things in a real secure environment.
Right. You're saying like the fingerprints change
that somebody changed the servers or something like that.
Yeah.
And I think we all know kind of security is all about layers and defense and depth.
And so, you know, as you get into more secure environments
or places with more hygiene and rigor, you know,
you get into the fact to your point of like, it's not just self-generated certificates or
self-signed certificates. You're using certificates from a certificate authority.
You have potentially, you know, centralized key management and other things to ensure all of the
keys and the certificates are being managed. They're being rotated. They have, you know,
expirations of certain amount of times you're revoking certificates. So I think it's,
it becomes a whole ecosystem, right? As you start to get into these more advanced security models,
you can't just do one thing. You need to do a compliment of things and do it in a way that
it can be managed and monitored. So, so from a zero trust perspective, the types of things that look different are
the sort of policy-based, access-based, access controls, and that sort of thing,
a little bit more stringent credential authentication kinds of things. Is that
what it means from a storage perspective? From a storage perspective, yeah, it could mean leveraging attributes on the file. So not only are we being rigorous about what device and network
you're coming from, we're restricting where that can happen. Then we're looking up the policy about
you as an entity and your access to these specific files that might contain different categories or
classifications of information like PII sensitive information or financial information, and then using those
attributes and rules to determine if you're going to allow that access to happen.
And then that zero trust approach, it kind of fits into that, right?
Because with zero trust, it's really evaluating trust for each transaction.
So in the case of just the true ABAC, it's just doing it
to say, hey, do you have access and are you authorized to do this? I think the zero trust
approach takes it up another level and says, do you have access to do this? Can you see this file?
But then does this seem normal or does this seem suspicious that constant evaluation, remediation,
or not remediation, but constant
mediation to say, yeah, this person has access. This seems like it really is the legitimate person
and they do have access to this file. So we're going to allow the read to happen.
But in the zero trust approach, in the context of everything else, you might say, well,
this seems suspicious because it's not a normal time of day for this person to be accessing this
data or they're
accessing an excessive amount of data. So maybe we want to alert the security operations center
to investigate this and observe what's going on. Or even further, if we're more concerned,
maybe we want to block that activity until it can be investigated.
So from a zero trust perspective, it's more than just logins and credentials and things of that nature.
You're now starting to get into modus operandi.
How are people or how are applications actually referencing the data?
Is this something that's normal or abnormal?
And then what should we do about it?
Exactly right.
Yeah.
It's that kind of what's normal, what's not normal and observing that and learning that and then responding to that. And, and I think that's, that's the big, that's the real big difference.
And that's the zero trust approach to things. And that's not traditionally what we've seen in
network attached storage. Traditionally, what you're seeing is somebody gets to read,
write privileges to a folder. They have privileges to do anything they want to read and write files
that folder until somebody goes in and removes their credentials at some point. Right, right, right. So how does a storage system learn what's normal
and what's abnormal in a stored environment? Is this something that's done offline or is this
something that's done in real time at the system, at the customer's environment or both?
So it's a little bit of both. So some types of behaviors are kind of,
can be determined right away and can be done without really, it can be, the models can be trained and learned ahead of time and then deployed to the system. So we can also look for things like the use of admin credentials. We know right away if, you know, an admin credentials being used, which is a potential threat that, hey, it's an admin credential just because of the way we interact with AD or LDAP, we know that's a privileged user. So we can alert on that and take actions based on that.
We also know when you're starting to overwrite files,
that's a behavior that's different
than what traditional applications and users do.
When you get into kind of long-term analysis
and trends for user types and groups,
that's where we kind of have to learn per customer.
So you'd learn what
those trends are in the customer environment in a learning mode, and then go into an enforcement
mode or past observation mode where you essentially say, okay, when we start to see these anomalies,
we're going to take actions. And as a customer organization, you can define what you want those
actions to be. It could just be provide an alert or it could be you want to block
the access of the user account
or client IP to further data
until it can be investigated.
It's almost like learning,
training an AI system
and then using it to perform inferencing
and stuff like that.
Exactly.
And so there's actually a learning mode
as well as an enforcement mode
in the storage system.
Exactly. Yep.
And everyone's starting to get used to the concept of using AI and machine learning in order to aid or assist us.
I definitely don't think it's replacing us, but obviously you see that it's been able to be a big tool that we can use.
And we see AI ops growing within the IT sector. And so we're leveraging that the same way in our product to help, you know, storage admins and
security admins alike, uh, with, with the storage. From a, from a zero trust perspective,
does it introduce any, um, I was going to say, I'm not sure if the Racktop system has a separate
storage JBOD or if it's integrated into the appliance or I think in some cases it's software only, so it must be different to some extent.
Are there some sort of protocols between the storage server and the JBOD in this case?
Yeah, so it depends on the architecture.
So the software we deploy, whether you deploy it as a virtual machine or as a physical instance with bare metal, is the same.
But the physical capacity that we use to store the data can vary depending on the deployment, right?
So in a virtual machine, it could be an iSCSI line being presented to it.
We could be attaching elastic block storage in the case of
an AWS instance, or on an on-prem deployment, we might have a bare metal appliance with direct
attached disk where we're talking over SAS, or what we see to be even more common these days
is where we're working with the leading OEM SAN providers and taking iSCSI or fiber channel LUNs
from them, presenting those to our controllers
or our software, and then laying a file system down on those LUNs, and then storing the data
on those LUNs provided by the SAN.
And we can leverage, you know, one SAN or many SANs to the same set of controllers.
Multiple controllers gives you high availability and can also scale out performance that way
too, as you start to scale up in the
need for, you know, either bandwidth or IOPS, et cetera. Those cases on the iSCSI protocols would
be, you know, the dominant solution to provide logins and stuff like that. Is that what you're
saying? iSCSI between us and the capacity, but the users and the data is going to be served over SMB or NFS, right?
Yeah.
Exactly.
I got you.
It can be flash or hybrid, right?
Whatever they want, right?
We have the ability to use hybrid architectures
with a little bit of flash
and spinning hard drives
that give you the economics
of the hard drives and capacity,
as well as the performance of the flash. Or for those really demanding workloads, we can
do an all flash tier. So we give you that flexibility. And then all of our solutions
leverage RAM as a primary caching tier to deliver that low latency and high IOPS. But it also helps
with being able to perform all these security operations as users are reading and writing data to the system and providing that zero trust evaluation.
And this sort of this learning and enforcement modes, those really depend on not just, you know, it could be an application, it could be the user, it could be, you know, it's very specific to a specific IO activity, I guess.
Is that how I read this?
I mean, because I mean, different things are going to be bad for others.
Right.
Like applications are going to have certain ways of doing things.
You know, they might work on specific types of files and file extensions.
The way they handle those files is going to be one way.
The way a typical user opens files, reads files, modifies files is going to be typically different.
And then what we're going to be looking for is those long-term trends as well as those users are members of groups. We kind of know that through group membership and active directory,
where they're coming from. We know the machines and all that stuff. So we're able to tie those pieces together, which is why things like machine learning and trend analysis is very
helpful because there's multiple factors we can leverage to make those decisions. And that's kind
of the value of tying all those pieces together and understanding that. And also understanding
not everything's the same, right, to your point. So it's not like you treat the whole system
identically, right?
It could be this data set and this type of data has this type of trend and pattern.
This user and these groups have these trends and patterns and they work on these data sets.
So there's lots of things you can do to segment this and isolate it as well and to simplify
the problem.
And some of the things that we do as part of our real-time analysis, you know, doesn't really require training and learning
out of the box. The customer gets that benefit, you know, to detect things like ransomware,
privileged access abuse, and excessive file activity. And those would be standard regardless
of the user environment and that sort of stuff. I mean, those are things that you've learned
offline that you would deploy automatically as part of the system.
Exactly. Yep.
And we have assessors, for instance, for specific types of ransomware,
as well as generic assessors looking for a zero-day type ransomware attack.
The difference being that on the specific assessors,
we can build up confidence much more quickly to then say,
hey, we think it's this specific variant of ransomware.
And these are the affecting, you know, user account and machine versus generic.
It might take us a little bit longer to build up confidence that it is a ransomware attack,
but we will still stop that attack as well and make it easy for you to remediate and recover from that as well.
And you mentioned that the actions that are actually taken are something that is almost configurable by the customer. Is that true as well? That is true. So out of the box,
there's a default configuration that we provide that we recommend. But you can think of it like
a data firewall where there's a rules engine built into the user interface that it has the
default settings, but then an organization can go in and change those
settings as well as add new rules. And also as they find things in their environment, like they
might be doing something that requires the use of a privileged account, and they're going to be
restricting access to a particular data set. And maybe it's a long-time rule, or maybe it's a
temporary rule that they want to allow where they're doing something like a data migration.
And the system allows you to put in a permanent rule or a temporary rule and then choose the actions you want to happen with that rule.
Ignore the behavior, alert on it, or block it, or do all three.
It seems fairly sophisticated and challenging to correctly set up.
Is there some tools that you have in place to make this, I mean, obviously the defaults and things of that nature, but to make this easier for the customers to deploy it in a
secure fashion?
Yeah.
So I think having observed some of the sales calls, I think people get a little concerned,
especially when it's been, you know, because a lot of times it's the storage admin or the infrastructure operations team that's going to
be purchasing and managing the brick store. Sometimes they're a little concerned, hey,
is this going to be a big time requirement for me? Is this going to be very challenging or difficult?
But we've been successfully able to show them that it actually is not that difficult to configure.
It's kind of easy. And
then once you configure it, it's not like some of the other file management tools that have lots of
false positive because of the way we do it. So you're not going to be barraged with that. But
we also make it very easy in that observation mode to see what's going on. So in the observation or
learning mode, you're seeing, hey, here's a case where a privileged account is being used. It's accessing data here and it's doing this.
Normally, that would fire off an incident and put an enforcement action in potentially.
With this, you can right from there create something like an ignore rule where you choose what you want to happen right from there.
Yeah, that would be great from a perspective. So it's like the customer actually gets to determine whether it's an observation mode or enforcement mode for how long.
And then once it's in enforcement mode, all those things start to really trigger actions and things of that nature.
That's great.
And what we've seen, too, with a lot of customers is they get visibility they haven't had before.
So they can see, hey, I kind of remember we did that
and we meant to go back and clean that up and we haven't.
So sometimes they'll go and make those changes
and improve the hygiene right then.
Other times, that's just the way it's going to be
or it's going to be for a while.
So they'll just allow that behavior to happen.
This is great.
All right.
Well, Jonathan, is there anything you'd like to say
to our listening audience before we close?
I appreciate it.
It was a great opportunity to talk about Zero Trust and Storage.
We'll be at HPE Discover later in June.
And if you're there, please stop by our booth.
Great.
Great.
Well, this has been great, Jonathan.
Thanks again for being on our show today.
And thanks again to Racktop Systems for sponsoring this podcast.
Thank you.
That's it for now.
Bye, Jonathan. Bye, Jonathan.
Bye, Ray.
Until next time.
Next time, we will talk to
the system storage technology person.
Any questions you want us to ask,
please let us know.
And if you enjoy our podcast,
tell your friends about it.
Please review us on Apple Podcasts,
Google Play, and Spotify,
as this will help get the word out.