Podcast Archive - StorageReview.com - Podcast #113: Dell PowerScale and the Impact of QLC SSDs
Episode Date: October 18, 2022Brian connects with Dell’s Product Management VP, David Noy for this week’s podcast. David… The post Podcast #113: Dell PowerScale and the Impact of QLC SSDs appeared first on Storag...eReview.com.
Transcript
Discussion (0)
Hey everyone, Brian Beeler here with the Storage Review Podcast and today we're talking all
things PowerScale and what's going on with that line of products from Dell.
There were some big announcements at DTW back in May and now some of these things are coming
to market which is really exciting to see.
I've got David Noy with us today.
David, thanks for jumping in.
Hey, thanks for having me. So yeah, I'm glad to do it. Glad to talk with you. I mean, you guys
had all sorts of great announcements at DTW. Sometimes the hardware was a little bit behind
the software offerings. But before we get into all of that, just what do you do at Dell? Give us a
little background on yourself. No problem. So I lead product management for our unstructured data solutions team,
which is our scale on NAS offerings, so PowerScale, as well as object storage, ECS,
and then the software-defined version, ObjectScale.
I also lead product management for our data protection suite.
So that's the PowerProtect DD product, all
of our software assets around backup, including our new PowerProtect Data Manager and things
related to cyber recovery and cyber sets.
Okay. Yeah, I didn't know actually that you did backup too. One of our favorite things
that you guys ever did was that little DD virtual edition that started to slide into, to rail and some other spots was just a neat,
like a little data protection widget to just kind of clip into your
environment. That was, that was pretty slick.
I don't know if you should take any credit for that,
but that's one of our favorite things.
I don't know if I can take credit for it, but
inherited, we have 13 exabytes,
I believe now of data protected in public cloud on that little DDVE product.
Really? Wow. That's extensive.
That's a lot.
So take us back a little bit to Dell Tech World.
I mean, there obviously Apex was a big messaging driver and Alpine.
I guess you guys like code words that begin with the alphabet, which is cool.
Apex, of course, is a service.
Alpine being the cloudification of your operating systems and storage software.
Let's start with those.
Where does PowerScale fit into those models?
How are you guys thinking about the
as a service and then also this cloud delivery model?
So we have PowerScale as a service in APEX today, file on demand, very popular offering.
For customers who are looking for a fully managed service, who want to buy consumption
based and that's the model for them
that capability is out there right now what we showed at deltech world was
a software defined version of the operating system that runs on power scale called 1fs
and by moving to software defined that gives us the ability to adapt beyond just the appliance model which
we ship today and even the underneath APEX there are physical appliances but as we go forward
to actually deploy 1FS in the cloud as a cloud native service for Scale-Out NAS.
Yeah that's ambitious right I mean but I, it makes sense to sort of separate the two components,
but the Alpine project's a big one for you guys.
I mean, that's obviously a major emphasis in the messaging,
but also, as you're talking about it,
is something that you're working on delivering too.
So if you think about it, this isn't just a PowerScale or UDS thing.
It's across all of the Dell technology storage assets.
We're in the process of making them all become available as both a software-defined consumption model as well as appliance consumption model and as Apex as a managed service offering.
So customers have their choice.
If they want something in the data center, they can get it that way.
If they want it in the data center or they want it as a fully managed offering, they're willing to
outsource that operation, they can get it that way. Either manage it in your data center or
hosted. And then as a cloud offering, we're doing that for file, object, block,
and our data protection products. So we process various stages of that process across the
portfolio. PowerScale has come pretty far along.
And so we're looking forward to what that means.
But we demoed the beta so you can guess as to how far along we are.
Well, typically when Dell shows something, that means you're pretty far along in the
process because you guys don't tend to lead with half-baked kind of solutions.
As you're talking about that, I do wonder,
and I don't even know how much visibility you have into this, into what happens from a development
standpoint for the code. Because taking the code off of hardware, I mean, still obviously runs on
something, but it's a different development modality than traditional storage array development.
And I think there was even a big
shift like when Dell and EMC came together to take some of those really traditional EMC storage
appliance hardware models and transitioning them to a little more power edgy kind of server models.
You had to make a shift in development mentality there. How different is this going to a cloud delivered model?
Well, you hit the nail on the head. I mean, the first move was to adopt PowerEdge servers
as a platform for us to extend onto. And so we've had a lot of success with our all-flash servers,
which are the PowerEdge based servers. But in doing that, we had to take a different approach
to how we build out 1FS to assume that ultimately we could be run on any platform.
And so if it's a PowerEdge platform or it's a compute instance running virtualized in cloud, call it EC2 and AWS or Azure Compute, what have you, that we treat those as yet another platform.
And so we abstract ourselves further from the actual hardware so
that that way we can actually become software defined what we don't want to do and you called
this out earlier with the development practices what we don't want to do is fork the code
and have two completely different code bases one that's for cloud and software defined and one
that's for the appliance and so we've kept that as a common code base. The importance there is that I get asked all the time
by my customers, when you launch this thing in cloud,
is it gonna be different from the version
that you have on-prem?
The answer is, it's the same code.
Like largely speaking, it'll be functionally equivalent
to what you have on-prem.
There'll be differences in performance probably. There'll be differences in performance, probably.
There may be differences in the scale,
but largely speaking, it is the same code.
So if you're used to doing things one way on-prem,
you're going to be very used to doing it the same way in the cloud.
And that commonality is actually something
that's very important to our customer base.
I think it's probably beyond very important.
It's like mission critical, right? Because you can't have a storage admin on-prem saying, oh, well, this is how we do it in the data center and spin up whatever. We use some deep data feature that's within PowerScale and then go to the cloud for some remote work, and be like, oh, that's not there yet. And we've seen that before where there's not feature parity
between the cloud version
of whatever is being deployed
and the on-prem,
and it causes some conflict.
That's right.
Yep.
We get the question all the time,
can I replicate back to the on-prem?
Yeah, replication.
If I run cyber capabilities,
we talked about the ransomware defender capability
in the on-prem,
and I want to do cyber protection of my in-cloud assets.
Can I do that?
Sure.
So let's stick now with the hardware a little bit.
So in May, again, you launched a couple things around 1FS and PowerScale.
You had some security stuff, which we'll definitely talk about some more,
but also the support for QLC, which is really interesting to me. We've done so much work around
QLC. To see you guys from a mainstream array perspective adopt that, I think is pretty cool
and really validates that the QLC NAND is there and ready for heavy use in mainstream enterprises.
Talk about that a little bit, the decision path there, kind of what you're seeing that from an enablement standpoint for your customers.
So QLC is just a different architecture of Flash from TLC in that it adds more dimensions, if you will,
to the way the data is stored.
But the trade-off is typically endurance and performance.
The way that we use those drives
is that we're fairly sparing on the endurance.
So we're pretty gentle in the way that we use the drives.
We do large IOs. We try
to be very careful about how we place data. Data that comes into the cluster is not immediately
just dumped on the drives. We process it first. And so we've done some calculations using infield
metrics. And what we've found is that we get about 14 years of endurance out of these drives
before hammering them in the worst possible ways
that we see happen in the field, which is pretty good.
I mean, for the longevity of a drive that's supposed to be low endurance,
14 years is a pretty long time in the worst possible cases.
At the same time, we have checks and balances,
so we do wear level checking to make sure that the drives haven't been for whatever reason,
inadvertently worn down. We will alert our support teams if their drives are failed for whatever
reason, but largely speaking, we feel fairly comfortable that even with the lower endurance
of these drives, because we're using larger capacity drives like 15 and 30 terabyte,
the lifespan of these drives would be quite good.
And so I've got a couple sitting on that just happen to be sitting on the desk here.
We've done a ton of work with these, the 30 terabyte part, the solid dime drives, and they're really fantastic from the capacity bump that you get like
you set up to 30.72 in in a drive and you guys are supporting these i think in the f200 and f900
nodes 600 600 900 anyway the the one the one will fit um drives in it. So 24 30.72 terabyte drives is roughly 750
terabytes in this thing, almost a petabyte in one of these 2U chassis. And you hit on the really
important thing, and I was hoping you would do this because we know we collectively know from telemetry data what the drives are seeing at the drive level and then what
you're seeing at a system level and with that information you can confidently go
to customers and say look guys it's QLC there are going to be some trade-offs
and we can talk about that in a minute but on the upside your workloads aren't
punishing these drives as much as you think or might be afraid that they will. So, so long as you're under this whatever drive
right per day ceiling that you guys set, then so be it. And if the drive fails, it's under
warranty. It's not like the customers have much risk here outside of workload performance.
Yeah, no, we've been very successful with the launch, actually.
So far, no drive failures, so cross your fingers.
But look, you're right.
So it's not even just that.
Even under the most punishing environments that we see in the field,
the real-life environments, these things will endure.
And the net of it is, even with the performance profile
being somewhat different from TLC drives,
the way that we do a lot of prefetch
and a lot of performance optimizations,
we see the exact same performance with QLC drives
as we do with TLC drives.
So from a customer perspective, it's great.
I get a lower price point and I get more density and more
equal performance. Like what's not to love, right? Yeah. I mean, we've seen some of that
performance data. In fact, we're working with your team on it. I was in Hopkinton last week,
lovely fall day, by the way. I don't know if you had the team order that up for me, but
doing some hands-on work with PowerScale up there.
And yeah, I mean, we all know the read performance is good. What surprised me was what we saw on the
write performance, which is at times where QLC can be troubled because, well, I don't want to
get too far down in the weeds, but the QLC drives have an indirection unit. Right now, it's 64K.
That's how they want to be written to.
They want it to be nice, sequential, friendly, writes to the drive to maintain performance.
And I guess in the software, somewhere along the way, you guys have figured that out to make sure that you're writing to these drives the way they want to be written to.
And if you do that, if you can coalesce those writes before they go to the drive, then you're pretty much golden, right? That's exactly what we do. And we are.
So, I mean, I think we're not done squeezing write performance out of these things yet. So
our engine is working on it right now, but we might be able to squeeze more. What's nice is
we are able to squeeze more read performance. So I think we announced it at Dell Tech World,
so I'm not out of line saying it, but in our upcoming 1FS release, we'll actually give you 25 to 30% more streaming read
performance with a software upgrade. So we're not done squeezing more juice out of this orange.
So talk about then how your customers are using these drives. I mean, I'm sort of conflicted on
this because I feel like when you go with a density message of, hey, customer, with TLC, we're capping out at eight terabyte drives or
maybe up to 15.36 in a few cases, but those are really quite costly when you start to get to the
large capacity TLC drives. Be that as it may, we can now get maybe four to eight times more
capacity depending on what you're coming from
in these systems. Do you worry that customers will buy fewer nodes because they don't need
them for capacity? I mean, I would be worried if they weren't doubling or tripling their
footprints every year. Some of our customers who are consuming this class of node for performance
and density, we'll go through some of the workloads who are consuming this class of node for performance and density,
you know, we'll go through some of the workloads in a moment,
but the amount of data that they're gathering is just, it's nonstop.
And what we're seeing is that, you know, whenever you put out a more dense node,
customers will gravitate towards it.
So you put out a 15 terabyte drive, that's one thing.
If you put out a 30, you guarantee that the 30 is going to outsell the 15.
Really? Wow. Is that specific to unstructured that you see that happening more? I mean,
I guess because the media explosion, everything's just, as you said, nonstop.
At the end of the day, I talk to customers who tomorrow they might have to pick up a six petabyte additional data pool.
And where am I going to put that in my data center?
How am I going to catch all of that?
And how am I going to do it in a cost-effective way with rack power sometimes being constrained and rack space constrained?
It's a tough problem.
So density rules.
Now, don't get me wrong.
Performance is important,
too. And so we're continuously working on both. But we'll keep striving towards density.
So what do you have to do? I mean, obviously, adopting the 30 terabyte drive helps on the
density side. Do you, as you said, the data sets are expanding, but these guys are constrained in many times rack space or power or cooling or whatever.
Do you need to do more dense two and a half inch drive chassis?
Like what else can you do to continue to help on the density messaging?
I think we have to look at some, you know, new design paradigms.
Certainly we're exploring a few, you know, without going too much into it.
There's several new different drive form factors,
but that said, I don't think that we're capped out
on the maximum right now with 24 and two U.
So I think there's more that we can do
to jam more drives in there.
You know, one thing that you have to think about
as being a differentiator between us
is we're not doing some complete and shelf model.
So, to think about as being a differentiator between us is we're not doing some completed shelf model so um with there where you potentially don't have compute along with storage you might get
you know misinformation about how much real density you have we scale compute and storage
together and if you need more performance for your data you really need a performance dense
set of data then you might buy smaller drives.
So you might use four terabyte or eight terabyte drives
when I really have to just do a ton of performance
out of a small data set.
But for customers who've got lots and lots of data
and that data is continuing to grow,
and it's not all active at any given time,
what percent of your data actually needs
that performance at any given time?
Is it 20%?
Is it only 10%?
I'm sure it's not 80%.
Then basically we'll tune the density
to be as rich as possible when that percentage
grows smaller so that you're maintaining the performance
but getting the cost efficiencies
and the power efficiencies that density drives.
And so that's really the question we're thinking is like,
how much more density can we actually physically fit into a certain amount of space while preserving the performance or even getting incrementally better in terms of performance?
Yeah, you talked about form factors, obviously, like E1L would be a great one or an E3 or any of some of these other ones where you can get the nice long NAN packs with plenty of cooling around them with the heat sinks.
It has a lot of potential in terms of the capacity that you can jam in there for sure.
Very few mainstream storage arrays have moved off of U.2.
Customers know it and like it. I mean, so we were talking about
your adoption of QLC and it being very good. And I'd be curious kind of to dive into that more, but
normally we hear a lot of FUD marketing around new technologies or new shapes or whatever,
because your competitors may not have that. So they go straight to, well, you know, endurance, whatever, you know,
and try to try to cloud the waters a little bit.
But it sounds like your customers are still adopting regardless and,
and really like what they're seeing from these high density QLC drives.
I'm just curious how much friction there's been in that sales process.
There's been a few calls I've been on with customers to walk them through why we're comfortable with the QLC drives.
Largely after seeing the data, which is just publish it to them as plain as can be.
Those, you know, those concerns largely fade away.
In fact, I haven't seen a situation yet where they haven't just been,
okay, well, that looks good.
That said, you know, in terms of, you know,
how we're going to pack more in,
there's all kinds of different ideas right now.
So I know there's a few that we're pursuing.
It's a little bit too early for me to call it yet,
but I understand that, you know,
I fully appreciate the benefits
of getting as much density and compute together at the same time as possible.
So when you guys made the announcement back in May at Dell Tech World, you announced the support for the 1536 and the 30.72 QLC drives, which was pretty neat, but also a little bit surprising.
Because historically,
Dell has been very much in the storage world about multi-source everything.
But when it gets to QLC, that level of competition isn't there.
I mean, Solidigm has the drives I just mentioned.
They're the only ones really at scale with those drives. So it was pretty obvious, you know, what, what you guys, who you were working with there. But that's a, that's a fundamental shift, at least from what I'm aware
of in, in your world is, does that matter? Does the multi-source issue, like, how do you address
that? Or does, do you care anymore? Well, you always want to be multi-sourced, right? As much
as, as you can, as much as it as makes sense and so we encourage all the drive
vendors to look at 30 terabyte qlc as being a good opportunity i mean that said the opportunity is
just too rich to go after to to to pass up for lack of another vendor there so you know we have
a good relationship with solid and and they came through, they really did.
So what can I say other than just, you know, it's solid.
Right.
Look the moment that something else comes out, we'll take a look at it too.
But right now that's what we have in our customers are demanding that and that's
what we're giving them.
Well, sure. So I guess if your customers are saying we want greater density and
you're looking around saying, is there a good solution for this? Yes, there is. Then go for it, right?
That's right. some of that work, I fully expected, I'm still surprised, we haven't published it yet, we'll get
there very soon, but the performance on the right side, I know we already talked about it, but I'm
still stuck on it because that was the one thing that I'm like, well, how's Dell going to go to
market with this, talking to customers about understanding how much right impact you have
and blocks, because that's going to be really confusing and a little bit sloppy of a sales motion to say,
you know, it only works on real large block.
But I guess, again, kudos.
There's not much else to say,
but you guys have done a good job to figure that out.
In some ways, Brian, I would almost say it's just luck.
Like the way that 1FS writes data to these drives
happens to work really well, right? Like the way that one of us writes data to these drives,
that happens to work really well, right?
We do coalesce writes, we do a lot of write pre-processing,
the data comes in and it's handled in a way
that makes it possible for us to actually go
and ensure that we don't cause damage going in
and we can ingest at a very high speed and make sure that we don't cause damage going in and we can ingest at a very high speed and make sure
that we distribute that load across all the nodes in our clusters. So the advantage of that is that,
well, it turns out that we're pretty easy on the drives underneath, pretty gentle on them.
It's not like one, you know, data is coming in and it's hitting a very small set of drives. It's
actually going across every node
in a cluster up to a reasonably good sized cluster in every drive inside. So we're pretty
well distributed. And that means that every drive is just seeing a fraction of that IO.
And that takes a lot of the pressure off those drives.
Do the large capacities cause any new challenges? I'm just
sort of thinking through what happens in 1FS if the node that was 60, 80 terabytes before is now
almost three quarters of a petabyte. Does that just massive data cause any new table indexing? I don't know. Any
other challenges for you that you had to work around?
Not yet. We calculated a number called mean time to data loss, which is a calculation
of given the size of the node, the number of drives inside of it, and what would happen
if we had multiple failures at the same time, the probability of failure occurring with the
drive or the chassis itself, how long would it take before enough failures occur that
you actually experience data loss?
And we try to keep that number in the thousands of years.
Yes.
You know, in a thousand cluster, you know, 5,000 clusters every year, you might actually have that happen.
But very few of our customers have 5,000 clusters.
So generally speaking, we're fairly OK.
Now, of course, that's the lower limit.
So we always try to go beyond that.
If we ever ran into a case where the size started to impact that, we would probably start to ratchet up our protection levels.
And so you'd start to trade a little bit of efficiency.
And this is what everyone who kind of uses this technique of erasure coating across drives to get endurance will do is they'll start to trade off efficiency to get more durability out of the system.
And so we have that lever at our disposal.
Fortunately, we didn't have to use it to go around. Oh, that's interesting.
So you would over-program your resiliency
if you felt like you had to deal with a less reliable part,
whatever the part is.
Correct.
Huh. Yeah.
Well, so that's good.
That's, you get all the benefits now and you haven't had to work real
hard to get there outside of your traditional qual process and the fulfillment support arm to supply
these things out to the field to deal with potential replacements or whatever.
I would say the most important thing is that we did the calculations, that we did all the quality testing to make sure that this was a high-quality
product that met those endurance levels that I described earlier,
so that our customers who are running,
I'd call them mission-critical applications.
They're not Oracle databases,
but they certainly are business-critical for those customers.
In some cases, if it's life sciences or healthcare, then it's
very mission critical and can feel comfortable
with the data. So I agree. It was a fairly engineering
light lift,
but we had to make sure that we ran it through the paces.
So for those customers that are worried about, that still remain worried about their right impact on the drive in terms of endurance,
we talked about the drives have telemetry, your system has telemetry.
If they're in a current power scale system, is that something that you can, I don't even know how that looks in, in 1FS. Is that
something that they can see or something that you can tell them be, Hey customer, I know you're
worried about your, your rights, but you're at, you know, 0.2 and a half drive rights per day,
which is well below the spec. And can you, can you see that in your systems and share that with
the customers to help them feel more comfortable? Both. You can report on it through command line interface or APIs and you can see it in alerts.
If something like, if we were to run into a situation where a drive was experiencing errors,
we would know about it, you would as well. Right. But they could also, you could also tell them like,
hey, looking at your current workloads, you're well within the window for what we can handle
with this particular drive spec.
Yeah, there's ways of reporting on that.
That's right.
Okay.
So the other big, yeah.
It's like belt and suspenders, right?
So the belt is like, we did the testing.
We feel good about it.
The suspenders is that there's enough alerting and reporting capabilities in the product
that if you were to get near the rails,
you know, we would know and you would know, or if you're getting to the rail.
So your next challenge then will be to push these guys to crank out a 64 terabyte part
so you can continue to pitch your density story in power scale with new bigger drives,
huh?
Oh, well, that'll be the next challenge for sure.
We'll see whether we have to make any adjustments to accommodate those.
But I would imagine that we're all looking for that.
I think that drive vendors as well as storage vendors appreciate the value of density.
And so I would imagine all of us have got that goal in mind.
Yes, I suppose so.
So you talked about security.
That was the other thing that you guys were excited about from a 1FS perspective back at DTW.
Everyone's worried about cyber resiliency and attacks and everything else.
And file data, unstructured data is kind of one of the more dangerous spots, I would think, with so many points of exposure,
all the file data that could be corrupted or attacked at any time. How much responsibility
falls on the array for that versus backup software or other protection applications?
How do you view that? That's a great question.
So I happen to own cyber protection products on both of those portfolios.
And so it goes back to belt and suspenders. I kind of feel that it's a,
it's a nuanced discussion, but it's an important one. If you're backing up data,
you have a responsibility to make sure that that backup is secure.
Because that's your last line of defense for your business.
Hopefully you're backing up data that requires backup because backup infrastructure is not free.
And so if somebody attacks a backup and that backup is destroyed, you've just lost your very reason for having backup to begin with. Which means if somebody were to accidentally delete something or something bad happened, you have no way to recover.
So let's just put that on the table that your last line of defense is your backup.
And if you're backing up data, you have to find ways to make sure
that that data is impregnable, that you've got the right checks and balances in place
to make sure that any
latent threats have been discovered and that you have a fast way to remediate against them.
If something's been corrupted in some way, shape, or form, and you find out about it,
that you have locked down your system and you can go back to a last known good copy and figure out where that was.
That said, when you talk about unstructured data and file data, oftentimes it will reach a scale
where backup products simply don't usually extend into. So if I'm looking at a customer who's got,
let's use this 30 terabyte all flash unit, and they went and built a 30 petabyte cluster out of it.
The likelihood is that they're not backing that up using traditional backup infrastructure.
And so it's just too big, right? Too cumbersome.
So quite possibly they're just replicating snapshots and locking down snapshots at the secondary location.
Great. Now, if an attacker gets access to the primary and starts to encrypt data, that will get replicated to the secondary location. Great. Now, if an attacker gets access to the primary and starts to encrypt
data, that'll get replicated to the secondary. And so... Yeah, it just passes through. And now
you're in trouble on both accounts again, right? Exactly. So just like I put a lock on my front
door, if an intruder gets through the lock, I have to have a secondary security system. And so I have
cameras around my home.
I thought you were going to say a shotgun, but okay, camera's fine.
Yeah, a little more innocuous, but you know.
Fine.
In the Midwest, we do things a little differently out here.
I'll make sure not to break into your home.
Thank you. The point being is that what we do on the primary side is
it'll expand to include scanning, but right now it's heavily focused on the actual user behavior.
Because in the case of primary, there's a lot of data to go analyze and figure out if something
bad happened over the last course of time. So instead, I'm looking for what's happening right now.
Are people accessing data they're not supposed to?
Are they behaving in ways that they're not supposed to?
And I want to be able to lock that user out of the system,
revoke their privileges.
I still want an air gap solution
so that I actually have a completely remote version of the data
that cannot be accessed by intruders who are
vectoring in from the primary. So they're basically behind the vault type of a solution
that opens the gate periodically and says, hey, is everything okay? Send me what you got, right?
And then locks up. So intruder can't get in, but it keeps multiple point in time copies that are
immutable. So we have that type of a solution as well.
If your system is small enough that you could back it up,
you have the option to do both.
You could use the backup with a cyber protection
to make sure your backups are not,
because typically a backup system is not just for one NAS.
It might be for multiple NASs,
it might be for a NAS and a couple of databases
and a few VMs that are running over
in a pocket somewhere else.
You still want to lock it down unless you've dedicated that backup system to that NAS.
And then on the NAS, you probably want to lock it down as well because detecting right away that someone's breached your environment or somebody's behaving in a way that's apparent is important because otherwise it'll just get propagated to the backup.
You'll still have to restore, but you're just going to propagate the problems.
The sooner you shut it down, the better.
And so there's both of those solutions in place,
but it's really kind of a user has to decide if I'm backing up,
I have an obligation to maintain the integrity of my backups.
If I'm running primary,
there's oftentimes cases where I just simply cannot back it up.
Therefore,
I will have, I will want another way to secure it. What I like about what we've done in PowerScale
is that if I look across the solutions in NAS, some of our competitors, nobody has a
nice vaulted solution the way that we've built it with a pure, with a full air gap. It's
an inside the vault solution. An attacker cannot vector in.
It's a different set of administrators, potentially.
And it periodically checks
so that the gate is not constantly open
to figure out what to do next.
And that really sets us apart.
Well, talk about that a little bit more
from a process standpoint
because you're talking about
storage admin stuff. You're also talking about ITSEC stuff at the same time.
In the orgs you work with, who's really taking ownership over data security for what you do,
for unstructured? And how much of that is a security issue, how much of it's a storage issue,
how are orgs adjusting to deal with these risks more or less in real time as you describe it,
rather than retroactively, let's go to the backups to find how far back we've got to go to
get one that's not infected or whatever, where this thing wasn't lying dormant.
But in the real time model, how are you seeing that work with your customers?
So, you know, obviously a lot of this is being driven by the CISO.
That's their charter.
That's what they're looking at.
And then they're driving it top down.
That said, it doesn't take long to look around on the news and see how often these breaches are occurring.
If I'm an infrastructure owner, I'm going to be paranoid about my data.
And it depends on the organization.
But a lot of these, you know, we were talking about these high performance drives, a lot of them go into regulated industries who have penalties that they have to basically cough up or have to deal with the repercussions of their data being offline.
And so it's kind of coming from both sides.
I would say that in the space of backup,
the folks who are focused on backup
and have that as a job title or as a responsibility
tend to be more attuned to it
in terms of the urgency around implementing these solutions.
But I will say that at the same time in the file space, I think we have over a thousand deployments
of the ransomware defender products. So it's not like it's not there. It's pretty well understood.
Will we do more? I think that that will increasingly become
something that is just going to be a check you know, check the box has to be part of the order coming from our customers,
not just us. But I do think the backup administrators are ahead.
I do think that the conversation starts with the CISO and you meet in between.
And, you know, so.
Yeah. So I'm thinking through from a portfolio perspective, and I know the rest of these things aren't under your umbrella exactly, but can Dell take what you've learned here on PowerScale and, for lack of a better word, port those types of security solutions to scale and Max and Flex and all the other stuff that you offer? Is that part of, I'm just, I, we haven't ever talked about this before. I'm just wondering how much of that universalizing some of these chunks of
functionality can happen and then spread across the portfolio.
So, so interestingly, you asked me the question,
PowerMax recently implemented some metrics that
can indicate whether or not they believe that something bad is happening in
your environment as well that gets sent up to a security index that's reported on in cloud iq
and so all of the products will start to report on some of these metrics within cloud iq to show
at a base level you know is there a score that you can look at that tells you, hey, I think there's something going on or it looks pretty clean. That's at least kind of your first line of defense,
not your last line of defense. Obviously, you may want to get more sophisticated than that. And
that's where some of these additional product offerings come in. And we do take a good,
better, best approach, right? So all of our products have immutability to some way shape or
form so you can always turn on immutability and with retention lock and prevent someone from being
able to delete something and look at you know some reporting metrics to figure out whether or not
something's going on the next level is the you know user behavior and actual content analytics
and being able to create a vault is another step.
So it's almost in the other order.
It's actually immutability is kind of how do you protect your data?
Then there's isolation, which is how do you put it in a location where
attacker can't easily vector in and there's intelligence, which is how do you
look at either attacker behavior or scan the data itself to look for
something that I can notice that to the ordinary eye might not actually look like an attack?
Because some of these attacks are getting so sophisticated, they might flip only a few
bits and that's enough to really mess up your data and extract a ransom.
But it wasn't enough to really trigger anyone that something really bad happened.
So it might stay dormant in your environment for months before you actually detect it.
Unless you had some AI ML that was learning from other people who got attacked and go,
oh, wait a minute, whenever I see a pattern that looks like this, that's an attack.
And so we're building all of those capabilities into all of our products.
And each one of them will essentially feed into different
levels of scoring, whether it's just a security metric that's reported on in CloudIQ or a more
sophisticated product offer or maybe the best product offer, which includes an air gap vault.
That's interesting. Yeah, I mean, we know that the backups are now the primary target for
ne'er-do-wells that are looking to cause problems.
And as you said, the data suggests pretty heavily that they will sit as quietly as possible for some period of time,
because then you end up backing up, backing up, backing up.
And now you could have months or days.
I mean, who knows how long that piece of nefarious code or whatever is there just kind of waiting to be activated.
But I do like the notion of taking everything you've learned and not just leaving that in one customer's library.
But let's say, you know, let's be collaborative here and take these signatures and these behaviors and figure out a likelihood that this has shown up in your environment.
And if so, here's what we need to go do to remediate it before it's a serious problem.
I mean, that's got to be a big part of the future for getting out ahead
or as ahead as you can be on these attacks, I would think.
That's right. I mean, it's really become like an antivirus-type world out there.
We have to get smart about how malware is making.
This is a form of malware, right?
It gets way into your system.
So try to do that as much as possible.
You made a comment before, I want to revisit for a second,
about the unstructured data at scale being really cumbersome to back up.
And it's not something that I've thought a lot about.
Is there a common wall that customers hit?
Is it 10 petabytes?
Is there some number where they're like, okay, once we get this far, it's too hard or too expensive or too slow or whatever?
I mean, I don't want to throw numbers out there and get people wrapped around them,
but there is probably a scale where you're either saying, okay, you know what?
I'm not going to back up everything.
I'm going to be more selective about which data sets within my environment i back up okay what i'll tell you is that you know it's not uncommon for us to see
10 to 20 petabyte plus environments even 200 petabyte environments not completely uncommon
and so at that scale the backup infrastructure to support it would be enormous
um and so you know for a number of different reasons including the change rates would be just
off the wall off the charts so um you know there's probably in that petabytes multi-petabytes range
where um customers flip over to either doing selected backup of just a portion of their environment
or just using replication and snapshotting as a way to protect their data.
Does this then, I mean, we started out talking about Apex and Alpine.
Alpine specifically has got to have your customers pretty excited because now if i can ad hoc spin up a
couple petabytes in a cloud of 1fs and use that as my replication target rather than you know
whatever they do now copying the the infrastructure or taking old nodes and making those replication
targets or doing it on disk because it's more cost effective
or whatever. Like there's a lot of other expensive and or suboptimal ways to do this. But Alpine
seems potentially like a really great fit for these guys that are too big or for anyone that
wants to replicate this stuff off and still have the accessibility to it know fail to it if they have to oh yeah i mean all kinds of
interesting topologies pop up for just a one-time burst of cloud because i need to go
stand up a pretty large cluster i've got the data but i don't want to go buy 10 000 compute nodes
for a day um so i'd rather just rent them from the cloud to what you said, like, oh, let's do periodic updates and maybe we'll spin down the data during times when it's quiet and spin it back up and catch up to where it was before to find out ways to low-cost, really low-cost object storage.
It's the ultra-cold archive.
So all of these things come into play.
And then there's just a pure agility play.
Like, you know, we happen to be,
I happen to be in a fortunate spot in that
Dell is a pretty large vendor of both software but hardware as well
and as a large vendor of hardware we enjoy really good relationships with our suppliers which you
talked about earlier and so typically I can supply my customers with gear when they need it but some
customers or some of you know some of our competitors may not have that luxury.
And so if you've got an environment where you've got rapid change and it's unpredictable,
and so I'm looking at something that might grow by two petabytes tomorrow and seven petabytes the day after,
who knows what the day after that, one of the advantages that cloud gives them is that almost instantaneously accessible supply chain.
I don't worry about, you know, oh, it's going to half a year to go get my gear. instantaneously accessible supply chain.
I don't worry about, oh, it's going to be half a year to go get my gear.
Okay, well, let me just hang out and wait. If a customer doesn't have that luxury, that agility is pretty critical.
Well, sure.
It lets you spin up almost instantly, right?
As fast as you can swipe your credit card
and be on your way. Yeah. I mean, this is a good conversation. We've covered a bunch of stuff I
didn't expect to. So I appreciate that. It's always fun to go in and learn more about what's
going on. It was great to be back in the lab and in Hopkinton again. It's been a
little while since we've done that and seeing some of this technology come to fruition, the
hardware bits. I mean, I know you guys are really into the service and cloudification of everything,
but I like to see and touch stuff still. So it's good to be hands-on with these things. And it's real.
I mean, I don't know exactly how long you've been shipping QLC, but I saw it in a number of the nodes on site as we were messing around there. And yeah, it's pretty neat to see.
Yeah. I mean, look, hardware is not going away, right? I mean, in fact, if anything,
for these really large environments
where something like 30 terabyte makes sense,
we see that the cost of running in cloud,
unless you really need that agility,
you know, it's not cheap.
And so our customers are going away
from the mentality of cloud first
to like cloud when it's right,
you know?
Right.
So they have to do that little back of the envelope calculation to say, hey, wait a minute.
Is this a good environment for cloud or is this one one I might keep it on?
And at very large capacities, you know, again, I mentioned 200 petabytes, but actually it's
a thing.
That calculus works in favor of,
wow, that's a big chunk of change I saved by staying on-prem.
And so I don't think that it's going to dry up in terms of the demands around density
that we talked about earlier,
performance and large cluster sizes.
All that stuff is all going to be there
for the foreseeable future.
Well, one of our most popular social media videos of the last couple of weeks was the Buffalo Bills coach, I think is the offensive coordinator that was freaking out, smashing
everything in his coaching booth after they couldn't get the snap off.
And we clipped that in the tagline was something like, you're CFO after they couldn't get the snap off. And we clipped that in the tagline with something like,
you're CFO after they get their first cloud bill,
which is kind of what you're talking about, right?
Is that it's,
maybe that was a dramatization via meme,
but it can be hard sometimes
to really estimate and get a full handle
on what that is.
And again, what you're talking about at scale
with unstructured can be insanely expensive
if that's not value optimized for your workloads
and what you're trying to do as an organization.
Absolutely.
Yep.
I mean, it's the difference between rent versus own, right?
I mean, you know when you're going to go on vacation,
you don't just buy a home in the place you're going to
unless you can absolutely afford that.
Congratulations to you if you can.
But, you know, it's a temporary thing that you need,
then you go rent it.
Or if it's something that you need sporadically,
that might make sense too.
But Airbnb every day doesn't seem like a good way
to manage your capital.
No, especially since they outlawed the parties.
I'm not an Airbnb guy anymore.
So ruin my whole vibe.
Better cut back on that.
Well, this is great.
We've got a report coming out in a couple of weeks
that's going to dive way into all of this stuff and cover it.
And normally I would have you on after we publish the report.
But, you know, so it worked out to get you on before. So this will be a great teaser for anyone that wants to learn more about
what's going on with PowerScale, what you guys are doing with QLC, some of the performance data
you talked about. We'll have some numbers to look at too. And I think it's great from a store,
just selfishly as a storage guy, to see a vendor out there pushing the limits of what can be done and addressing customer needs with great products and being unafraid to go do that.
And so I give you guys a lot of credit for that.
Well, thank you.
Appreciate it.
Thanks for joining us today, David.
Appreciate it.
No problem.
Anytime.