Podcast Archive - StorageReview.com - Podcast #120: Seagate Exos CORVAULT
Episode Date: May 30, 2023Recently, Brian took a trip to Longmont, Colorado, to look at the Seagate Exos… The post Podcast #120: Seagate Exos CORVAULT appeared first on StorageReview.com. ...
Transcript
Discussion (0)
Everyone, Brian Beeler here with the Storage View Podcast and we're lucky enough to be
on site out here in sunny Longmont, Colorado. It is still sunny before the snow comes in.
And we've been working all day on the Seagate Core Vault products. We're going to talk a little
bit about that and other things that are going on. We've got no script, so let's just jump into it.
Eric, thanks for doing the podcast. Thanks very much. It's great to be here. Okay, so Corvault, you said you've been working
on it for six years. What's the deal? Yeah, you know, this is an idea that we came up with. We
were in a meeting, you know, like I said, five or six years ago, and, you know, we were talking,
this was a big storage provider, and they had, you know, really sophisticated. They were doing
all these great, great things, and I was thinking to myself, you know what? Like we should think about storage differently, right?
You know, people have been thinking about storage
in these kind of small three and a half inch sized
increments, but the-
Wait, there's still three and a half inch increments?
Right, there still are in size,
but that's how we think of them, right?
But we really ought to think of them in hundreds
because people are deploying thousands and tens of thousands.
You think about it like a box of Tic Tacs, right?
You don't buy Tic Tacs one at a time, right?
You buy a whole box, right?
And I thought, geez, for these clouds,
and you're increasingly seeing bigger and bigger storage deployments,
why don't we think about the storage a hundred times bigger than we think about it today?
And then the things we can do to create value are just amazing, right?
It just opens up a whole new world of things we can do for value. Okay just amazing. It just opens up a whole new world
of things we can do for value.
Okay, well, let's think about this.
The storage at scale was what, JBODs, traditionally?
Right.
Or what, or in-server compute nodes?
Both, right?
At the time you were talking about this,
what sort of architectures were predominant at the time?
You know, there's always been a combination
of kind of disks with compute in the same box, and then, you know, compute in one box and JBODs at the time? You know, there's always been a combination of kind of disks with compute in the same box, and then, you know,
compute in one box and JBODs at the start.
Mostly for the big ones, it's, you know,
it's JBODs with servers.
And you think about it, when you've got a JBOD put with
100 and something disks in it, that computer,
the compute node, either, you know,
if it's either connected directly or, you know,
just on top of it, it's got to think about what's happening
with all those hundred disks, right?
It's got to put every block of data on every disk,
it's got to keep track of where it is,
it's got to worry about all those disks
being the same size, et cetera, et cetera.
Like, why does it need to do that?
You think about a disk itself, right?
A disk itself has got 10 platters now, 20 surfaces,
and we manage, when you buy a three and a half inch disk,
you're getting 20 surfaces,
they're all a little bit different in size, right? And we do error correction and all
that. I'm like, well, why don't we just, why don't we think about this in a higher abstraction?
Let's think about a storage element as a hundred ish discs. And then all of a sudden we can start
bringing in some real value to our customers and, and we can kind of free them from having to do
the management at this low level. There's no reason to do management at that level, right?
Their job isn't to manage every disk,
their job is to do AI or to store your data
or something like that.
Let's let these companies focus on their core competence
and manage the lower level infrastructure,
which is what we're good at.
Well why, actually, let's take one half step back
because maybe the audience doesn't know Core Vault.
So how does Core Vault differentiate itself
from the traditional SAN or NAS storage array world,
and then the JBOD world?
Is, you're sort of in a tweener spot with this.
Right, that's exactly right.
We actually use some of the technology from our SAN business,
which is really popular.
We've sold more than a million copies of this data path.
And what we do is we take that and we put it in,
essentially a JBOD infrastructure infrastructure and we give you,
the way to look at it is a two or three petabyte hard disk.
So we do erasure coding inside of that box.
And then we have all these neat things that we can do
with failure management.
So for example, if just like one surface of a disk fails,
today what do you do if any part of the disk fails?
You throw the whole disk away, right?
And then you know, disks are getting pretty big.
They're gonna be 30 terabytes before you know it.
You have to move 30 terabytes of data across your network
to make up for that.
Well, there's really no reason to do that, right?
We can just take one surface away if just one surface fails,
leave the rest of the disk in service,
and you don't have to pay all that tax, right?
So you can do all these neat things with maintenance
and reliability and things like that.
Once you think about that unit of storage
as a hundred disks that are all being managed together
and giving you one very reliable big disk,
and that's what Core Vault is.
So we saw some of this years ago with Zyotech
where they had these data packs,
which were a bunch of two and a half hard drives,
if I remember, it's been a little while.
And they had some of these self-healing attributes.
So it's not entirely a foreign concept, but it's still something that's not very
out there in large amounts of storage. I mean this is still relatively unique.
Oh yeah, you know, it's such a great analogy, right, because it was the same idea.
But you know, what's happened is the scale has changed dramatically.
So you think about it, Zyotech was probably, I don't even remember, it's
probably a couple of hundred gigabyte drives,
and maybe there were 12 of them, something like that.
And when I first started working at Seagate
about 10 years ago, you know,
kind of a state of the art object store was
a 2U12, right, so 12 disks,
and maybe they were one terabyte,
half a terabyte, something like that.
And then there was a server managing that.
So you think about, you have a server for say,
six to 12 terabytes of information,, that's a fair amount of overhead. Today, we've got 2,000 terabytes,
3,000 terabytes for that one server. So what's changed over that, gosh, 15 years is that the
percentage of overhead that you have to carry to do this kind of work has gotten less and less
to the point now where we sell core vaults for the same price as a JBOD, right?
So you can choose, you can have a JBOD
where you have to manage all the disks,
or you can buy a Core Vault for the same price
where we do all the management for you.
And you, actually, you guys are pretty public
about pricing, which is different.
So a lot of this goes to the channel,
or most of it, or all of it, or something, right?
And there's hundreds of these guys out there
selling solutions, enterprise IT, SMB, I mean, all sorts of things.
And I think the general consensus or the popular opinion is that JBOT is the most cost-effective way to get a bunch of storage attached to some compute.
But for you guys to go ahead and publish numbers in enterprise IT around cost is somewhat unusual.
Right. Well, you know, I'm trained not to say cost, but one of the things that is really
important is that people think that that kind of commercial off the shelf, commodity hardware
is the cheapest thing, but what they don't think about, and this is one of the key ideas
in Corvault, is that to make everybody's product look the same, you're introducing a pretty
significant inefficiency.
You're paying for that another way.
You think, hey, I'm getting this really low cost.
There was a really big move from this bespoke hardware,
from the big OEMs toward more software-defined storage
that really took a lot of money out of the system.
But when we look at it now, if you pay the tax to make every drive the same,
then all of a sudden you're losing a tremendous amount of value.
And that's one of the tricks that Corvault does
is that it allows us to use our biggest drives
in the best possible way.
And that's something that is more difficult
to get in a JBOT infrastructure.
More difficult, I mean, in some ways impossible.
When you start to think about,
if you load up today, 106 drive bays,
20 terabyte disks,
whatever, you could get to around two petabytes
usable out of the system.
When you go through the process of ADR
to recover some component, or fail some component
of the drive, so fail a platter,
now you're at 90 something percent
of whatever its original capacity was.
So in effect, you're managing now 105 drives
of one capacity, another drive of a slightly lower capacity. So in effect you're managing now 105 drives of one capacity, another drive of a slightly
lower capacity. But the way you're set up is built for that resiliency and to be able to keep those
drives online longer rather than fail them immediately and have someone go service it,
pop the drive, pop one in, and then let it do its rebuild. Yeah, you know, that's a big cost. You
think about it, to have somebody on call 24-7.
That was my first job in college.
I was the guy, I had like the midnight to 8 a.m. shift.
Your mother must have been so proud.
Well, yeah, I'd either do homework or sleep, right?
And literally I had a cart with drives, and whenever a raid set would go down,
I would go and, you know, wake up, that alarm would go off,
and I'd rouse myself from slumber
and go replace the the drive right what was your accuracy i'm curious well i got i got fired
actually because one one day i replaced the wrong drive you know it was three o'clock in the morning
and i just you know was waking up that was before red bull oh yeah there was nothing nothing for it
and um i remember i was in a deep sleep and i you know the alarm went off and i put the wrong
uh i replaced the wrong drive and crashed the whole thing and lost all the data.
So that's the resiliency problem though.
No system should fail based on that.
Right, I'm with you, right?
And that's actually the great thing
about products like Corvault is that
we're giving you, these Corvaults have five nines
of reliability, right, 99.999%.
And they're extremely resilient to failures
and things like that.
And they do it in a way that you have to invest a lot less money in maintenance to get to that reliability.
And that's one of those things you pay for when you want that everything standard kind of hardware.
You have to pay that maintenance cost.
You don't have to do that in a more managed system like this.
So go back to the, let's talk more about the drives because I don't want to go past that so quickly.
It's really neat in the way that you're able to look at, so tell me, it's surfaces, heads, what else can we detect and work around in the hard drive itself?
Well, I tell you, these modern drives are very smart about their own health today.
And so the drive is clever enough to know if a failure is just a single
surface and if you happen to remove that surface from the pool, the drive will be fine. It
knows that. Or it knows if there's some more system problem where you should replace the
whole drive, right? And then the system is smart enough to know, all right, I've been
set up so that I only want to replace drives every year, for example. Instead of having
that guy with a cart that I did 24-7, instead I'm just going to pay a contractor to come in once a year and just replace all
the failed drives. Well, it knows that, right? And it knows how to say, okay, I'm 2.12 petabytes
now, but I'm only going to use a certain amount of that. I'm only going to use 98% of that
storage. I'm going to hold a couple of drives of capacity in reserve. It actually uses all
the drives. But then if a drive happens to fail along the time, it says, no problem,
I'll just take that drive out of the pool or I'll just take that one surface out of the pool,
and I'm just going to keep going.
I don't have to, you know, I'll say what I'm doing, but the user doesn't have to do anything, right?
And that's a huge deal and a huge maintenance savings at scale.
Well, yeah, if you could have one person come in once a year and drop out drives and be
gone, that's what, a couple hours a year rather than several small increments that add up
by the time you show up, you badge in, you get the thing, you swap it.
Yeah, or you pay somebody to sit there 24-7 with a bunch of extra drives.
That's really expensive, right?
So these new systems become self resilient,
and then all of a sudden, you can count on this,
whatever number of nines the box has itself,
you're ratio coding above that,
you have an unbelievable resiliency,
like 12 nines or something like that.
Well sure, and then when you layer software,
where you're going is where you layer software
on top of the box itself,
now you've got sort of an exponential gain, I suppose.
Oh exactly, exactly, and the resilience of the system is just amazing, right?
So that's where you get that cloud level reliability.
So, who's your customer for this?
Is it really hyperscalers, or is it smaller guys
that are CSPs, MSPs?
Is it enterprise?
And I'm sure you'll probably say, yes, all of them.
But where are you seeing some good traction points?
Well, there's really two populations, right?
So one is everything we sell is enterprise, right?
So it's all enterprise.
And we really see two populations.
So one population are people building things that look like clouds, right?
Private clouds, a lot of the distributed storage, Web3, people like that, where they're building
a lot of storage and they really want that redundancy built in, right?
So that's one population of customer.
The other population of customer is the one that has
one, two, or three of these things.
And they just, you know, they don't want to have to deal
with the overhead of the data protection, right?
It's complicated, you have to have a storage expert,
it's really hard to do, especially if you have
these software defined systems, you know,
some of them are quite fiddly, right?
You can do quite a, you can do a lot with them,
but you have to have somebody who really understands it
and gets in there.
So we kind of see both ends of
the spectrum both the the biggish clouds they really like the simplicity and the
smallest ones really like not having to deal with the software to find object
stores well so it's funny but the the smaller ones that are tend to be
cost-optimized right and what they can do because they just don't have the
scale often go with kludgy stuff, software,
and it's not quite stable,
so they tend to spend more time servicing
and figuring out why things are down,
why things are broken.
Something like this helps alleviate
some of those concerns then.
Yeah, the way I look at storage is that
we all buy storage on two factors,
cost and risk, right?
I buy on bezel.
I like bezel shape.
The bezel, yeah. Absolutely, the bezel. You should say our bezel is awesome. I know, I saw it. I like Bezel. Yeah, absolutely.
The Bezel sells it.
You should say, our Bezel is awesome.
I know.
I saw it.
You walk in there, it looks great.
I would buy it just for the Bezel.
No, no.
So you have a certain amount of risk you're willing to take, right?
You want a certain reliability of your data.
And then after that, after you've achieved that, and there's also a performance element
in there, then it's the lowest cost.
So it's like, what offering can give me the best value at the level of risk and performance that I care about?
That's really the point.
And so when I look at those kind of two populations,
even though they seem kind of far apart,
Corval actually does a great job servicing both of them
because they ultimately address those two things
for a very low price.
So how do you, with all these varied scenarios
where they're going to go to work,
maybe it's Ceph, maybe it's a Filecoin thing,
maybe it's M&E storage, maybe it's surveillance, maybe it's just traditional
enterprise storage in VMware or something.
Those are five things I just mentioned
that are all tremendously different
in how they're set up and configured.
And I guess with a company like Seagate
with as much scale as you have in terms of people
to help figure these things out,
but how do you stay focused on making sure that you can support, maybe not directly, but support your partners in all of people to help figure these things out. But how do you stay focused on making sure
that you can support, maybe not directly,
but support your partners in all of these
varied types of deployments?
Well, that's a big deal, right?
Because people, you know, nobody wakes up in the morning
and says, hey, I want to buy a big box of hard disks, right?
Unfortunately, they should, right?
It would be easier for you.
We put a lot of energy into making these things great, right?
But they don't do that.
They say, hey, I want to do AI on this data set that I own, or I want to back up my data, or I want to do something great, right? But they don't do that. They say, hey, I want to do AI on this
data set that I own, or I want to back up my data, or I want to do something else, right? They think
of that solution. And then they think about the software that they want to service that solution,
right? And then they start thinking about the hardware. So, you know, in a lot of ways, we're
kind of three steps down the decision tree. And so what we've done, and we put a lot of
resource into this, is that we've got these great labs, our reference architecture labs, where we've identified, you know, kind of
the main, say, eight or ten software infrastructures that we think people will use our products in.
And what we do is we actually build them. So we get a lab, and you probably have seen today,
you know, we go and we actually build these infrastructures out. We write papers. We show
the architectures and things like that. You know, I think ultimately,
you should just be good at what you're good at.
So Seagate is good at-
Try not to overextend?
Well, like I said, just focus on your core competence.
We want to enable our customers to focus on
their core competence and not worry about
the nuts and bolts of the storage.
And for us, we want to protect and store your data.
And you think about it,
Core Vault is really just a very reliable, big hard drive.
It's a block device, right?
So you give us a block of data and say keep it it safe, and we'll keep track of it for you,
right? And that's really the way to look at it. And all the software on top is not trivial,
right? It's a lot of work. But ultimately, all software uses storage in a similar way where it
says, hey, here's a block of data. Keep it safe for me. Keep track of it, right? And that's what
we focus on. Yeah, and that highlights, I mean, that highlights the big difference
between what you've done with Core Vault
and what you would do with a JBOD,
because in the JBOD scenario,
you're relying on something else to do that work, right?
To maintain the data integrity.
It's like buying a bookcase from Ikea, right?
You get the bookcase, but it's in 400 pieces,
and you've got to spend all afternoon
putting it together, right?
And we just say, look, here's an already built bookcase. It works just great.
How do you, okay, so then how do you prevent yourself from trying to do too much? Because
it would be super easy to say, oh, well now we've done this erasure coding thing. We've got this
ADR thing. We can rebuild and keep drives online. We've got a couple other data services going on
within this chassis. Why don't we just add a bunch of other stuff and make it look like a SAN?
I mean, how do you just temper that excitement of putting too much into these systems?
That's a great question.
I'll tell you, we have been tempted, right?
Well, of course you have, yeah.
As soon as you start, right, there's all these things.
Let's put in an S3 connector.
Let's do all these things.
Yeah, let's put some flash in there.
Let's put some SSDs.
We've got great tiering software.
We can make it work, right?
But now you already have a product for that.
But those meetings of how much to put into this
versus what do we leave to the traditional
SAN controller businesses, gotta be challenging.
Well, back to be good at what you're good at, right?
Our job is to take a block of data and store it safely.
And so we really think about that
and we want to give the best value.
So we do a lot of things.
Like I've really encouraged innovation in areas
that reduce the total cost of ownership.
We're doing all kinds of interesting things there
and with these new drives where we use a laser to heat up
a millionth of a square meter of media
for a millionth of a second so we can write the data.
Like we're doing all kinds of neat know, so we can write the data.
Like, we're doing all kinds of neat innovation there,
but we're keeping the features very simple, right?
We want this to be a very simple device that just works.
It's easy to, you know, one button provisioning.
You open the box, it works.
Like, we've been really clear with that.
And, you know, I agree there's a lot of temptation
to make it more complex, but that's not the problem
we're trying to solve, right?
We're trying to make the story simple.
Yeah, but engineers, you know what happens with engineers.
They get a little crazy and always want to do more.
Yeah, no, absolutely.
But I think we've done a good job of staying simple with this product.
And we do have the San product line, and we actually make quite a few.
So we kind of do see that part of the business,
but that's not what we're going for with Corvault, right?
We want Corvault to be very focused.
Distinctly different, but again, a number of great use cases for that product specifically, which is pretty great.
And then, the delineation between what you have
in the rest of the portfolio.
I think too, a lot of people don't realize how robust
the Seagate storage portfolio is, past just the drives,
but you've got all the live drive stuff,
you've got all the other stuff for the creative program.
We've got our mobile stuff,
we've got SSDs all over the place, we've got a lot going on.
But ultimately, it's the same kind of process
you were just talking about, how do you keep
the portfolio simple, and we want to protect
and store your data, so if you think about
all of our products, that's what they do.
We protect and store your data in a reliable,
cost-effective way, and all our products
are focused on doing that.
Core Vault is a good example where we've really
just focused that in on a very clean value proposition.
So Core Vault comes in a big chunk, right?
Yep.
106 drives, 20 terabyte today.
When you buy it, you get this thing.
Which solves a lot of problems at scale.
Do you worry, or are you guys concerned at all
about doing a smaller version for people
that want the same erasure coating
and all the recovery stuff available in Corvolt
but just don't need that scale?
Well, we're actively looking at a one meter box.
So right now our 106 drive box is 1.2 meters
and that's too big for some data centers.
So we're looking at a one meter.
So we might get there in the next couple of quarters. That's so we're looking at a one meter, so we might get there in the next couple of quarters,
right, that's something we're looking at.
I don't see doing this at like a 2U12,
and it's the Zyotech problem, it's like you really have
to have a fair amount of storage so that the overhead cost
of doing the erasure coding and all that other stuff
is tiny, right, you want that overhead to be small.
If you did that with 12 drives, you have to pay
for all the same controllers and everything else,
and then it's not as good of a value.
So it'll always be some relatively large amount of storage,
although we are open to, like I said, smaller form factor, like one meter, things like that.
So what are some of the fun use cases where this is going?
We've talked sort of around the edges of what you can do with it,
but give me a couple you guys are excited about.
Well, I'll tell you, the thing I'm most excited is,
you know, my whole career.
So I'm 30 something years in the business now.
And you know, forever, drives have come out
in relatively small increments of capacity.
So you know, we were doing, you know,
100 gigabytes, now we're doing two terabytes.
Well yeah, out in front of the lobby,
you've got the 125, like, Mongo drive.
Yeah, MFM interface, yeah.
Anything with 18 cables coming off of it
probably shouldn't, by today's standards,
it's kind of funny.
I remember the first time I saw one of those,
I was like, I could put all of my 320K floppies
on that thing, it's going to be great.
Yeah, I saw there was a social post this week
of people comparing the number of floppy drives or floppy disks required for DOS, Windows, and 3.11.
And then I think they went all the way up to 95 or something on 1.44.
It was just pretty funny to watch the progression.
Bigger and bigger and bigger, yeah.
You know, the change has just been incredible in the market.
The thing I'm most excited about is that we're about
to start releasing drives with much larger capacities.
So we're working on a three petabyte product,
quarter-volt, right now.
And that is just mind-blowing.
I can't even imagine how much data that is,
you think about it, it's just like...
Even 18 months ago, the notion of a petabyte
under management was hard for most people to wrap their heads around.
And now, you're talking about 2 point whatever 1 petabytes.
Right.
They were very simple.
They were 2, 2 and 1 half, 3 petabytes
were released in core vaults in very simple round number
capacities.
I'll get it wrong, but here's an interesting, you know, stat.
Like from the beginning of time itself
until like sometime in the mid 70s,
somebody estimated that the sum of humanities data
is like three terabytes or something.
Like, you know, it's like a small fraction
of one drive today.
And you know, like the amount of data we're making
is just getting, it's just continuing to increase.
It's doubling every three years.
The thing behind that, we were talking about
artificial intelligence earlier, it's not interesting
per se that the world's making more data.
What's interesting is that we're figuring out
how to get value from these huge data sets.
Well that's the question, right, is how much of it
is useful?
Is my Facebook post from 19, well I guess we didn't go back.
Don't go back.
Don't go there, right?
Eight years ago, my Facebook post was that, is that useful to still have anywhere today?
Not to anyone else, maybe to me, but not to the rest of the world.
But where do you start to draw the line, too, between what's actually useful and where you can draw some intelligent insight from and what's trash?
Well, that's the thing that is really, I think,
right now in this really big acceleration phase where,
you know, like I'll use an analogy for video surveillance.
So we've been doing video surveillance for about 50 years,
I mean decades, something decades, but for a very long time
the only way to get any value from that surveillance footage
was to have a person watch it
and think about what was happening, right?
Well, the initial surveillance was people watching. If you go back to any of the old casinos they had the
the floor yeah and then the guys i actually remember those when i was a kid yeah there's
staring down through like the the bubble dude you know seeing who's cheating or whatever it
was loss prevention then but now it's much more right oh they're super sophisticated now but you
look at it like now the computers watch the video
and they can actually learn things that we find interesting.
And that's not just happening in surveillance,
that's happening in literally every profession and field
that you can think of.
Or retail, we're seeing a ton of that.
So I'm from Cincinnati, we got Kroger, we got Macy's,
and they're all looking at these things,
and especially with fewer staff.
So if you have a 100,000 square foot Kroger
and you have to stock whatever number of SKUs they have,
when you run out of peanut butter
or you're low on Honey Nut Cheerios,
if there's intelligent camera systems set up
with algorithms that know how long it takes
and where they are, you can get super efficient.
Oh yeah, and they know that there was a football game
that Sunday and so there's extra chicken wings on order and they know that the temperature went below this
particular threshold so people buy more i mean like the the level of sophistication and again
it's every it's every conceivable industry that's going on a little salty over football i'm sad you
brought that up yeah you know i was just in philadelphia there was a lot of salt a lot of
salt going on yes the same salt the cincinnati bengals and the philadelphia eagles both suffered from nfl's plan to have mahomes be the poster
child of the nfl for the next several years be that as it may we undersold chicken wings in
cincinnati obviously because no return trip to the super bowl right right i'm not sure how we got
sideline i know it's just my well but it's it's like a golden age right now. You think about what's happening.
Like, I don't think we know how where this is going.
Right.
I mean, I think that, you know, you have these huge data sets and the computers are getting ever more clever about pulling valuable things out of them.
And, you know, again, I think that I think that we're just in the beginning of the book, not in the middle or the end.
Right.
It's a really interesting world.
Do you have, what are you guys doing
to help customers figure that out?
Because you must be, you talked about the labs
and setting this stuff up.
Do you guys have to play more in that sandbox earlier
to be prepared for when those questions come up of,
how many cameras can I attach to this thing?
How much intelligence can I derive from this?
If I add GPUs, what does that do?
There's so many, you choose your own adventure pathways
to go with these things.
Well, yeah, there's a couple of vectors there.
So the number one thing we do is we try to make our part
of the equation simple so that companies can focus
on the part that they're good at.
That's number one.
Also, though, we're a really high volume manufacturer.
We make something number
of drives a second. And we employ artificial intelligence and smart cameras and things like
that in our factory. And that's informed a lot of our early work in particular on kind of where to
go with the architectures and things like that. So we actually either eat our own dog food or
drink our own champagne depending on what day it feels like. But we're an adopter of all this stuff
because it matters. And it's very efficient. It's're an adopter of all this stuff, because it matters, right?
And it's very efficient, right?
It's really helpful.
Well, speaking of that, I mean,
how many organizations suffered supply chain challenges
over the last several years?
It's still an ongoing problem for some.
How's that been for you?
And especially as you look at these package solutions,
it's not just the hard drives, it's 106,
it's the chassis, it's the blinky lights,
it's little transistors, it's all just the hard drives, it's 106, it's the chassis, it's the blinky lights, it's little transistors,
it's all of these things.
You know, obviously being a large international organization
should help with that, but that didn't save everybody
from being beholden to some 12 cent component
that you still need.
Yeah, they call that the golden screw problem, right?
So you get this big infrastructure
and you're just missing the golden screw.
I heard that there was a car manufacturer in Germany who was literally buying
washing machines, taking a little chip out of the washing machine,
putting into the car and like throwing away the washing machine or putting it
in the warehouse.
That's exactly what we're looking for out of our ESG initiatives, right?
Is excess washing machines.
That's our efficiency. I tell you,
we have a lot of great people who focus on this and it required a lot of work for
us, right? A lot of focus have a lot of great people who focus on this. And it required a lot of work for us, right?
A lot of focus took a lot of energy this last about year and a half to work with our customers to get through this.
It's not totally over, by the way.
No, clearly not.
I mean, it's still a problem.
Yeah, there's still high power chips for power supplies and things like that are still hard to get.
But it's something that we definitely had to take a lot of energy
to navigate, right? And of course, we're coming on the back end of it now. I do think we're on
the back end. We're thinking a lot about how do we harden our supply chain further so that it's
not as painful if it were to happen again, right? That's something that's been a big deal for us.
Yeah. Well, I mean, nobody was ready for that sort of, that much of an abrupt hit. And I mean, we all remember back to
when we started to hear the rumblings. A lot of people were at CES at the beginning of 2020,
and then as things unraveled, I mean, it was pretty quick and painful for many that were
running just in time or other. yeah i think the just in time
is not as popular today as it was i would say it's substantially less well you know what happened so
like you know when when covet hit you know companies run on cash and so a lot of companies
said you know what stop spending cash we don't know what's going to happen stop ordering stuff
right stop ordering chips and you know computer chip whether you like it or not takes a relatively
long time to make you know six months nine months, a year in some cases.
And so everybody stopped ordering them.
The chip companies stopped starting new wafers, et cetera.
And we went on and luckily the healthcare industry
came up with a vaccine relatively quickly.
And here we are back and everybody's like,
okay, I want my chips now.
Well, somewhere in between though,
the storage market got hit really hard
because it was that, let's stop ordering
because we're not sure what's going to happen.
But then you had things like Chia come along,
Filecoin, other distributed storage things
that in, we were right in the middle of it,
in about two or three weeks,
just sucked everything out of retail supply.
And so there wasn't necessarily,
I think the problem looked bigger than it was
because Best Buy only has so many drives in stock
or Newegg or whoever.
But all of these things,
when people were at home looking for things to do
or some forward-looking organizations were investing
and taking that sort of downtime to improve their offerings,
I mean, this thing was crazy.
So the storage world kind of got doubly blindsided, right?
Yeah, I am really proud of how Seagate's
kind of navigated those waters, right?
We, you know, if I look at the magnitude
of kind of the ups and downs, man,
have we done a great job of keeping up with it
for our customers.
And that's one of the things that we're really good at
is we have long term, and we take the long view
for both our customers and our suppliers. So, you know, when these bumps happen, you know, we're not good at is we have long-term, and we take the long view for both our customers and our suppliers.
So when these bumps happen, we're not immune to them,
but boy, I think we weather them really well
because we're not thinking about our relationship
for only one quarter.
We're thinking about our relationship
with these partners for years and years.
So go back to Core Vault a little bit more.
We've talked a lot about the resiliency,
the erasure coding, all these great things
that are fundamental to it.
As you look at customers that are evaluating architectures,
what's the process to get them involved
in checking Core Vault out?
Are you doing POCs here on site?
Are you doing demos?
How do you get them to try this out?
And it's not that it's a fundamental shift
from what they're doing already,
but sometimes they want to see touch feel.
What does that look like?
Oh, that's a good question.
So really there's a three-step process.
So number one is that if you're using,
or thinking about using a Core Vault,
there's a really good chance we've built a system
that looks a lot like the one you might want to use it in.
So that's number one, is that we've got these
terrific reference architectures.
We're doing one for Filecoin right now,
where I think we have eight Piba bytes
under a single server under ZFS.
It's amazing, right?
The cost advantage is incredible.
We have another partner using Ceph,
and they're using 30% fewer servers
and some pretty significant less memory.
So we've gone down, let's say,
the theoretical architectural work really well for all the
main bits.
And then so we can share that.
And then we also have a technical marketing engineer, sales engineer who can talk to your
team about that.
So that's kind of the first step is just like, will this work in theory?
And the second one is let's get you one.
Let's get a demo unit or a NFR unit or something like that so you can try it out and test it, right?
Because ultimately, you know, people first want to know
will it work in theory, and then they want to know
will it work in practice.
And so we're focused on being able to enable
those two things.
Well, you talk a lot about resiliency
and storing your block of data, making sure it's there
for however long you tell it to be there.
What are the other factors?
I assume we haven't talked about performance at all.
I assume that's part of the equation
for some of these use cases.
What do you guys, do you quote?
How do you think about performance for these?
Yeah, we do.
We actually, one of the things that is really good
about our Corval controller is it does just about
as good of a job as you can do
keeping all the actuators busy, right?
So with a multi hard drive system, you want more drives available.
Right, you got a hundred hard drives and you know people think about like SSDs
and they think well hey they have good performance but really what they have is
they have a good like IOPS performance because you can get to one bit kind of
just as easily as you can get to any other any other bit but you take a
hundred hard drives or ten thousand hard drives and you put them all working together,
boy, they can stream some data, right?
So, you know, performance is a big deal for us.
Spent a lot of time tuning it and making sure it works well
in various applications, but that's something that
you can expect a really good real-world performance
out of a core vault compared to almost
any other infrastructure.
It's really great performing.
Well do you think about the workloads
as mostly large block sequential ins and outs
or do you have to be ready for some bunch of random small?
What do you tune for?
What are you most prepared for, I suppose?
You know, it's a definite balance.
If I look at object sizes,
I'll tell you they really span a wide range.
Some infrastructures are just a bunch of tiny objects
and they trunk them together.
Some people use really big objects.
You really have to, I don't see a world where
like a Core Vault type product is oriented toward
a monolithic workload, because we just don't see them,
right, especially if you have enough clients, right?
It looks like a random workload.
I mean, if you're talking about going to Filecoin
and also to VMware, I mean, those are two very enough clients, right? It looks like a random workflow. I mean, if you're talking about going to Filecoin and also to VMware, I mean,
those are two very different worlds, right?
Well, they are, but you think about it in,
if you've got, say, 10 or 20 petabytes of storage,
they're not as different as you might think.
So at a certain point,
they all sort of mishmash together to be...
Yeah, you have enough users,
everything looks like a random workflow.
Absolutely, sure.
Huh, that's interesting.
You talked a little bit about Flash,
about restraining yourself from putting Flash
into these boxes.
Is there a world where Flash becomes a play,
either in a tier or in an all-Flash configuration
for Core Vault, or is that an expense bridge too far?
Well, it's not an expense.
Again, it's a simplicity play.
So, you know, for example, in the SAN world,
one of our most popular configs is we have a 5U84 product that's been super popular. And a lot of people use 80
disks and four SSDs, or 76 disks and eight SSDs is one of our more popular combinations.
We've really resisted doing that in Core Vault because we think that people who are going
to buy Core Vault, really what they're going to do is they're going to buy five Core Vaults
and one all-flash array. And they'll put that on top and use it that way.
And again, what we want is we want that kind of big hard drive to just be very consistent and simple and easy to use.
And so if you start breaking that up, all of a sudden you're adding complexity into the system.
So you've got to work with all the compute platforms out there, all the server guys.
I mean, I've seen some of your reference architectures with Supermicro.
But, I mean, at the end of the day,
it's just a JBOT-ish storage, right?
So it'll work with anything you want to plug it into.
Do you worry at all about the competitive balance
of the world with you guys being a component supplier,
plus you OEM a bunch of stuff?
I mean, how do you manage those relationships
to make sure that everyone's comfortable,
I guess, with having the wide variety of products
you offer in the market?
Well, you know, it's a good question.
Ultimately, what we want to do is offer good value
to our customers, and there's a lot of ways to market.
Right, there's a lot of ways to market.
And so, what we found, for example,
we have a distribution channel for our systems,
and we always use two-tier distribution,
so that's minimize the conflict with ROEMs.
It's always something that's there, but ultimately,
you know, customers are going to buy
the product they want, right?
And so it's not a good idea to try to tell your customers
what they want to buy, right?
You need to like, you need to-
Generally doesn't work, or not the second time,
if it worked the first time.
Right, right, right.
And I will say, it's just not the way to do it.
We want to protect and store your data.
We're delighted to sell you a hard drive, or a store your data. We're delighted to sell you a hard drive
or a million hard drives.
We're delighted to sell you a big box of storage.
You would take that order for a million hard drives?
In fact, right now.
Like if you want to, let me get my order book and I'll.
You're content to take that order?
All right, so the message is,
really if we boil it all the way down,
any infrastructure that's using a JBOD today
should be looking at Core Vault as an alternative.
Absolutely, absolutely.
That's that easy.
All right, we'll link to Core Vault notes
in the show notes, we'll link to that.
We'll put any other resources in there
so people can learn more, find your sales teams
and get that POC checked out
and check out Core Vault instead of a JBOD.
Well thank you, it's been great talking to you.
Thanks, sir.