Storage Unpacked Podcast - Storage Unpacked 259 – Sustainable Storage in the World of AI with Shawn Rosemarin (Sponsored)
Episode Date: May 31, 2024In this episode, Chris discusses the topic of building sustainable storage solutions with Shawn Rosemarin, Global VP of Customer Engineering at Pure Storage....
Transcript
Discussion (0)
This is Storage Unpacked. Subscribe at StorageUnpacked.com.
This is Chris Evans and I'm here with Sean Rosemarin from Pure Storage.
Sean, how are you doing?
I'm great, thank you. The sun is shining and it's a beautiful May, what can I say?
Lovely. Canadia, yeah, you've got the weather.
You're lucky.
For this moment, although in the Pacific Northwest,
if you don't like the weather or you like the weather,
just wait 10 minutes.
It's likely to change.
And something else will come along.
Yeah, perfect.
Okay, and before we go any further,
obviously you've been on the podcast
and we've chatted in the past before,
but it would be good if you could just take 10 seconds
and tell people what your current job title within Pure Storage is.
Yeah, so I have the illustrious privilege of working for our founder, John Colgrove,
within the office of the CTO. I look after a function of our business called customer
engineering. So I'm the global vice president of customer engineering. And what that means is
I spend the majority of our time talking to our largest customers, largest prospects, understanding how they're using data,
what their challenges are in data, how they're leveraging our platforms, how they're leveraging
the ecosystem, and really kind of act as that bridge back to our engineering and research
development organization to ensure that what we're focused on is completely calibrated to
what our customers are most focused
on. So incredible position and, you know, lots of learnings on a day-to-day basis.
Excellent. Okay, now let's just get into the meat of what we're going to talk about today.
And before we start, I just really want to raise the fact that I haven't really, as an analyst or
as a podcaster or anything like that, really talked about AI that much.
And partly one of the reasons for that is because I am absolutely no expert on the area of AI.
It's one of those things where I think if the knowledge I have is I'd probably be dangerous with it rather than be useful with it. So I've tended to shy away from it until there was an
opportunity to really talk about AI in a practical sense relating to the sort of things that I talk
to day to day. Now, thankfully, between you and I, we have an opportunity to do that. And we're going to talk
today a little bit about sustainability, storage, AI, what the future will be, what we need,
all the issues that come around that. So ultimately, our topic today will be AI, but it will be
with that little storage sort of focus on it. And I thought it might be a good starting point, Sean,
if you could just explain for the audience
this whole AI boom and where we're at,
because I think it's a good idea
to get a sort of standing point
so everybody sort of gets a baseline
as to where we're at in the industry at the moment.
Yeah, so let me start off by saying
this thing didn't come out of nowhere, right?
I mean, point in time,
my uncle wrote his PhD thesis
on neural networks and expert systems
and artificial intelligence back in 1979.
And so, you know, this thing has been brewing
for quite a while.
I think you could go back in the archives
even way before that in terms of thinking
about what the potential of this kind of technology would be.
But if you think of where we've gone, essentially, the consumerization, or the cost of this level of computing has reached
a point now where we can do a lot of these things that we talked about doing, but we're never really
affordable. And I'll put that affordability in quotes, Chris, because as we'll talk about later
in the podcast, this is still very, very expensive, but it's now reached
a level where as a service, there are companies who can now embrace this and monetize it.
And we've got the ability from a digital perspective, because we've moved everything
from paper to digital, that we can now actually look at all this digital data we've collected
through modern analytics and data pipelines, and we can start to actually do a lot of these things. But, you know, it would be reckless to start this discussion without saying
that, you know, this is likely a 10 to 15 year journey, and there will be money made along the
way, and there will be billion dollar corporations and trillion dollar corporations that will be
built on the back of it, some of which don't exist today.
But we have to remind ourselves that we are extremely aggressive in terms of what we think
this can do. But actually bringing that to life and working past some of these challenges
are problems and opportunities that will take us 5, 10, even 15 years to bring to market.
Yes, it's an interesting scenario, isn't it? That we always see the Gartner hype cycle
is probably the best description of this sort of thing.
We do see a real buzz of technology
when something new comes out,
and AI has probably had that
over about the last 12 months.
And we've seen a massive uptake of everybody
saying that every product they're ever going to ship
has got AI in it now,
specifically generative AI,
so that we can interact with it.
And you do see that hype cycle
sort of peak quite significantly and then fade away and usually as you uh quite rightly um
highlighted that curve does have a lifetime which could take 10 or 15 years and the interesting
thing i think about that is it's going to be a case that we need to sit down and think about
what the requirements are to deliver the technology over that time. You know, you've highlighted the fact that it's practical to do
it now, but it's A, expensive, and B, it's very highly demanding in terms of resources. So these
sort of technologies aren't just going to be sitting on your desktop in terms of building
models. You know, we've got data centers that are absolutely enormous with tens of thousands,
hundreds of thousands of GPUs and petabytes of data being required.
Yeah, I would totally agree.
I mean, if we look at, there's a one slight sort of deviation that I would highlight.
So if we look at other trends, if we look at things like, you know, how did e-commerce come to life?
That took about 15 years. years, if I had told you, you know, back in the late 90s, that you'd be putting your credit card online, and you'd get a delivery date, you could track your package. And, you know, you'd be able
to order from these entities that had virtually anything, and you could price shop, you would
have said all this is great. But to actually bring the mechanics of that to life took time.
But there's two specific areas that are different about this particular trend.
The first one is that money can't solve everything.
There are actually real hard issues here,
like access to electricity that we'll talk about,
that fundamentally, no matter how much money you have,
even if you wanted to go build nuclear reactors,
that's going to take several years to stand up. And so this energy piece is one.
Regulation, we'll talk about that in a little bit of detail.
That's going to be another one.
And I think the third one is, when we think about what we're building today, we have to
be really careful because we're sitting on the cusp of data growth that's estimated to
be roughly 30% compounded average growth rate.
That means next year we'll have 30% more data than we had in aggregate this year.
And if we think about what that means, if we don't build
the right foundation and we have to replatform or we have to change routes two, three, four, five,
10 years from now, we're going to be talking about a mountain of data, a mountain of data that would
need to be migrated, potentially integrated into a new platform. And so this one's a little
different, but I do agree with your earlier proposition. Those that launch first with their
services have an opportunity to become the de facto standard, the Xerox per se of the photocopy
era. But if you think about where we are today, those who move first may also have the
largest challenge in migrating to whatever becomes the industry standard foundation of the future for
AI. And we've seen a lot of companies like that. So I mean, the obvious ones, everybody will
probably be thinking of are people at OpenAI. Meta has been right at the front there. Anthropic,
who are in partnership with AWS.
You know, these are the companies that are really sort of building these models and really
starting this transition towards generative AI, at least in the first instance.
And they must already have petabytes worth of data that they're processing on a regular
basis.
Oh, they absolutely do. In fact, I would tell you in 2023 alone, there were 149 foundational models released. So if you think about what's happening, there is this race to build a model to effectively learn and understand the corpus of the internet. Right. And that's one we could definitely unpack at some point, which is, you know, there is so much data out there. And, you know, you've got
all these different models in terms of looking at how to break all this down, and essentially move
from a search model to a query model, where I didn't no longer search on the internet,
I actually asked a question and I get an answer. And the promise of that is incredible.
But the reality is in the enterprise, the corpus of the internet doesn't solve my problem. I actually need to go into my own data. I need to go into volumes that are specific to my discipline. I need to start to look into very specific, I'm going to call it multimodal content that might be mine and might be some that I buy and I need to start to look at how do I bring this context of, you know,
large language models and this query engine to my specific business and my specific industry. And I think that's where we are right now is asking these questions of, you know, ChatGPT
has been very, very cute and capable of, you know, helping me to frame my thinking in a
general sense.
How do I use that and use that modeling to now look
at the data corpus that I have internally that is not outside
accessible and how do I allow my employees and my customers and my
suppliers to start to use query engines that are fed from that data source yeah
that's it that's where we're gonna go to next and I think you're right because
one of the things that I found with enjoying the fun of using open AI and
anthropic actually I use Claude more than I've used some of the others is it's great fun getting
it to get you to create a poem or tell you a joke or do something like that but it's not entirely
accurate when it goes off and searches other external data sources it tends to have a and a reasonable
degree of accuracy but it seems to me that the only real value here will come when i can actually
as a if i'm a business combine as you said my data with with the the tech the technical ability for
the platform to be able to query that data and present it back to me in a usable format so
there's two things i think i can see there's the ability to get into my data,
which obviously is rag. And I guess we can explain that in a second. But secondly, it's that ability
for that model to be accurate enough that when it does get to my data, it gives me something useful.
So would it be right to say that the value of the model is in the accuracy and the ability to query
other data that's going to be in my enterprise.
Yes, I think that's correct. And I would give you another kind of continuum to think about here.
So if we think about AI, I like to say today, we're artificial intelligence, right? So we're
sort of building out this artificial view of the world and this artificial view of, you know,
if we had someone who had learned all the learnings of the world, what would they,
how would they summarize it? I think the big push now is to move from artificial
intelligence to augmented intelligence, which actually says now as a professional, I can look
to this engine as a way to augment my capabilities, augment my productivity, augment my understanding,
and even augment my quality of life by doing things that, frankly, as a human,
I'm not really good at or I'm not really capable of. And then I think from augmented intelligence,
we then move to autonomous intelligence. And this is much further out. But I think that's
when we start to trust this machine and this brain and this co-pilot enough to say, you know,
there are certain tasks you've proven to me that you can do effectively on a repeatable basis. Go ahead and do those. I don't want to
be the bottleneck. And I think that will sort of unlock that last layer of benefit here. But it's
going to take us a while to fully trust that this can be done. No different, by the way, than me
trusting that my iPhone can do a security update while I'm sleeping. And when I wake up, that phone
will be exactly the way it was when I put it on the charger the night before. I sort of, yeah,
I can trust the idea that my iPhone will look great. And when I pick it off the stand, it'll
still be there the next day and it'll still be working. Not sure I'm yet in a position to say
I trust some sort of AI to, for example, drive my car for me and for me to just sit in the
back and go to sleep or something like that. I'm not quite ready for that one yet. And when we were
chatting before we did this recording, I think I might have mentioned that I went to the BMW Museum
in like 1989. And one of the things they highlighted there was that the intelligence
side of it, I think probably because their computing power
wasn't good enough at the time,
but their intelligence was to augment the user
and not to actually take over from the user.
So they showed a heads-up display
in complete darkness in a forest
where infrared was displayed onto the windscreen,
on the internal side of the windscreen,
and the driver could drive as if they could see in the dark.
Now, that was their preferred method rather than have the car take over.
And it sort of strikes me that that would be currently probably the better route.
But as this model evolves and we get more data and we get more advanced, we'll see that change as you've highlighted.
Yeah, I would think there'll be some use cases where we'll see it emerge first.
So security is one, right, where milliseconds and nanoseconds matter.
So if my AI system in security starts to sense
that something is wrong
and something's happening that shouldn't be,
I don't want to wait for a human to get that alert
and say, yes, please lock that user out.
I actually, you know, at some capacity,
I want to start to allow the system to say,
I'm actually going to shut that user out.
There's a risk that that user's doing something legitimate and I'm potentially slowing something down. But based
on the way that AI has interpreted the activity, the safe route is to lock that user out.
I think you'll start to see things like that occur a little like to your car analogy,
right? If I'm going head on into a barrier, the car is going to swerve out of the way.
It's not going to ask me, it's not going to say, hey, I think you're doing something wrong. A lot of the cars today will do auto evasive
maneuvers. And that's the first step. I think getting to a full autonomous driving of driving
you between cities with you, you know, taking a nap in the back. I don't say we won't get there.
I think we'll get there maybe a decade or so down the road, but we're going to take some baby steps and we're going to earn that trust just like we would in any other
aspect of our life before we completely take our hands off the steering wheel and decide that we
are safe to go do something else. It sort of has me in mind of the idea of Clippy popping up and
saying, you appear to be driving into a barrier. Would you like to turn left, turn right? Yeah,
we probably don't want the Clippy approach uh when it comes to something time critical i can entirely say that okay yeah
okay all right so let's let's talk about um business then and enterprise customers because
you know that's what we talk about and then one of the things i'm really interested just to tackle
is the whole idea of where these models will be developed going forward and the idea of security and the data sovereignty, not an easy word to say, data
sovereignty around that. I think, you know, initially, of course, the AI companies have all
got access to their own data sources, whether that's legitimate or not, you know, and that's
going through the courts in lots of different respects.
But I just wonder whether enterprises will be really happy about using Gen AI and exposing it to their internal data sources,
especially their crown jewels, if you like, without having some control over how those AI solutions actually access and use their data.
Yeah, it's a great point, Chris.
And that's why I think, you know, for a lot of these organizations where they've had board level edicts that says, you know, come back to us and tell us what we're going to do with
AI.
What are we going to tell the market that we're going to do with AI?
How are we going to embrace this new concept?
I think the key thing is really, you know, if you look at it architecturally, yes, the
LLMs are interested.
You hinted earlier with RAG, the retrieval augmented generation, where I can take internal IP, internal corpus, and I can supplement that.
But I think there's another element too, which is what external volumes do I need access to?
What external data am I going to purchase? And then how am I going to deliver this
in a way that is going to allow me to not just protect my internal assets, but ensure that I'm
compliant with the upcoming regulation? Because ultimately, at the end of the day, it's not so
much whether or not my data gets exposed. That's a big concern. But it's whether the data that I've
used to train my model is mine to use. And if at any point in the future, should that ownership of that data get questioned,
I could be in a position where I would have to retrain my entire model to deal with things
like California privacy, GDPR, HIPAA, you know, even external companies saying my volume
of data is no longer on the public domain.
Therefore, it must be removed.
So when you think about training these models and the amount of cycles and amount of money that's
spent training them and the amount of data that's generated, it's not just one time. There will be
an iterative retraining process throughout to deal with insertion or appending, as well as deletion of key data sources. And I think that, you know, that's only
one angle of this sort of data and security and governance piece. What we've not factored in is,
you know, how do we ensure that we don't get, you know, isn't used for evil. So we've sort of taken
away that sinister prompting that could come from these? How do we eliminate things like deep fakes being used on our tools,
which could potentially bring issues?
How do we ensure that folks can't mastermind more sophisticated security attacks
by looking at what's available on these LLMs
and actually being able to more socially profile employees
and do more sophisticated attacks to gain credentials?
These are all things that we will work through
over the eras of development of AI.
And there's a big cost involved in that
because at this point,
we haven't really talked about the cost too much.
We've sort of highlighted the fact
that it's relatively expensive.
But when you are training, retraining a model,
the more you have to go through that process,
the more cycles you burn, the more time it it takes you want that process to be fairly quick and fairly accurate and using the minimal
resources possible and one of the i guess one of the examples i would use there is i remember
at a pure storage event a few years ago talking about f1 and the number of cycles that are
available to do wind tunnel training and simulations on the car. And they like your technology because you optimize the use of
IOPS or whatever the technical threshold limit was that you were allowed to do. So
there's got to be a cost on a utilization calculation that's done by people here.
Yeah. So let's break that down into two specific areas.
The first is you talked about the cost to train these models. So I find it fascinating that GPT-1,
which most of us didn't interact with, cost roughly $10,000 to train. Not that bad, $10,000.
We could probably find that in discretionary budget. GPT-3 was $ million. GPT-4 was 550 million. And GPT-5, or whatever it happens
to emerge with when it emerges, is estimated to be well over a billion. But within GPT-4,
there was 78 million worth of compute. So if you want to go train a general corpus AI today,
you're looking at a minimum 100 million plus investment, right? In fact, Google's most recent
Gemini used $191 million alone in compute just to learn the corpus of what was there.
So these are major, major, major investments. But to your second point about architecture,
yeah, I would tell you that, you know, what's really interesting about the GPU is it's given us the
ability to compute at levels that we never thought possible, probably second only to what will come
with quantum computing at some point in the future. But it's also put enormous, you know,
it's kind of showcased enormous holes in our data platform. Traditional storage solutions just don't get the data to the GPUs
fast enough. And if you think about it, I'll give your audience an analogy. If you think of your
GPUs as PhDs that you've hired, and they're quite expensive, and they sit in the back room,
but they're unbelievably smart at getting through information and finding conclusions.
But if you can't bring them the books fast enough,
or the material fast enough, and you can't allow them to share what they've learned with each other
fast enough, then you're not getting the value of the talent that you're employing. And that's
the challenge today is how do I eke every possible benefit out of my storage platform so that as it grows and as I scale my farm, I can still feed
and actually gain insights across that data set at the same level of efficiency.
That's really interesting, isn't it? Because we spend an awful lot on technology. And I'm going
to use my old person analogy here again that I use all the time and look back at the mainframe world when computers were really expensive.
And when I started work, the environments I worked in,
we ran our mainframe environment at more than 100% utilization permanently,
which everybody might say, well, how could you run that more than 100%?
Well, there was a calculation that said, basically, if everything was running all the time,
every 1% over 100 represented tasks
that were active and ready to be dispatched and processed,
but couldn't because there was a slight lag.
So it was a measure of the sort of latent delay,
if you like, on the system.
But the issue was more the fact
that the mainframe was running 100% all the time
because it was expensive, and therefore you absolutely maximize things you push things through
the evening you did them overnight you you made the use of the bandwidth of course of that it was
available over the course of the day and you're implying the same thing here with GPUs that GPUs
are super expensive the more you put in the more you want to make sure they're running at 100%.
But running them at 100% isn't just about putting the GPU and turning it on.
It's about giving the data to run at 100%.
And that's a throughput issue on your storage.
It is, right?
I mean, because ultimately, your GPUs are looking for data sets.
They're looking for work packages, right?
And if you look at NVIDIA and CUDA and the compiler in CUDA
that allows you to efficiently distribute the workload into for work packages, right? And if you look at NVIDIA and CUDA and the compiler in CUDA that allows you to efficiently distribute the workload into these work packages, that's great.
But once again, I'm going to need fast storage and I'm going to need fast connectivity in order to deliver that.
Because I can assure you, Chris, if I'm a CFO and someone's come to me and said, I'm going to buy this GPU farm for $ million, 100 million, 200 million dollars. I want you to
come back to me every month and show me that this, you know, these crown jewels, so to speak,
that you've acquired are now being used at 100, even greater than 100%. Obviously, I want to look
at the output. But the last thing I want to do is to have these, know incredibly expensive assets being underutilized assets as well at this
point in time take a long time to get hold of so once you've actually got them you know you really
do need to exploit them because because they're in short supply or at least they're in long long
term delivery so you you need to be able to do something in such a way i think that can
use whatever you can get hold of as well to a certain degree.
Well, I think that's the interesting innovative piece, right?
As we think about what we're doing today to process all of this data, we are using GPUs.
And it is largely being centralized either on-prem or in the cloud.
But we're starting to see some really interesting things emerge in the edge right i mean most recently tesla said it might use its cars uh latent capacity while they're parked to start doing
some of this apple's talked about its pcs and its iphones being able to do some of this
so i think we will get a lot more distributed in terms of what level of compute and computing we're
doing at the edge uh how we're making best use of latent compute capacity.
Reminds me of my days back in the 80s
when we had the search for extraterrestrial intelligence
and we all had that running on our PCs looking for aliens.
Screen saver.
I think we'll bring that back at some point
and start to say, okay, how can we distribute this?
Not just to manage cost,
but to ultimately use the power that we have in the most effective way possible since that power in many
cases isn't stored we have batteries but we want to use all the power that we have access to
um and i think this concept of how are we going to power this whole thing becomes more and more
important when we start to think about how this fits together.
Yeah, okay.
Well, let's talk about power in a second, but let's just quickly talk about how data centers will have to change in order to make this work.
So, you know, it's certainly not going to be deployed on hard disks.
There's no doubt about that.
But, you know, there's networking challenges, there's storage challenges, there's power
and cooling challenges.
You know, there's a lot of modification that power and cooling challenges, you know, that
there's a lot of modification that's going to have to come into data centers to make
this stuff really be delivered efficiently.
Yeah, there is a lot.
So let's break that down.
So first of all, the computational level, let's just talk about, you know, the density
is going to be the biggest issue here.
So if you look at the latest Blackwell offerings that NVIDIA has brought to market, we're talking
about power density we haven't seen before right so ultimately you know if i'm looking at 5 000 watts for a
single chassis and i'm looking at traditional rack density of 14 to 16. um you know i'm i'm sitting in
a you know specific area where i'm gonna basically hit the wall before my rack is full. You know, I just can't be sitting with a 14 to 16, you know,
kilowatt rack and be putting three systems in and I'm full.
14 to 16, though.
I mean, I look at that number, by the way, and think that to me,
from looking back from years ago, seems like a big number anyway,
but it's not anymore.
No, it's not.
It's not anymore.
And when you think about it, in many cases, when we talk about density, the issue is not how much can I fit in my data center? It's how much infrastructure can I put in my rack before my rack is actually full from a density perspective. But then you look at energy consumption, right? that suggested that their NVIDIA H100 deployment is going to consume as much power as all of Phoenix
by the end of the year.
So you start to think about, okay, existing power grids,
existing electrical infrastructure.
You look at advanced cooling technologies like liquid cooling.
It's a little scary for those of us that are growing up in data center
and thought about water or the like spreading around the data center.
We'll probably get there. But let's think
about infrastructure scale and capacity, right? 2.1 gigawatts of data center capacity were just
sold in the last 90 days to support these projects. That's enough power for a million and a half
homes. And so there's a bit of a rush to go and acquire this grandfathered space that has dense
access to power.
The alternative, of course, being what Microsoft and others have done,
which is to buy nuclear power plants for the sole purpose of powering their infrastructure.
You know, then we've got geographical distribution and latency.
You know, if you think about financials or automotive, to your point about self-driving cars, right?
The latency is going to be critical here. We can't have milliseconds of delay between cars talking to each other about where they are
on the road. Conversely, we can't have millisecond delays of trading systems that are managing
pension plans miss out on an opportunity or take a loss unintentionally. Then we're going to have
to talk about sustainability, regulatory compliance. There will be a lot in this space around environmental and data sovereignty. And the piece I don't want
to forget, Chris, is jobs. Our jobs as IT administrators, as infrastructure experts,
will still be here. But the skills, the people, the process is going to dramatically change as we embrace this concept of co-pilots,
we embrace this concept of new architectures for driving AI, and we start to think about what are
our jobs going to look like when it's man, woman plus machine, as opposed to the world we work in
today, which is largely full-time equivalent based. Do you think, though, that that changes anything
that we haven't seen previously?
I mean, if you look at days before the spreadsheet,
which I think, from my respect, that would have been early 80s,
would have been days before the spreadsheet.
I think I seem to remember some of the early spreadsheet technology
coming in around 1981, maybe, around the IBM PC.
And I certainly remember when I was at university,
somebody was actually working
on a spreadsheet tool,
a story for another day.
But obviously there was a time
before spreadsheets.
So when we had that situation,
there was manual calculation
and there was all the rest of it.
There was time before calculators.
There was time before
all of that sort of stuff
when people used, gosh,
books with logarithms,
logbooks, which I remember.
That's right.
Having actually had one and actually had to use them.
So that augmentation always happens all the time that we see new technologies come along.
I just wonder if this augmentation is going to be any different to any of the other things that we've seen in history.
Yeah, it'll be different because we have the ability to go deeper and wider in terms of what we have access to.
And I also think we have the tools at our disposal,
like these phones that we all carry around
and potentially goggles or contact lenses
or smart screens or whatever it is
that's sitting in the windscreen of our cars
that will allow us to make use of this information
in a much more real-time way.
And so you see, it's kind of all boiled through to where we are today.
We tend to build on what has come before us.
Without the internet, we couldn't have phones.
Without phones, we wouldn't have effective e-commerce the way we see it today.
Without effective e-commerce, we wouldn't have this thriving ecosystem and platform effect.
Now we're going to use the same platform effect of everything we've built to now actually drive this augmented capability this co-pilot that will help us both
in our personal professional lives so so much of this has been driven by the demand for storage
and the throughput and the rest of it there's obviously um the data center changes we just
we just mentioned but um where are the threats here? I mean, where are going to be the
problems that we're going to encounter other than the obvious ones about powering racks and, you
know, getting power into the data center? There's, I think, a fundamental level here of technological
challenge. There is. And, you know, it reminds me a lot, Chris, of what we saw with HPC or high
performance computing maybe 15 years ago. And we saw a lot of customers going out there and doing science projects,
buying pieces and components of particular hardware,
a little like you and I built our first PCs back in the day.
And then we realized that, you know,
the components that we put in those PCs maybe didn't have as long a life.
They were difficult to replace.
They caused us to have to rebuild every year.
We saw the same challenges in HPC, right?
I mean, a lot of it was software defined.
A lot of it was open source.
Some of it was vendors who had kind of come up
with a particular proprietary technology
or science project and they brought it in.
And while it delivered during that POC
or minimally viable product,
as soon as it got to scale, it broke.
And it broke because of the operational complexity.
It broke because of the overhead and the energy cost. And the reality was it just couldn't scale.
And so while some of those HPC projects still exist today, many of them evolved back to
infrastructure models that could deliver long life and performance and simplicity at scale.
I think it's a big piece of it.
Because if you think about it today, if I'm building a new arm of my business and I'm
using AI to power it, as soon as I prove my thesis, the business is going to want to rush
into that as quickly as possible.
So we're going to go from a factor of 1 to 10 to 100 to 1,000 to 10,000.
And if I haven't thought through what is this thing going to look like a factor of one to 10 to 100 to 1000 to 10,000. And if I haven't thought through
what is this thing going to look like at scale? How am I going to manage it at scale? Is it going
to be a reliable platform for the next decade, which is the timeframe I think we should be
thinking about this, then I could end up actually crippling my organization two, three, five years
down the road. When I realized the initial platform
I chose to build this on is no longer viable and now I need to re-platform. And so this is a big,
big piece of the conversations that we're having with customers today is if we look at what Pure
has done with Flash, I look at the discussions I'm in, right? We're now sitting at a particular
point in time when is the future going to be flash?
Yes.
Am I going to need something that's energy efficient?
Yes.
Am I going to need something that spans performance all the way down to archive capacity?
Yes.
Am I going to need to consume it as a service?
Yes.
Am I going to need to have a vendor that I can rely on that will be in business in the
next decade?
Yes. And am I going to have the manpower and the skills to support it. And so those are the things that
unfortunately don't always get looked at in the same light. Today, most of what we hear about
with AI is bigger, brawnier, stronger, more powerful. But what about when your project
actually works? What about when this
becomes the next big thing? Will you be in a position to actually support it and deliver it
out to your customers? I was thinking about how that translates when you use the PC analogy,
quite like that idea. So I wouldn't like to think how many PCs I built over the years.
And you're right, you'd buy something, you'd think, oh, well, I'm going to put that graphics card in
because that one's the best at the current state-of-the-art graphics card
or almost, yeah, I'll go for almost the top one.
I'm going to go for whatever particular motherboard
I can afford in memory and all the rest of it.
And then you put a bit of software
and you find that the driver didn't work.
And then you're trying to find,
especially when you were looking at the early days of Linux,
you're trying to find a driver you can just sort of shoehorn in
and and then you know certain versions of windows you would try and use a driver you thought might
work and inevitably that might work for you but that was absolutely not scalable and then before
you know it you know manufacturers have brought out machines that are completely sealed like the
apple ecosystem where you don't get to touch anything inside, where the software is built to work with the hardware, where the ecosystem is
built around that combination of the two working together. And then you see where scalability can
come from. So, you know, it's pretty easy to see if you imagine today's AI models and being sort of self-built PCs where we need to get to.
Yeah. And we have to resist the temptation to grab these shiny red toys that, you know,
appear to have come off the line with a custom built solution to solve this problem and really
challenge their architectures and challenge their scale. Because what I tell you, Chris,
is if you start to look at a lot of this open source stuff you just described, it's exciting.
But with the current security posture we have,
I'm not sure that putting Linux boxes on my floor
in my storage architecture
with a custom distribution of Linux
is going to satisfy my security people
who are going to want to look into every iteration
of that
core operating system, as well as what components are in it to ensure that it satisfies the
gold image standards of that organization.
I think I'm not too keen on the idea of putting something on the floor that might break when
I've spent 50 or 100 million dollars and suddenly it's the thing delivering the data into an
infrastructure which is now sitting idle you know that that hierarchy of you know dependency it becomes
even more critical now because 50 or 100 million dollars worth of gpu is not working is being if
you're like what's the american expression nickel and diming on the actual storage piece just because
you think that's the better way of delivering something cheaply at scale,
then really you've done yourself a disservice. Well, remember what we talked about earlier,
right? It's the density of energy. So when you and I were growing up with, you know, one gig drives, and now we're thinking today, from Pure's point of view of 75 and 150 and 300
terabyte drives, we're looking at the energy profile and we're saying, wow, I could actually be able to scale my data 10X
without having to retrofit or expand my energy footprint
or my data center footprint.
And we're super excited about that
because frankly, we believe that we can unlock
this level of density five, seven, 10 years
before traditional SSDs can.
And it's easy to think about storage as just a,
like you said, a commodity cost.
But the reality is you will run out of space
and you will run out of power
before you have satisfied the data needs
of your organization.
We've been focused on this problem for 15 years.
And now that we're at the scale,
we're talking about 300 terabytes
in a single direct flash module.
This absolutely changes the game. And when we look at AI and the density that will be required and essentially freeing you up the energy so you can put these GPUs into your environment without running out of power, that's incredibly exciting for us.
I think that's a good point to sort of move into your technology and what you're doing
for this, because understanding, I think, some of these challenges, just as a side note, by the way,
I saw something today that I was reading from another analyst that was talking about how
there was suddenly this demand for mega capacity SSDs. And it made me think, well, yeah, that's
actually that demand's been around for quite a while. Possibly just the industry just hasn't picked up on it or has chosen to ignore it because
technically they found it difficult to deliver those products.
So it's...
Well, but Chris, let's also keep in mind commodity SSD manufacturers, the majority of their volume
is in the one to two terabyte range.
That's for desktops and notebooks.
And there's a massive amount of volume there. And it's quite good business for them. So there aren't a lot of consumers that
are going to move from a two terabyte drive to a 50 terabyte drive for their home computer.
And so when you think about the economic models on which most of the enterprise storage providers
are relying, they're playing in the long tail, right? So you really have to ask yourself,
if you're running the businesses of producing SSD drives, how much engineering are you going
to invest in a product that makes up three, four, five, 10% of your overall volume versus focusing
on really making sure you don't erode any of your market share in the one to two terabyte space?
Okay. All right. Fair enough. That's a fair comment.
Okay, then.
Let's just go back over that whole capacity side of things then
and exactly how that's going to be driven
because there's got to be a number of components in here.
Okay, scaling to 300 terabytes is one,
but efficiently operating at that level
isn't just going to be done by hardware alone.
There's software involved, and essentially that's part of the scale story as far as i can
see that unless this software can manage the infrastructure efficiently it doesn't matter
whether you can put 300 terabyte drives in you have to be able to manage them yeah so i would
kind of equate chris as i as i kind of go into a little bit of detail here, fundamentally pure strategy is to drive
efficiency across all flash everywhere. If you look at what purity is, our software, it is
essentially the operating system for flash. It is the operating system for flash that drives maximum
efficiency. Today, that's delivered via DFMs. Today, those DFMs go inside FlashArray and FlashBlade
and deliver the proposition that
we deliver to the market. But ultimately, if you look at what we have done and what we are doing,
Pure continues to engineer Flash software as high technology across block, file, and object.
This is not something we buy. It's not something we borrow. It's not something we rent. We spend the majority of our engineering on ensuring
that our flash software and our flash operating system is the best in the world in terms of
driving efficiency. We're the only provider of being able to look at that efficiency across the
full life of the IO. So all the way from the controller to the enclosure to the individual DFM to the individual cell of NAND.
Nobody else can do that.
We also have no flash translation layer.
Who cares?
Well, we care because the flash translation layer doesn't just impact performance.
The flash translation layer consumes a ton of DRAM, which consumes a ton of power,
which consumes a ton of DRAM, which consumes a ton of power, which consumes a ton of energy. And when my competitors
are looking at one gigabyte of memory for every terabyte of storage, and I don't have that
requirement, I can drive density into the same footprint significantly faster. By the way,
less DRAM also gives me much higher reliability. And so when you think of what we did with Purity
in terms of really building the most
efficient path to flash, right, really bringing the highest level of reliability to QLC,
that has now allowed us to say, okay, so now we can go and purchase an engineer off of raw NAND,
a density roadmap for media, for DFMs, that is exponentially faster than anything in the industry. We think this is
super, super important. Now, it does beg the question of, does the DFM then become the
building block for other storage solutions, or even hyperscalers, or even beyond? And I think
ultimately, if you think of where this is going, if the industry will struggle to produce SSDs beyond 30, 60 terabytes, and Pure's got a solution at 300, I do think this is things that I found really interesting, probably slightly more than two years, is how much engineering
is going back into engineering hardware
to deliver what is required
for a whole range of different technologies.
So obviously we've seen ARM come back up again
and be treated as an efficient solution
for certain types of workloads
compared to using, say, Intel and AMD x86 processors.
We've seen rapid development of the GPU
and the way that the GPU is being used,
but we've also seen the hybrid super,
what do they call them, super chips that NVIDIA have built,
where they've taken ARM and they've merged it together
with the GPUs to build more complex systems on chip
that are allowing them to sort of not
necessarily just bolt together components like we would have done in the old days when we would
have built our PCs by hand but actually say well how do we actually bring all of this stuff together
in a more cohesive and more holistic way that actually delivers what the requirement is and
it sounds just like you're saying by the idea of having purity managing
a physical layer of, let's just call it flash, full stop,
you've got the ability to start building in that more holistic approach
to actually delivering your storage infrastructure.
Absolutely.
So first of all, if you go down,
and I hope some of your listeners have visited our corporate campus,
Santa Clara, we're a stone's throw from NVIDIA.
They're obviously a very close partner of ours,
and we are in constant discussions with them of how do we feed storage faster to NVIDIA.
You can talk about OVX.
You can talk about SuperPod.
You can talk about BasePod.
There are all sorts of discussions around how to ensure that the goodness that nvidia has built in cuda to
effectively drive efficiency of gpu utilization is mirrored with purity to drive the efficiency
of the utilization of flash because ultimately if i build the most efficient solution then it will
essentially become more affordable and more viable for the market, which will essentially accelerate the roadmap and the time to market for these solutions.
So the data issue, we're very focused on solving that.
We're very focused on making sure that we deliver the most efficient path to storage, to flash, period.
While NVIDIA is focused on making sure that they can crunch through that data
as efficiently as possible. And I think the collection of both of those will bring a
collective goodness to the market in terms of making all of this possible years earlier than
it otherwise would be. But more importantly, making it viable to operate in the long term.
So one of the things that sort of, I guess, this leads us on to, Sean, is this discussion,
really, about how the hype scale is going to deal with this. Because, you know, at the very top,
we sort of talked about where we thought a lot of this AI stuff would be delivered. Is it going to
be on-prem? Is it going to be in the cloud? Cloud's obviously going to have a massive impact on this.
And over the last, I would say, five or ten say, five years, let's call it five years,
we've seen an evolution of some of the background technology
the hyperscalers have got for storage.
They've added in NVMe drives and some other things,
but I wouldn't say they're necessarily the fastest to get to the market
with new technology there.
It tends to sort of be a particular product.
So where do you see those
companies going and how do you think they're going to deal with this, especially with reference to
efficiency when we know efficiency is a big thing on their minds? Well, Chris, if you look at where
these guys originated, they all started with white boxes and they all started with as cheap
technology as they could possibly get, thinking that that was the path to the lowest price to the customer. And then you've seen all the hyperscalers, specifically Amazon,
actually acquire technology to make their infrastructure smarter. And it's largely
turned proprietary. If you look at what's happened with Annapurna Labs, if you look at with a lot of
other acquisitions, it's all about how do I drive efficiency for VM workloads in that particular case? The next path to efficiency for
these clouds is Flash. And they know it, right? They know that ultimately, if they are going to
continue to offer more dense storage, which will allow them to offer multiples of data services
in the same footprint that they are going to need to move to Flash.
And, you know, if you look at where the hard drive business is today, the nearline hard drive,
50, 60, 70% of it is all being sold to the hyperscalers. You can see that in the HDD manufacturers' latest earnings. 50% of their nearline hard drives went to hyperscalers.
Over the next couple of years, you're going to see that dramatically shift to Flash.
And once again, you know, their choice will will be do i buy the cheapest commoditized flash
or do i actually look to bring my software layer that is already the maximum
essence of efficiency and sort of marry that with the right flash model and i think there's
tremendous synergy between what pure has done to date with Purity and our ability to really drive that energy efficiency, that operational efficiency, that reliability, that scale, that longevity into this hyperscalers represents their next multiple of efficiency and a huge opportunity to not just
monetize AI, but actually drive an overall more efficient operation across their data center.
I'm interested to see how that one's going to play out because having looked at the rest of
the industry, you see a very definite sort of evolution where flash is bit by bit carved off chunks of the the hard drive
market and at the bottom end you could also look at it and say that at the bottom end it would be
very easy to put a lot of your archive data onto things like tape mediums and you can front end
that with technology that allows you to get to that relatively easily so there's almost been like
a carving out of the the top
end and the bottom end for for hard drives and it seems what's left seems to be in the middle
and again you know looking at the hyperscalers it would seem that they would have the same problem
or not the same problem but the same probably um journey where they'll see more and more stuff at
the higher end being translated over perhaps or moved over to flash as the cost economics become more practical and what
they can't they'll push the bottom end 100 so i think tape still has quite a long life ahead of
it i think disk moves to flash and i think it moves on the basis of the cost gap today between
hdd and ssd if you start to look at the majority of the cost of flash is in the controller,
as the controllers move from 52 terabyte drives to 75 to 150 to 300 terabyte and even beyond,
you're now putting that controller cost over a much larger set of data, which in essence will
push the cost per terabyte down. It's only a matter of time. I mean, we've openly said that
we believe no net new hard drive will be sold
to the enterprise beyond 2028. And we're sticking to our guns on that. I mean, everything seems to
be lining up and the Density roadmap is definitely a tailwind for us. And we're excited to share more
about that, as well as everything else we're doing. I'm hoping that your listeners will attend
our Accelerate conference next month in Las Vegas.
I'll be there. Our CEO will be there. Our founder will be there.
And we'll be talking a lot more about what we see in terms of the future of data and the future of data to support AI at that time.
I was going to ask you a little bit about future, but of course, you know, asking about futures is always a tricky one because there's a balance's a balance between wanting to sort of um you know tease us with a little bit of something and actually not being able to talk about it at all as a public company so ultimately i would recommend that
people um if if they can they get to the event i'm guessing some of it like the keynote probably
be streamed online maybe um so you know if know, if you can't make it in person,
at least you'll have the ability to at least see the keynote,
which is definitely going to be worth following,
I think, if you're in this sort of market.
Yeah, we're extremely excited about all of what we have announced
over the last year.
If you think about where Pure started as a product
and then evolved into a portfolio,
you'll see us now talking much more about a data
platform. And ultimately what that means is that Purity, the operating system that drives efficiency
for all flash, really becomes the platform on which many of these workloads are built, right?
Whether they sit at the core, they sit at the cloud, they sit at the edge. And when you take
that platform and now you're able to deliver it truly as a service
through what we called Evergreen One, but you deliver it as a service via SLAs, I think it
lines up perfectly to what the market's going to be needing, not just for AI, but for all the
workloads that come up over the next few years. Great, Sean, it's been really interesting to get
that discussion going on, at least on initial discussion on AI going and starting to get people
thinking about what the challenges might be for their storage infrastructure. We'll make sure we to get that discussion going on, at least on initial discussion on AI going and starting to get people thinking
about what the challenges might be
for their storage infrastructure.
We'll make sure we put some links into Accelerate
and all the other stuff we've talked about
in our show notes.
But for now, it's been great to catch up with you
and look forward to learning a little bit more
once we've got another opportunity to chat.
Thank you, Chris.
You've been listening to Storage Unpacked.
For show notes and more, subscribe at storageunpacked.com.
Follow us on Twitter at Storage Unpacked
or join our LinkedIn group by searching for Storage Unpacked Podcast.
You can find us on all good podcatchers,
including Apple Podcasts, Google Podcasts, and Spotify.
Thanks for listening.