Storage Unpacked Podcast - Storage Unpacked 259 – Sustainable Storage in the World of AI with Shawn Rosemarin (Sponsored)

Starting point is 00:00:00 This is Storage Unpacked. Subscribe at StorageUnpacked.com. This is Chris Evans and I'm here with Sean Rosemarin from Pure Storage. Sean, how are you doing? I'm great, thank you. The sun is shining and it's a beautiful May, what can I say? Lovely. Canadia, yeah, you've got the weather. You're lucky. For this moment, although in the Pacific Northwest, if you don't like the weather or you like the weather,

Starting point is 00:00:33 just wait 10 minutes. It's likely to change. And something else will come along. Yeah, perfect. Okay, and before we go any further, obviously you've been on the podcast and we've chatted in the past before, but it would be good if you could just take 10 seconds

Starting point is 00:00:44 and tell people what your current job title within Pure Storage is. Yeah, so I have the illustrious privilege of working for our founder, John Colgrove, within the office of the CTO. I look after a function of our business called customer engineering. So I'm the global vice president of customer engineering. And what that means is I spend the majority of our time talking to our largest customers, largest prospects, understanding how they're using data, what their challenges are in data, how they're leveraging our platforms, how they're leveraging the ecosystem, and really kind of act as that bridge back to our engineering and research development organization to ensure that what we're focused on is completely calibrated to

Starting point is 00:01:24 what our customers are most focused on. So incredible position and, you know, lots of learnings on a day-to-day basis. Excellent. Okay, now let's just get into the meat of what we're going to talk about today. And before we start, I just really want to raise the fact that I haven't really, as an analyst or as a podcaster or anything like that, really talked about AI that much. And partly one of the reasons for that is because I am absolutely no expert on the area of AI. It's one of those things where I think if the knowledge I have is I'd probably be dangerous with it rather than be useful with it. So I've tended to shy away from it until there was an opportunity to really talk about AI in a practical sense relating to the sort of things that I talk

Starting point is 00:02:03 to day to day. Now, thankfully, between you and I, we have an opportunity to do that. And we're going to talk today a little bit about sustainability, storage, AI, what the future will be, what we need, all the issues that come around that. So ultimately, our topic today will be AI, but it will be with that little storage sort of focus on it. And I thought it might be a good starting point, Sean, if you could just explain for the audience this whole AI boom and where we're at, because I think it's a good idea to get a sort of standing point

Starting point is 00:02:33 so everybody sort of gets a baseline as to where we're at in the industry at the moment. Yeah, so let me start off by saying this thing didn't come out of nowhere, right? I mean, point in time, my uncle wrote his PhD thesis on neural networks and expert systems and artificial intelligence back in 1979.

Starting point is 00:02:53 And so, you know, this thing has been brewing for quite a while. I think you could go back in the archives even way before that in terms of thinking about what the potential of this kind of technology would be. But if you think of where we've gone, essentially, the consumerization, or the cost of this level of computing has reached a point now where we can do a lot of these things that we talked about doing, but we're never really affordable. And I'll put that affordability in quotes, Chris, because as we'll talk about later

Starting point is 00:03:21 in the podcast, this is still very, very expensive, but it's now reached a level where as a service, there are companies who can now embrace this and monetize it. And we've got the ability from a digital perspective, because we've moved everything from paper to digital, that we can now actually look at all this digital data we've collected through modern analytics and data pipelines, and we can start to actually do a lot of these things. But, you know, it would be reckless to start this discussion without saying that, you know, this is likely a 10 to 15 year journey, and there will be money made along the way, and there will be billion dollar corporations and trillion dollar corporations that will be built on the back of it, some of which don't exist today.

Starting point is 00:04:05 But we have to remind ourselves that we are extremely aggressive in terms of what we think this can do. But actually bringing that to life and working past some of these challenges are problems and opportunities that will take us 5, 10, even 15 years to bring to market. Yes, it's an interesting scenario, isn't it? That we always see the Gartner hype cycle is probably the best description of this sort of thing. We do see a real buzz of technology when something new comes out, and AI has probably had that

Starting point is 00:04:33 over about the last 12 months. And we've seen a massive uptake of everybody saying that every product they're ever going to ship has got AI in it now, specifically generative AI, so that we can interact with it. And you do see that hype cycle sort of peak quite significantly and then fade away and usually as you uh quite rightly um

Starting point is 00:04:50 highlighted that curve does have a lifetime which could take 10 or 15 years and the interesting thing i think about that is it's going to be a case that we need to sit down and think about what the requirements are to deliver the technology over that time. You know, you've highlighted the fact that it's practical to do it now, but it's A, expensive, and B, it's very highly demanding in terms of resources. So these sort of technologies aren't just going to be sitting on your desktop in terms of building models. You know, we've got data centers that are absolutely enormous with tens of thousands, hundreds of thousands of GPUs and petabytes of data being required. Yeah, I would totally agree.

Starting point is 00:05:29 I mean, if we look at, there's a one slight sort of deviation that I would highlight. So if we look at other trends, if we look at things like, you know, how did e-commerce come to life? That took about 15 years. years, if I had told you, you know, back in the late 90s, that you'd be putting your credit card online, and you'd get a delivery date, you could track your package. And, you know, you'd be able to order from these entities that had virtually anything, and you could price shop, you would have said all this is great. But to actually bring the mechanics of that to life took time. But there's two specific areas that are different about this particular trend. The first one is that money can't solve everything. There are actually real hard issues here,

Starting point is 00:06:10 like access to electricity that we'll talk about, that fundamentally, no matter how much money you have, even if you wanted to go build nuclear reactors, that's going to take several years to stand up. And so this energy piece is one. Regulation, we'll talk about that in a little bit of detail. That's going to be another one. And I think the third one is, when we think about what we're building today, we have to be really careful because we're sitting on the cusp of data growth that's estimated to

Starting point is 00:06:34 be roughly 30% compounded average growth rate. That means next year we'll have 30% more data than we had in aggregate this year. And if we think about what that means, if we don't build the right foundation and we have to replatform or we have to change routes two, three, four, five, 10 years from now, we're going to be talking about a mountain of data, a mountain of data that would need to be migrated, potentially integrated into a new platform. And so this one's a little different, but I do agree with your earlier proposition. Those that launch first with their services have an opportunity to become the de facto standard, the Xerox per se of the photocopy

Starting point is 00:07:20 era. But if you think about where we are today, those who move first may also have the largest challenge in migrating to whatever becomes the industry standard foundation of the future for AI. And we've seen a lot of companies like that. So I mean, the obvious ones, everybody will probably be thinking of are people at OpenAI. Meta has been right at the front there. Anthropic, who are in partnership with AWS. You know, these are the companies that are really sort of building these models and really starting this transition towards generative AI, at least in the first instance. And they must already have petabytes worth of data that they're processing on a regular

Starting point is 00:07:59 basis. Oh, they absolutely do. In fact, I would tell you in 2023 alone, there were 149 foundational models released. So if you think about what's happening, there is this race to build a model to effectively learn and understand the corpus of the internet. Right. And that's one we could definitely unpack at some point, which is, you know, there is so much data out there. And, you know, you've got all these different models in terms of looking at how to break all this down, and essentially move from a search model to a query model, where I didn't no longer search on the internet, I actually asked a question and I get an answer. And the promise of that is incredible. But the reality is in the enterprise, the corpus of the internet doesn't solve my problem. I actually need to go into my own data. I need to go into volumes that are specific to my discipline. I need to start to look into very specific, I'm going to call it multimodal content that might be mine and might be some that I buy and I need to start to look at how do I bring this context of, you know, large language models and this query engine to my specific business and my specific industry. And I think that's where we are right now is asking these questions of, you know, ChatGPT has been very, very cute and capable of, you know, helping me to frame my thinking in a

Starting point is 00:09:21 general sense. How do I use that and use that modeling to now look at the data corpus that I have internally that is not outside accessible and how do I allow my employees and my customers and my suppliers to start to use query engines that are fed from that data source yeah that's it that's where we're gonna go to next and I think you're right because one of the things that I found with enjoying the fun of using open AI and anthropic actually I use Claude more than I've used some of the others is it's great fun getting

Starting point is 00:09:51 it to get you to create a poem or tell you a joke or do something like that but it's not entirely accurate when it goes off and searches other external data sources it tends to have a and a reasonable degree of accuracy but it seems to me that the only real value here will come when i can actually as a if i'm a business combine as you said my data with with the the tech the technical ability for the platform to be able to query that data and present it back to me in a usable format so there's two things i think i can see there's the ability to get into my data, which obviously is rag. And I guess we can explain that in a second. But secondly, it's that ability for that model to be accurate enough that when it does get to my data, it gives me something useful.

Starting point is 00:10:35 So would it be right to say that the value of the model is in the accuracy and the ability to query other data that's going to be in my enterprise. Yes, I think that's correct. And I would give you another kind of continuum to think about here. So if we think about AI, I like to say today, we're artificial intelligence, right? So we're sort of building out this artificial view of the world and this artificial view of, you know, if we had someone who had learned all the learnings of the world, what would they, how would they summarize it? I think the big push now is to move from artificial intelligence to augmented intelligence, which actually says now as a professional, I can look

Starting point is 00:11:14 to this engine as a way to augment my capabilities, augment my productivity, augment my understanding, and even augment my quality of life by doing things that, frankly, as a human, I'm not really good at or I'm not really capable of. And then I think from augmented intelligence, we then move to autonomous intelligence. And this is much further out. But I think that's when we start to trust this machine and this brain and this co-pilot enough to say, you know, there are certain tasks you've proven to me that you can do effectively on a repeatable basis. Go ahead and do those. I don't want to be the bottleneck. And I think that will sort of unlock that last layer of benefit here. But it's going to take us a while to fully trust that this can be done. No different, by the way, than me

Starting point is 00:12:00 trusting that my iPhone can do a security update while I'm sleeping. And when I wake up, that phone will be exactly the way it was when I put it on the charger the night before. I sort of, yeah, I can trust the idea that my iPhone will look great. And when I pick it off the stand, it'll still be there the next day and it'll still be working. Not sure I'm yet in a position to say I trust some sort of AI to, for example, drive my car for me and for me to just sit in the back and go to sleep or something like that. I'm not quite ready for that one yet. And when we were chatting before we did this recording, I think I might have mentioned that I went to the BMW Museum in like 1989. And one of the things they highlighted there was that the intelligence

Starting point is 00:12:42 side of it, I think probably because their computing power wasn't good enough at the time, but their intelligence was to augment the user and not to actually take over from the user. So they showed a heads-up display in complete darkness in a forest where infrared was displayed onto the windscreen, on the internal side of the windscreen,

Starting point is 00:13:00 and the driver could drive as if they could see in the dark. Now, that was their preferred method rather than have the car take over. And it sort of strikes me that that would be currently probably the better route. But as this model evolves and we get more data and we get more advanced, we'll see that change as you've highlighted. Yeah, I would think there'll be some use cases where we'll see it emerge first. So security is one, right, where milliseconds and nanoseconds matter. So if my AI system in security starts to sense that something is wrong

Starting point is 00:13:29 and something's happening that shouldn't be, I don't want to wait for a human to get that alert and say, yes, please lock that user out. I actually, you know, at some capacity, I want to start to allow the system to say, I'm actually going to shut that user out. There's a risk that that user's doing something legitimate and I'm potentially slowing something down. But based on the way that AI has interpreted the activity, the safe route is to lock that user out.

Starting point is 00:13:53 I think you'll start to see things like that occur a little like to your car analogy, right? If I'm going head on into a barrier, the car is going to swerve out of the way. It's not going to ask me, it's not going to say, hey, I think you're doing something wrong. A lot of the cars today will do auto evasive maneuvers. And that's the first step. I think getting to a full autonomous driving of driving you between cities with you, you know, taking a nap in the back. I don't say we won't get there. I think we'll get there maybe a decade or so down the road, but we're going to take some baby steps and we're going to earn that trust just like we would in any other aspect of our life before we completely take our hands off the steering wheel and decide that we are safe to go do something else. It sort of has me in mind of the idea of Clippy popping up and

Starting point is 00:14:38 saying, you appear to be driving into a barrier. Would you like to turn left, turn right? Yeah, we probably don't want the Clippy approach uh when it comes to something time critical i can entirely say that okay yeah okay all right so let's let's talk about um business then and enterprise customers because you know that's what we talk about and then one of the things i'm really interested just to tackle is the whole idea of where these models will be developed going forward and the idea of security and the data sovereignty, not an easy word to say, data sovereignty around that. I think, you know, initially, of course, the AI companies have all got access to their own data sources, whether that's legitimate or not, you know, and that's going through the courts in lots of different respects.

Starting point is 00:15:34 But I just wonder whether enterprises will be really happy about using Gen AI and exposing it to their internal data sources, especially their crown jewels, if you like, without having some control over how those AI solutions actually access and use their data. Yeah, it's a great point, Chris. And that's why I think, you know, for a lot of these organizations where they've had board level edicts that says, you know, come back to us and tell us what we're going to do with AI. What are we going to tell the market that we're going to do with AI? How are we going to embrace this new concept? I think the key thing is really, you know, if you look at it architecturally, yes, the

Starting point is 00:16:04 LLMs are interested. You hinted earlier with RAG, the retrieval augmented generation, where I can take internal IP, internal corpus, and I can supplement that. But I think there's another element too, which is what external volumes do I need access to? What external data am I going to purchase? And then how am I going to deliver this in a way that is going to allow me to not just protect my internal assets, but ensure that I'm compliant with the upcoming regulation? Because ultimately, at the end of the day, it's not so much whether or not my data gets exposed. That's a big concern. But it's whether the data that I've used to train my model is mine to use. And if at any point in the future, should that ownership of that data get questioned,

Starting point is 00:16:49 I could be in a position where I would have to retrain my entire model to deal with things like California privacy, GDPR, HIPAA, you know, even external companies saying my volume of data is no longer on the public domain. Therefore, it must be removed. So when you think about training these models and the amount of cycles and amount of money that's spent training them and the amount of data that's generated, it's not just one time. There will be an iterative retraining process throughout to deal with insertion or appending, as well as deletion of key data sources. And I think that, you know, that's only one angle of this sort of data and security and governance piece. What we've not factored in is,

Starting point is 00:17:34 you know, how do we ensure that we don't get, you know, isn't used for evil. So we've sort of taken away that sinister prompting that could come from these? How do we eliminate things like deep fakes being used on our tools, which could potentially bring issues? How do we ensure that folks can't mastermind more sophisticated security attacks by looking at what's available on these LLMs and actually being able to more socially profile employees and do more sophisticated attacks to gain credentials? These are all things that we will work through

Starting point is 00:18:05 over the eras of development of AI. And there's a big cost involved in that because at this point, we haven't really talked about the cost too much. We've sort of highlighted the fact that it's relatively expensive. But when you are training, retraining a model, the more you have to go through that process,

Starting point is 00:18:23 the more cycles you burn, the more time it it takes you want that process to be fairly quick and fairly accurate and using the minimal resources possible and one of the i guess one of the examples i would use there is i remember at a pure storage event a few years ago talking about f1 and the number of cycles that are available to do wind tunnel training and simulations on the car. And they like your technology because you optimize the use of IOPS or whatever the technical threshold limit was that you were allowed to do. So there's got to be a cost on a utilization calculation that's done by people here. Yeah. So let's break that down into two specific areas. The first is you talked about the cost to train these models. So I find it fascinating that GPT-1,

Starting point is 00:19:12 which most of us didn't interact with, cost roughly $10,000 to train. Not that bad, $10,000. We could probably find that in discretionary budget. GPT-3 was $ million. GPT-4 was 550 million. And GPT-5, or whatever it happens to emerge with when it emerges, is estimated to be well over a billion. But within GPT-4, there was 78 million worth of compute. So if you want to go train a general corpus AI today, you're looking at a minimum 100 million plus investment, right? In fact, Google's most recent Gemini used $191 million alone in compute just to learn the corpus of what was there. So these are major, major, major investments. But to your second point about architecture, yeah, I would tell you that, you know, what's really interesting about the GPU is it's given us the

Starting point is 00:20:05 ability to compute at levels that we never thought possible, probably second only to what will come with quantum computing at some point in the future. But it's also put enormous, you know, it's kind of showcased enormous holes in our data platform. Traditional storage solutions just don't get the data to the GPUs fast enough. And if you think about it, I'll give your audience an analogy. If you think of your GPUs as PhDs that you've hired, and they're quite expensive, and they sit in the back room, but they're unbelievably smart at getting through information and finding conclusions. But if you can't bring them the books fast enough, or the material fast enough, and you can't allow them to share what they've learned with each other

Starting point is 00:20:50 fast enough, then you're not getting the value of the talent that you're employing. And that's the challenge today is how do I eke every possible benefit out of my storage platform so that as it grows and as I scale my farm, I can still feed and actually gain insights across that data set at the same level of efficiency. That's really interesting, isn't it? Because we spend an awful lot on technology. And I'm going to use my old person analogy here again that I use all the time and look back at the mainframe world when computers were really expensive. And when I started work, the environments I worked in, we ran our mainframe environment at more than 100% utilization permanently, which everybody might say, well, how could you run that more than 100%?

Starting point is 00:21:42 Well, there was a calculation that said, basically, if everything was running all the time, every 1% over 100 represented tasks that were active and ready to be dispatched and processed, but couldn't because there was a slight lag. So it was a measure of the sort of latent delay, if you like, on the system. But the issue was more the fact that the mainframe was running 100% all the time

Starting point is 00:22:04 because it was expensive, and therefore you absolutely maximize things you push things through the evening you did them overnight you you made the use of the bandwidth of course of that it was available over the course of the day and you're implying the same thing here with GPUs that GPUs are super expensive the more you put in the more you want to make sure they're running at 100%. But running them at 100% isn't just about putting the GPU and turning it on. It's about giving the data to run at 100%. And that's a throughput issue on your storage. It is, right?

Starting point is 00:22:37 I mean, because ultimately, your GPUs are looking for data sets. They're looking for work packages, right? And if you look at NVIDIA and CUDA and the compiler in CUDA that allows you to efficiently distribute the workload into for work packages, right? And if you look at NVIDIA and CUDA and the compiler in CUDA that allows you to efficiently distribute the workload into these work packages, that's great. But once again, I'm going to need fast storage and I'm going to need fast connectivity in order to deliver that. Because I can assure you, Chris, if I'm a CFO and someone's come to me and said, I'm going to buy this GPU farm for $ million, 100 million, 200 million dollars. I want you to come back to me every month and show me that this, you know, these crown jewels, so to speak, that you've acquired are now being used at 100, even greater than 100%. Obviously, I want to look

Starting point is 00:23:17 at the output. But the last thing I want to do is to have these, know incredibly expensive assets being underutilized assets as well at this point in time take a long time to get hold of so once you've actually got them you know you really do need to exploit them because because they're in short supply or at least they're in long long term delivery so you you need to be able to do something in such a way i think that can use whatever you can get hold of as well to a certain degree. Well, I think that's the interesting innovative piece, right? As we think about what we're doing today to process all of this data, we are using GPUs. And it is largely being centralized either on-prem or in the cloud.

Starting point is 00:23:59 But we're starting to see some really interesting things emerge in the edge right i mean most recently tesla said it might use its cars uh latent capacity while they're parked to start doing some of this apple's talked about its pcs and its iphones being able to do some of this so i think we will get a lot more distributed in terms of what level of compute and computing we're doing at the edge uh how we're making best use of latent compute capacity. Reminds me of my days back in the 80s when we had the search for extraterrestrial intelligence and we all had that running on our PCs looking for aliens. Screen saver.

Starting point is 00:24:36 I think we'll bring that back at some point and start to say, okay, how can we distribute this? Not just to manage cost, but to ultimately use the power that we have in the most effective way possible since that power in many cases isn't stored we have batteries but we want to use all the power that we have access to um and i think this concept of how are we going to power this whole thing becomes more and more important when we start to think about how this fits together. Yeah, okay.

Starting point is 00:25:10 Well, let's talk about power in a second, but let's just quickly talk about how data centers will have to change in order to make this work. So, you know, it's certainly not going to be deployed on hard disks. There's no doubt about that. But, you know, there's networking challenges, there's storage challenges, there's power and cooling challenges. You know, there's a lot of modification that power and cooling challenges, you know, that there's a lot of modification that's going to have to come into data centers to make this stuff really be delivered efficiently.

Starting point is 00:25:31 Yeah, there is a lot. So let's break that down. So first of all, the computational level, let's just talk about, you know, the density is going to be the biggest issue here. So if you look at the latest Blackwell offerings that NVIDIA has brought to market, we're talking about power density we haven't seen before right so ultimately you know if i'm looking at 5 000 watts for a single chassis and i'm looking at traditional rack density of 14 to 16. um you know i'm i'm sitting in a you know specific area where i'm gonna basically hit the wall before my rack is full. You know, I just can't be sitting with a 14 to 16, you know,

Starting point is 00:26:08 kilowatt rack and be putting three systems in and I'm full. 14 to 16, though. I mean, I look at that number, by the way, and think that to me, from looking back from years ago, seems like a big number anyway, but it's not anymore. No, it's not. It's not anymore. And when you think about it, in many cases, when we talk about density, the issue is not how much can I fit in my data center? It's how much infrastructure can I put in my rack before my rack is actually full from a density perspective. But then you look at energy consumption, right? that suggested that their NVIDIA H100 deployment is going to consume as much power as all of Phoenix

Starting point is 00:26:46 by the end of the year. So you start to think about, okay, existing power grids, existing electrical infrastructure. You look at advanced cooling technologies like liquid cooling. It's a little scary for those of us that are growing up in data center and thought about water or the like spreading around the data center. We'll probably get there. But let's think about infrastructure scale and capacity, right? 2.1 gigawatts of data center capacity were just

Starting point is 00:27:12 sold in the last 90 days to support these projects. That's enough power for a million and a half homes. And so there's a bit of a rush to go and acquire this grandfathered space that has dense access to power. The alternative, of course, being what Microsoft and others have done, which is to buy nuclear power plants for the sole purpose of powering their infrastructure. You know, then we've got geographical distribution and latency. You know, if you think about financials or automotive, to your point about self-driving cars, right? The latency is going to be critical here. We can't have milliseconds of delay between cars talking to each other about where they are

Starting point is 00:27:51 on the road. Conversely, we can't have millisecond delays of trading systems that are managing pension plans miss out on an opportunity or take a loss unintentionally. Then we're going to have to talk about sustainability, regulatory compliance. There will be a lot in this space around environmental and data sovereignty. And the piece I don't want to forget, Chris, is jobs. Our jobs as IT administrators, as infrastructure experts, will still be here. But the skills, the people, the process is going to dramatically change as we embrace this concept of co-pilots, we embrace this concept of new architectures for driving AI, and we start to think about what are our jobs going to look like when it's man, woman plus machine, as opposed to the world we work in today, which is largely full-time equivalent based. Do you think, though, that that changes anything

Starting point is 00:28:45 that we haven't seen previously? I mean, if you look at days before the spreadsheet, which I think, from my respect, that would have been early 80s, would have been days before the spreadsheet. I think I seem to remember some of the early spreadsheet technology coming in around 1981, maybe, around the IBM PC. And I certainly remember when I was at university, somebody was actually working

Starting point is 00:29:06 on a spreadsheet tool, a story for another day. But obviously there was a time before spreadsheets. So when we had that situation, there was manual calculation and there was all the rest of it. There was time before calculators.

Starting point is 00:29:18 There was time before all of that sort of stuff when people used, gosh, books with logarithms, logbooks, which I remember. That's right. Having actually had one and actually had to use them. So that augmentation always happens all the time that we see new technologies come along.

Starting point is 00:29:35 I just wonder if this augmentation is going to be any different to any of the other things that we've seen in history. Yeah, it'll be different because we have the ability to go deeper and wider in terms of what we have access to. And I also think we have the tools at our disposal, like these phones that we all carry around and potentially goggles or contact lenses or smart screens or whatever it is that's sitting in the windscreen of our cars that will allow us to make use of this information

Starting point is 00:30:02 in a much more real-time way. And so you see, it's kind of all boiled through to where we are today. We tend to build on what has come before us. Without the internet, we couldn't have phones. Without phones, we wouldn't have effective e-commerce the way we see it today. Without effective e-commerce, we wouldn't have this thriving ecosystem and platform effect. Now we're going to use the same platform effect of everything we've built to now actually drive this augmented capability this co-pilot that will help us both in our personal professional lives so so much of this has been driven by the demand for storage

Starting point is 00:30:35 and the throughput and the rest of it there's obviously um the data center changes we just we just mentioned but um where are the threats here? I mean, where are going to be the problems that we're going to encounter other than the obvious ones about powering racks and, you know, getting power into the data center? There's, I think, a fundamental level here of technological challenge. There is. And, you know, it reminds me a lot, Chris, of what we saw with HPC or high performance computing maybe 15 years ago. And we saw a lot of customers going out there and doing science projects, buying pieces and components of particular hardware, a little like you and I built our first PCs back in the day.

Starting point is 00:31:13 And then we realized that, you know, the components that we put in those PCs maybe didn't have as long a life. They were difficult to replace. They caused us to have to rebuild every year. We saw the same challenges in HPC, right? I mean, a lot of it was software defined. A lot of it was open source. Some of it was vendors who had kind of come up

Starting point is 00:31:31 with a particular proprietary technology or science project and they brought it in. And while it delivered during that POC or minimally viable product, as soon as it got to scale, it broke. And it broke because of the operational complexity. It broke because of the overhead and the energy cost. And the reality was it just couldn't scale. And so while some of those HPC projects still exist today, many of them evolved back to

Starting point is 00:31:58 infrastructure models that could deliver long life and performance and simplicity at scale. I think it's a big piece of it. Because if you think about it today, if I'm building a new arm of my business and I'm using AI to power it, as soon as I prove my thesis, the business is going to want to rush into that as quickly as possible. So we're going to go from a factor of 1 to 10 to 100 to 1,000 to 10,000. And if I haven't thought through what is this thing going to look like a factor of one to 10 to 100 to 1000 to 10,000. And if I haven't thought through what is this thing going to look like at scale? How am I going to manage it at scale? Is it going

Starting point is 00:32:31 to be a reliable platform for the next decade, which is the timeframe I think we should be thinking about this, then I could end up actually crippling my organization two, three, five years down the road. When I realized the initial platform I chose to build this on is no longer viable and now I need to re-platform. And so this is a big, big piece of the conversations that we're having with customers today is if we look at what Pure has done with Flash, I look at the discussions I'm in, right? We're now sitting at a particular point in time when is the future going to be flash? Yes.

Starting point is 00:33:07 Am I going to need something that's energy efficient? Yes. Am I going to need something that spans performance all the way down to archive capacity? Yes. Am I going to need to consume it as a service? Yes. Am I going to need to have a vendor that I can rely on that will be in business in the next decade?

Starting point is 00:33:26 Yes. And am I going to have the manpower and the skills to support it. And so those are the things that unfortunately don't always get looked at in the same light. Today, most of what we hear about with AI is bigger, brawnier, stronger, more powerful. But what about when your project actually works? What about when this becomes the next big thing? Will you be in a position to actually support it and deliver it out to your customers? I was thinking about how that translates when you use the PC analogy, quite like that idea. So I wouldn't like to think how many PCs I built over the years. And you're right, you'd buy something, you'd think, oh, well, I'm going to put that graphics card in

Starting point is 00:34:06 because that one's the best at the current state-of-the-art graphics card or almost, yeah, I'll go for almost the top one. I'm going to go for whatever particular motherboard I can afford in memory and all the rest of it. And then you put a bit of software and you find that the driver didn't work. And then you're trying to find, especially when you were looking at the early days of Linux,

Starting point is 00:34:24 you're trying to find a driver you can just sort of shoehorn in and and then you know certain versions of windows you would try and use a driver you thought might work and inevitably that might work for you but that was absolutely not scalable and then before you know it you know manufacturers have brought out machines that are completely sealed like the apple ecosystem where you don't get to touch anything inside, where the software is built to work with the hardware, where the ecosystem is built around that combination of the two working together. And then you see where scalability can come from. So, you know, it's pretty easy to see if you imagine today's AI models and being sort of self-built PCs where we need to get to. Yeah. And we have to resist the temptation to grab these shiny red toys that, you know,

Starting point is 00:35:12 appear to have come off the line with a custom built solution to solve this problem and really challenge their architectures and challenge their scale. Because what I tell you, Chris, is if you start to look at a lot of this open source stuff you just described, it's exciting. But with the current security posture we have, I'm not sure that putting Linux boxes on my floor in my storage architecture with a custom distribution of Linux is going to satisfy my security people

Starting point is 00:35:41 who are going to want to look into every iteration of that core operating system, as well as what components are in it to ensure that it satisfies the gold image standards of that organization. I think I'm not too keen on the idea of putting something on the floor that might break when I've spent 50 or 100 million dollars and suddenly it's the thing delivering the data into an infrastructure which is now sitting idle you know that that hierarchy of you know dependency it becomes even more critical now because 50 or 100 million dollars worth of gpu is not working is being if

Starting point is 00:36:15 you're like what's the american expression nickel and diming on the actual storage piece just because you think that's the better way of delivering something cheaply at scale, then really you've done yourself a disservice. Well, remember what we talked about earlier, right? It's the density of energy. So when you and I were growing up with, you know, one gig drives, and now we're thinking today, from Pure's point of view of 75 and 150 and 300 terabyte drives, we're looking at the energy profile and we're saying, wow, I could actually be able to scale my data 10X without having to retrofit or expand my energy footprint or my data center footprint. And we're super excited about that

Starting point is 00:36:55 because frankly, we believe that we can unlock this level of density five, seven, 10 years before traditional SSDs can. And it's easy to think about storage as just a, like you said, a commodity cost. But the reality is you will run out of space and you will run out of power before you have satisfied the data needs

Starting point is 00:37:14 of your organization. We've been focused on this problem for 15 years. And now that we're at the scale, we're talking about 300 terabytes in a single direct flash module. This absolutely changes the game. And when we look at AI and the density that will be required and essentially freeing you up the energy so you can put these GPUs into your environment without running out of power, that's incredibly exciting for us. I think that's a good point to sort of move into your technology and what you're doing for this, because understanding, I think, some of these challenges, just as a side note, by the way,

Starting point is 00:37:50 I saw something today that I was reading from another analyst that was talking about how there was suddenly this demand for mega capacity SSDs. And it made me think, well, yeah, that's actually that demand's been around for quite a while. Possibly just the industry just hasn't picked up on it or has chosen to ignore it because technically they found it difficult to deliver those products. So it's... Well, but Chris, let's also keep in mind commodity SSD manufacturers, the majority of their volume is in the one to two terabyte range. That's for desktops and notebooks.

Starting point is 00:38:23 And there's a massive amount of volume there. And it's quite good business for them. So there aren't a lot of consumers that are going to move from a two terabyte drive to a 50 terabyte drive for their home computer. And so when you think about the economic models on which most of the enterprise storage providers are relying, they're playing in the long tail, right? So you really have to ask yourself, if you're running the businesses of producing SSD drives, how much engineering are you going to invest in a product that makes up three, four, five, 10% of your overall volume versus focusing on really making sure you don't erode any of your market share in the one to two terabyte space? Okay. All right. Fair enough. That's a fair comment.

Starting point is 00:39:07 Okay, then. Let's just go back over that whole capacity side of things then and exactly how that's going to be driven because there's got to be a number of components in here. Okay, scaling to 300 terabytes is one, but efficiently operating at that level isn't just going to be done by hardware alone. There's software involved, and essentially that's part of the scale story as far as i can

Starting point is 00:39:30 see that unless this software can manage the infrastructure efficiently it doesn't matter whether you can put 300 terabyte drives in you have to be able to manage them yeah so i would kind of equate chris as i as i kind of go into a little bit of detail here, fundamentally pure strategy is to drive efficiency across all flash everywhere. If you look at what purity is, our software, it is essentially the operating system for flash. It is the operating system for flash that drives maximum efficiency. Today, that's delivered via DFMs. Today, those DFMs go inside FlashArray and FlashBlade and deliver the proposition that we deliver to the market. But ultimately, if you look at what we have done and what we are doing,

Starting point is 00:40:11 Pure continues to engineer Flash software as high technology across block, file, and object. This is not something we buy. It's not something we borrow. It's not something we rent. We spend the majority of our engineering on ensuring that our flash software and our flash operating system is the best in the world in terms of driving efficiency. We're the only provider of being able to look at that efficiency across the full life of the IO. So all the way from the controller to the enclosure to the individual DFM to the individual cell of NAND. Nobody else can do that. We also have no flash translation layer. Who cares?

Starting point is 00:40:53 Well, we care because the flash translation layer doesn't just impact performance. The flash translation layer consumes a ton of DRAM, which consumes a ton of power, which consumes a ton of DRAM, which consumes a ton of power, which consumes a ton of energy. And when my competitors are looking at one gigabyte of memory for every terabyte of storage, and I don't have that requirement, I can drive density into the same footprint significantly faster. By the way, less DRAM also gives me much higher reliability. And so when you think of what we did with Purity in terms of really building the most efficient path to flash, right, really bringing the highest level of reliability to QLC,

Starting point is 00:41:31 that has now allowed us to say, okay, so now we can go and purchase an engineer off of raw NAND, a density roadmap for media, for DFMs, that is exponentially faster than anything in the industry. We think this is super, super important. Now, it does beg the question of, does the DFM then become the building block for other storage solutions, or even hyperscalers, or even beyond? And I think ultimately, if you think of where this is going, if the industry will struggle to produce SSDs beyond 30, 60 terabytes, and Pure's got a solution at 300, I do think this is things that I found really interesting, probably slightly more than two years, is how much engineering is going back into engineering hardware to deliver what is required for a whole range of different technologies.

Starting point is 00:42:31 So obviously we've seen ARM come back up again and be treated as an efficient solution for certain types of workloads compared to using, say, Intel and AMD x86 processors. We've seen rapid development of the GPU and the way that the GPU is being used, but we've also seen the hybrid super, what do they call them, super chips that NVIDIA have built,

Starting point is 00:42:58 where they've taken ARM and they've merged it together with the GPUs to build more complex systems on chip that are allowing them to sort of not necessarily just bolt together components like we would have done in the old days when we would have built our PCs by hand but actually say well how do we actually bring all of this stuff together in a more cohesive and more holistic way that actually delivers what the requirement is and it sounds just like you're saying by the idea of having purity managing a physical layer of, let's just call it flash, full stop,

Starting point is 00:43:30 you've got the ability to start building in that more holistic approach to actually delivering your storage infrastructure. Absolutely. So first of all, if you go down, and I hope some of your listeners have visited our corporate campus, Santa Clara, we're a stone's throw from NVIDIA. They're obviously a very close partner of ours, and we are in constant discussions with them of how do we feed storage faster to NVIDIA.

Starting point is 00:43:55 You can talk about OVX. You can talk about SuperPod. You can talk about BasePod. There are all sorts of discussions around how to ensure that the goodness that nvidia has built in cuda to effectively drive efficiency of gpu utilization is mirrored with purity to drive the efficiency of the utilization of flash because ultimately if i build the most efficient solution then it will essentially become more affordable and more viable for the market, which will essentially accelerate the roadmap and the time to market for these solutions. So the data issue, we're very focused on solving that.

Starting point is 00:44:34 We're very focused on making sure that we deliver the most efficient path to storage, to flash, period. While NVIDIA is focused on making sure that they can crunch through that data as efficiently as possible. And I think the collection of both of those will bring a collective goodness to the market in terms of making all of this possible years earlier than it otherwise would be. But more importantly, making it viable to operate in the long term. So one of the things that sort of, I guess, this leads us on to, Sean, is this discussion, really, about how the hype scale is going to deal with this. Because, you know, at the very top, we sort of talked about where we thought a lot of this AI stuff would be delivered. Is it going to

Starting point is 00:45:18 be on-prem? Is it going to be in the cloud? Cloud's obviously going to have a massive impact on this. And over the last, I would say, five or ten say, five years, let's call it five years, we've seen an evolution of some of the background technology the hyperscalers have got for storage. They've added in NVMe drives and some other things, but I wouldn't say they're necessarily the fastest to get to the market with new technology there. It tends to sort of be a particular product.

Starting point is 00:45:44 So where do you see those companies going and how do you think they're going to deal with this, especially with reference to efficiency when we know efficiency is a big thing on their minds? Well, Chris, if you look at where these guys originated, they all started with white boxes and they all started with as cheap technology as they could possibly get, thinking that that was the path to the lowest price to the customer. And then you've seen all the hyperscalers, specifically Amazon, actually acquire technology to make their infrastructure smarter. And it's largely turned proprietary. If you look at what's happened with Annapurna Labs, if you look at with a lot of other acquisitions, it's all about how do I drive efficiency for VM workloads in that particular case? The next path to efficiency for

Starting point is 00:46:26 these clouds is Flash. And they know it, right? They know that ultimately, if they are going to continue to offer more dense storage, which will allow them to offer multiples of data services in the same footprint that they are going to need to move to Flash. And, you know, if you look at where the hard drive business is today, the nearline hard drive, 50, 60, 70% of it is all being sold to the hyperscalers. You can see that in the HDD manufacturers' latest earnings. 50% of their nearline hard drives went to hyperscalers. Over the next couple of years, you're going to see that dramatically shift to Flash. And once again, you know, their choice will will be do i buy the cheapest commoditized flash or do i actually look to bring my software layer that is already the maximum

Starting point is 00:47:15 essence of efficiency and sort of marry that with the right flash model and i think there's tremendous synergy between what pure has done to date with Purity and our ability to really drive that energy efficiency, that operational efficiency, that reliability, that scale, that longevity into this hyperscalers represents their next multiple of efficiency and a huge opportunity to not just monetize AI, but actually drive an overall more efficient operation across their data center. I'm interested to see how that one's going to play out because having looked at the rest of the industry, you see a very definite sort of evolution where flash is bit by bit carved off chunks of the the hard drive market and at the bottom end you could also look at it and say that at the bottom end it would be very easy to put a lot of your archive data onto things like tape mediums and you can front end that with technology that allows you to get to that relatively easily so there's almost been like

Starting point is 00:48:23 a carving out of the the top end and the bottom end for for hard drives and it seems what's left seems to be in the middle and again you know looking at the hyperscalers it would seem that they would have the same problem or not the same problem but the same probably um journey where they'll see more and more stuff at the higher end being translated over perhaps or moved over to flash as the cost economics become more practical and what they can't they'll push the bottom end 100 so i think tape still has quite a long life ahead of it i think disk moves to flash and i think it moves on the basis of the cost gap today between hdd and ssd if you start to look at the majority of the cost of flash is in the controller,

Starting point is 00:49:05 as the controllers move from 52 terabyte drives to 75 to 150 to 300 terabyte and even beyond, you're now putting that controller cost over a much larger set of data, which in essence will push the cost per terabyte down. It's only a matter of time. I mean, we've openly said that we believe no net new hard drive will be sold to the enterprise beyond 2028. And we're sticking to our guns on that. I mean, everything seems to be lining up and the Density roadmap is definitely a tailwind for us. And we're excited to share more about that, as well as everything else we're doing. I'm hoping that your listeners will attend our Accelerate conference next month in Las Vegas.

Starting point is 00:49:48 I'll be there. Our CEO will be there. Our founder will be there. And we'll be talking a lot more about what we see in terms of the future of data and the future of data to support AI at that time. I was going to ask you a little bit about future, but of course, you know, asking about futures is always a tricky one because there's a balance's a balance between wanting to sort of um you know tease us with a little bit of something and actually not being able to talk about it at all as a public company so ultimately i would recommend that people um if if they can they get to the event i'm guessing some of it like the keynote probably be streamed online maybe um so you know if know, if you can't make it in person, at least you'll have the ability to at least see the keynote, which is definitely going to be worth following, I think, if you're in this sort of market.

Starting point is 00:50:34 Yeah, we're extremely excited about all of what we have announced over the last year. If you think about where Pure started as a product and then evolved into a portfolio, you'll see us now talking much more about a data platform. And ultimately what that means is that Purity, the operating system that drives efficiency for all flash, really becomes the platform on which many of these workloads are built, right? Whether they sit at the core, they sit at the cloud, they sit at the edge. And when you take

Starting point is 00:51:02 that platform and now you're able to deliver it truly as a service through what we called Evergreen One, but you deliver it as a service via SLAs, I think it lines up perfectly to what the market's going to be needing, not just for AI, but for all the workloads that come up over the next few years. Great, Sean, it's been really interesting to get that discussion going on, at least on initial discussion on AI going and starting to get people thinking about what the challenges might be for their storage infrastructure. We'll make sure we to get that discussion going on, at least on initial discussion on AI going and starting to get people thinking about what the challenges might be for their storage infrastructure.

Starting point is 00:51:28 We'll make sure we put some links into Accelerate and all the other stuff we've talked about in our show notes. But for now, it's been great to catch up with you and look forward to learning a little bit more once we've got another opportunity to chat. Thank you, Chris. You've been listening to Storage Unpacked.

Starting point is 00:51:47 For show notes and more, subscribe at storageunpacked.com. Follow us on Twitter at Storage Unpacked or join our LinkedIn group by searching for Storage Unpacked Podcast. You can find us on all good podcatchers, including Apple Podcasts, Google Podcasts, and Spotify. Thanks for listening.

Storage Unpacked Podcast - Storage Unpacked 259 – Sustainable Storage in the World of AI with Shawn Rosemarin (Sponsored)

In this episode, Chris discusses the topic of building sustainable storage solutions with Shawn Rosemarin, Global VP of Customer Engineering at Pure Storage....

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.

Your Ad Here

Storage Unpacked Podcast - Storage Unpacked 259 – Sustainable Storage in the World of AI with Shawn Rosemarin (Sponsored)

In this episode, Chris discusses the topic of building sustainable storage solutions with Shawn Rosemarin, Global VP of Customer Engineering at Pure Storage....

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.