Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 06x13: Focusing on AI Data Infrastructure Next Season on Utilizing Tech with Solidigm

Episode Date: May 13, 2024

Great AI needs excellent data infrastructure, in terms of capacity, performance, and efficiency. This episode of Utilizing Tech serves as a preview of season 7, brought to you by Solidigm, and feature...s co-hosts Jeniece Wnorowski and Ace Stryker along with Stephen Foskett. Solidigm's partners are discovering just how important it is to optimize every element of the Ai infrastructure stack. With ever-larger AI datacenters being built, efficient storage can make a big difference, from power and cooling to physical density to performance. As we will hear throughout season 7, different AI environments will need specialized data infrastructure, from the edge to the cloud. With retrieval-augmented generation (RAG) emerging as a new trend in AI, it makes high-performance even more important at run-time. Hosts: Stephen Foskett, Organizer of Tech Field Day: https://www.linkedin.com/in/sfoskett/ Jeniece Wnorowski, Datacenter Product Marketing Manager at Solidigm: https://www.linkedin.com/in/jeniecewnorowski/ Ace Stryker, Director of Product Marketing at Solidigm: https://www.linkedin.com/in/acestryker/ Follow Utilizing Tech Website: https://www.UtilizingTech.com/ X/Twitter: https://www.twitter.com/UtilizingTech Tech Field Day Website: https://www.TechFieldDay.com LinkedIn: https://www.LinkedIn.com/company/Tech-Field-Day X/Twitter: https://www.Twitter.com/TechFieldDay Tags: #UtilizingTech, #AIDataInfrastructure, #AI, @SFoskett, @TechFieldDay, @UtilizingTech, @Solidigm,

Transcript
Discussion (0)
Starting point is 00:00:00 As we've talked about all season on Utilizing AI, great AI needs excellent data infrastructure from capacity to performance and efficiency. This episode of Utilizing Tech serves as sort of a capstone for Season 6 and a preview of Season 7, which is brought to you by Solidigm and features co-hosts Janice Naroski and Ace Stryker, along with myself, Stephen Foskett.
Starting point is 00:00:23 So listen in as we give you a preview of the next season of Utilizing Tech, which kicks off on June 3rd, and maybe a few bits of information to sum up what we talked about all during season six. Welcome to Utilizing Tech, the podcast about emerging technology from Tech Field Day, part of the Futurum Group. This bonus episode focuses on our next season of Utilizing Tech, presented to you by Solidigm. We are going to focus on AI data infrastructure in enterprise IT, which is a key topic that has emerged here on season six of Utilizing, focusing on AI. I'm your host, Stephen Foskett, organizer of the Tech Field Day event series. Joining me as my co-host today and all season long here in season seven, we have Janice Narowski and Ace Stryker from Solidigm.
Starting point is 00:01:17 Welcome to the show. Why don't you tell me a little bit about yourselves, starting with Janice. Thank you, Stephen. It's a pleasure to be here. We're really excited about this series. I'm Janice Garowski. I've been with Solidigm since its inception. I've been in the storage business for over 15 years, and I'm just really excited to work with you all to dive into some really meaty topics around storage and how we are fueling the AI era.
Starting point is 00:01:44 I'll turn it over to Ace, our co-host here, for an intro as well. Thanks, Janice. Thanks, Stephen. I'm really psyched to be here with you guys as well. I am Director of Market Development at Solidigm and have been in the solid-state memory business since about 2017, starting at Intel and moving over to Solidyne when it was born at the end of 2021. And in my job, I spend a lot of time focused on emerging opportunities for our storage
Starting point is 00:02:16 products. And to do that, a big part of the job is understanding sort of the infrastructure story from start to finish. And very much focused these days on AI and likely will be for some time. Things, as you know, Stephen, are just blowing up there. Yeah, that's one of the things listeners of Utilizing Tech might remember you. Back in March, we had you on and we were talking about why great AI needs great storage. And that was really the seed, the acorn that this whole season grew out of. The idea that in order to have great AI, you need great storage. And not just because storage is so important to AI. Now, obviously,
Starting point is 00:02:59 Janice and I have been in storage industry for a long time and think that storage is the most important thing in the universe. But beyond that, I think that what we're seeing is that AI does place incredible demands on storage, both in terms of capacity, in terms of efficiency, in terms of data transfer, as well as things that you might not think about, sustainability, energy efficiency, things like that. So I guess, Ace, do you want to sort of recap what we talked about back in March? What are the various things that makes storage so indispensable to AI infrastructure? Sure, you bet. You know, when you step back and you look at sort of the AI data pipeline as a whole, and what are your hardware and your software requirements
Starting point is 00:03:46 to do AI work, a lot of the focus is on compute, and rightly so. You need a lot of compute horsepower to clean data, to train a model, to validate a model, and then to deploy it in the real world and make it useful for folks, right? And so the focus from our company's point of view has been, how do we help folks who are spending all that money and energy and effort on really high-powered data centers? How do we support that and make sure that the right storage is there to feed those GPUs, maximize utilization, and ultimately improve two things, performance, right? You want to get this done faster, and you want to have
Starting point is 00:04:37 higher quality models at the end of the day, and then total cost of ownership or TCO. And so it turns out that storage plays a pretty big role in both of those vectors when you're talking about the suitability of a given data center infrastructure for AI work. So for example, if you're using hard drives to power a lot of your work, and I think we'll get into this in the course of the season, right?
Starting point is 00:05:04 And we'll talk about sort of pros and cons there. But you're probably consuming a lot of space and power to do that. Your random read performance is very low, which has implications on your training speed and so forth. And so you've got to think about more than just compute, right? You've got to think about an optimized infrastructure as a whole and make sure that the model, which is only as good as the data that goes into it, which is only as good as the storage that data lives on, right, you've got to make sure that that model is getting not just a lot of compute power thrown at it, but very capable and cost optimized storage as well to sort of design
Starting point is 00:05:53 and deploy the ideal infrastructure. I agree with Ace. It is, you know, it is a time where a lot of our organizations that we're working with are, you know, in a variety of environments. They're in a read-intensive environment or in a, you know, a mixed workload environment. And so the beauty of what SolidIne has to bring here is just kind of a full suite, a full stack of products that really address the various stages that folks are dealing with when it comes to the AI data pipeline.
Starting point is 00:06:29 So, Janice, you've been participating in a lot of our field day events in the past, and we have seen Solidigm present. But one of the things that I love is that Solidigm comes in often with partners. And these partners are able to tell an integrated story that goes way beyond the SSDs and things like that and really goes to the entire system stack. And so it was important when we were talking about this season that we would have a very partner-focused story. I wonder, can you talk a little bit about the sort of partners that we're going to be bringing into Season 7 as a way to give our listeners something to look forward to? Because we've got some pretty good names here. Thank you, Stephen. Such a great question. We are passionate, just as passionate about our partners as the products that we build to support them. We are big on our short-term and long-term roadmap. These guys, we look forward to helping them solve their individual problems.
Starting point is 00:07:29 And all of their problems are a little bit unique. Some of them not so unique and be learned from across the industry. So we're super excited to bring our partners onto this series, partners like Data Direct Networks, you know, go by DDN, right, where they're doing a lot with their all-flash systems, populating it with all things high-density storage to solve some of those deeper AI challenges. We will also have partners in the software space, right, partners who are working with us to fine to fine tune the products that we develop to, you know, further allow AI algorithms and AI workloads to take advantage of our drive.
Starting point is 00:08:12 So we'll be excited to hear from an organization like Xenor or even Grade, right? And then also the work that we do, and this is like the heartbeat of who we are, is working with our own partners, such as Dell and HPE, even NetApp to some degree, right? So having a mixture of software, hardware, and kind of the, you know, I would say innovative folks will be just exciting, I think, to learn from. The one partner I did not mention yet, speaking of innovation, is CoreWeave, right? CoreWeave is one of the most interesting CSPs on the market today.
Starting point is 00:08:52 They're an all-GPU cloud-based organization. I think they claim that they have more GPUs running in their data center than most companies. And so how are they fueling those data centers with those GPUs, massive amounts of power, and how are they utilizing high density storage like QLC to kind of bring down the overall total cost of ownership, the power savings, right? And still getting that super solid performance they need to run that cloud effectively. One of the most interesting things I think about the CoreWeave story is that they are really telling the story that we're trying to get at with this season. Because if AI data infrastructure is the critical topic for this whole season, CoreWeave is one of the biggest, most innovative, most leading edge developers of AI data infrastructure. I mean, they're literally building the same kind of things that the leading hyperscalers are building, except they're building it for others to use. I'm really looking forward to that because CoreWeave also, because they're so big and because they're so focused on this entire stack, they have a lot of insight into what we're getting at here in terms of the importance of storage.
Starting point is 00:10:09 So as we've said, it's important to have performance. Obviously, it's hard to get a lot of IOPS out of a hard drive, you know, obviously. But it's also important for other things. You know, one of the things that Ace was talking about previously, you know, the density of servers, the density of storage within those servers. Also, as you mentioned, the power efficiency. As we're recording this, the news just came out that there's a project to build a five gigawatt data center just for AI use. Now, just for comparison there, five gigawatts, that is about, you know, one twentieth of all the solar energy generated in the United States. That is 500 million LED light bulbs. You know, that is an incredible amount of power. Or four time traveling DeLoreans, I should ask.
Starting point is 00:11:07 That's a lot of gigawatts of power. And when you're dealing with that, when you have that kind of power capacity, but also when you're talking about the cost of kitting out all those servers, storage becomes one area that you can really try to optimize. So let's talk there a little bit about energy efficiency too, because one of the aspects of higher density and higher performance storage is that that means that you're using less power and less cost to deliver the same or better performance, right, Ace? Yeah, well, first of all, Stephen, I'm in favor of using the DeLorean as a unit of measurement
Starting point is 00:11:49 for all our future power analogies on this program. But yeah, you're absolutely right. When you think about the power demands in the newest data centers, specifically those geared toward AI processing, it's almost difficult to fathom what we're talking about. You're starting to see more and more laws and regulations evolve in different parts of the world that specifically address this problem, because it is becoming a problem on a national scale for a number of countries. Like, hey, we cannot afford to just keep putting in as many, you know, AI data centers as we
Starting point is 00:12:30 might want to. The grid cannot support it, right? And so efficiency is a huge part of the story there. And, you know, we've seen cases where 35% of power consumption in an AI data center is consumed by storage. I mean, it's not an insignificant portion. You think about the GPUs and they're churning all the time and they're big and they're hungry and they're hot and all that's true. But I mean, more than a third of that going to power your storage devices is a huge deal. And so there's a lot of room for optimization there. And when you talk about the energy issue and you talk about sustainability long-term, right? It's really a
Starting point is 00:13:17 function. I mean, there's a lot of things that you can factor in there. But what I've observed is it's really a function of two things going on that move the meter the most, if you will, in that conversation. One is density. How much storage per device are you getting? The denser it is, the higher capacity of the drives, right? The more power efficient you're going to be. And the other is utilization. And that has to do with, are you replicating data across several drives? You know, if you have hard drives, are you short stroking them to meet some minimum IOPS requirement, right? So those are the two things that really create a bunch of, frankly, inefficiency and opportunity for optimization. Moving to higher density drives in smaller form factors, it reduces the number of individual drives you need to power, right? But it also saves on cooling costs and space costs and weight if that's a concern. And so
Starting point is 00:14:18 all these things have huge implications that you don't necessarily think about, you know, at the CapEx stage when you're buying a bunch of gear to put in your, your, your, you know, big server farm, but, but in terms of OpEx and five to seven years after the purchase, how much is it costing you to keep this thing running? Density is a huge deal. And then utilization is the other one. You know, we've seen a lot of setups where, for example, hard drives are rated and the data is triplicated. So you're talking about three times as many drives purely for replication before you even get into the short stroking conversation. And so very quickly, you can run up an astronomical power bill, right, depending on some of the storage decisions you make. And so certainly optimizing your compute as a part of that infrastructure is important, but there's a lot we can talk about on the sustainability and the power consumption side as well.
Starting point is 00:15:13 Yeah, it really is transformative when you talk about optimized storage performance and storage capacity. I mean, most of us experience this on a daily basis with our new laptops. It's not like people are going out there buying laptops with hard drives in them anymore. And the reason isn't because they're incredibly cheap. I mean, yeah, it still costs a little bit more to use Flash instead of spinning media, but it's totally worth it with all the other benefits that you're getting in terms of, you know, the whole form factor of the machine, the power budget of the machine. And frankly, it's not that much more
Starting point is 00:15:45 expensive now, especially if you can use it optimally, if you can get more out of it. And I think that's what we're going to see with AI data infrastructure going forward. The other interesting angle that we've talked about as well is that with high performance flash devices, you can also do all sorts of advanced storage features that you might not be able to do with other non-flash devices, certainly, but even lower performance flash in terms of cloning and snapshotting and things like that. And all of that is useful for the AI data setup as well, because in many cases, you're going to be needing to make copies of data sets. You're going to be able, you're going to need to move data around. All of that is going to take up a lot of backend storage performance, just getting ready to do the AI processing. And if it's not a, you know, high-performance storage system, and when I say high-performance, I mean really high-performance storage system. You won't be able to do that online.
Starting point is 00:16:47 And of course, that's just going to mean that you're going to have to have a whole other infrastructure built in order to support all this. So it really does seem that storage really is fundamental. But then you mentioned the form factor angle. And I think that that's critical too. One of the things that I've been focused on
Starting point is 00:17:04 when talking about Solidigm with people who aren't from the storage industry is the importance of, or the transformative effect of different storage form factors. Now that these servers are coming out, they're able to pack a lot more storage density into the same, you know, 1U or 2U rack unit device, because we're not talking about the old fashioned, you know, disk drive looking devices anymore. We're talking about new things. And, you know, Janice, are we going to be able to talk about that? You presented that at field day. Are we going to be able to talk about form factors too? Yeah, I think we'll be able to talk about form factors more. Not only the ones we've been designing over the years, all the different varieties,
Starting point is 00:17:46 as you mentioned, can be packaged into these smaller devices now, but also working with new types of form factors where you might see some compute being done within the device as well as the storage. So hopefully we'll be able to bring that up. Computational storage, I believe, is what, you know, folks want to hear more about. And that's something we, you know, we can definitely speak to. I will say too, Stephen, you know, a lot of our partners too, you asked the question before about the partners and what we're doing with them. A lot of the partners that we work with have different needs, right? It's no longer just about, you know, stacking a bunch of U.2s in a data center, right?
Starting point is 00:18:28 There's, you know, now edge compute, right? And being able to get away from using hard disk drives at the edge and being able to deploy, you know, a comparable SSD that's just as dense as a hard disk drive and then walking away from, you know, having it overheat or having it to be serviced. So we're seeing a lot of momentum
Starting point is 00:18:49 with some of our form factors at the edge as well. Yeah, that's actually a really good point too, Janice. We were talking about that during season six when we were talking about AI. I think that people are so focused on these big high dollar, yeah, four DeLorean data centers full of full of AI, GPUs and everything that they. But the next wave of AI is probably going to be outside the data center or outside the cloud, that sort of cloud data center. It's going to be more deploying AI inferencing and retraining and AI in different locations. And as you said, AI at the edge.
Starting point is 00:19:38 And there again, you know, we come back to this question of power and cooling and performance and all of the different things that AI is going to need. And there again, I think that we're gonna hear about that during season seven, or at least I hope. Do you have some people lined up to talk about Edge and outside the data center? We do, we have a couple of experts, customers rather, that we work with.
Starting point is 00:20:05 Cheetah Raid is one of them. It's just doing some really cool ruggedized edge work with a lot of military or U.S. government types of agencies. But then I think we also have some really amazing experts internally who are working on some of these new SSDs that will really complement the edge. And I'm hoping to bring some of those experts on as well from Solidigm. Yeah, absolutely. And it's going to be really interesting to hear where they're going as well. And then, of course, we've talked as well with some of the data companies, with the data platform companies.
Starting point is 00:20:41 One of the topics they mentioned that brought up during season six here as well was retrieval augmented generation or RAG. Essentially having AI inferencing or generative AI be able to go out in real time and reference data and bring in referencing data from structured data sets. That I think is going to be another transformative trend here in 2024, as instead of just asking generative AI to know everything, or to make up everything, we're going to say, you know, hey, generative AI, why don't you go ask the question that I don't even know how to ask, and then come back with that data. There again, you know, if you're using retrieval log metageneration, it is really critical that that stuff have ultra, ultra high performance and good capacity, because that is going to need a nice data set behind it. But you can't have that data
Starting point is 00:21:37 set slowing down the interactive aspect of AI that affects the end user. So I think that that as well, we'll see some relevance here too. Hopefully we'll be able to bring in some of those data platform folks as well into this conversation. Yeah, there's a lot of really cool work being done in that space right now. Innovation, not just on the hardware level, but the software, cool caching approaches. RAG, as you mentioned, opens up all sorts of additional solution possibilities from the customer and user perspective. And so these are things that we're tracking. These are things that I think are definitely ripe for exploration in season seven. You know, there's really interesting work being done on how models are quantized, you know, shrunk
Starting point is 00:22:31 and made more space efficient on validation techniques. I mean, the possibilities are endless here. We'll try to focus on the things that we think have the biggest implications for the future of infrastructure. And we'll bring in, you know, the one of the one of the things that we think have the biggest implications for the future of AI infrastructure. And we'll bring in, you know, one of the things about Solidigm is, you know, without tooting our horn too much, we believe we have some of the best partners in the business. And so we're excited to bring them in and hear from them. Yeah, and we're very excited with that, too. Thank you so much.
Starting point is 00:23:31 Those of you who are listening, yes, yes, Solidigm is going to be co-hosting this whole season and Solid talking about and won't really have to worry about this. It's the topic that's important. It's the data infrastructure and the importance of data to AI that's important. And so our commitment to the listener is that, frankly, this is going to be another season of utilizing just like season six, just like season five, just like season four, in that we're focusing on a topic, in this case, the infrastructure, the data infrastructure under AI, and that we're going to bring in a variety of different voices around that, and that we're going to have, you know, interesting co-hosts like Ace and Janice here joining me for these conversations, but that the whole conversation, the whole thing is more about making practical use of this technology than it is, hey, rah, rah, let's talk about a specific product. And there again, I just want to say, Janice and Ace, I really appreciated your attitude when I came to you and talked to you a little bit
Starting point is 00:24:16 about this, because you guys were on board with that, right, Janice? I mean, I remember having that conversation with you and you were like, yes, I want to do this. A hundred percent. Yeah. Thank you so much for the opportunity, Stephen, to learn from our partners and share that with the rest of the world. Absolutely. And, you know, this is what we do here. This is what Tech Field Day is all about. It's what utilizing is all about. It's what the Tech Field Day podcast is, too. So just to give everybody a sense of what's going on, today is May 13th. This is the last episode of season six where we focused on AI here on Utilizing Tech. But the next season is coming real quickly.
Starting point is 00:24:54 The AI Data Infrastructure season begins on June 3rd. This is our Monday podcast. So every Monday you'll have a new episode. It'll feature myself and Ace or Janice as a co-host and a different company or a different person, a different customer in this space talking about the importance of data infrastructure and the nuances of data infrastructure to building a modern AI data center and modern AI processing. So check that out, again, starting on June 3rd. Before we go, Ace, Janice, I guess Ace, learnings along the way. It's a very fast moving space and it's a lot of fun to be here now and try to help sort of position our partners and accelerate progress
Starting point is 00:26:00 in the industry toward really optimizing performance and costs. In terms of where you can hear more from us, Janice and I are on LinkedIn. Welcome any conversations folks want to reach out there. That's great. We have an AI page on our Soladyne website with a lot of this information where we'll continue to post new stuff as it comes along. That's Soladyne.com slash AI.
Starting point is 00:26:26 So please check that out as well. Well said, Ace. Yeah, can easily connect with us on LinkedIn. And a lot of the customer and partners that we were speaking of earlier also have some sunlight, if you will, on our AI landing page. So again, yeah, head over to solidime.com forward slash AI
Starting point is 00:26:48 to keep ahead of those things. Well, thank you very much for joining us here as our sort of wrap-up episode for season six and our premiere episode or preview episode for season seven of Utilizing AI. If you're listening to this, thank you for joining us. We are so happy to have you. We're so happy to have the audience
Starting point is 00:27:08 for Utilizing Tech just building and building. This is the Utilizing Tech podcast series. You can find this podcast in your favorite podcast application, as well as on YouTube. Just search in your favorite search engine for utilizing and whatever topic you're looking for, and you'll probably find us.
Starting point is 00:27:26 If you enjoyed this discussion, please do leave us a rating or a review. As I said, you'll find us in pretty much every podcast app as Utilizing Tech. This podcast is brought to you by Tech Field Day, home for IT experts across the enterprise, which is now part of Futurum Group. Season seven, as I mentioned,
Starting point is 00:27:43 is brought to you by Solidigm as well. For show notes and more episodes, head over to our dedicated website, which is utilizingtech.com, or you can find us on X Twitter and Mastodon at Utilizing Tech. Thanks for listening, and we will see you on June 3rd as we kick off next season of Utilizing Tech.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.