Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 06x13: Focusing on AI Data Infrastructure Next Season on Utilizing Tech with Solidigm
Episode Date: May 13, 2024Great AI needs excellent data infrastructure, in terms of capacity, performance, and efficiency. This episode of Utilizing Tech serves as a preview of season 7, brought to you by Solidigm, and feature...s co-hosts Jeniece Wnorowski and Ace Stryker along with Stephen Foskett. Solidigm's partners are discovering just how important it is to optimize every element of the Ai infrastructure stack. With ever-larger AI datacenters being built, efficient storage can make a big difference, from power and cooling to physical density to performance. As we will hear throughout season 7, different AI environments will need specialized data infrastructure, from the edge to the cloud. With retrieval-augmented generation (RAG) emerging as a new trend in AI, it makes high-performance even more important at run-time. Hosts: Stephen Foskett, Organizer of Tech Field Day: https://www.linkedin.com/in/sfoskett/ Jeniece Wnorowski, Datacenter Product Marketing Manager at Solidigm: https://www.linkedin.com/in/jeniecewnorowski/ Ace Stryker, Director of Product Marketing at Solidigm: https://www.linkedin.com/in/acestryker/ Follow Utilizing Tech Website: https://www.UtilizingTech.com/ X/Twitter: https://www.twitter.com/UtilizingTech Tech Field Day Website: https://www.TechFieldDay.com LinkedIn: https://www.LinkedIn.com/company/Tech-Field-Day X/Twitter: https://www.Twitter.com/TechFieldDay Tags: #UtilizingTech, #AIDataInfrastructure, #AI, @SFoskett, @TechFieldDay, @UtilizingTech, @Solidigm,
Transcript
Discussion (0)
As we've talked about all season on Utilizing AI,
great AI needs excellent data infrastructure
from capacity to performance and efficiency.
This episode of Utilizing Tech serves as sort of a capstone
for Season 6 and a preview of Season 7,
which is brought to you by Solidigm
and features co-hosts Janice Naroski and Ace Stryker,
along with myself, Stephen Foskett.
So listen in as we give you a preview
of the next season of Utilizing Tech, which kicks off on June 3rd, and maybe a few bits
of information to sum up what we talked about all during season six.
Welcome to Utilizing Tech, the podcast about emerging technology from Tech Field Day, part of the Futurum Group.
This bonus episode focuses on our next season of Utilizing Tech, presented to you by Solidigm.
We are going to focus on AI data infrastructure in enterprise IT, which is a key topic that has emerged here on season six of Utilizing, focusing on AI. I'm your host, Stephen Foskett, organizer of the Tech Field Day event series.
Joining me as my co-host today and all season long here in season seven,
we have Janice Narowski and Ace Stryker from Solidigm.
Welcome to the show.
Why don't you tell me a little bit about yourselves, starting with Janice.
Thank you, Stephen. It's a pleasure to be here.
We're really excited about this series.
I'm Janice Garowski. I've been with Solidigm since its inception.
I've been in the storage business for over 15 years,
and I'm just really excited to work with you all to dive into some
really meaty topics around storage and how we are fueling the AI era.
I'll turn it over to Ace, our co-host here, for an intro as well.
Thanks, Janice. Thanks, Stephen.
I'm really psyched to be here with you guys as well.
I am Director of Market Development at Solidigm
and have been in the solid-state memory business since about 2017,
starting at Intel and moving over to
Solidyne when it was born at the end of 2021.
And in my job, I spend a lot of time focused on emerging opportunities for our storage
products.
And to do that, a big part of the job is understanding sort of the infrastructure story from start
to finish.
And very much focused these days on AI and likely will be for some time.
Things, as you know, Stephen, are just blowing up there.
Yeah, that's one of the things listeners of Utilizing Tech might remember you. Back in March, we had you on and we were talking about why great AI needs great storage. And that was
really the seed, the acorn that this whole season grew out of. The idea that in order to have great
AI, you need great storage. And not just because storage is so important to AI. Now, obviously,
Janice and I have been in storage industry for a long time and think that storage is the most
important thing in the universe. But beyond that, I think that what we're seeing is that AI does place incredible
demands on storage, both in terms of capacity, in terms of efficiency, in terms of data transfer,
as well as things that you might not think about, sustainability, energy efficiency,
things like that. So I guess, Ace, do you want to sort of
recap what we talked about back in March? What are the various things that makes storage so
indispensable to AI infrastructure? Sure, you bet. You know, when you step back and you look at
sort of the AI data pipeline as a whole, and what are your hardware and your software requirements
to do AI work, a lot of the focus is on compute, and rightly so. You need a lot of compute horsepower
to clean data, to train a model, to validate a model, and then to deploy it in the real world and make it useful
for folks, right?
And so the focus from our company's point of view has been, how do we help folks who
are spending all that money and energy and effort on really high-powered data centers?
How do we support that and make sure that
the right storage is there to feed those GPUs, maximize utilization, and ultimately
improve two things, performance, right? You want to get this done faster, and you want to have
higher quality models at the end of the day, and then total cost of ownership or TCO. And so it turns out that storage plays a pretty big role
in both of those vectors
when you're talking about the suitability
of a given data center infrastructure for AI work.
So for example, if you're using hard drives
to power a lot of your work,
and I think we'll get into this
in the course of the season, right?
And we'll talk about sort of pros and cons there.
But you're probably consuming a lot of space
and power to do that.
Your random read performance is very low,
which has implications on your training speed and so forth.
And so you've got to think about more than just compute, right?
You've got to think about an optimized infrastructure as a whole and make sure that the model, which is only as good as the data that goes into it, which is only as good as the storage that data lives on, right, you've got to make sure that that model is getting not just a lot of compute
power thrown at it, but very capable and cost optimized storage as well to sort of design
and deploy the ideal infrastructure.
I agree with Ace.
It is, you know, it is a time where a lot of our organizations that we're working with
are, you know, in a variety of
environments. They're in a read-intensive environment or in a, you know, a mixed workload
environment. And so the beauty of what SolidIne has to bring here is just kind of a full suite,
a full stack of products that really address the various stages that folks are dealing with when
it comes to the AI data pipeline.
So, Janice, you've been participating in a lot of our field day events in the past, and we have seen Solidigm present. But one of the things that I love is that Solidigm comes
in often with partners. And these partners are able to tell an integrated story that goes way beyond the SSDs and things like that and really
goes to the entire system stack. And so it was important when we were talking about this season
that we would have a very partner-focused story. I wonder, can you talk a little bit about the
sort of partners that we're going to be bringing into Season 7 as a way to give our listeners
something to look forward to? Because we've got some pretty good names here.
Thank you, Stephen. Such a great question. We are passionate, just as passionate about our partners
as the products that we build to support them. We are big on our short-term and long-term roadmap. These guys, we look forward to helping them solve their individual problems.
And all of their problems are a little bit unique.
Some of them not so unique and be learned from across the industry.
So we're super excited to bring our partners onto this series, partners like Data Direct Networks, you know, go by DDN, right,
where they're doing a lot with their all-flash systems, populating it with all things high-density
storage to solve some of those deeper AI challenges. We will also have partners in the software space,
right, partners who are working with us to fine to fine tune the products that we develop to, you know,
further allow AI algorithms and AI workloads
to take advantage of our drive.
So we'll be excited to hear from an organization
like Xenor or even Grade, right?
And then also the work that we do,
and this is like the heartbeat of who we are,
is working with our own partners, such as Dell and HPE, even NetApp to some degree, right?
So having a mixture of software, hardware, and kind of the, you know, I would say innovative folks will be just exciting, I think, to learn from.
The one partner I did not mention yet, speaking of
innovation, is CoreWeave, right? CoreWeave is one of the most interesting CSPs on the market today.
They're an all-GPU cloud-based organization. I think they claim that they have more GPUs running
in their data center than most companies. And so how are they fueling those data centers with those GPUs, massive amounts of power, and how are they utilizing high density storage like QLC to kind of bring down the overall total cost of ownership, the power savings, right?
And still getting that super solid performance they need to run that cloud effectively. One of the most interesting things I think about the CoreWeave story is that they are really telling the story that we're trying to get at with this
season. Because if AI data infrastructure is the critical topic for this whole season,
CoreWeave is one of the biggest, most innovative, most leading edge developers of AI data
infrastructure. I mean, they're literally building the same kind of things that the leading hyperscalers are building, except they're building it for others to use.
I'm really looking forward to that because CoreWeave also, because they're so big and because they're so focused on this entire stack,
they have a lot of insight into what we're getting at here in terms of the importance of storage.
So as we've said, it's important to have performance.
Obviously, it's hard to get a lot of IOPS out of a hard drive, you know, obviously.
But it's also important for other things. You know, one of the things that Ace was talking about previously,
you know, the density of servers, the density of storage within those servers. Also, as you
mentioned, the power efficiency. As we're recording this, the news just came out that
there's a project to build a five gigawatt data center just for AI use. Now, just for comparison there, five gigawatts, that is about, you know,
one twentieth of all the solar energy generated in the United States. That is 500 million LED
light bulbs. You know, that is an incredible amount of power. Or four time traveling DeLoreans, I should ask.
That's a lot of gigawatts of power. And when you're dealing with that, when you have that
kind of power capacity, but also when you're talking about the cost of kitting out all those
servers, storage becomes one area that you can really try to optimize.
So let's talk there a little bit about energy efficiency too, because one of the aspects of
higher density and higher performance storage is that that means that you're using less power and
less cost to deliver the same or better performance, right, Ace? Yeah, well, first of all, Stephen,
I'm in favor of using the DeLorean
as a unit of measurement
for all our future power analogies on this program.
But yeah, you're absolutely right.
When you think about the power demands
in the newest data centers,
specifically those geared toward AI processing, it's almost
difficult to fathom what we're talking about. You're starting to see more and more
laws and regulations evolve in different parts of the world that specifically address this problem,
because it is becoming a problem on a national scale for a number of countries. Like, hey, we cannot afford to just keep putting in as many, you know, AI data centers as we
might want to.
The grid cannot support it, right?
And so efficiency is a huge part of the story there.
And, you know, we've seen cases where 35% of power consumption in an AI data center is consumed by storage. I mean,
it's not an insignificant portion. You think about the GPUs and they're churning all the time and
they're big and they're hungry and they're hot and all that's true. But I mean, more than a third
of that going to power your storage devices is a huge deal. And so there's a lot of room for optimization there. And when you
talk about the energy issue and you talk about sustainability long-term, right? It's really a
function. I mean, there's a lot of things that you can factor in there. But what I've observed is it's really a function of two things going on
that move the meter the most, if you will, in that conversation. One is density. How much storage
per device are you getting? The denser it is, the higher capacity of the drives, right? The more
power efficient you're going to be. And the other is utilization. And that has to do with, are you replicating data across several drives? You know, if you have hard drives,
are you short stroking them to meet some minimum IOPS requirement, right? So those are the two
things that really create a bunch of, frankly, inefficiency and opportunity for optimization. Moving to higher density drives
in smaller form factors, it reduces the number of individual drives you need to power, right?
But it also saves on cooling costs and space costs and weight if that's a concern. And so
all these things have huge implications that you don't necessarily think about, you know, at the CapEx stage when you're buying a bunch of gear to put in your, your, your, you know, big server farm, but, but in terms of
OpEx and five to seven years after the purchase, how much is it costing you to keep this thing
running? Density is a huge deal. And then utilization is the other one. You know, we've
seen a lot of setups where, for example, hard drives are rated and the data is triplicated.
So you're talking about three times as many drives purely for replication before you even get into the short stroking conversation.
And so very quickly, you can run up an astronomical power bill, right, depending on some of the storage decisions you make.
And so certainly optimizing your compute as a part of that infrastructure is important, but there's a
lot we can talk about on the sustainability and the power consumption side as well.
Yeah, it really is transformative when you talk about optimized storage performance and storage
capacity. I mean, most of us experience this on a daily basis with our new laptops. It's not like
people are going out there buying laptops with hard drives in them anymore.
And the reason isn't because they're incredibly cheap.
I mean, yeah, it still costs a little bit more to use Flash instead of spinning media,
but it's totally worth it with all the other benefits that you're getting in terms of,
you know, the whole form factor of the machine, the power budget of the machine.
And frankly, it's not that much more
expensive now, especially if you can use it optimally, if you can get more out of it. And I
think that's what we're going to see with AI data infrastructure going forward. The other interesting
angle that we've talked about as well is that with high performance flash devices, you can also do all sorts of advanced storage features that you might not be able to do with other non-flash devices, certainly, but even lower performance flash in terms of cloning and snapshotting and things like that. And all of that is useful for the AI data setup as well,
because in many cases, you're going to be needing to make copies of data sets. You're going to be
able, you're going to need to move data around. All of that is going to take up a lot of backend
storage performance, just getting ready to do the AI processing. And if it's not a, you know,
high-performance storage system, and when I say high-performance, I mean really high-performance storage system.
You won't be able to do that online.
And of course, that's just going to mean
that you're going to have to have
a whole other infrastructure built
in order to support all this.
So it really does seem that storage really is fundamental.
But then you mentioned the form factor angle.
And I think that that's critical too.
One of the things that I've been focused on
when talking about Solidigm with people who aren't from the storage industry is the importance of,
or the transformative effect of different storage form factors. Now that these servers are coming
out, they're able to pack a lot more storage density into the same, you know, 1U or 2U
rack unit device, because we're not talking about the old fashioned,
you know, disk drive looking devices anymore. We're talking about new things. And, you know,
Janice, are we going to be able to talk about that? You presented that at field day. Are we
going to be able to talk about form factors too? Yeah, I think we'll be able to talk about form
factors more. Not only the ones we've been designing over the years, all the different varieties,
as you mentioned, can be packaged into these smaller devices now, but also working with new
types of form factors where you might see some compute being done within the device as well as
the storage. So hopefully we'll be able to bring that up. Computational storage, I believe, is what,
you know, folks want to hear more about. And that's something we, you know, we can definitely
speak to. I will say too, Stephen, you know, a lot of our partners too, you asked the question
before about the partners and what we're doing with them. A lot of the partners that we work with
have different needs, right? It's no longer just about, you know,
stacking a bunch of U.2s in a data center, right?
There's, you know, now edge compute, right?
And being able to get away from using hard disk drives
at the edge and being able to deploy, you know,
a comparable SSD that's just as dense as a hard disk drive
and then walking away from, you know,
having it overheat
or having it to be serviced.
So we're seeing a lot of momentum
with some of our form factors at the edge as well.
Yeah, that's actually a really good point too, Janice.
We were talking about that during season six
when we were talking about AI.
I think that people are so focused
on these big high dollar, yeah, four DeLorean data centers full of full of AI, GPUs and everything that they. But the next wave of AI is probably going to be outside the data center or outside the cloud, that sort of cloud data center.
It's going to be more deploying AI inferencing and retraining and AI in different locations.
And as you said, AI at the edge.
And there again, you know, we come back to this question of power and cooling and performance
and all of the different things that AI is going to need.
And there again, I think that we're gonna hear about that
during season seven, or at least I hope.
Do you have some people lined up to talk about Edge
and outside the data center?
We do, we have a couple of experts,
customers rather, that we work with.
Cheetah Raid is one of them.
It's just doing some really cool ruggedized edge work with a lot of military or U.S. government types of agencies.
But then I think we also have some really amazing experts internally who are working on some of these new SSDs that will really complement the edge.
And I'm hoping to bring some of those experts on as well from Solidigm.
Yeah, absolutely.
And it's going to be really interesting to hear where they're going as well.
And then, of course, we've talked as well with some of the data companies,
with the data platform companies.
One of the topics they mentioned that brought up during season six here as well was retrieval augmented generation or RAG. Essentially having AI inferencing or
generative AI be able to go out in real time and reference data and bring in referencing data from
structured data sets. That I think is going to be another transformative trend here in 2024, as instead of
just asking generative AI to know everything, or to make up everything, we're going to say,
you know, hey, generative AI, why don't you go ask the question that I don't even know how to ask,
and then come back with that data. There again, you know, if you're using retrieval log
metageneration, it is really critical that that stuff have ultra, ultra high performance and good
capacity, because that is going to need a nice data set behind it. But you can't have that data
set slowing down the interactive aspect of AI that affects the end user. So I think that that as well, we'll see
some relevance here too. Hopefully we'll be able to bring in some of those data platform folks as
well into this conversation. Yeah, there's a lot of really cool work being done in that space right
now. Innovation, not just on the hardware level, but the software, cool caching approaches.
RAG, as you mentioned, opens up all sorts of additional solution possibilities from the
customer and user perspective. And so these are things that we're tracking. These are things that
I think are definitely ripe for exploration in season seven. You know, there's really interesting work being done
on how models are quantized, you know, shrunk
and made more space efficient on validation techniques.
I mean, the possibilities are endless here.
We'll try to focus on the things that we think have the biggest implications
for the future of infrastructure. And we'll bring in, you know, the one of the one of the things that we think have the biggest implications for the future of AI infrastructure.
And we'll bring in, you know, one of the things about Solidigm is, you know,
without tooting our horn too much, we believe we have some of the best partners in the business.
And so we're excited to bring them in and hear from them.
Yeah, and we're very excited with that, too. Thank you so much.
Those of you who are listening, yes, yes, Solidigm is going to be co-hosting this whole season and Solid talking about and won't really have to worry about this. It's the topic that's important. It's the data infrastructure and the importance of data to AI that's important. And so our commitment to the listener is that, frankly, this is going to be
another season of utilizing just like season six, just like season five, just like season four,
in that we're focusing on a topic,
in this case, the infrastructure, the data infrastructure under AI, and that we're going to bring in a variety of different voices around that, and that we're going to have, you know,
interesting co-hosts like Ace and Janice here joining me for these conversations, but that the
whole conversation, the whole thing is more about making practical use of this technology than it is,
hey, rah, rah, let's talk about a specific product. And there again, I just want to say,
Janice and Ace, I really appreciated your attitude when I came to you and talked to you a little bit
about this, because you guys were on board with that, right, Janice? I mean, I remember having
that conversation with you and you were like, yes, I want to do this. A hundred percent. Yeah. Thank you so much for the opportunity, Stephen, to learn from our partners and share that with the rest of the world.
Absolutely. And, you know, this is what we do here. This is what Tech Field Day is all about.
It's what utilizing is all about. It's what the Tech Field Day podcast is, too.
So just to give everybody a sense of what's going on, today is May 13th.
This is the last episode of season six
where we focused on AI here on Utilizing Tech.
But the next season is coming real quickly.
The AI Data Infrastructure season begins on June 3rd.
This is our Monday podcast.
So every Monday you'll have a new episode.
It'll feature myself and Ace or Janice as a co-host and a different
company or a different person, a different customer in this space talking about the importance
of data infrastructure and the nuances of data infrastructure to building a modern AI data
center and modern AI processing. So check that out, again, starting on June 3rd. Before we go, Ace, Janice, I guess Ace, learnings along the way. It's a very fast moving space and it's a lot
of fun to be here now and try to help sort of position our partners and accelerate progress
in the industry toward really optimizing performance and costs. In terms of where you can hear more from us,
Janice and I are on LinkedIn.
Welcome any conversations folks want to reach out there.
That's great.
We have an AI page on our Soladyne website
with a lot of this information
where we'll continue to post new stuff as it comes along.
That's Soladyne.com slash AI.
So please check that out as well.
Well said, Ace.
Yeah, can easily connect with us on LinkedIn.
And a lot of the customer and partners
that we were speaking of earlier
also have some sunlight, if you will,
on our AI landing page.
So again, yeah, head over to solidime.com forward slash AI
to keep ahead of those things.
Well, thank you very much for joining us here
as our sort of wrap-up episode for season six
and our premiere episode or preview episode
for season seven of Utilizing AI.
If you're listening to this, thank you for joining us.
We are so happy to have you.
We're so happy to have the audience
for Utilizing Tech just building and building.
This is the Utilizing Tech podcast series.
You can find this podcast
in your favorite podcast application,
as well as on YouTube.
Just search in your favorite search engine
for utilizing and whatever topic you're looking for,
and you'll probably find us.
If you enjoyed this discussion,
please do leave us a rating or a review.
As I said, you'll find us in pretty much every podcast app
as Utilizing Tech.
This podcast is brought to you by Tech Field Day,
home for IT experts across the enterprise,
which is now part of Futurum Group.
Season seven, as I mentioned,
is brought to you by Solidigm as well.
For show notes and more episodes, head over to our dedicated website, which is utilizingtech.com,
or you can find us on X Twitter and Mastodon at Utilizing Tech. Thanks for listening,
and we will see you on June 3rd as we kick off next season of Utilizing Tech.