In The Arena by TechArena - Solidigm on Building Future-Ready AI Storage
Episode Date: November 5, 2025Solidigm’s Ace Stryker joins Allyson Klein and Jeniece Wnorowski on Data Insights to explore how partnerships and innovation are reshaping storage for the AI era....
Transcript
Discussion (0)
Welcome to Tech Arena, featuring authentic discussions between tech's leading innovators and our host, Alison Klein.
Now, let's step into the arena.
Welcome in the arena. My name's Alison Klein, and today is another Data Insights episode, which means Denise, Norowski from Solidine, is back with me.
Janice, how are you doing today?
Listen, it's great to be back. I'm doing great. Thank you.
you. So, Janice, I know that we've got a really exciting guest and we are actually turning
the tables a little bit with data insights and going deeper into storage media. Who do we have
with us today? Yes, we do have a very exciting guest. We actually have Ace Stryker with us,
and Ace is the product marketing director of AI infrastructure for Solidime. Welcome to the podcast,
Ace. Thank you so much, Janice and Allison. It's great to be with you. So Ace, just to kick things off,
Can you tell us about your role at Solidime and what your focus is on AI and strategy?
Sure thing. Yeah. I have been at Solidime since day one. The company is approaching four years old this December.
And for the last couple of years, my job has been to eat, sleep, and breathe, AI data and what's going on in the quickly moving world of AI and how are the use cases and the opportunities evolving and what's the role.
of storage in that.
As you can imagine, it's a big deal for a company that makes SSDs.
AI is the data driver of the 2020s and probably will end up being of the first half of the
21st century.
Between the giant data sets that are needed to train foundation models these days and build
in, you know, increasingly complex capabilities and the exploding amount of data on the
other end of the AI pipeline and inference.
where you have things that we can get into,
like the key value cash or retrieval augmented generation.
There's lots of emerging technologies on the inference side
that also generate and consume a ton of data
in the course of providing more value to AI users.
And so my job is to pay close attention to that,
to understand new models,
the sort of ecosystem landscape,
potential partners for Solidime,
understand our customers and our use cases better,
Also, that we can make sure that we are investing our time and energy into making products
that are going to help solve problems today and tomorrow.
Ace, I was so excited to talk to you today because Data Insights program is often focused on other
companies talking about how they're utilizing data platform.
So they deliver unique capabilities.
But we're going to turn that around and talk about what are all these AI workloads doing
that makes them so different and uniquely demanding for.
storage across training and inference. And how does Solonime optimize for all of the core
capabilities that they're looking for like high throughput concurrency and consistent quality
of service? Yeah, AI is such a diverse field, right? Almost anything, any task that you can
imagine assigning to a human to get done, there's somebody working on trying to accomplish that
with AI. And of course, we're here in the early sledding in the first, call it three to four years now,
of AI really being a predominant thing in our lives.
And we're dealing with a lot of the use cases that are low-hanging through at this point.
A lot of fairly simple and straightforward tasks that we are finding ways to leverage AI to
solve with less cost, less time, less human input.
But now with the rise of agenic AI, we're able to move into a world of increasingly
complex problems and assign those to AI and the quality of the solutions we're getting
are better and better. We're not quite there yet to where you can turn over any problem in your
life and I will solve it, but month over month, even week over week, there's new models coming
out, there's new tools, there's new solution stacks with pieces plugged in a new way that
solves a problem that for the duration of human history up to this year had to be done manually
by humans. And so as you can imagine with a diversity of applications there comes a whole bunch
of different requirements on the hardware side.
And from a storage perspective,
we're really concerned about how the storage in an AI cluster
interacts with the memory to deliver optimized outcomes.
Storage does not do the job on its own.
To really understand how data flows through the AI pipeline,
you've got to understand the frequent interactions between the storage layer
and things like host DRAM and high bandwidth memory on the GPU,
trying to protect this from a storage-only perspective is going to sort of limit your ability
to understand what's going on.
And so as we look at how we work with memory to do things like make sure the GPUs are fed
all the time and keep churning at high utilization, make sure that things like the history
of your interaction with an AI model in the form of what's called the KB Cash, the model's
short-term memory, is efficiently stored and recalled in the course of a conversation
without having to expend a lot of time and energy and money on unnecessary tokens.
And so there's a density piece of this that we worry about a lot,
and there's that performance piece of this that we worry about a lot.
But at the end of the day, all of that stuff,
if I had to encapsulate it into one word,
the word that we hear most often from our customers,
from our partners is efficiency, right?
That is the name of the game.
How do we do more with less?
We know how much power these AI clusters consume.
astronomical, right? We know how much data they need. We know how big some of these models are.
So it's all about how can we get more efficient and storage plays an important part in that story and
probably an underappreciated part in a world where a lot of attention is focused on the GPU for
a lot of good reasons, right? It's super expensive. It's very power hungry, right? We're all very worried
about GPU specs, but a GPU that's not fed efficiently by the storage pipeline is wasted money.
space and energy.
So with all that, Ace, do you share specifics about customers or benchmarks?
We're saw in the processes that really helped to kind of accelerate machine learning
and or deep learning workflow?
Sure.
Yeah, we have a lot of great customer stories.
I'll take the opportunity here to plug our website.
So selladime.com.
If you go to the insights page, there's a bunch of articles about what we're doing with
customers to solve real world problems.
There's cool stuff on there about work with.
Doug on a system called Nomad. It's a mobile data center that you can deploy out in the
Sahara Desert if you want to solve problems with the edge. We have a story with InnoNet
that's about collecting automotive data inside of moving vehicles. Maybe a good one. An example here
if folks aren't familiar with some of our work would be PKKO. We've done some work with them
in the healthcare space on medical imaging and using AI to understand things like CAT scans
and X-rays and to deliver diagnoses.
more quickly, more efficiently. So it goes back to you can build an AI solution stack with
hardware and software and aim it at almost any problem in the world, right? And the diversity of
our kind of customers and use cases covers a lot of grounds for that reason. In terms of benchmarks,
it's an interesting question because benchmarks are what we use to really get as close as we can
to Apple's comparisons and to say, hey, if you just turn this one dial, for example, if you swap one
SSD for another, this is the difference you will see in some kind of measurable outcome.
And so the leading benchmark today in the AI storage space is called ML Perf.
It's published by a group called ML Commons, and they have a storage-specific test that lets you
plug in either a single drive or an array of drives inside of a server and run it through
some AI workloads to understand how many GPUs can you support at high utilization with
this storage subsystem.
I encourage folks to check that out if they're interested.
MLPIR just published results from version 2.0 of their test,
and Solidime drives were well represented in there in systems
from a few different kind of ecosystem members.
But yeah, that's the one that we're certainly watching
and we've even got a hoax on the working group
that they're helping develop the next versions of those tests
to make sure that they align with real-world use cases and stuff.
But that's a very useful tool to understand
just how important storage performance is
to making sure that those high-blotent GPUs you bought are being properly used and heavily
used as you want them to be.
I've been having a lot of conversations with practitioners of late about what infrastructure
requirements are shaping to be moving forward.
And one of the topics that keeps coming up is data management at scale.
How does Solidine tackle managing and moving data with speed and efficiency?
Well, you're talking about these large-scale datasets.
It's another answer that necessarily includes other parts of the system, right?
We can't just say storage does that on its own and solves those problems.
When we're talking about moving and managing large amounts of data, the software matters very much,
and the networking matters very much as well, right?
So when you look at the way these AI clusters are architected, typically you have a bunch of GPU servers
with a certain amount of storage inside those boxes, although it's not a ton.
A GPU server from NVIDIA might have eight or 10 slots for SSDs in there.
You can fill that up and you can use up that space pretty quickly.
And so in most cases, for these larger deployments,
you're also communicating across a network to dedicated storage servers
that might be full of 24 or 32 SSDs.
And you may have a whole bunch of those stacked on top of each other, right?
So, of course, it's important to put the right SSDs in the box
and make sure those are performance enough
and make sure they're high density so you can get some of those efficiencies of scale in terms of
storing more data in fewer boxes. But if you're choking that with a slow network connection
or poor orchestration of the software, you're going to get burned as well. That's why we're
very excited looking at emerging networking technologies. There's a lot of reason for optimism,
whether it's Ethernet or some of these other kind of proprietary approaches. We are approaching
a world in the next few years where the bandwidth over the network can and will in some cases
exceed the throughput of storage devices. In other words, the network ceases to be the bottleneck
and then it's back on the storage devices, right? And then you can really unleash great performance
by putting faster SSDs in those boxes than we typically do today. It comes down to partnerships.
This is something that Solidine spends a lot of time and energy on and kind of a point of pride for us
is identifying those movers and shakers in the ecosystem,
whether it's folks making GPUs or folks building industry-leading software like
WECA or working with the CSPs or Neo-Clouds that kind of put all this stuff together.
It's all done in partnership with them to really dial in the optimal combination of hardware and software
to make sure that as that amount of data continues to grow and grow,
and the demands from the workload in terms of the speed of moving the data,
only get higher that we're able to keep up and deliver results that'll make AI users happy
and ultimately create value for businesses at the bottom line. As emerging paradigms like in-memory
processing and computational storage of all, what's your vision for where intelligent storage is
headed next? That's a good question. Because there's so much investment and attention on doing
AI bigger and better and more efficiently, there's a lot of mad scientists.
in labs around the globe that are working on some pretty cool stuff, right?
And so if we take that view of the AI clusters, having storage in a couple of places,
let's look at the GPU servers first.
That's where we talk about direct-attached storage, those SSDs that are plugged directly
into the GPUs.
So there's a lot of chatter now about high bandwidth flash, which is kind of an emerging
application for NAND hasn't existed or has only existed in a very limited form in the past.
But you can expect to hear more about that in the future and how can we really unleash the performance of the NAND media and get it out from behind the PCIE interface.
And then even within conventional SSDs, we'll continue to see that PCIE interface evolve, right?
Most of the high performance stuff today is all PCIE Gen 5, but we've got six and seven coming behind it quickly.
That's going to increase bandwidth by quite a bit.
It doubles every generation.
And so the GPUs will be much happier in terms of their bit.
to pull data when they need it.
But that's also going to generate a tremendous amount of heat
when you talk about these next-gen SSDs.
And so thermal management for storage becomes a bigger deal.
If you look at AI servers today,
we're pretty worried about thermal management of the GPU
and the CPU, and a lot of those use cold plate cooling
because fans are no longer sufficient to keep them cool enough
to prevent them from throttling.
We have not had to worry about that for storage in the past.
We absolutely do have to worry about it going forward in Gen 5 and beyond.
And so things like cooling storage with efficient cold plate mechanisms or even immersion
cooling where you dump the whole thing in a tank of mineral oil or whatever the liquid is,
that's going to be a bigger and bigger challenge and opportunity for storage vendors to innovate
and solve some of those problems.
When you look at the other side of the AI cluster, which is the network attached stuff,
those dedicated storage servers that sit in racks maybe next to the GPU servers.
Really cool stuff coming along, obviously in terms of higher and higher densities per drive,
right? So Solidime has led for the last year with 122 terabytes in a single SSD about the size
of a deck of cards. That will continue to grow and grow. We've announced plans for 256
terabyte drive. And you can imagine it's not too long in the distant future before you're going
C. Solidime and others aiming at a petabyte in a single device, which was unfathomable even five years ago.
That's just a wild amount of data. But in the world of AI, we're seeing the amount of data only go in one direction and go very, very quickly higher and higher.
So that's a couple of areas I'd keep an eye on. There are wild cards out there as well.
You mentioned computational storage. What can folks do with that? How much work can you take off of the CPU or GPU by leveraging some of the computer?
inside the SSD. There are some interesting experiments and in limited forms there are
some products available on the market to do things like compression and decompression of
data using SSD compute. Storage class memory was a big deal a few years ago, has been less
so but a lot of people are now saying, hey, this might really be a good fit for AI workloads.
And so we may see that start to come back. CXL is something that feels like it's been talked
about forever and has never really landed in a way that has shaken up the market in a big way,
but continues to be talked about. And perhaps there are new applications for that in the AI world
as well. So those are a few of the things that I would say keep an eye on, and we might see some
exciting innovations coming down the pike in the next few years. What you've been talking about
is just an acceleration of innovation on the number of fronts, from interfaces to media, to
new classes of devices. If you were going to be having a one-on-one within, I'm
IT architect, looking at the next generation storage layer for high-performance AI.
What would you talk about them in terms of guidance on how to plan for that?
I would say particulars we're dealing with more data as time goes on.
It pays to really get familiar with how data moves between different memory tiers and storage
within an AI system.
And there may be a significant kind of untapped opportunities for efficiencies there.
So as an example, we recently published a white paper.
You can check it out on our website.
We have an explainer video with it.
We worked with this great company called Metrum AI.
To try to answer the question, what happens if you move significant amounts of AI data out of memory?
You offload it onto SSDs in ways that people don't typically do.
How does that affect performance?
How does that affect memory utilization?
And we have some pretty interesting results.
So we use this example use case where you have a video of a traffic intersection.
there's cars, there's pedestrians, there's cyclists, and we feed it into this analysis pipeline
that generates embeddings and creates a rag database and then ultimately outputs like a safety
report that says, hey, this is what happened in the video, these are the changes that could be
made to this intersection to keep people safe. And then we ran that keeping all the data in memory
and we ran it moving as much as we could using kind of industry standard approaches
outside of memory and onto SSD. And that ended up being a couple of things.
mostly in terms of the data that we moved.
We moved a lot of that rag data that we generate in the course of analyzing the video.
Typically that's in memory and it can get quite big.
We move that to SSD and then also some of the model weights themselves.
You don't need to keep the whole model in memory all the time, right?
You're typically only accessing certain layers at any one time.
And so the ones that aren't active, can you put them on disk or SSD rather than keep them in memory?
And you can.
We demonstrated that and we wrote about how we accomplished that.
And what we saw was, yeah, you can absolutely use less memory.
That makes sense intuitively, right?
If you're moving stuff out of memory onto SSD, you don't need as much DRAM.
And we saw like a 57% reduction in the amount of DRAN that was used in a 100 million vector data set that we benchmarked.
And then the other part was the model weights and moving those from memory to SSD.
And that was really exciting because what that allowed us to do was to run more complex models on GPU
hardware that you typically just cannot run it on, period.
Our demo included running the Lama 3.370 billion parameter model, I believe, on an
Nvidia L40s.
And that's a lot of alphabet soup there.
But long story short, that is a combination of GPU and model that you cannot use in the
real world typically.
There is not enough memory on that GPU to fit that model.
But we showed that by moving some of the weights into storage, you could.
And so you can imagine the implications of that in terms.
of maybe edge environments where you have less power and you need to use GPUs that might have
more severe memory constraints. You can imagine how that might apply to legacy hardware that an
enterprise might already have, that they want to repurpose for new AI use cases. What we're showing
is new possibilities unlocked where, yes, you can run these modern complex models on hardware
that you didn't think you could before. And that involves this kind of NSISD offload approach.
And so there's a lot more work to be done there. Our white paper was just a
tip of the iceberg and we're getting ready to publish some more. So stay tuned on that closer
to the end of the year. In terms of like the advice I would give someone about architecting the next
generation storage layer for scalable AI, I would say pay close attention to where data resides.
And if that's the optimal place, if the way that we've always done it is the way we should
continue to do it or whether leveraging high performance SSDs can gain you some efficiencies.
and, oh, I didn't mention this,
but we didn't lose performance at all
when we did the SSD offload.
That was the other kind of key finding.
In fact, we gained performance
because the indexing algorithm
that was used in the SSD offload approach
was so efficient,
our queries per second actually went up
by like 50% versus running those queries
with the data in memory.
And so that's something I think a lot of folks
I'll realize,
and that would be a piece of advice
I'd give to anyone kind of looking at
how do we optimize and do this even better in the future?
So, Ace, that was a lot of good information.
Definitely some great work there with Metrum AI.
So thank you for that deep dive.
And I agree.
I encourage anybody who hasn't seen that paper on her website and some of that data,
and you should check it out.
Any other AI communities or collaboration or research that you've been working on
to help shape future ready storage solutions that you want to talk about?
Yeah.
I mean, almost nothing that Solidime ever does is done alone in the vacuum.
If you look at our key values on our website, which I know nobody ever really looks at that stuff.
But if you go to About Us and you look at our key values, you'll see our corporate logo,
which is sort of this interlocking S thing.
And it explains that what that actually means is partnership.
It has to do with two partners coming together and fitting into each other.
And that wasn't an accident.
That was chosen as the company logo because that is so core to who we are and the way we approach these problems.
So anyone you can imagine across the ecosystem, whether it's invidia and their GPUs and inside the servers,
how do we work together on thermal solutions for compute and storage?
We've written about that recently, whether it's the CSPs or the neoclouds that are making the hardware available for enterprises to turn on their AI and try to go solve those problems and extract the value to improve their bottom line,
whether it's the software geniuses who are working on orchestration and putting all the
hardware pieces together and getting the most out of them, we're talking to those folks
every day. And we are constantly growing in our understanding of what problems do they face,
what opportunities do they see, how can we contribute from a storage perspective? And so it's a great
point of pride for us that almost any time you see solid I'm on stage somewhere, you'll see us
with a partner. And that's because it's really embedded in our DNA. That's how we approach
these things and solve these problems together. When I think about what you've sent in this
episode, I know that our listeners are going to want to engage further. So first of all,
thank you for spending time with us. I always learned something from you. And I'm so glad.
My pleasure. Thank you. But where can folks engage with you to continue the dialogue? And then
where can they go to find out about the solutions that you talked about, whether it's those
122 drives or other technology that Solidime is delivering to the market.
Let's say start with the website. We update that all the time. We just recently launched a whole new
section of the website dedicated to edge AI problems and solutions and use cases. We're constantly
posting new articles, the kind that we talked about earlier, talking about customer engagements
and how we're solving problems together. You can check out our LinkedIn, our YouTube. We've got
new content on there all the time as well. And if you're going to be at any of the big
conferences coming up in the fourth quarter of 2025, things like OCP or Supercompute,
we will absolutely be there as well with Bells on. We'll be sponsoring and have booths.
And please come by and say hi. We'd love to talk to you more.
Awesome. Thank you so much for being on the show. And Janice, yet another fantastic
insights episode. Thanks so much for your collaboration.
pleasure thank you allison thanks for having us here
thanks for joining tech arena
subscribe and engage at our website techorina.a i
all content is copyright by tech arena
