Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 08x12: Revolutionizing Data Infrastructure for AI with WEKA
Episode Date: June 16, 2025Fore more episodes: https://www.utilizingtech.com/Storage software running on modern hardware can deliver incredible performance and capability to support AI applications. This episode of Utilizing Te...ch wraps up our season with a discussion of WEKA's data platform for AI with Alan McSeveney, Scott Shadley of Solidigm, and host Stephen Foskett. Modern hardware is capable of incredible performance, but bottlenecks remain. The limiting factor for AI processors is memory capacity: GPUs are hungry for data and must be refreshed from storage quickly enough to keep them running at scale. Storage can also be used to share data between GPUs across the data center and to cache working data to accelerate calculation. The secret to scalability, from storage to applications to AI, is distribution and parallel processing. Modern software runs at incredible scale, and all elements of the stack must match. Technologies like Kubernetes allow applications to use huge clusters of workers all contributing to scale and performance. WEKA runs this way, matching the GPU clusters and web applications we rely on today.Guest: Alan McSeveney, Field CTO of Media and Entertainment, WEKAHosts: Stephen Foskett, President of the Tech Field Day Business Unit and Organizer of the Tech Field Day Event SeriesJeniece Wnorowski, Head of Influencer Marketing at Solidigm Scott Shadley, Leadership Narrative Director and Evangelist at SolidigmFollow Tech Field Day on LinkedIn, on X/Twitter, on Bluesky, and on Mastodon. Visit the Tech Field Day website for more information on upcoming events. For more episodes of Utilizing Tech, head to the dedicated website and follow the show on X/Twitter, on Bluesky, and on Mastodon.
Transcript
Discussion (0)
Storage software running on modern hardware can deliver incredible performance and capability to support AI applications.
This episode of Utilizing Tech wraps up our season with a discussion of Weka's data platform for AI,
with Alan McSeviny of Weka, as well as Scott Shadley of SolidIME, and myself.
Learn how modern hardware is
transforming storage for AI in this episode.
Modern Hardware is transforming storage for AI in this episode.
Welcome to Utilizing Tech, the podcast about emerging technology from Tech Field Day, part of the Futurum group.
This season is presented by Solidim and focuses on AI at the edge and related technologies.
I'm your host, Stephen Foskett, organizer of the Tech Field Day event series, and joining me from Saladheim for this final episode of our season, once again, is my co-host and old friend, Scott Shadley.
Welcome to the show, Scott.
Hey, Stephen.
It's great to have fun doing this season with you, and it's sad and also great to know that
we've made it through another season of these wonderful episodes and some amazing conversations
that we've had over the last few episodes.
And working with you guys has always been so much fun.
Yeah, it's been really great.
It's been great welcoming you as a co-host.
I knew you could do it.
Glad to have you.
Yeah, I like to talk about things, that kind of stuff.
So when it comes to talking tech, it's a lot of fun.
And the guests and yourself make it a lot of entertaining.
Well, that's what I was just going to say. I mean, the guests are incredible. You know,
we get so much great insight from them and just so much perspective on how this AI thing
is being implemented around the world. You know, I think that people have this feeling
that AI is somehow kind of a big iron thing, that
it's some supercomputer in a big data center that's sucking down gigawatts of power.
And it is, it is, but it's more than that.
AI is being implemented outside the data center in smaller environments, at the edge, maybe
let's say interesting venues, entertainment venues, all sorts of
things.
Exactly.
It's not just the home of big iron, right?
Yeah, that's the wonderful thing about this is this is kind of the convergence of a whole
bunch of different technologies at once and the ability to generate data in the way that
we can generate data and then actually do something with it in a more meaningful way as we talked about in a couple of
previous episodes of what people are doing to go back in time and bring that
forward with the technologies that we have available today. So today's a fun
one too because we have a literal convert joining us today from being a
customer to an employee and so that's kind of fun as we bring Alan from Weka along.
Hi, I'm Alan McSebney.
I'm the field CTO for media and entertainment and related AI at Weka.
I recently joined only in December, but I was a customer of Weka
for four years prior to that in a cloud-based visual effects studio.
Um, most of my, my career has been related to creative industries.
Um, actually I was a professional musician for the first 10 years of my life and really
enjoyed marrying, um, creativity and, and technology and really pushing the boundaries of what could be done
with technology back when actually audio and music was quite a challenging thing you could
do in a computer as opposed to now where you could run an entire studio on a laptop.
But transitioned out of there into the visual effects world and large scale playback systems.
And that's been an immensely rewarding career.
Just using technology and being able to push boundaries has really been a place where I'm
kind of happy.
Well, it's interesting that in audio, audio visual, yeah, old Atari ST guy here, so I know a little
thing about using personal computers for music.
What happened there was basically specialized hardware gave way to software running on commoditized
hardware.
And the same thing, Scott and I in our career in enterprise storage have seen the same thing
happen, where what was once the domain of literally special
boards, special processors, special everything has now become the domain of software.
And that's really the story of Weka too.
My understanding is that essentially the founding team and the origin of the product was what if this was done in software and what if we took advantage
of the latest, you know, incredible advances in more commodity hardware, especially NVMe,
but also all the things that you can do now on the processor.
And it worked, you know, I mean, this has been something where the software-based storage solution from Weka has literally become the
bedrock of AI, right?
Yeah, I mean the marriage of three things really allowed Weka to become not just a product
but a vision in the first place, which was someone soldering an SSD onto a PCIe card, someone coming up with the concept of containerization, and then the network teams out there far surpassing everyone's expectations and blowing away
what anybody considered could be achieved with network performance. So you put those three things together and now you can access
NVMe
storage across an array of servers all over network while orchestrating
all of this through a container.
That, that is essentially what Weka is, um, gives you a huge expandable data
platform, uh, that is all NVMe based, that is addressable over network,
that will at times not only outperform local NVMe within a client, but sometimes even DREP.
It's a creative innovation where you guys are literally transforming that ecosystem
to allow the underlying hardware that comes from someone like ourselves into something
that people just see as right next door. It's kind of unique that the software
platforms and the data platform you guys have has that ability to transition data
from point A to point B as if it were just sitting there and not have to worry
about all that transition time because to your point about networking, we keep seeing networks go up and down and faster
and slower.
And as we start getting further away from the main processing, that ability to see that
localized data is something unique that you guys are working on.
Yeah, I mean, there's a bottleneck always somewhere, right?
You know, every so often it gets moved to somewhere else.
But network performance today is astonishing.
You know, we're seeing Ethernet networks up to 400 gigabit.
We're able to push data into a single host at hundreds of gigabytes a second.
single host at hundreds of gigabytes a second. It's not something I thought we would see so quickly,
but this is where we are today.
I think you're right.
The idea of having data local to whatever your processes are,
whether it's on a laptop or on whatever compute you're using and then
expecting to have to transfer that someplace and there'll be some penalty or
some time spent or some transfer process that would just make you sigh. It is most
of our most of our collective experience and history.
Today, that is just far from the case.
Actually having all data centralized
and accessible over network can be more performant
than having it local.
And that's one of the challenges, I think,
when it comes to AI.
Because this hardware is incredibly capable.
But I don't know that every system can take advantage of those capabilities
in order to kind of move the bottlenecks out of the way.
I mean, the entire history of technology is all about moving bottlenecks.
We eliminated this one, and then it pops up over here.
We eliminated that one, it pops up over there.
And if you look at this kind of classic computer system hierarchy with processors and memory
and storage, storage for a long time was just the ultimate bottleneck.
With SSD that has been dramatically reduced, as you say, with things like NVMe and now
with Ethernet networking, not to mention proprietary
networks, it's been reduced further.
But the demand for data from these AI processors is just absolutely off the charts.
It's insatiable.
One of the things we've heard about repeatedly on this season and the last year on utilizing
tech is basically the need to feed the beast. If you are not keeping your expensive
GPUs fed, then you're essentially wasting money every minute, every hour that they're not working
at maximum capacity. That's pretty much what companies are looking at Weka to solve with
software, right? Yeah, I mean the limiting factor with a large data center enterprise-scale GPU today is memory
capacity.
The amount of parallel compute available in a cut-and-edge GPU is mind-blowing, but the memory footprint on each card
is just not where the processes that are running today
needs to be.
Our aim is to be able to augment that memory with a place
where essentially you can tier the data that
should be in memory off to Weka at such a data rate
that it can also be retrieved so fast that it becomes practical to now scale your memory footprint
into petabytes of space. There's obviously, you know, we're talking tiers of performance here, but when we're able to outperform
in some cases what DRAM could provide to GPU memory at that type of scale, hundreds of
petabytes if you like, it really starts to be a paradigm change. The amount of time spent, for example, in
LLM processes during pre-fill calculating key values and creating KV cache data,
a lot of the time this will either just be cached to DRAM or to local NVMe as KV cache,
but this can only be used by other GPUs inside the same server, or within the same NVLink.
And now we can drop all of that KV cache out to Weka and make it available not just to other GPUs in the same server,
but to every single GPU in the entire data center.
And at rates where, for example, to calculate around 105,000 tokens of pre-filled takes
around 25 seconds on GPU, we've been able to take that same KB cache,
place it back into GPU memory in around half a second.
So we're talking more than 40, almost 50x speed up
in some cases.
And every time there's any query that's performed on an LLM,
and this KB cache is generated,
we can just keep that for as long as that model is around.
So it never has to be recalculated again.
So the more and more and more that queries are common
across multiple processes or customers,
we just don't ever have to calculate it again. across multiple processes or customers.
We just don't ever have to calculate it again.
And then the GPU can get on with the meaty part, which is decoding.
Yeah, you bring up an interesting point
about the idea of the tiering, right?
Because we all looked at it as kind of,
if you think of the hype cycle and all this kind of stuff,
I brought this up with some of the other examples
that we've gone through in this season.
But we're at the point now where people realize they need more of something and that something is really
being able to offload and shift the performance tier into an aspect of a larger footprint.
Like for example, the massive drives that we can provide give you those petabytes of
storage that can look like that fast memory.
Just because if you overload the memory, again, moving bottleneck to bottleneck to bottleneck the
Eliminating of those bottlenecks is really kind of the key here and you know fast delivery to your point
it's really cool you guys have had the the ability to highlight the
Performance characteristics in real time of what you guys are up to with some of the work
You've done in some of the recent venues that have come to light
So it's really interesting to see how you guys have been able
to transform the idea that it's really more about the data
and not where the data is sitting and being able to map,
let the user maximize their hardware configuration by way
what you can do with your software.
Yeah, I mean, for example,
probably the most prominent place
that WECA could be seen in action would be the Las Vegas
sphere.
So it feeds data to that screen.
It's involved in rendering.
It's involved in carving.
At this point, it's touching pretty much every aspect of content creation and delivery to the screen.
And we've seen enough interest of this where other large scale venues are looking to do the same.
It's just being able to deliver this type of performance over a network is...
Just being able to deliver this type of performance over a network is...
I mean, I don't want to say that we don't have any competition, but there's nothing else right now that is able to hit these numbers that are also built that can scale up to the
correct size, not just in performance, but in capacity.
This is always the trade size, not just in performance, but in capacity. This is always the trade-off, right?
You look at systems that historically have been
extremely performant.
They're usually direct attached.
If you want them to be a little bigger,
you would switch to something like SAN.
That was never quite as performant
as direct attached storage, but it could be much bigger.
And then you could go bigger still and reduce some of the
complexity by deploying NAS, which would be slower still,
but could go larger.
And then if you wanted stupid scale, you could go to object
and also your performance goes through the floor.
So the place where Weka sits really
is beating the direct-attack storage performance
and also scaling all the way up to object.
And customers that have these massive high performance
requirements and large scale are coming and trying our product
and just can't really find anything else
that can hit those metrics.
So I'm sure people will catch up,
but today it's a good place to be.
If I can provide a little background in there
from a long time storage nerd,
you know, it's funny that people talk about,
you know, what you just talked about,
SAN and NAS and object.
The scalability and performance of those is really a function not of the intrinsic element
or nature of the storage or the storage protocol.
It's about basically the modernization of the delivery mechanism and the software that's
constructed to deliver those. I think that that's sort of the insight
that some of these companies recently have had is that,
you know, the reason that direct attach storage
was high performance was because it was dedicated.
And the reason that object storage scaled so well
was because it was distributed.
And the idea that you could build a massive scale solution
that would kind of combine the best of all possible worlds
with software is really the reason that so many
of these modern systems are able to scale.
Frankly, it reminds me a lot of Kubernetes and the cloud,
and frankly, AI itself.
I mean, the reason that AI processing
is so incredibly power consuming and high performing
is because of this whole idea of distributing it,
breaking it up into small tasks
and distributing it massively among multiple nodes
in parallel.
That's exactly what's been going on
in the leading software for storage.
And that's the reason I think that Weka is able to scale the way it does.
It's not because of some specialized little trick in there.
It's simply because the system scales to just incredible levels,
just like the cloud does, just like AI does.
And I think that that makes it uniquely suited for this AI application because it's such a scalable platform,
because everything is just completely distributed.
There isn't some monolith somewhere that says,
this is only how fast it can run.
Everything is distributed.
Everything is running software.
I think that's how people think that things work,
but not everything works that way and yours certainly does.
Yeah, I mean, you touched on one technology that has been instrumental in achieving planetary
scale in anything and that's Kubernetes and the ability to orchestrate containerization in a way where as long as you can provide the resources behind
it, you can scale horizontally in an extremely resilient and redundant fashion is phenomenal.
So Weka released recently its own Weka operator where you can actually provision an entire
Weka cluster or multiple Weka clusters deployed fully in Kubernetes. So if a Kubernetes-based shop
has compute that already has NVMe available in it and that is all siloed per server,
has NVMe available in it. And that is all siloed per server.
Installing Weka via Kubernetes now
allows you to bundle all of this NVMe in a one giant file
system that is available to every single server.
And if you're in a multi-tenancy environment,
you could actually compose more than one cluster
in this environment, shared across that infrastructure
with each customer having its own entire dedicated cluster
with cluster admin privileges per custom.
I don't know another product that's doing that today,
but it's pretty wild.
I mean, normally you would treat storage like pets, and everything else could be treated
like cattle.
But today, actually, we've been able to run storage in a cattle ranch.
It's pretty interesting.
It's pretty wild.
I really do appreciate that you're putting some focus on storage.
I mean, we've been the overlooked pet for quite some time.
I'm not sure if I like being a pet or on the cattle ranch,
but maybe I'm the dog managing the cattle. I like that idea.
But it's interesting because you talk about these ability to shift
access to information across multiple
points of physical location, which plays well into our conversations of this kind of season
around Edge.
How far away some of those platforms can be from the user or the operator, right?
Because we all have this different definition of the word Edge and we've talked about those
definitions all season.
But realistically, I mean, how far out there are you guys seeing that the future of what
would be classified as your ability to reach closer and closer to where the data generation
point is?
Are there certain platforms or solutions that you're kind of investigating or already working
on?
Yeah, it's funny.
I think the definition of edge moves probably more often than the bottleneck moves, right?
And also one person's edge infrastructure could be larger than another's core infrastructure.
And certainly some organizations, the amount of edge infrastructure they may have can vastly
outweigh what they have as core.
But I think as we see more robotics appearing in the world to feed data to all of these compute
processes that are literally out there in the world, not sitting in a data
center, it's going to become more important.
That will require different stages, different tiers, fantastic data
movement between all these tiers.
So yeah, I mean, edge network is going to be really important.
Edge data centers, edge storage within them, data tiering from there back to much larger data centers.
All the orchestration of this is, you know, it's an intense focus for Weka.
And, you know, we want to make sure that as that whole world gets more complex,
that we stay at the forefront of it.
And I think the nature of the solution too kind of matches the needs there too,
because it's built up of sort of this parallel architecture,
you can scale up and scale down very effectively.
So you can use it at smaller scale,
well, comparatively smaller scale,
for AI processing outside the data center,
and then you can ramp it right up to massive scale,
and then you can use your tools to enable data to make that
leap from location to location from size.
And I think that that's, again, that matches the way that people wish that software worked,
but it doesn't always work that way.
Yeah, I mean, so we have customers right now, for example, running autonomous vehicles all around the
world who are generating massive amounts of metrics and trying to send all that home is
not very efficient.
So deploying Weka in many, many different data centers all around the world that can
be as close to these vehicles as possible to collect all their methods, clean them and reduce their size and then send all of that back to
a core for additional research, for additional training and modeling is a
place where WEC has been really successful. As that type of autonomy moves
into additional places within the edge with more robotics.
I think we're gonna see a lot more of that.
Another place that we've been successful
is within media and entertainment as well,
where the edge can serve to provide tool sets to talent
that can be all around the world because talent is something you can't really scale.
You have to find talent where it resides.
Many companies have to set up infrastructure where you might make a toolset available to people
for either a permanent or a temporary amount of time.
But they need huge performance within the compute,
within the storage, within rendering.
And all of this has to be able to seamlessly communicate
with all of the other parties that
are participating in the same project work code, for example. Um, so we've, we've had great success there.
Um, particularly in cloud we've, we've watched customers being able to deploy
temporary setups in different countries where you wouldn't even have normally
at any footprint, um, employee talent in that area, tear it all down when the
project's finished,
while you're bringing up more someplace else for another project.
That's been, there's a bit of a game changer, actually.
When people think of AI, especially nowadays, I think a lot of them are just focused on chatbots,
and chatbots, and more chatb bots. But of course there's a lot more
being done with this technology, whether it is using AI and ML in different ways or using
HPC for other related applications. I know that you all are involved in some of that.
Can you tell us a little bit about some other applications for this technology? Yeah, I mean, for example, we have a few companies in in in
health and life sciences who are trying to solve some of the
hardest problems in the world here that really matter to
people. Memorial Sloan Kettering, for example, deployed
WECA to help speed up their modeling in the pursuit to solve many cancers.
And they have managed to massively contract the time it takes to coherence for a model
and massively reduce energy footprint in the same time, just by being able to achieve more miles per gallon on
The exact same hardware in a shorter time
so this
This is going to be a game changer if if Memorial Sloan Kettering can actually achieve what they think they can in the next few years
Which really excites me. I mean, it's fun to work on media and entertainment. It's fun to work on cars, you know, many things.
But when you actually see life changing work being done,
it's quite humbling.
Absolutely.
And it's always fun to hear about technology.
As I said, that's not just the same old thing
that people are thinking of and using AI in new and exciting ways.
Thanks so much for this incredible conversation.
Scott, this is our last episode of the season.
We have been thrilled to have SolidIME co-hosting two seasons of Utilizing Tech now.
And if our listeners go to utilizingtech.com,
they'll find both of those seasons
along with six other seasons previously.
I guess before we go, Scott, sum up a little bit
about season eight AI, AI data infrastructure,
AI at the edge.
How exactly should people be thinking
about data infrastructure and storage for AI?
Yeah, I appreciate that and it has been it's been a lot of fun this season And I know Janice has had fun over a couple of seasons as well as my co-worker ace
From our perspective and from my personal perspective. It's just AI is is a shiny object, right? It is something that is very real
It's very true. But the fact that we're combining AI and now where we're
generating the data and we're generating so much data nowadays, it's unique to think that
people don't tend to realize as much that you have to put that data somewhere.
A lot of this season, whether intentional or not, has been focused on the advent and
benefit of storage.
It's kind of cool as a long-time storage guy to see the value and benefit of storage. And so it's kind of cool as a long time storage guy
to see the value and the benefits of what we see
and our daily lives coming through to everyone else
as something of value.
Because you spend so much time working on data
and that data always seems to be the star of the show
in certain different processing and things like that.
But as you saw through the season, if you go back and look at it,
we've talked to a whole bunch of different ways of looking at managing data,
focusing on data, and all of that revolves around where the data sits.
And the data doesn't always just sit in the CPU or DRAM,
which are wonderful toys and tools, but it does have to have a long-time placement of that.
So I see that as one of one of the bigger nuts
of this whole season is just, it's cool to know
that storage is really getting a play in the space
and all these companies are doing so many cool innovations
to again, shift those bottlenecks and talk about it,
whether it's the industry standards bodies,
the software platforms, the hardware platforms,
a combination of all of that.
So it's been a great season.
I've had a lot of fun and I've learned a lot myself. Well, thanks a lot. Yeah, it has been a great
season for me as well. Obviously, an old-time storage nerd here. It's fun to see where this
industry is headed and it is fun to see just how all of those things that we wished we could do have
in many cases come true with modern software. So just incredible overall. Alan, again, thank you for joining us
and representing WECA here on Utilizing Tech.
As we wrap up this episode, where
can people connect with you and continue this conversation?
Actually, this week, 18th and 19th,
WECA will be presenting at the three big AI
conferences, San Francisco, London, and Singapore.
So if you can make it down to those, please come and hear what we're all about.
Great.
Scott, I guess going forward, where can people continue speaking with you and your colleagues
and learning more about Solidaim?
Yeah, for Solidaim, it's pretty straightforward.
solidaim.com slashai. And also, you can find me on LinkedIn, Blue Sky, and Twitter, formerly known as, or ex-formerly
known as Twitter, at SMShadley.
I tend to spend a lot of time having fun sharing insights and just being a little bit social.
Yep.
And you will find me as Sfosket on most of the socials, including Blue Sky and Mastodon
as well,
and of course on LinkedIn. Thanks for listening to this episode of Utilizing Tech. You can find
this podcast in your favorite podcast application as well as on YouTube. As mentioned, this is the
last episode of season eight. Yes, that's right. There are eight seasons of this and you can go
back and listen to those all the way back to the pre-chat
GPT era. If you enjoyed this discussion, please do leave us a rating and review. It's really nice to
see those. This podcast was brought to you by Solidimed this season, as well as Tech Field Day,
which is now part of the Futurum group. For show notes and more episodes, go to UtilizingTech.com
or find us on ex Twitter, Blue Sky, or Mastodon at
Utilizing Tech. Thanks for listening and we will catch you next season on
Utilizing Tech.