In The Arena by TechArena - OCP Insights with Scott Shadley: AI, Collaboration & Storage
Episode Date: October 24, 2024Join Allyson Klein and Jeniece Wnorowski in this episode of Data Insights as they discuss key takeaways from the 2024 OCP Summit with Scott Shadley, focusing on AI advancements and storage innovations....
Transcript
Discussion (0)
Welcome to the Tech Arena, featuring authentic discussions between tech's leading innovators
and our host, Alison Klein. Now, let's step into the arena.
Welcome in the arena. My name is Alison Klein, and today is another Data Insights
podcast, and that means Janice Narowski is back with me. Hey, Janice. Hi, Allison. It's great to
be back. So, Janice, we have survived an OCP summit of the ages. How are you doing after that
incredible event? Oh, wow. I don't think I expected the energy
of that place right I think by Monday I'll probably make a comeback but all good very
exciting time and it was probably one of the best OCPs I've been to in a while yeah me too
it was 7,000 people which set a record by a long shot I think last year was something like 7,000 people, which configurations, new technology, and the
hyperscalers showing up with incredible focus and determination to build out the next generation of
AI. Yeah, I couldn't agree more. And I was just thinking, you know, as a marketer, it's like
open compute project, right? It's like the amount of people that showed up this year, it is such a
community. I think they're going to have to rebrand themselves at some point. What I love about OCP is there's a
lot of communities out there, right? Trying to do good, trying to standardize, trying to bring
all this stuff together and have it make sense collectively. But the beauty of OCP is that it
takes those communities and brings communities within
communities together.
And walking away from the show, it's just the collaboration amongst, like you said,
some of the hyperscalers, some of the newer innovators on the scene.
The show floor was just buzzing with energy.
So I am really excited for what's going to happen post-OCP.
And there were some really big announcements too, right?
Yeah. I mean, the announcement with AMD and Intel on stage comes to mind with two leaders from the semiconductor world stating that they're going to work together with the community on the evolution
of x86 and really make that architecture something that evolves with the customer requirements in mind and really evolves
to embrace both of those companies' chips as something that can be accessed for the next
wave of technology. I thought that was amazing. As somebody who worked at Intel for a long time,
I didn't see that one coming. I don't think I did either. And I've never read the headline.
It was something like, hell freezes over and AMD and Intel come together. And as a former Intel person myself,
I couldn't have been more proud because I agree this not only sets the trajectory for what Intel
and AMD can accomplish, but it really sets the tone for the community at large. Yeah, I had a
great conversation with Amber Huffman, who I know is near and dear to the storage industry's heart for many technologies, including NVMe.
But she's at Google now.
And one of the things that she said is with AI advancing in a trajectory of innovation every year, standards and Moore's Law and all the things that we've depended upon cannot move at the same pace.
And so the concept of a CP where there's 80% alignment around a particular trajectory
with room for innovation for different vendors works for the AI era. And I love that model.
You know, I think that gives room for a lot of innovation and really frees up a lot of
engineering efforts to focus on how to move technology forward as quickly as possible.
Yeah, this shotgun approach, right?
Everything's moving so quickly with AI.
It's unlike anything we've ever seen, right?
But everyone at the same time is trying to figure it out, but also learn from others. So I think the only way to do this effectively is to really come together and
learn from your peers and your partners and see how things are being done and test it out and see
what works. So I couldn't agree more. I think, Allison, I don't know what you thought about this,
but last year when I traveled the show floor, I saw a lot of liquid cooling solutions, right? And this year,
I feel like I saw some of that. It was certainly a lot of liquid cooling, but I saw a lot more
applications, right? I saw a lot more workloads and AI being run in conjunction with that cooling.
I don't know. What did you think about the emphasis on cooling and power?
I was thinking about the people that we talked to
in episodes from this week. We certainly saw cooling vendors showcasing what they're delivering
from a standpoint of direct and immersion cooling. We had Cool IT on the show, for example, talking
to us about what they're doing with direct cooling solutions. We certainly had a really innovative
example in our panel with
iZotope where they're integrating that into a rack level configuration that's pretty sweet.
And then, of course, the immersion folks were there. I think that people are talking about
power. I don't know if you headed over into Flex's booth, but they were showing off some
pretty cool power technologies. And they were not the only one. There were a number of folks.
But I think you're right.
Compute kind of won the day for me.
And I think that compute infrastructure, meaning compute storage networking, was much more present and much more vibrant than what it felt like last year.
I thought that was really exciting to see.
I know that we have a guest today.
And I would love to hear about the guest
and what we're going to be talking about. Yeah, I am excited to introduce a colleague
within Soladyne. I'd like to introduce Mr. Scott Shadley, which a lot of folks in the industry
know. Welcome to the show, Scott. Hey, how's it going? Glad to be here. It's always fun to
have a conversation about what's going on in the world.
And OCP is one of those very unique events.
So I appreciate it, Alice and Janice, for having me on today.
So, Scott, you were just listening to Janice and I talk about the summit.
What was your big takeaway?
And I know that you were part of the content delivery at the show.
Yeah, I've participated in roughly seven different versions of OCP from some of the early beginnings through the pandemic and then this post-pandemic rebound, which is great to see to the point about the level of attendance and even the show forward to that point being much more vibrant.
It was a lot larger and kind of back to previous years this time around.
So I really appreciated that. Overall, the event, I think one of the unique things about OCP that sets to co-develop, co-brand, and deploy things in that way. And I think that
was perfectly set in motion by one of the keynotes given by a complete consumer of all of this and
not even any of the vendors in the way of the Geico keynote, where they talked about going from
a cloud environment back to on-premise by way of the OCP architectures that were discussed. I thought that was very interesting.
Yeah. And we're seeing a lot of buzz with that too, right, Scott, in terms of just migrating
from off-prem to on-prem, which five years ago, right, that wasn't the case. Everyone was
migrating the other direction. So I agree that was really interesting. Scott, you just said
something that jogged my memory. You said taking a little spin toward.
And I'd love for you to just tell us, like, what in the world were you doing on stage?
What was that?
At events like these, you got to think of interesting new ways to gain attention and visibility and drive new ideas.
And so we got this crazy idea within Solid I am that for one of the presentations, we'd stick a bike on a trainer on stage. And I did a great job, fortunately not passing out while I pedaled for 15 minutes while pitching a deck, highlighting the ability to save and or generate power. So the idea of riding the
bike was trying to keep me carbon neutral. And then it parlayed very well into our Arbus presence
there where we actually had a bike where anybody could hop on and ride it. They only had to go for 30 seconds though.
So I still feel quite empowered
by the fact that I managed to convey
quite a compelling efficiency message
while pedaling a bike for 15 minutes.
Why has storage, Scott,
so much of a focus for efficiency
in the data center today?
I thought that the demonstration
that you guys had in your booth
was fantastic to just shine a light on it.
But why is that so important as folks grapple with the amount of power that they're spending in those data center
racks? Yeah, it's interesting because a lot of the initial focus is this AI evolution kicked off,
right? It has been compute GPU, we see the NVIDIA roadmap, all that kind of stuff. And having been
in this industry for as long as I have, and if I say the number, it makes me feel really old, but storage has never really been a key component
or focus. A lot of people know it's there and they expect it, but they don't tend to actually
pay that close attention to it. And what we're starting to see now is that people have gotten
past what I would call the pre-phase one AI, because we're nowhere near the maturity of this technology
and direction that we're going, they're realizing the storage impact, right? So if you need space
for GPUs, what can you take out of a data center? There's very little things you can actually remove
to increase with the exception of compacting the data storage devices. So the idea of this
continued transition from the hard drive industry to these now
amazingly high capacity 6D and soon to be 100 plus terabyte drives that are available now from
Solidim and eventually from others down the road kind of thing as they follow behind, allows us to
rethink how those architecture here put together. And so it brings that storage picture back into
play. And there's great examples of that where we've seen that the storage
consumption in the data center is almost 40%. Yet today people are thinking, I just got to find ways
of powering a GPU. What about if I reduce the demand of that 40% by putting in more efficient
storage? And so that's why we're starting to see as we're doing this next level look at the
architectures, the storage products are actually starting to have more value.
Yeah, thank you for that, Scott.
You just jogged by memory on a couple of things, right?
How does one not just look at the GPU or look at the compute, but look at the overall picture? And, you know, sometimes storage, at least in the past, hasn't been the thing that folks really look at.
So, Allison, I think your question was spot on, but can you tell us, Scott, how the compute
customers from the edge customers really differ and how are they benefiting from denser, more
power efficient storage? Yeah, absolutely. That's a very good look at how you take this next step
in that evolution. We all know the CSPs, the hyperscale data centers, even these Fortune 1500, Fortune 50
companies have these very large centralized data storage architectures. It's about plowing a
cornfield and putting up a big, huge building and all that supports it. But when we start pushing
towards the edge of these infrastructures, we start to see a lot more opportunity for storage
to play an even bigger role, mainly because if you think about where they're putting them,
they're going in less desirable locations in some cases.
A prime example I brought up in my presentation was
a friend of mine is running a data center
that's actually stored inside an old church in Europe.
So they don't get the full height racks.
They don't get the full cooling, the raised floors, all that kind of stuff.
So the amount of what they're putting in there needs to be relevant, but it's not as effective use of space. So if I have to have
a hundred drives to support something, I don't have that room for the GPUs I need. But if I can
get that hundred drives down to 10, 20, 30, by way of increasing the density of those drives in a
smaller rack footprint, even more better. And not that more better is a word, but it came out.
So another aspect of that is these edge platforms, as you get even further out to the endpoints,
you're not getting a full-size server.
You're getting a one-U pizza box or one-U half rack, and they don't fit the 24 drives
that you had before.
You only get four, six, or eight.
But I still need the density because we're generating most of the data where that smallest
box is. Four 64 terabyte drives is a heck of a lot better storage than say
four or 10, three and a half inch hard drives, right? Because they can't even get close to that
capacity point. So those are some of the key changes that we're seeing that are helping drive
this focus as you push out towards that edge environment.
When you think about storage, and one of the
things that I think about is capacity of drives and throughput of drives, what kind of performance
characteristics are right for AI, and how does that equate to storage efficiency?
I love that question. Thank you. It wasn't even prompted. But when we first started this journey
on the SSD world in the enterprise space, it
was all about, ooh, look how fast it can go.
And a single drive that can drive millions of IOPS is amazing.
But if you think about that at scale, and there's no system on the planet, even with
the GPUs and CPUs that can actually transfer those millions of IOPS across hundreds of
drives.
And so the performance characteristics of most of the
architectures that we're looking at are gated by the slowest or the smallest pipe. And that
smallest pipe today has been driven by the hard drive. So if we're 1x, 2x, 3x, 4x a hard drive,
we're actually satisfying a majority of these workloads without having to try. So the availability
of these high density drives with these performance optimized
solutions are much better for this type of workload than some things. Now there are occasions where
the million plus IOPS drives are great for certain pieces of that pie. And we have those products
too. There's a reason why the portfolio is branded across that way, because throughput is based on
the implementation and you always have to look at the weakest link. And at this point in time, there's never an instance really where an SSD product in the modern architectures is ever
that weakest link. Well said, Scott. You know, the question always comes up even at the show,
right? How many customers are using hard disk drives versus an all-flash array? And it's
fascinating to me to say, how do you even run an AI workload on something
that's not all Flash, right? I appreciate that insight. We've asked a ton of questions, though.
One of the things I would love to know from you is, I know after you got off the bike on the stage,
there was a lot of people asking you questions. What was the audience really curious about? What
were they asking you? A lot of it was around the partnerships and how
we're taking a different look at it. So if you go back through the marketing roles of different
events and things like that, like, look at this Kuni thing I've got, look at this Kuni thing I
can do, but it's never, this is how I took my Kuni thing and made it work. And so we showed
several examples with some of our partners up on stage of how they helped us shrinkify or simplify
or improve upon a given infrastructure that they needed to do something with that our products
helped them with. And people are like, I didn't hear anything about how fast your drive was. I
didn't hear about this, but wow, that partnered definition, the way you were able to re-architect
because you talked to your customers and actually put a customer first is amazing.
And that was a fun little sidestep of that as I got off stage.
So, Scott, what do you think we should be expecting from Solidigm in the year ahead?
I mean, you guys have done some amazing things in 2024.
Your PCIe drive gained a lot of headlines.
You're delivering performance. You're delivering performance.
You're delivering capacity.
What's next?
There's a wonderful roadmap, if you will, for us to trudge along with the standard products,
which is great.
We can think of great new innovation ideas.
So we have a whole strategy office that's looking at next type of opportunities, what
we can do to parlay with AI and the drives in some new and innovative ways that may or may not be in the marketplace today. And you saw some of that in some of the OCP,
future technology innovations area. And I participated in some of that activity as well.
But realistically, I think people are starting to realize as we get these bigger and bigger drives,
there's always been that challenge when you have somebody who's always had a problem or done it one way to rethink it the new way.
And one of those things that hasn't really come up in a lot of this is this concept of
the reliability aspects of a product.
So hard drives fail.
We all know they do.
They're very mechanical.
And there's the definition of how to track failures by mean time between failures, MTBF.
And with SSDs and the technology and especially the reliability factors
that Solidigm has put into their drives with their enterprise focus, the concept is really
mean kind to failure, MTTF. And while the numbers sound similar and they look the same on a roadmap,
the idea is I'm going to give you a prediction of when it will fail, if and when it will fail,
not a guarantee that it's going to fail how often.
And it's commonly looked at in this marketplace as the blast radius. If I lose X, how bad an impact does it have? And that's why hard drive architectures have always been built the way they
are. It's about redundancy and, oh my gosh, what happens if, what happens when, what happened,
that kind of thing. With the SSD infrastructures that we're building today and the reliability levels that we can bring to these very dense products, we're
actually removing that as actually a concern for a lot of these people, because if a drive
fails, the failure is much more manageable from a perspective of how to keep that moving
forward.
And so we're starting to see people rethink how they use things like RAID and the mirroring
and the striping and things like that in this architecture, all because of the reliability that we can bring with these products.
Scott, we are so lucky to have you because not only are you a great contributor to Solidigm, you yourself are connected to so many other communities. to share with the team here and listeners, where can folks get more information about you and where
to find you and follow-up questions, but also where do they go for Saladin? Yeah, no problem.
I'm a terrible self-promoter, so I don't have any fancy personal website or blog or anything like
that. LinkedIn, look me up, Scott Shadley. It's pretty easy to find me. I do participate. As you
mentioned, I sit on the board of directors of SNEA. SNEA is one of the other standard bodies
that takes place in all this. And I work with NBM Express and we're looking at all
the other ones of those types of organizations as well. For Solidigm, if you're very interested
on where we're going with this and especially on the track of the AI space, it's solidigm.com
forward slash AI. Thanks so much, Scott. It's been so interesting to listen to you and hear your perspectives.
Thanks for spending some time with Janice and I.
And Janice, with this, we are going to wrap another episode and a series of episodes from
Open Compute.
What a fantastic week.
And thank you so much for the collaboration.
Thank you, Allison.
It's my pleasure.
This has been amazing.
Yeah, it's been great.
I appreciate it.
Looking forward to coming back.
Thanks for joining the Tech Arena. Subscribe and engage at our website, thetecharena.net.
All content is copyright by the Tech Arena.