In The Arena by TechArena - OCP Insights with Scott Shadley: AI, Collaboration & Storage

Starting point is 00:00:00 Welcome to the Tech Arena, featuring authentic discussions between tech's leading innovators and our host, Alison Klein. Now, let's step into the arena. Welcome in the arena. My name is Alison Klein, and today is another Data Insights podcast, and that means Janice Narowski is back with me. Hey, Janice. Hi, Allison. It's great to be back. So, Janice, we have survived an OCP summit of the ages. How are you doing after that incredible event? Oh, wow. I don't think I expected the energy of that place right I think by Monday I'll probably make a comeback but all good very exciting time and it was probably one of the best OCPs I've been to in a while yeah me too

Starting point is 00:01:00 it was 7,000 people which set a record by a long shot I think last year was something like 7,000 people, which configurations, new technology, and the hyperscalers showing up with incredible focus and determination to build out the next generation of AI. Yeah, I couldn't agree more. And I was just thinking, you know, as a marketer, it's like open compute project, right? It's like the amount of people that showed up this year, it is such a community. I think they're going to have to rebrand themselves at some point. What I love about OCP is there's a lot of communities out there, right? Trying to do good, trying to standardize, trying to bring all this stuff together and have it make sense collectively. But the beauty of OCP is that it takes those communities and brings communities within

Starting point is 00:02:05 communities together. And walking away from the show, it's just the collaboration amongst, like you said, some of the hyperscalers, some of the newer innovators on the scene. The show floor was just buzzing with energy. So I am really excited for what's going to happen post-OCP. And there were some really big announcements too, right? Yeah. I mean, the announcement with AMD and Intel on stage comes to mind with two leaders from the semiconductor world stating that they're going to work together with the community on the evolution of x86 and really make that architecture something that evolves with the customer requirements in mind and really evolves

Starting point is 00:02:47 to embrace both of those companies' chips as something that can be accessed for the next wave of technology. I thought that was amazing. As somebody who worked at Intel for a long time, I didn't see that one coming. I don't think I did either. And I've never read the headline. It was something like, hell freezes over and AMD and Intel come together. And as a former Intel person myself, I couldn't have been more proud because I agree this not only sets the trajectory for what Intel and AMD can accomplish, but it really sets the tone for the community at large. Yeah, I had a great conversation with Amber Huffman, who I know is near and dear to the storage industry's heart for many technologies, including NVMe. But she's at Google now.

Starting point is 00:03:31 And one of the things that she said is with AI advancing in a trajectory of innovation every year, standards and Moore's Law and all the things that we've depended upon cannot move at the same pace. And so the concept of a CP where there's 80% alignment around a particular trajectory with room for innovation for different vendors works for the AI era. And I love that model. You know, I think that gives room for a lot of innovation and really frees up a lot of engineering efforts to focus on how to move technology forward as quickly as possible. Yeah, this shotgun approach, right? Everything's moving so quickly with AI. It's unlike anything we've ever seen, right?

Starting point is 00:04:20 But everyone at the same time is trying to figure it out, but also learn from others. So I think the only way to do this effectively is to really come together and learn from your peers and your partners and see how things are being done and test it out and see what works. So I couldn't agree more. I think, Allison, I don't know what you thought about this, but last year when I traveled the show floor, I saw a lot of liquid cooling solutions, right? And this year, I feel like I saw some of that. It was certainly a lot of liquid cooling, but I saw a lot more applications, right? I saw a lot more workloads and AI being run in conjunction with that cooling. I don't know. What did you think about the emphasis on cooling and power? I was thinking about the people that we talked to

Starting point is 00:05:05 in episodes from this week. We certainly saw cooling vendors showcasing what they're delivering from a standpoint of direct and immersion cooling. We had Cool IT on the show, for example, talking to us about what they're doing with direct cooling solutions. We certainly had a really innovative example in our panel with iZotope where they're integrating that into a rack level configuration that's pretty sweet. And then, of course, the immersion folks were there. I think that people are talking about power. I don't know if you headed over into Flex's booth, but they were showing off some pretty cool power technologies. And they were not the only one. There were a number of folks.

Starting point is 00:05:45 But I think you're right. Compute kind of won the day for me. And I think that compute infrastructure, meaning compute storage networking, was much more present and much more vibrant than what it felt like last year. I thought that was really exciting to see. I know that we have a guest today. And I would love to hear about the guest and what we're going to be talking about. Yeah, I am excited to introduce a colleague within Soladyne. I'd like to introduce Mr. Scott Shadley, which a lot of folks in the industry

Starting point is 00:06:17 know. Welcome to the show, Scott. Hey, how's it going? Glad to be here. It's always fun to have a conversation about what's going on in the world. And OCP is one of those very unique events. So I appreciate it, Alice and Janice, for having me on today. So, Scott, you were just listening to Janice and I talk about the summit. What was your big takeaway? And I know that you were part of the content delivery at the show. Yeah, I've participated in roughly seven different versions of OCP from some of the early beginnings through the pandemic and then this post-pandemic rebound, which is great to see to the point about the level of attendance and even the show forward to that point being much more vibrant.

Starting point is 00:06:55 It was a lot larger and kind of back to previous years this time around. So I really appreciated that. Overall, the event, I think one of the unique things about OCP that sets to co-develop, co-brand, and deploy things in that way. And I think that was perfectly set in motion by one of the keynotes given by a complete consumer of all of this and not even any of the vendors in the way of the Geico keynote, where they talked about going from a cloud environment back to on-premise by way of the OCP architectures that were discussed. I thought that was very interesting. Yeah. And we're seeing a lot of buzz with that too, right, Scott, in terms of just migrating from off-prem to on-prem, which five years ago, right, that wasn't the case. Everyone was migrating the other direction. So I agree that was really interesting. Scott, you just said

Starting point is 00:08:01 something that jogged my memory. You said taking a little spin toward. And I'd love for you to just tell us, like, what in the world were you doing on stage? What was that? At events like these, you got to think of interesting new ways to gain attention and visibility and drive new ideas. And so we got this crazy idea within Solid I am that for one of the presentations, we'd stick a bike on a trainer on stage. And I did a great job, fortunately not passing out while I pedaled for 15 minutes while pitching a deck, highlighting the ability to save and or generate power. So the idea of riding the bike was trying to keep me carbon neutral. And then it parlayed very well into our Arbus presence there where we actually had a bike where anybody could hop on and ride it. They only had to go for 30 seconds though. So I still feel quite empowered

Starting point is 00:08:47 by the fact that I managed to convey quite a compelling efficiency message while pedaling a bike for 15 minutes. Why has storage, Scott, so much of a focus for efficiency in the data center today? I thought that the demonstration that you guys had in your booth

Starting point is 00:09:01 was fantastic to just shine a light on it. But why is that so important as folks grapple with the amount of power that they're spending in those data center racks? Yeah, it's interesting because a lot of the initial focus is this AI evolution kicked off, right? It has been compute GPU, we see the NVIDIA roadmap, all that kind of stuff. And having been in this industry for as long as I have, and if I say the number, it makes me feel really old, but storage has never really been a key component or focus. A lot of people know it's there and they expect it, but they don't tend to actually pay that close attention to it. And what we're starting to see now is that people have gotten past what I would call the pre-phase one AI, because we're nowhere near the maturity of this technology

Starting point is 00:09:45 and direction that we're going, they're realizing the storage impact, right? So if you need space for GPUs, what can you take out of a data center? There's very little things you can actually remove to increase with the exception of compacting the data storage devices. So the idea of this continued transition from the hard drive industry to these now amazingly high capacity 6D and soon to be 100 plus terabyte drives that are available now from Solidim and eventually from others down the road kind of thing as they follow behind, allows us to rethink how those architecture here put together. And so it brings that storage picture back into play. And there's great examples of that where we've seen that the storage

Starting point is 00:10:27 consumption in the data center is almost 40%. Yet today people are thinking, I just got to find ways of powering a GPU. What about if I reduce the demand of that 40% by putting in more efficient storage? And so that's why we're starting to see as we're doing this next level look at the architectures, the storage products are actually starting to have more value. Yeah, thank you for that, Scott. You just jogged by memory on a couple of things, right? How does one not just look at the GPU or look at the compute, but look at the overall picture? And, you know, sometimes storage, at least in the past, hasn't been the thing that folks really look at. So, Allison, I think your question was spot on, but can you tell us, Scott, how the compute

Starting point is 00:11:05 customers from the edge customers really differ and how are they benefiting from denser, more power efficient storage? Yeah, absolutely. That's a very good look at how you take this next step in that evolution. We all know the CSPs, the hyperscale data centers, even these Fortune 1500, Fortune 50 companies have these very large centralized data storage architectures. It's about plowing a cornfield and putting up a big, huge building and all that supports it. But when we start pushing towards the edge of these infrastructures, we start to see a lot more opportunity for storage to play an even bigger role, mainly because if you think about where they're putting them, they're going in less desirable locations in some cases.

Starting point is 00:11:50 A prime example I brought up in my presentation was a friend of mine is running a data center that's actually stored inside an old church in Europe. So they don't get the full height racks. They don't get the full cooling, the raised floors, all that kind of stuff. So the amount of what they're putting in there needs to be relevant, but it's not as effective use of space. So if I have to have a hundred drives to support something, I don't have that room for the GPUs I need. But if I can get that hundred drives down to 10, 20, 30, by way of increasing the density of those drives in a

Starting point is 00:12:22 smaller rack footprint, even more better. And not that more better is a word, but it came out. So another aspect of that is these edge platforms, as you get even further out to the endpoints, you're not getting a full-size server. You're getting a one-U pizza box or one-U half rack, and they don't fit the 24 drives that you had before. You only get four, six, or eight. But I still need the density because we're generating most of the data where that smallest box is. Four 64 terabyte drives is a heck of a lot better storage than say

Starting point is 00:12:50 four or 10, three and a half inch hard drives, right? Because they can't even get close to that capacity point. So those are some of the key changes that we're seeing that are helping drive this focus as you push out towards that edge environment. When you think about storage, and one of the things that I think about is capacity of drives and throughput of drives, what kind of performance characteristics are right for AI, and how does that equate to storage efficiency? I love that question. Thank you. It wasn't even prompted. But when we first started this journey on the SSD world in the enterprise space, it

Starting point is 00:13:27 was all about, ooh, look how fast it can go. And a single drive that can drive millions of IOPS is amazing. But if you think about that at scale, and there's no system on the planet, even with the GPUs and CPUs that can actually transfer those millions of IOPS across hundreds of drives. And so the performance characteristics of most of the architectures that we're looking at are gated by the slowest or the smallest pipe. And that smallest pipe today has been driven by the hard drive. So if we're 1x, 2x, 3x, 4x a hard drive,

Starting point is 00:13:57 we're actually satisfying a majority of these workloads without having to try. So the availability of these high density drives with these performance optimized solutions are much better for this type of workload than some things. Now there are occasions where the million plus IOPS drives are great for certain pieces of that pie. And we have those products too. There's a reason why the portfolio is branded across that way, because throughput is based on the implementation and you always have to look at the weakest link. And at this point in time, there's never an instance really where an SSD product in the modern architectures is ever that weakest link. Well said, Scott. You know, the question always comes up even at the show, right? How many customers are using hard disk drives versus an all-flash array? And it's

Starting point is 00:14:42 fascinating to me to say, how do you even run an AI workload on something that's not all Flash, right? I appreciate that insight. We've asked a ton of questions, though. One of the things I would love to know from you is, I know after you got off the bike on the stage, there was a lot of people asking you questions. What was the audience really curious about? What were they asking you? A lot of it was around the partnerships and how we're taking a different look at it. So if you go back through the marketing roles of different events and things like that, like, look at this Kuni thing I've got, look at this Kuni thing I can do, but it's never, this is how I took my Kuni thing and made it work. And so we showed

Starting point is 00:15:19 several examples with some of our partners up on stage of how they helped us shrinkify or simplify or improve upon a given infrastructure that they needed to do something with that our products helped them with. And people are like, I didn't hear anything about how fast your drive was. I didn't hear about this, but wow, that partnered definition, the way you were able to re-architect because you talked to your customers and actually put a customer first is amazing. And that was a fun little sidestep of that as I got off stage. So, Scott, what do you think we should be expecting from Solidigm in the year ahead? I mean, you guys have done some amazing things in 2024.

Starting point is 00:16:00 Your PCIe drive gained a lot of headlines. You're delivering performance. You're delivering performance. You're delivering capacity. What's next? There's a wonderful roadmap, if you will, for us to trudge along with the standard products, which is great. We can think of great new innovation ideas. So we have a whole strategy office that's looking at next type of opportunities, what

Starting point is 00:16:20 we can do to parlay with AI and the drives in some new and innovative ways that may or may not be in the marketplace today. And you saw some of that in some of the OCP, future technology innovations area. And I participated in some of that activity as well. But realistically, I think people are starting to realize as we get these bigger and bigger drives, there's always been that challenge when you have somebody who's always had a problem or done it one way to rethink it the new way. And one of those things that hasn't really come up in a lot of this is this concept of the reliability aspects of a product. So hard drives fail. We all know they do.

Starting point is 00:16:55 They're very mechanical. And there's the definition of how to track failures by mean time between failures, MTBF. And with SSDs and the technology and especially the reliability factors that Solidigm has put into their drives with their enterprise focus, the concept is really mean kind to failure, MTTF. And while the numbers sound similar and they look the same on a roadmap, the idea is I'm going to give you a prediction of when it will fail, if and when it will fail, not a guarantee that it's going to fail how often. And it's commonly looked at in this marketplace as the blast radius. If I lose X, how bad an impact does it have? And that's why hard drive architectures have always been built the way they

Starting point is 00:17:36 are. It's about redundancy and, oh my gosh, what happens if, what happens when, what happened, that kind of thing. With the SSD infrastructures that we're building today and the reliability levels that we can bring to these very dense products, we're actually removing that as actually a concern for a lot of these people, because if a drive fails, the failure is much more manageable from a perspective of how to keep that moving forward. And so we're starting to see people rethink how they use things like RAID and the mirroring and the striping and things like that in this architecture, all because of the reliability that we can bring with these products. Scott, we are so lucky to have you because not only are you a great contributor to Solidigm, you yourself are connected to so many other communities. to share with the team here and listeners, where can folks get more information about you and where

Starting point is 00:18:26 to find you and follow-up questions, but also where do they go for Saladin? Yeah, no problem. I'm a terrible self-promoter, so I don't have any fancy personal website or blog or anything like that. LinkedIn, look me up, Scott Shadley. It's pretty easy to find me. I do participate. As you mentioned, I sit on the board of directors of SNEA. SNEA is one of the other standard bodies that takes place in all this. And I work with NBM Express and we're looking at all the other ones of those types of organizations as well. For Solidigm, if you're very interested on where we're going with this and especially on the track of the AI space, it's solidigm.com forward slash AI. Thanks so much, Scott. It's been so interesting to listen to you and hear your perspectives.

Starting point is 00:19:06 Thanks for spending some time with Janice and I. And Janice, with this, we are going to wrap another episode and a series of episodes from Open Compute. What a fantastic week. And thank you so much for the collaboration. Thank you, Allison. It's my pleasure. This has been amazing.

Starting point is 00:19:23 Yeah, it's been great. I appreciate it. Looking forward to coming back. Thanks for joining the Tech Arena. Subscribe and engage at our website, thetecharena.net. All content is copyright by the Tech Arena.

In The Arena by TechArena - OCP Insights with Scott Shadley: AI, Collaboration & Storage

Join Allyson Klein and Jeniece Wnorowski in this episode of Data Insights as they discuss key takeaways from the 2024 OCP Summit with Scott Shadley, focusing on AI advancements and storage innovations....

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.