In The Arena by TechArena - Infrastructure Innovation to Fuel the Distributed Computing Landscape with Iddo Kadim

Episode Date: August 1, 2023

TechArena host Allyson Klein chats with TechArena’s own technology strategist, Iddo Kadim, about the growing distribution of compute environments and infrastructure requirements to fuel continued in...novation. This episode covers the topics of processor acceleration across CPU and GPU, different approaches to orchestration, and discussion of where workloads will land across a cloud to edge landscape. The experts also discuss lead usage models that will drive the upcoming TechArena report on distributed computing.

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Tech Arena, featuring authentic discussions between tech's leading innovators and our host, Alison Klein. Now, let's step into the arena. Welcome to the Tech Arena. My name is Alison Klein, and today I'm delighted to have a friend and partner in crime on the podcast with me for what promises to be the first of many episodes,
Starting point is 00:00:33 Ido Kadim. Welcome to the program, Ido. Good morning, Alison. Great to be here. You recently joined the Tech Arena as a tech strategist, but why don't you go ahead and introduce yourself more fully to the audience and talk a little bit about your background? Sure. So I spent the majority of my, mostly or 18 years, really increasingly focusing on data center infrastructure in a variety of roles, strategic marketing, technical sales, and building technology solutions for the Olympics and Paralympics. That's amazing. And I have a whole set of questions for you someday about your Olympic experiences, but that's not today's interview. Today's interview is another important topic and one that we've been bouncing around on and on the tech arena. It's the future of infrastructure. And why don't we just start with a question of why does infrastructure need to change? And what is it in your mind that is the driving force behind this change?
Starting point is 00:02:00 I think, as you said, we're in my mind, an unprecedented time of change, but infrastructure has always evolved. Every time it's felt like it's reached a certain steady state only to find out that it needs to evolve again. And the driving force, the Uber driving force, I think, is that the demand for compute is insatiable. And it's really limited by how easy it is to consume and therefore to provide access to such infrastructure. And there are a number of things happening in terms of use cases that drive need for continuous change. And they come from a number of directions at the moment that are causing that unprecedented amount of change. That's funny. I was thinking about
Starting point is 00:02:53 this statement that you made, and I was thinking, when did I really realize that there's always going to be a need for change and innovation? And I think that we're both rooted in our histories at Intel, and of course, that comes with a belief in Moore and innovation. And I think that we're both rooted in our histories at Intel. And of course, that comes with a belief in Moore's law and for many years, understanding that additional compute capacity would be consumed somehow. I remember coming to this learning again with virtualization when everyone thought that consolidation would mean that nobody would buy as many servers anymore. And of course, we saw the birth of the cloud come out of that. But this time feels a little bit different. What do you think makes
Starting point is 00:03:30 it different? And we all know that there are massive changes in computing, starting with generative AI going on. But what do you think are the primary drivers of this era of change in infrastructure? And why does it seem so like unique, I guess is the right word, in terms of a moment for the infrastructure community? Yeah, I think in terms of drivers, sometimes my brain gets mixed between drivers and results. But I think if I really focus on the drivers, I think one is there's been a lot of investment and a lot of development in centralized infrastructure in the past static and a day and a half. So data centers going into clouds and clouds taking advantage
Starting point is 00:04:22 of their very large scale to drive or to leverage a lot of the uniformity of infrastructure. But it turns out that the cloud isn't the answer to everything because a lot of data is generated far away from the cloud and becomes very expensive to move to the cloud and also needs processing for other reasons, maybe for responsiveness or because of security reasons, sovereignty reasons, need to be processed close to where the data is created. So that drives a need in the edge, but you can't drive the same scale, localized scale and uniformity of solution in the edge. And the edge requires more optimization and more specification. As a result, many of the methods that were used to make cloud work in terms of orchestration and management
Starting point is 00:05:27 and instance offerings, et cetera, they don't work the same way in the edge because the edge is constrained. There are many different configurations of systems, physical systems. It's not just a question of slicing different instances on top of similar systems. It's not just a question of splicing different instances on top of similar systems. And so that
Starting point is 00:05:48 drives a huge change in the requirements in terms of orchestration, which cloud-style orchestration doesn't provide for. And also, I would say in terms of the orchestration itself, the monitoring, it's like, how do you know
Starting point is 00:06:04 things are going right? How do you react to issues? You can't just rely on, I'll just switch to another set because maybe in your edge location, there isn't another set. So I think that's one mega change is that diversification and distribution of compute needs. And when you think about that, I think that what comes to mind to me is that starts at the very foundation of those platforms. And it's opened up a lot of opportunity for different types of architectures to raise their heads. What do you see in the landscape with logic and microprocessor architectures? And why is this a moment where diversification makes sense? So first of all, I would say, you know, as I wrote in my inaugural blog for the Tech Arena, the fact is we are at a moment of huge diversification in terms of compute architectures across the board.
Starting point is 00:07:09 There are a number of drivers. And I think one is there was a sort of a state where there was one dominant architecture and one dominant or predominant supplier. And the market doesn't like these situations. The market seeks alternatives in order to drive commercial agenda, if not an architectural agenda. But I think at the same time, architectural agenda evolved.
Starting point is 00:07:37 The clouds needed architectures that would enable them to really maximize the value of CPU cores that they sell. So they needed alternatives of where to run infrastructure-type workloads, security, networking, network filtering, et cetera. So that was the one driver. And we saw Amazon and Microsoft and Google, et cetera, developing solutions, which leads to derivatives of Martinique architectures, but those become another
Starting point is 00:08:12 pole of compute and start driving new needs. I think with that, AI becoming a unique workload that drives a tremendous amount of compute and justifies an alternative architectural approach is another really important driver. And I think the third is this evolution, really stunning change in open source everything. So how do you enable any deployer to get exactly what they need? Not the superset of what they need, but exactly what they need. And especially going back to this diversification in the edge. And so I think this opportunity for really being driven by the RISC-V ecosystem to create even more diversification, to provide to those needs of getting what I need and not more. When you look at all of the offerings, you talked about cloud service
Starting point is 00:09:11 providers creating their own offerings, different architectures, x86, ARM, RISC-V. What are the ones that you think, when we're looking out ahead five years from now, what do you think the landscape is going to be like in terms of deployments within data centers, within the cloud, deployments at the edge, and how much are disruptive accelerators and other technologies that may not even be on the landscape be playing within a world that is enabled with chiplet architectures? I'm always hesitant when I get questions about five years old. Who the heck really knows? Where's the potential?
Starting point is 00:09:55 So I think one trend that I don't think I've mentioned yet in the blogs or in this conversation you just hit on is the proliferation of accelerators and especially dedicated accelerators. Intel has had the approach for a long time of accelerating functionality in instruction sets or integrating fixed function accelerators into the general purpose CPU to cater for these workloads that weren't readily performed just by instruction set. And NVIDIA elevated the GPU to an alternative of another really general purpose accelerator because it's fit for many functions.
Starting point is 00:10:40 It's not really the best for anything, probably graphics graphics probably, but if you designed an accelerator for AI, it can perform better. But there are advantages to this general purpose accelerator approach. If you have a cloud and you can share accelerators, you can create pools of accelerators, it actually makes sense to build accelerators for specific functions. So obviously for AI training, which Google kicked off that era with their TPU, it turns out that similar approach to, let's say, video transcode, now deployed by different architectures, deployed by YouTube and by Amazon in their video infrastructure, etc., or for crypto, or for various storage architectures. there's a lot that could be said for these. Again, if you want a specific solution that is optimized for your use case, then accelerators do make a lot of sense versus general purpose accelerators or general purpose solutions. And so I think there's going to be a huge emergence
Starting point is 00:12:02 of the RISC-V architecture as a core in many of those things. That's a chosen compute architecture that underlies many of those accelerators in the future. I also think there will be lots of ARM-based CPUs. There will continue to be large footprints of x86. There will continue to be GPUs from NVIDIA, from AMD, from others.
Starting point is 00:12:28 I think it's really going to be a much more diverse space. And I think that open source movement, which I think OCP specifically had really created something super interesting, creates the landscape in which this diversity can live by creating ad hoc industry-led points of compatibility and integration that allow different architectures to live side by side and still provide solutions in conjunction of each other. So basically a cornucopia of silicon delights is on its way to fuel all of this distributed
Starting point is 00:13:12 computing needs. When you think about that, are there any disruptive technologies in the cloud to make that easier for deployment? and anything that you're especially excited about in terms of infrastructure changes at the box level? I would say yes and, but I will start with an and. I think we've been seeing over the last, I don't know, 15 years or whatever, that changes that oftentimes we feel are over are really only at the beginning. And transitions ultimately take a very long time to materialize and to really reveal their full value. Having said that, CXL is an obvious big change. And I think it will imply changes in the box and also in some cases on what we even think of as the box.
Starting point is 00:14:18 I think it has different use cases. For example, in a cloud environment, it could make it easier and more economical to pool resources and allocate them ad hoc. I think in an edge scenario, it creates the opportunity to build a more modular architecture of a box that enables you to do late-stage integration of exactly the right mix of technology that you need for a use case in a location. So I think CXL in that respect is transformative. I think chiplet architectures will be transformative as well
Starting point is 00:15:03 because they allow for, again, integration of just the right size in a chip form factor to build optimal boxes, but at the same time also allow unprecedented scale of integration of capabilities in a package. So I think that's two. And those are very interconnected in my mind because CXL is the logical connection. And then CIE is the physical underlying connection that allows the mix and match inside the package. Let's see. Yeah, I think these are two that are,
Starting point is 00:15:46 let's leave it at that. And of course, I've already mentioned, I think RISC-V will have a big impact. So we talked about the cornucopia of silicon. Now we're talking about unboxing the box. Yes. When it comes to this distributed landscape and you think about edge to cloud or multi-cloud to multi-edge, where do you think we are today?
Starting point is 00:16:12 And where do you think the future is going with that? And then the second part of that question is, what does that require from infrastructure changes and infrastructure capability in all of those locations? I think we're only at the beginning of this transition, honestly. require from infrastructure changes and infrastructure capability in all of those locations? I think we're only at the beginning of this transition, honestly. And I think the future is incredibly exciting. One huge area, of course, is closed loop orchestration. And when I said closed loop, I meant, therefore, you need to monitor and you need to react in the automation of that. And that's happening, by the way, also on the network side, just as a nostalgic moment.
Starting point is 00:16:54 I don't know. We started talking about software-defined infrastructure, what, 15 years ago, Allison? Oh, yeah. I remember. Something like that. Maybe 10. And even then, it was clear that orchestration, the ability to match workload to infrastructure capability, and then to monitor it and machine correction of things
Starting point is 00:17:17 that humans need to move into more of an external programming role in the infrastructure than actual being in the loop. That was all known then, but we're not there yet to a large extent. Clouds are there. Many data centers are there by adopting cloud architectures, but the edge is far from that reality. So I think that is an area that I expect to see huge change for sure to enable that, especially given, again, the amount of diversity that needs to be managed. When I think about the challenge in orchestrating multi-cloud, and we've pursued that on the
Starting point is 00:17:57 program quite a bit, we are still in an environment where many enterprises are managing siloed clouds without the promised portability, efficiency, and ease of single pane of glass management that was promised years ago. And I understand why there are barriers because the large cloud providers don't necessarily want to make it easy to port off of their infrastructure. Plus there's other challenges. I don't want to just say it's competitive forces, but how do you look at the complexity of introducing multi-edge points, especially when you consider the potential size of all of those edge points, where does orchestration come in?
Starting point is 00:18:48 Where does management come in? And what do you see from the industry thus far? You said that we're in the beginning stages of this. What do you see that provides hope that we'll get to a point that will not make this the world's largest computing rat's nest that we could ever envision? I think there are spaces in which things are taking form in industrial settings, for example, or in retail settings, actually, where centralized management with distributed compute into the, be it franchise stores or outlet stores or manufacturing sites has happened. And I think the experience that is being gained there by practitioners and by solution providers will provide the foundation. There will need to be some amount of standardization.
Starting point is 00:19:52 It can't just be a mess of do whatever you like and somehow miraculously it's going to work. Because observability is critical. You can't just do these things in a black box and hope. So observability is critical. You can't just do these things in a black box and hope. So observability is crucial. And there will need to be some amount of standardization that I think will evolve ultimately through practice and through communities rather than necessarily through hyper-standardized documents. But I think that a layering of this will have to happen. They will have to develop some form of canonical architecture, architectures that enable interoperability. What's the role of open source in that?
Starting point is 00:20:39 Oh, crucial. When I said communities, I think that's... That's exactly what you meant. Okay. Yes. And I think to a large extent, it could be argued that open source communities have really replaced the traditional standard bodies over the past decade. Which, by the way, means that there are many efforts that start and don't reach an end state and either fizzle or merge with others. But at the end of the day, the big changes that stick and that people align with tend seem to have come more from communities in the last decade than anywhere else.
Starting point is 00:21:18 Now, I'm going to shift to the million-dollar question. AI. That was the last question. Oh, no, there's one more. AI, in your dreams of your tech career, did you foresee the advancement of AI at this stage and the rapid pace of its development? And where do you think it's going over the next few years? And what should we expect of this? And then finally, how does that influence your views
Starting point is 00:21:50 of this diverse computing landscape? Because I know you've been thinking about the diverse computing landscape for a while. How does it change in the era of generative AI? Oh, man. No easy questions on the tech arena. So while I'll be honest. No, I could not foresee this.
Starting point is 00:22:10 And especially not the pace at which it has evolved. And could we, did we see AI really gain momentum over the past few years? For sure. Could we see that progress continue? For sure. But the overtaking of the zeitgeist really is incredible. I think there's a little bit of a hype. I think there's going to be a little bit of a trough of disillusionment and discovery of reality. But at the same time, there's no doubt that AI will be a dominant, I don't want to say workload
Starting point is 00:22:50 because there are different types of AI workloads, but at the end of the day, let's call it computing style in future years. And I think what's really common, or maybe some of the challenges, let's start there, clearly observability, understandability, predictability of outcomes is a huge issue or family of issues that will have to be addressed to really allow for this growth. I think all of those are big challenges
Starting point is 00:23:26 that need to be addressed in the future. But if we go back to infrastructure, AI really has driven concepts of supercomputing into the box. And I think that is the big change. How do architectures evolve to provide sufficient amount of scale with low enough latencies, great enough access to memory, close enough to the compute element in a clustered fashion? And obviously the fencer and denser, the better. So I think that's where a lot of the energy will happen. And I think this affects memory fabrics and it affects networking, it affects compute, it affects storage. It really affects everything. And it's applicable
Starting point is 00:24:21 both in data centers and at the edge. Yeah. one of the things that I think about is we've always talked about data gravity when it comes to the edge. With AI and the demand for AI across this distributed computing landscape, I think about compute gravity. Where will the pockets of intense compute sit with this environment to drive the types of capabilities that we need. And I love the analogy that you said about driving a supercomputer inside a box, because that really speaks to what the demand is and why there are discussions about supply shortages of semiconductors. And semiconductors are strategic nation state considerations in terms of what needs to be done to secure borders and ensure that a country is in good position. These are things that were not talked about 10, 15 years ago. I'm really excited for this. And I think
Starting point is 00:25:20 that the next step is we're going on a journey together of self-discovery around this distributed computing landscape. And tell me a little bit about what you're attempting to do in conversations with the industry and what's next. Right now, I'm really trying to learn. There's so much happening in all of these different spaces between diversification of compute, CXL, and UCI. We didn't really talk about that much, but when you start moving memory outside of the box, technically there are emerging solutions for that, but the paradigms of managing it can be revolutionizing. That will be interesting to see how quickly that moves. Realities of edge is a big area to discover in my mind. And then the impacts of AI that we discussed.
Starting point is 00:26:15 And of course, digging out what is really the, where are the real forces in open source for these different layers? Where do they happen? I'm excited to see how this story evolves in the industry and on the tech arena. While we're talking, though, I'm sure that the audience is interested in what you have to say. Where can they connect with you and continue the dialogue with you? To connect with me, the best place is on LinkedIn. And then of course, you can find my writings on the tech arena. Fantastic. Thank you so much for being on the program today. Next time, I want to talk about
Starting point is 00:26:56 use cases across this distributed computing landscape and what industries are poised to benefit first, but we're out of time. So thanks so much for the time today. It was a real pleasure. I had a blast. Thanks, Allison. Thanks for joining the Tech Arena. Subscribe and engage at our website, thetecharena.net.
Starting point is 00:27:20 All content is copyright by The Tech Arena.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.