Orchestrate all the Things - Running AI workloads is coming to a virtual machine near you, powered by GPUs and Kubernetes. Featuring Run:AI CEO / Co-Founder Omri Geller

Starting point is 00:00:00 Welcome to the Orchestrate All the Things podcast. I'm George Amadiotis and we'll be connecting the dots together. Run.ai offers a virtualization layer for AI aiming to facilitate AI infrastructure. It's seeing rapid growth and just raised a $75 million Series C funding round. Here's how the evolution of the AI landscape has shaped its growth. I hope you will enjoy the podcast. If you like my work, you can follow Linked Data Orchestration on Twitter, LinkedIn, and Facebook. Thank you. This is really exciting to be here on the podcast. So I am Omri Geller,

Starting point is 00:00:38 the co-founder and the CEO of Run.ai. Run.ai is a company that was founded in 2018. I founded it with Dr. Onand Dar. He is the CTO. I am the CEO. We both met at Tel Aviv University. Onand did his PhD and I did my master's. And we worked together for multiple years under the same supervisor, Professor Neil Fedder.

Starting point is 00:01:08 And when we started or thought about starting RUN.AI, we actually thought that we want to be part of the AI revolution, of the artificial intelligence revolution that we've seen starting to take place in the industry. We wanted to be part of that and we defined a mission for an AI to help humanity solve the unsolved using artificial intelligence. Both of us are from a technical background, we learned electrical engineering.

Starting point is 00:01:49 For us, the nature is close to the compute part. One of the most visible things when it comes to AI applications that are growing in adoption is that in order to solve more and more complex problems using artificial intelligence algorithms, you'll need more and more computing power. And that trend is obvious. So you can look on the artificial intelligence models today versus those that have been with us for the last, let's say, even five years ago. And the size of those models has grown significantly from a few millions of parameters that you need to tune in order to produce a model that gives predictions

Starting point is 00:02:34 to tens of billions of parameters today that you need to tune in order to give predictions. And this is going to scale pretty fast in the coming years as well. So we've seen this trend of the computing power that becomes more and more important when it comes to AI applications. And we felt that there is a need when you have new AI applications that are running on new hardware that is needed in order to support those compute-intensive tasks, and you can see that in the market, there are more and more AI processors to support the growing demand of AI applications. So we felt that there is a place for a new software layer to bridge this gap between the new form of AI applications and the new hardware that is being produced and

Starting point is 00:03:34 manufactured in the market. And we started Run.ai to bridge this gap, to build the de facto layer for running AI applications in the most efficient way, basically, on any type of AI processor. You can think about run AI as the kind of a new operating system for the new computers for AI applications. And we started run AI about four years ago, and our vision has been and still is to be the layer that runs AI applications for any use case, whether it's like building and training of AI models or inferencing in production on any type of AI accelerator, whether it's on premises, in the public clouds or at edge locations.

Starting point is 00:04:30 So on every location, we want to be that software that does that. Cool. Thank you. And I have to say, I had the good fortune of covering, well, you exiting Stealth, basically. That was already almost three years ago in 2019. And at that time, I think you also announced, I'm not sure, it's been a long time, so I'm not sure whether it was your seed round or your round A founding, but either way. And so now it's almost three years later and it seems like things have been going well for you. The idea that you just described, I have to say that to me initially it looked a bit counterintuitive for the reasons that we also discussed like three years ago that the notion is that you want to be as close to the hardware as possible to get as much performance as possible. And what you do seems to kind of

Starting point is 00:05:34 introduce a layer in between. However, there are also arguments in favor of that approach. And well, it seems to have been validated by the market, because the occasion for connecting today is that you're announcing another funding round. And so I thought, well, there's a lot of ground to cover, but maybe let's start with that. So yeah, if you want to talk a little bit about the funding and whom are you getting it from, and then we'll also cover the ground that you've covered from 2019 to today. Yes, so that's correct. You were speaking in 2019, it was post or around A.

Starting point is 00:06:19 Back then we raised 13 million dollars. And we've been growing significantly since then. We did our B round, another $30 million led by Insight Partners approximately a year ago. And right now we are announcing a new round of investment, our C round, $75 million investment led by Tiger Global and Insight

Starting point is 00:06:49 Partners, which also led the previous round. And we've been scaling significantly. We've been scaling quickly in the last few years since we spoke. Of course, we had to do some adjustments in the product, like every startup, to understand how to make it work and how to find a product market fit. But once we understood all of them and we were able to connect all the missing, connect all the dots, since then we've seen significant scaling in the company, in the customers connect all the dots. Since then, we've seen significant scaling in the company, in the customers, in the headcount, in the market traction, and as well as the funding.

Starting point is 00:07:34 Okay, so we'll get, we'll return to the product line a little bit later. But for now, let's check up on those fronts that you mentioned. So I tried to do a little bit of looking around to see what I could find in terms of clients and use cases and growth. Well, clients are not exactly easy to find on your website, I have to say. So I did manage to find a few like Wayne and the London Medical Imaging and AI Center. But that was referenced in an external article. So if you can share any information you have, any publicly shareable information on your clients and the verticals that you cover, what kind

Starting point is 00:08:18 of use cases you have deployed your platform to, and then perhaps also some growth metrics? Sure. So we are working with customers across many verticals. And this is one of the things that I love the most about RAN.ai is being an infrastructure play where basically can provide the same solutions or the same product for any vertical. And every company that is building and deploying AI models can be a customer of Run.ai. And you could see that in our customers, we have customers from Fortune 500 companies to

Starting point is 00:09:03 very innovative startups like Wave in the UK that are building technology for autonomous cars. And also research institutions that are leading their fields with AI, whether it's like King's College London or EPFL in Switzerland. And we're able to provide the same software to each of those organizations. And then when it comes to the verticals, we have customers today across many verticals, whether it's the financials, automotive, defense, education, gaming, and healthcare, and more.

Starting point is 00:09:49 So we're definitely agnostic in the verticals, and we're working with leading organizations in each of those verticals. And for that, we had to grow the company significantly, both on the product end, as we'll talk later on, but also when it comes to the headcount. So I think when we spoke about in 2019, we probably were approximately 10 people. We're more than 70 right now. We had probably a handful of users that are using RUN.AI. Today, we're talking about thousands of users that are using RUN.ai. Today, we're talking about thousands of users that are using RUN.ai,

Starting point is 00:10:27 from tens of GPUs to tens of thousands of GPUs of NVIDIA processors that are under management by RUN.ai. So we really have seen a lot of growth in every metric. And I think also what is interesting is that we also saw growth in the use cases. When we spoke back then, most of our customers were in a phase of building and training AI models. So they were looking to understand what are they going to do with AI in order to solve their business problems. And fast forward three years from then, we are now supporting our customers also with inferencing, meaning they found their applications for AI. And now they have challenges on how to deploy them and manage inferencing at scale, and actually how

Starting point is 00:11:22 to manage applications in production at scale, but it helps them there as well. That is something that I'm really happy about because as we talk about our vision, we're able to run any AI application for any AI use case. There is a natural development from training to inferencing, and we're happy to see that in the market as well. Yeah. And just to add to what you said, all the reports, all the metrics that I've seen point to the same direction that, well, once you get past the initial investment, let's say,

Starting point is 00:11:58 to train, to convert a model, then most of the operational cost actually comes from inferencing, from having it deployed and running in production. So it makes sense that early clients, let's say early adopters, have now switched gears and are more oriented towards running their models in production. Absolutely. Yes, we say that. Okay. So the other thing that I saw development, let's say, for an AI over the years is you seem to have a sort of partner ecosystem created right now. And interestingly, well, you did also have some partnerships when you first started out with VMware.

Starting point is 00:12:48 And I think, well, I'm not sure if it was technically a partnership, but I recall that you mentioned you used to work closely with AWS. So interestingly, I don't see either of those in the list of partners that you have today, even though I did see other, well other big names such as NVIDIA. So I was wondering if you could elaborate a little bit on your partnership ecosystem and how you work with them and how these partnerships help you in your mission. At RAN.AI, we are big believers in partners. And you're right, we have a very wide partner ecosystem

Starting point is 00:13:27 and the goal is mutual go to market and mutual product development in general because we believe that there is a lot of information that we can share with our partners and learn a lot from from our partners on the market that is evolving right now. And you're absolutely right that we have many partners at this point. And this is because we see that as the right way for us to work with our customers. And I'll explain. Many organizations today are in the phase of building foundations for their AI stack, AI software stack within the organization.

Starting point is 00:14:09 And run AI is a fundamental piece in that AI stack. We have been building the solution in a way that is basically playing with everyone and we'll talk about it maybe later on. But we've built it in a way that we want to be able to work in any type of environment with any type of software stack that already and AI practitioners can work with Run.ai, where Run.ai will plug into their stack. And therefore, we invest a lot in finding the right partners that are helping their customers to build an AI stack. And we work with them to actually build the best AI stack for their customers. And those partners can be cloud providers. So that's correct that now with AWS, they are not an official partner for us. We work very closely with Microsoft Azure. We work very closely with

Starting point is 00:15:26 OpenShift and with HP on their container platforms. And we work very closely with NVIDIA and hardware sellers that are selling NVIDIA GPUs. So definitely a big bet for us on the partner ecosystem. And we love that. Thank you. The other thing that seemed noteworthy to me was that the product itself seems to have expanded in scope. So now you cover more machine learning frameworks. I think it used to be just a couple, two or three, maybe. And now there's more. You also seem to have expanded the scope to ML Ops. And something that also stood out for me is the emphasis on Kubernetes. So, you can definitely provide more technical context around that. But it seems like,

Starting point is 00:16:24 I don't know if that was the case from the start, but it seems like right now the whole approach of deploying Atlas, your software product, is based on Kubernetes. So it's developed and deployed as a Kubernetes plugin. And also the fact that, well, now you support more cloud platforms, basically. So that's a lot of expansion. And I'll let you cover that from any point you choose. Yeah. Right. So I'll start from the fact, and I'll go back maybe to my previous answer,

Starting point is 00:16:59 that we believe that run AI should play with everyone. And we've built our product as a plug-in to existing AI stack. And Kubernetes is one of the most important pieces in building your AI stack. Because containers is heavily used in data science and also outside of data science, but data science is extremely using containers. And therefore, Kubernetes as the de facto orchestration software for containers is very important

Starting point is 00:17:39 in order to efficiently do data science. However, Kubernetes was not built in order to run high-performance workloads on AI processors. It was built to run services on classic CPUs. And therefore, there are many things that are missing in Kubernetes in order to efficiently run AI applications using containers. When we identified that at Run.ai, and you're absolutely right, it took time also to understand that, but when we identified that, we decided to build our software as a plugin to Kubernetes in order to, in a way, make Kubernetes for AI.

Starting point is 00:18:27 But we didn't want to build the Kubernetes because organizations already have their Kubernetes roadmap, whether it's with OpenShift, VMware, Tanzu, Vanilla Kubernetes or others. And we built RunAI such that every organization can choose their own flavor of Kubernetes and then plug in Run.ai in order to get the extra benefits of orchestrating and managing AI applications in Kubernetes. So that is the reason that we've built Run.ai as a plug-in to Kubernetes. And we are partnering with all of the Kubernetes vendors out there because in any deployment of Kubernetes on AI accelerator, such as the GPUs,

Starting point is 00:19:17 run AI can plug in without doing any change and just make your AI cluster much more efficient. And that's the reason that we've built it as a plugin. More than that, we also believe that in the application layer, meaning everything that is running on top of Kubernetes and run AI, we do not want to force the users to use a specific form of machine learning platform or data science interface, because there are many of those in the market. There are many different tools out there that data scientists can use. And we want to enable each of the users to bring their own tool of choice

Starting point is 00:20:02 and run it in the most efficient way on the underlying hardware over Kubernetes. And therefore, we built integrations for existing tools that data scientists and machine learning users are using, whether it's open source, sorry, whether it's open source like Kubeflow and MLflow, or other commercial platforms that our customers are asking.

Starting point is 00:20:32 At RUN.AI, our goal in the end is to provide the resource management layer, and therefore we need to enable running any type of AI application, the most efficient way on any type of processor, and that's what we do. Okay. There's something I wanted to ask to follow up on something you just said.

Starting point is 00:20:57 Beyond those integrations with machine learning frameworks or other products that you already have in place, is there a way for users to use your platform if they're using something else? So for example, if my favorite machine learning framework is not pre-integrated, can I use it in some way? Yes, absolutely. So anything that runs in containers can run on top of Kubernetes and then it can run on top of run.ai. So basically any application even if we don't have a pre-built integration, any application that is containerized can run on top of run.ai and that's out of the box and we are doing that all the time for our customers. Okay. So I think that maybe because the platform has also expanded, I think you have expanded both in width, let's say, as well as in

Starting point is 00:21:55 depth. We're going to go to the in-depth bits, I think, a little bit later. But for now, because we haven't still referenced that, I would like to ask you to give like an end-to-end example. So how does it work? Suppose I'm a new user and I have my workloads, be it training or inference or whatever. How does RunAI Atlas integrate in my day-to-day operation? Okay. Yes. So basically, from the user perspective, from the data science perspective, one of the nice things is that they don't need to change anything in order to use RUN.AI. It is working under the hood within the Kubernetes layer and helping them get more available compute. And I'll explain. So let's think, for example, on a cluster of compute resources, of GPUs, that is being used by multiple data scientists in order to run applications. And in data science, there is a very variable demand for compute

Starting point is 00:23:08 because there are some workloads that can take a lot of GPUs and some workloads that can take a very small amount of GPU power, such as inferencing. So training workloads can run on many GPUs because there is a lot of data to process and there is a lot of parameters to tune. Whereas inferencing, the model is already being trained and it is just waiting for an input in order to create predictions. Instead of choosing how much GPU power every user will get and giving them a fixed quota or a fixed amount of compute power, we basically allow for flexibility so that the applications can run on a very small amount of GPU from fractions of GPUs so that you can take for

Starting point is 00:24:04 a specific application only a fraction of GPU, something that is impossible within Kubernetes without run AI. And you can scale up to as many nodes or many GPUs and many processors that you need for one workload. And this is dynamic. It means that

Starting point is 00:24:19 according to what the workload needs at runtime, we allocate the right amount of compute and then release it. So from that perspective, the user, even if they got a static amount of GPUs, for example, I'm as a user, I got two GPUs in the cluster, then our software will make sure that the workloads of that user can take four GPUs if that's what they need and there is available compute power. And we also know how to shrink it back and run on less GPUs if needed. So from the user perspective,

Starting point is 00:24:58 they just ask for how many GPU power they need and run AI to make sure that at any point in time they get as much as they need and even more if there is more available power in the cluster. I think that by describing that you already superficially at least touched upon some of the technical advancements that you have accomplished in this time. So fractional GPU sharing and thin GPU provisioning, at least. These are the technical underpinnings of being able to do what you just described, I think. Yeah, correct. We had a very beautiful deep tech things that we developed in recent years. One of them is the fractional GPU that actually enables you to run multiple containers on a single GPU where each of the

Starting point is 00:25:54 containers is isolated. So basically doing for the GPUs what VMware did for the CPUs, but we do it in container ecosystem under Kubernetes without hypervisors. So exactly like you can run multiple containers on a single CPU, run AI enables multiple containers to share a single GPU without changing anything in the code and without hurting the performance. More than that, we also developed what we call job swapping and thin allocation of GPUs, which is an amazing thing that the team at Run.ai did. And that actually allows us to figure out that some applications, even though they allocated GPU power, they are not actually using that. For example, I asked for a GPU and I went for a vacation. So my container is allocating the GPU, but nothing is running there

Starting point is 00:27:00 because nobody is using this container right now. So without run AI, this GPU power that is extremely expensive and important is being wasted. With run AI new technology, we know how to swap out such an application that is allocating the GPU but is not actually using that and instead run an application that now needs a GPU and actually is going to utilize the GPU but is not actually using that, and instead run an application that now needs a GPU and actually is going to utilize the GPU. And we know how to do those preemptions, continue, pause and continue. We know how to do that automatically without hurting the performance of the user and without even the users feel that they have the GPU

Starting point is 00:27:45 at any point in time. But when their application is not using the GPU, other applications can run in the meantime. And that concept allows us to even more increase the utilization of the clusters of AI processors and improve the productivity of the users so that they can get better access to more compute power. Yeah, I would say that's probably one of the most well-known hard problems to solve in IT in general, like resource optimization and synchronizing contention, whether it's multi-threaded applications or, in your case, allocating GPU resources. But the principle remains the same, I would say. So probably you

Starting point is 00:28:30 must have reused, well, at least existing work and algorithms and so far, and obviously adapted them to what you need to accomplish specifically. Exactly. So we are not inventing the wheel here in those concepts, but we're bringing the concepts that, as you said, are known for many decades in classic IT, whether it's for compute, storage, or networking. We're bringing it to the new form of compute, which is the AI accelerators. And the concepts are the same. The technology is completely different, of course,

Starting point is 00:29:03 in order to support those new forms of applications and processors. And one of the benefits, let's say, of applying this kind of approach that you have, so this sort of virtual layer that sits above the hardware through which all the requests for resources go, is the fact that you can leverage that to provide analytics of sorts. So in terms of utilization, in terms of demand, and so on and so forth. So I was wondering if you could share a few words on what types of analytics are you able to provide and also whether you see that there is a sort of overlap there with platforms that are more geared towards MLOps. I think at least some of them are also in position to offer that type of analytics. So, do you see like overlap leading to perhaps

Starting point is 00:30:02 some sort of, I don't know, conflict of interest, let's say there? So you're absolutely right that we provide analytics from our software. And this is one of the things that we love the most when we work with our customers, because we gather information that by working with our customers, we then can work together and understand how to optimize all their workloads that are running. So we can see the utilization of their hardware, how many jobs they ran, historically, which workloads were more utilizing the GPUs while others aren't. So why and how to improve that?

Starting point is 00:30:47 There is a lot of data that we can gather from monitoring the processors that are running the applications. And we have analytics that are being saved from the day of installing run AI till any point in time in the future. So we do not delete this data and we sit periodically with our customers to analyze it and help them get more insight into their resources and AI work. So these capabilities are extremely important for the IT teams to have capacity planning, to understand spending, to understand

Starting point is 00:31:28 what applications are using the compute resources and make sure that they are aligned with business goals. On the other hand, we also help the data science teams that are actually running the applications to optimize their applications because we can see in real time how much really optimized are those applications. And if there are bottlenecks that we can solve, then it's usually visible using run AI. So we can identify where the bottlenecks are and if there need to be some tweaks in the code, for example, in inferencing, you want to reduce the latency, we can help and see where there is a bottleneck and how to solve that. So those are things that we provide as part of the platform. And there are other

Starting point is 00:32:16 MLOps platforms that provide visibility. Some of them provide it more in a higher level in the stack. For example, looking at the accuracy of the models and doing some monitoring on applications in production to see the accuracy of those applications. Those are places where run AI are not providing any visibility. And some of the MLOps tools also try to provide insights into the resource management layer, but Trani IDC are bread and butter. This is where we focus, and we provide very deep insights on

Starting point is 00:32:56 everything that's related to resource consumption and management of the compute resources. And because we're integrable with other MLOps platforms, so it's additive. We're not changing that. Our platform is focused on other things. And if the organization would like to have two dashboards for insights on the compute resources, this is great. This is something that we're happy to provide. Okay, Great. Thanks. Another thing I wanted to discuss with you is sort of the bigger picture, let's say.

Starting point is 00:33:30 And I think it probably ties to something you said in the beginning. So you said that you went through a realignment phase where you evaluated the direction that you needed to take and so on. So looking back at our conversation from two years ago, I had the impression based on that, that you would be pursuing adding more hardware integrations to the platform. So things like Google's TPUs or, for example, other AI chips.

Starting point is 00:34:02 And it seems like you haven't really done that. And actually, listening to you speak, you seem to keep repeating, you seem to keep referencing GPUs almost exclusively, I would say. So I have to assume that this is what you have focused on rather than expanding the scope of your coverage. And I was wondering if you could, well, first of all, verify if that's the case. And if so, explain the thinking behind that.

Starting point is 00:34:34 Yeah, so this is the case. We are focusing on NVIDIA GPUs. And this is because that's where our customers are. So we are working with our customers and definitely, as I mentioned, as part of the vision, we want to be able to support any type of AI processor that will be there. But at this point in time, from what we encounter in the field, organizations are using GPUs for training and inferencing. And we want to provide the best solution for our customers and therefore we invest in providing

Starting point is 00:35:13 the solution over an NVIDIA GPUs at this point. Okay, I see. I'm pretty sure, however, that you must keep an eye on everything that's going on in the AI processor domain. So I would like to sort of pick your brain, let's say. And I also try to keep an eye on that. So it will be interesting to hear what you think. So I've seen lots of new developments. So to begin with the most obvious and the most probably commercially interesting ones,

Starting point is 00:35:43 I've seen AWS make available new instances and I'm pretty sure you must have somehow adopted the platform to be able to use them. I've also seen FPGAs, well, I'm not sure whether they're actually used more on the ground, let's say, but I do keep seeing more references to them. There's also, thanks to miniaturization, I also see CPUs even being used more often than they used to be. And of course, there's this whole array of new AI chip processors that are being developed by new vendors. So what do you think,

Starting point is 00:36:28 you know, among all of those is more interesting and where do you see more customer driven demand basically? So what do your clients ask you for? So this is a great question. I mean, basically the users and our customers, they see the compute resources in the end, their enabler to build and deploy AI solution. And the reason that from our experience right now, Nvidia is actually dominating the market. It's obvious. Though you see other cloud providers bring their own chips, whether it's AWS with Tranium or Google with their TPU, or other hardware vendors or processor vendors such as you know an Intel that actually acquired Havana and are trying also they have the Havana instances on on AWS so we've been monitoring this this market in order to see where it

Starting point is 00:37:38 goes and I think that what we have seen is that the organizations are looking for a cost-efficient way to train and deploy AI models. But more than that, they look for a simple way to interact with the accelerators of the processors. And one of the things that NVIDIA has much better than all the rest, and it is a big, big advantage for them, is the software ecosystem that they have on top of their processor. So, it's not only the CUDA, which is the layer that is optimized for parallel computing on the GPUs, they are going up the stack by providing applications for the data scientists themselves, mostly open source, by providing ecosystem for containers that are pre-built for data science. And they're making everything very easy to consume on their GPUs. And this is something that is going to take time

Starting point is 00:38:47 for other chip vendors to actually be able to compete with. And until then, the users, and even if they are looking for a more cost-efficient or cost-effective way, and even if some of the processors can be more powerful, until there will be an easy way to interact with those processors, it will be hard to see a movement for those enterprises from working with NVIDIA towards the processors that are not NVIDIA-based. Having said that, we are seeing that there is more demand for GPUs that are not NVIDIA, whether it's AMD or even the new Intel Ponte Vecchio GPU, as something that organizations are looking into in the near future to deploy. From our experience, the GPUs are more in favor than the ASICs that the cloud providers

Starting point is 00:39:55 offer. It may change in the future, but we do hear from customers their thoughts about building an heterogeneous GPU cluster, not only from NVIDIA, but we less hear about their desire to work with the ASICs for AI that cloud providers and others offer. Okay, so it seems like at the time being, while these specialized ESICs may serve a need, they're mostly aimed at organizations specifically looking for workloads with a very performance-oriented profile for their specific application and not so much for a general audience.

Starting point is 00:40:42 Correct. That's what we're seeing from our experience in the markets. OK, another kind of broader picture question I wanted to ask you, which is also very relevant to you, is the infrastructure landscape, which is a bit different than when we just talk about the hardware level. And Forrester just published a report in which you were also included actually in this AI infrastructure analysis. So just very quickly, if you could summarize, how do you define this market

Starting point is 00:41:20 yourself and how do you see RAN AI in this market? Yeah, so I think, first of all, this is a market that is evolving and a lot of work is being done in order to build and define this market or sub-markets for AI infrastructure. I think one of the interesting things that Forrester did in their reports that they published in Q4 2021. They actually built the first reports that we have seen that distinguish between machine learning platforms or platforms that are the interface for the data scientists and the platforms that are there in order to run AI applications. So more in the infrastructure below the application. And this is the first time that we saw this analysis.

Starting point is 00:42:16 And we were very happy at RUN.AI because this is how we believe the market should behave. We believe that there is a layer of platforms that are the interface for the users, for the data scientists or the machine learning engineers. Then there is a layer down below that that is basically this operating system that knows how to run those applications in the most efficient way. In the Forrester report, what you'd see is that you'll see the cloud providers there. You'll see hardware or OEMs for GPUs, whether it's Dell or HPE.

Starting point is 00:43:02 And then you'll see run AI there. And the way that they positioned run AI in that report is basically being the layer that knows how to orchestrate and run those applications of AI across any underlying infrastructure. And for us, that was really nice because when we look at this Forrester report, we see all the other 12 vendors, it's overall 13 vendors, whether, again, it's the cloud providers or the OEM of GPUs and NVIDIA, of course, they're all partners of Run.ai

Starting point is 00:43:39 because we're able to run applications on top of them. And this is how we believe that the market is going to behave. We believe that there are going to be the hardware vendors or those that provide hardware, whether it's the cloud providers or the OEMs. And on top of that, you'll need to have the right software, such as Run.ai, to know how to run the work with efficiency on those providers.

Starting point is 00:44:10 And then on top of that, you're going to have many applications that we already see a lot in the market. I think another interesting thing from there about the AI infrastructure is the fact that the world is hybrid. And I'll explain. So in this report, there were revenue bans, significant revenue bans. So in order to be part of this report, those organizations have significant revenues. And you can see that there are also the public cloud providers, such as Google, AWS, Microsoft Azure, and those OEMs, like the, as I said, HP, Dell, and others. And this is something that we see from our customers,

Starting point is 00:45:03 that they are going in an approach of hybrid cloud AI infrastructure, where some of their AI applications are running on public clouds, and some are running on-premise OEMs and the cloud providers, and we expect to see this trend growing over time as well. Thanks for the analysis. I think that it's pretty obvious that compared to the other vendors included in that report, you have a unique placement, let's say. Yeah. Yeah. Okay, so I think we're almost out of time, maybe even a bit over that, actually.

Starting point is 00:45:51 So let's wrap up with basically your roadmap. So what's your plans after having secured this significant new round of funding? Yeah, so we are a very deep tech company. So a lot of our roadmap is building more and more technology in order to improve the utilization of those AI processors. And as we think about this software layer that we built, we're going to expand in multiple directions. So one of them is, as I said, going, I would say, down the stack to even closer to the hardware and build more and more technologies for improving the utilization and supporting from NVIDIA GPUs,

Starting point is 00:46:38 supporting other hardware vendors as well over time, according to need. Then we're also going to expand with our use cases, meaning we started from training, we're now moved for inferencing, and we're going to have very unique capabilities of inferencing that we're going to announce in the coming few months. We're going to expand there, so support more and more applications across use cases as well. And then we're also going to expand our software to be able to run and manage the applications across any location. So we're running on data centers and public labs, but also unique edge locations and supporting virtual machines as well as not only containers. As we become the de facto standard to run A applications,

Starting point is 00:47:29 we want really to enable our users, our customers to run their application, whether it's a container or a VM across any cloud and across any accelerator. And for that, we're going to continue and develop the product. Great, thanks. And I assume much is typical in both for companies at your stage and also companies who just got funding then you're also going to be aiming to increase your head count to be able to do all of those things. Absolutely, significantly, yes. Okay, thanks for the conversation and and good luck with everything going forward. Thank you so much for having me here. It was great to catch up.

Starting point is 00:48:12 I hope you enjoyed the podcast. If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn, and Facebook.

Orchestrate all the Things - Running AI workloads is coming to a virtual machine near you, powered by GPUs and Kubernetes. Featuring Run:AI CEO / Co-Founder Omri Geller

Run:AI offers a virtualization layer for AI, aiming to facilitate AI infrastructure. It's seeing good traction, and just raised a $75M Series C funding round. Here's how the evolution of the AI ...landscape has shaped its growth. Article published on ZDNet

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.