Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 12: Why Do Most Enterprise AI Projects Fail? @BeyondMindsAI

Starting point is 00:00:00 Welcome to Utilizing AI, the podcast about enterprise applications for machine learning. Each episode brings together experts in enterprise infrastructure to discuss applications of AI in today's data center. Today we're discussing enterprise projects and frankly why they fail. First, let's meet our guest. Hello and thank you for hosting me today. It's a pleasure to be here. I'm Roy. I'm BeyondMind's CTO and one of the co-founders. In the last seven years or so, I'm doing research around AI, and in the last two years, I'm focusing around the intersection of business and technology.

Starting point is 00:00:47 I'm Stephen Foskett, organizer of Tech Field Day and publisher of Gestalt IT. You can find me on Twitter at S. Foskett. And I am Andy Thurai, co-host, founder and principal of thefieldcto.com, home of unbiased emerging tech advisory services. You can find me on Twitter at Andy Thurai and at thefieldcto.com. So I've been in the enterprise tech industry for a long time, probably longer than I want to admit. And it's true that most enterprise projects fail.

Starting point is 00:01:18 We've seen this countless times through countless different technology trends and technology transformations. In a way, it almost doesn't matter what the technology is, you can guarantee that the majority of these projects just aren't going to work. Why is that? What is that that causes problems in the delivery of enterprise solutions? Roy, what do you think? Well, I can give you mainly the AI angle of things. I think

Starting point is 00:01:47 that it's important to observe that what we see in AI is significantly different from what's happening in the software world, which is much more mature. You can see different solutions that are off the shelf, that are great and can solve different problems. But the situation in AI is completely different. And I think one of the main observations that I can discuss around this failure or systematic failure is the buy versus build dynamics around AI solutions. And unlike software, there aren't many off the shelf solution that you can buy. And that's in force or force companies and

Starting point is 00:02:31 enterprises in specific to build their own data science team. Hiring people starting to develop from scratch, and that's very complex. So I think that's an important observation that probably is one of the most significant reasons why these projects fail. So one of the most significant challenges in bringing AI to production is the need for customization. That means that problems in enterprise AI are inherently specific. If you have a specific problem that is unique for you, that means that you're going to develop, that you need to develop something that is dedicated for you. Significant customization and without that with this need uh there is a lot of effort that need to be taking place then well so i hear what you're saying but you know so

Starting point is 00:03:38 it is different in a sense from an enterprise software that you know there are ai actually comes with two components one One is a model training, which requires a lot of horsepower. And then there's model inference, which requires a little less horsepower. So you need to cater for both because with, with the model training, it's the bulkiness is, is the, how deep, especially when you do a deep neural network, how big of a power you would need and how much of a data you would need is an issue.

Starting point is 00:04:06 And then when it comes to inference, how fast can you do it and how close to the data you can be and how often can you update the models? Is that in itself is an inherent problem in a sense that, you know, there are two different areas you're trying to attack? Yeah, I definitely agree. So I would even say that the situation is even worse

Starting point is 00:04:26 because what we see in enterprises and in young data science team, that they look at AI solutions as a single model. Let's collect data, clean the data, go to their lab, train a neural network, pick our PyTorch TensorFlow framework, throw the data in, maybe tweak some architecture, design, bring a state-of-the-art open source research. Boom, we have a model, 95% accuracy, we are happy. But that's like probably 5% of the work because then you need to take that model and put it into production.

Starting point is 00:05:07 And there is a huge gap between this academic research and way of thinking in the lab and the real-life environment with messy data, with real-life noise, with production challenges, with cost-effective cloud, with the need for maintenance, the ability to do updates to your models, to your system, the way that you collect feedback, how you stop garbage going into your models, how you build a trustful AI relationship. How can you bound your results with confidence? So I think that taking what you just asked, I think that the right way to bring AI to production that's focusing on AI systems with many components that need to be developed. So maybe like the most important tip that I can bring to data science team is you're going to have to invest a significant amount of time, effort, and money into infrastructure, systematic development,

Starting point is 00:06:21 and not jumping into the lab and training a model. That's the possible part, but there is an entire system that needs to be developed in order to really see value from these solutions. If you don't mind, I want to double click on that a little bit more. You talked about building an infrastructure. Yes, the infrastructure part of it for the model training is a lot easier now with the GPU invention especially, right? Nvidia doing a ton of stuff on that and the horsepower for the GPU is available.

Starting point is 00:07:00 Plus the compute power is available on cloud that more than you can have at any given time. And the data lakes are gone into humongous data lakes that you'd be able to feed in on a constant basis and keep up data on a constant basis. So the model creation problems seem to be somewhat, as you call the lab experiment, is somewhat solved. So do you think the actual problem is more towards the model interference or more towards actually making a production grade or all of the above? Yeah, I think, well, there are many, many challenges around AI to production, but if you want me to pick one that's around the production grade, the scalability, the stability of the system. Let me give you a very tangible example

Starting point is 00:07:52 from a real life scenario. So recently, I'm doing a lot around defect detection in the manufacturing environment. And that's quite a new technology that you can really trust about a machine to do defect detection instead of human in production line. But what data science teams see again and again in this production line is a significant drift of data.

Starting point is 00:08:19 So you train the model again in the lab. Let's say it's an object detector that can detect defects in your products, could be cosmetic defects or electronic defects or whatever, I don't know, defects of lemons in production line for agriculture. But then you see drift on the data and that comes again and again and again, a new kind of defect. And suddenly the color of the product is changing and the lightning in the room, the illumination is changing. Or I don't know if it's a lemon, if you want to sort lemon to good lemons and defect lemons and suddenly the season change and the lemons are not

Starting point is 00:08:57 yellow, they're suddenly green. How can you collect that feedback from the user, retrain your models without forgetting what you have learned before, improving over time. And I would call it how to keep the train on the rails, right? You build a model, but that's now need to be deployed and impact over time. And I hear from data science team in this domain, lots of heavy nights where suddenly the AI stop working well and they need to fix it.

Starting point is 00:09:39 There is nothing that AI team hates more than going back into old code. So that should be part of the system, right? How to monitor the data that comes in, in order to be able to alert the data science team. Something is weird going on. The data suddenly shift and other factors, right? So that actually brings to mind a lot of the discussion that we've heard around DevOps, because it's really, you know, it strikes me again and again in these conversations that MLOps and DevOps are really facing the same sort of structural and procedural challenges,

Starting point is 00:10:18 even if the technology is very, very different in the technological approach. And one of the challenges of DevOps for a long time has been basically to, yeah, to get to the point where the people that are, you know, developing the software can properly understand and interact with infrastructure. And I see such a parallel with AI, as you're saying you know if if you just dive in and you're like you know let's you know we're data science team let's you know jump in let's train our model let's have our model do great things it is exactly the same as a development team writing a new application without taking into account the infrastructure so how do we break that wall down I mean because this is a wall that's outside of AI. This is a wall that happens in enterprise IT development

Starting point is 00:11:06 all the time. So we had DevOps and ML Ops are part of the game. And I will say that in order to bring AI to production, you don't need only data science team. And you need a corpus of different roles within this team. You need the engineering, the backend, the front end, a corpus of different roles within this team. You need the engineering, the back end, the front end.

Starting point is 00:11:30 You need the data science team. You need researchers maybe involved. You need DevOps. So it's more of a squad way of thinking if you really want to solve all the problems. And I think, you know, at BeyondMinds, we build our own ecosystem and understanding of of how how doing that um and i think a significant part of it is investing time and effort in solving

Starting point is 00:11:55 some fundamental research problem along the way um starting from different building blocks that you need for training. For example, with our core platform or core technology, we build the ability to train in a scalable way, to hyper-tune the parameters and different aspects that we need along this training. But beside that, I think there are many fundamental problems that must be solved and be part of the overall solution, part of the AI system. A very classic example that I always give

Starting point is 00:12:38 is what we call the input gate. That's our garbage collector in the entrance to the model, to the system. Let's assume we want to build a classification network to distinguish between cat and a dog. What will happen if I will get an image of a giraffe? The AI model will say, this is a cat or this is a dog. What is more similar.

Starting point is 00:13:05 And the problem become even worse because the softmax weights or the probabilities will be even high in many of the cases. So we have developed around a lot of IP and solutions around what we call out of distribution or what the research community is called out of distribution detection. Basically, the ability to say, wait, wait, wait, this is not a dog, not a cat. This is something else. Please call a human to interrupt here. And that's another part. How do you work in a human in the loop environment? And if you really want to bring AI into production in a mission critical problems,

Starting point is 00:13:49 you must understand how to work with humans in the loop, how to build a trustful AI. And I think that's another important factor to take into consideration. I want to double click previous statement you made. That in itself is a classic difference between the way we build and deploy enterprise software and AI systems. A lot of people still don't understand that. The major difference is most of the other systems,

Starting point is 00:14:23 whether it's DevOps or enterprise software or whatnot, you get the software and then you build a system, you deploy the system, and then the system stays in place unless you do some bug fixes or updates and whatnot. But with AI, because it's not system-based, it's data-based, which means based on the new data as you are suggesting, it needs to update itself, which means technically any AS system should be capable of, going back to your dogs and cats reference,

Starting point is 00:14:57 the system should be able to reference and infer and learn, this is neither a cat nor a dog, but if you give a feedback to it as a, you know, supervised learning, it should also say, aha, okay, that's a giraffe. Next time, I'll keep that in mind. So there's a possibility for AI system to be doing both the supervised learning and unsupervised learning. So that's what makes it a little bit trickier for the ai in the enterprise systems versus other enterprise software yeah i agree so we are clearly not in the in the stage where general ai is is something even close to exist so with the current state of ai it must be

Starting point is 00:15:41 wrapped with a lot of engineering around it in order to bring that into production. And how to collect feedback, validate on real time, and retrain your models with that feedback from the user without forgetting what you already learned, that's a significant part of any AI system that I see today. And I think the main trend that we see today is moving towards mission-critical areas for AI. So what we saw, let's say, in the last two or three years is mainly around use cases of AI where the accuracy is not too significant. Let me give you an example. So let's assume you need to do speech to text for conversation and then bring some insights from this conversation. And then let's say you have a corpus of insights, you do some statistics.

Starting point is 00:16:45 It's not really important if you're in 85% accuracy or 82% accuracy. It's a statistical way of thinking. But let's now move into a classic enterprise use case in the insurance domain. Claim assessment, you want to decide if to accept or reject an insurance claim. That's not statistical at all. If you will be wrong in one clear reject case and you will give it a proof, someone will shout at you and will fire the data scientist or will complain about the ability. So the confidence level that we need to work in is much more significant.

Starting point is 00:17:29 And if you want to bring claim assessment into production, even if it works like 50% of the time, in this 50%, you want 99% accuracy. And that requires a system. You want to be able to have a confidence score and do some basic decisions about that confidence score. If the confidence is low, pass it to a human. If the confidence is high, pass it to the system. If it needs to be rejected, pass it to a human for a second judgment.

Starting point is 00:18:00 If it's claimed that you haven't seen in your training set, pass it to a human. But it still will be able to improve 50% of the business. And around insurance operation, that means a lot of money. Even if it's 20%, that's still millions for a classic insurer in the US. On that note, too, I think that one of the things that's come up a few times in this podcast is this whole question of sort of understanding, I guess, the limits of AI. It's seductive. We actually talked about that earlier. You know, you can assume, wow, this thing is so powerful. It's going to give me such great results.

Starting point is 00:18:40 But it's very important that, you know, data scientists and, you know, business people in general understand that this stuff is not magic. And even though we call it artificial intelligence, in many cases, it's not intelligent in the sense that we want it to be. And so the examples that you're providing, I think, you know, based on the output of an ML model, how do we know that this thing is responding within the boundaries that we expected? You know, I think we can all guarantee that it's responding to the model in a, you know, sort of a predictable manner based on the data that the model was fed. But is that going to give us good results or is that just going to give us results? So I think you're touching one of the most important factors and topics in AI.

Starting point is 00:19:33 And I will call it trustful AI. And I think it combines of three main factors. The first one is what the community terms explainable AI or XAI in short. And I will be happy to join a podcast only about that cause I can talk about that for an hour for sure. But on a nutshell, explainable AI means the ability to explain why something is happening, why this is the prediction. Going back to the claim assessment,

Starting point is 00:20:03 why this claim should be rejected or why it should be accepted, right? And then you have bias mitigation, which is clearly super important factor in the world that we live in. And I can talk about bias mitigation as well, but I think it's an important factor even for the data science

Starting point is 00:20:25 team and for the business team to look in a closed loop about what's going on around bias. Definitely, if you're talking about sensitive data, sensitive environment around people and around data. And then the third part, which is critical factor in any AI system is the confidence estimation or the research community usually call it the uncertainty estimation. I want to be able to bound my prediction and understand when the machine is able to predict in a high confidence and when it's not. And that's a very basic question to ask about the machine learning model. So confidence is one part of it, the different types of confidence estimation. And maybe the second part is what I mentioned before, the input gate, which is some kind of confidence, the input gate, which is some kind of confidence

Starting point is 00:21:26 over the input data, basically, saying this new data that just came, the giraffe image that I mentioned in the past, in the previous question, is that in my distribution that I saw in the training set? Because if not, I don't want to predict what's going on in this giraffe same two claims right if i trained on different claims in i don't know let's say medical insurance and all the claims were around orthopedic and then suddenly a claim from different domain of medicine came, that won't work. So that's something that the business perspective or the product managers which build the AI product

Starting point is 00:22:15 must understand. Well, actually almost at the 30 minute end of the podcast, we are warming up with the heavy, hot and heavy topic. I actually wrote a piece on that about a year ago. This is one of the reasons why AI will not be mainstream for a while, because it's not exactly artificial intelligence we are looking for. We are looking for augmented intelligence. In other words, if a human can't make a decision, humans are allowed to make subjective decisions.

Starting point is 00:22:47 Machines are not, and that's not going to change for a while. I mean, they can, but they can only help a human in making the decision by providing the data. So that's why the most major issue most enterprises face is that when a machine makes a decision based on a data, it should be fed into when you need to escalate it to a human with a confidence core, and then assist the human making a decision. Explainable AI, trust in AI decisions, security, constant updation of doing that. These are all, each one of them can be its own topic. So at the end of the day, that in itself is the major problem in enterprise and production, right? It's about making AI-based decisions that will assist humans, not the AI by making a decision in itself. And you need to build systems for that.

Starting point is 00:23:38 You're definitely correct. And I read your piece and it's a great reference. I think that's quite agnostic in most cases, but not to all of them. So you can find different use cases in AI where the interaction with human is less sensitive. For example, predictive maintenance. Again, in manufacturing, you want to be able to predict if something is going to fail.

Starting point is 00:24:07 That's very clear how to move that needle of maintenance and how to prevent a sudden failure in the production. If you have a good way to predict anomalies in the way this machine works, if there is an anomaly, go do prediction. Go do maintenance. And then it's more easy to move the needle there, to bring ROI to this company, to this company to design an enterprise environment. And it's not necessarily, let's say, heavily decision support around sensitive decisions, right?

Starting point is 00:24:58 In the worst case, you did extra maintenance, okay? So I think, you know, there are lots of heavy problematic areas where it's clear decision support. I don't know, healthcare is super complex, right? Insurance is starting to touch these areas. You see many, I don't know, fraud detection in the financial services. That's again, in the worst case, you block a transaction and the user will call you why did you block my transaction or on the other side they will say oh thank you for block that transaction that was a fraud right so know-how have been done in the worst case i agree it's there are a few

Starting point is 00:25:41 factors in play one is the the sensitivity of the transaction. The other one is the value of the transaction. And ultimately what it decides. You can't let a robot decide and do a surgery on me automatically, even if I give a consent. So it depends, it's situational basis. You can't globalize the decisions, right?

Starting point is 00:26:01 And also I would argue, going back to your original statement, there are certain things humans can never decide. For example, the predictive maintenance that you're talking about. There's no way in hell I'd be able to analyze all the data that's available, even if I map a digital twin model, I won't be able to figure out when a specific component is going to fail. When I say I, not just me, but a human, because our brain power is somewhat limited, right? So you need some help. So there, whether you could say, okay, you can have a machine, look at this, this is a high value

Starting point is 00:26:38 machine. I mean, I've worked in some of those use cases in the industry in that particular area. It would say, if the machine goes down, not working tomorrow, it's going to cost me millions of dollars per day, high value machine. So this component might fail, high probability within the next two weeks. Let's order a replacement part. That can be automated. But just dump that machine and buy a new machine that probably may not be the decision may not be automated so it depends on the use case yeah yeah definitely i agree um and i think maybe

Starting point is 00:27:15 if i can uh add another comment um that that i see very important uh in in the in today's AI landscape. I think it's around what are the challenges of this time, right? And if you go to Gartner reports and different survey and others that are reporting what are the main challenges of today, I think they are doing post analysis of what was true in 2018. And I see completely different challenges in 2020, in today. And I think it will take a year until you will find a report that is reporting that. I will focus about the ability to customize your solution, the ability to handle real-world data, that means production-grade AI, build trust that we touched, and create impact over time, right? How you build systematic solutions over time.

Starting point is 00:28:18 And that's the area that I see all the time in companies. That's what we are focusing at BeyondMinds. And I think that's the trend that you are going to see in the coming year. So talking about the challenges between 2018 and 2020 and going beyond, one of the things that most of the enterprises don't do properly is, you know,

Starting point is 00:28:43 how to allow your customizable, customize your solutions in production, particularly for AI, particularly for the models. How do you allow them to customize? Do you have any thoughts on that? Yeah, I will say two things. One is vendor selection strategy. The basic understanding that majority

Starting point is 00:29:04 of what you need to solve, you can buy and reach ROI faster and cheaper. And you should build a strategy on how to select vendors and what to push out into external vendors and what to develop inside. Because clearly the domain knowledge is within the company, but there is a lot of expertise around technology out there that you must use. So that's the first thing. And the second thing, I think we touched it quite heavily in the conversation, is the basic understanding that AI is a system. You need to invest a lot of effort and time in solving fundamental problems that will allow you later to bring AI to production. The fact that you have 10, 20, 50 PhDs, researchers that just finished in MIT waiting to solve your problems does not mean that it will be easy to bring it to production.

Starting point is 00:30:03 I can give you something that a quote that a client of ours told me a few weeks ago. To hire a data scientist today is quite easy, but to bring a data scientist that will be able to bring something to production, that's one of a thousand. And you want to minimize this dependency on the ability of the data scientist to bring something to production. And the way to do that is invest in the system, right?

Starting point is 00:30:34 It's not only the infrastructure of the cloud, but it's the code infrastructure, the technology stack that you're going to use. And that's a different mindset because think about this VPR and DEO, head of AI in a large organization saying, wait, wait, wait, I need only 10 people for a year, year and a half, two years, just to build infrastructure of code, of solutions before you even start to see ROI. That's a challenge. And this is why I think

Starting point is 00:31:06 many of these challenges should come from outside of the organization. I think that's a great summary and a great way to end the discussion. I am going to take you up on the offer to have a follow-on podcast where we discuss some of the details of making this happen. But for now, folks, this has been just a marvelous discussion. Roy, thank you so much for joining us. Andy, thank you for joining us. Before we go, please tell us, both of you, I guess we'll start with Andy and then go to Roy. Where can we connect with you to follow your thoughts on enterprise AI topics? Absolutely. You could follow me on Twitter at Andy Thorey or my website at thefieldcto.com. Well, I'm quite easy to find online. My name is Roy Mejrez. You can find me on LinkedIn quite easily, and you can find beyondminds.ai online.

Starting point is 00:32:12 And we'll be happy to meet and discuss how to solve your problems. Great. Thank you. And I'm Stephen Foskett. You can find me on Twitter at sfoskett. And you can find my writing at gestaltit.com. Thank you for listening to the Utilizing AI podcast. If you enjoyed this discussion, please do remember to rate, subscribe, and review the show on iTunes since that does help visibility. And please do share this show with your friends.

Starting point is 00:32:33 This podcast is brought to you by gestaltit.com, your home for IT coverage from across the enterprise, and from thefieldcto.com. For show notes and more episodes, go to utilizing-ai.com or find us on Twitter at utilizing underscore AI. Thanks, and we'll see you next time.

Your Ad Here

Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 12: Why Do Most Enterprise AI Projects Fail? @BeyondMindsAI

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.