The AI Daily Brief: Artificial Intelligence News and Analysis - No, 95% of AI Pilots Aren't Failing

Episode Date: August 22, 2025

Today, we're breaking down the MIT study claiming 95% of generative AI pilots at companies are failing - and why this headline is misleading the entire market. The report, based on just 52 intervi...ews and 150 survey responses, has been cited as a reason for AI stock crashes, but the methodology is deeply flawed and the findings are being wildly misinterpreted. What the study actually reveals is that while individual employees are getting massive value from AI tools (90% use LLMs regularly vs only 40% of companies buying subscriptions), organizations are struggling with implementation - not because the technology doesn't work, but because of leadership buy-in, poor change management, and organizational dysfunction.Brought to you by:KPMG – Discover how AI is transforming possibility into reality. Tune into the new KPMG 'You Can with AI' podcast and unlock insights that will inform smarter decisions inside your enterprise. Listen now and start shaping your future with every episode. ⁠⁠⁠⁠⁠⁠https://www.kpmg.us/AIpodcasts⁠⁠⁠⁠⁠⁠Blitzy.com - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://blitzy.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ to build enterprise software in days, not months Vanta - Simplify compliance - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://vanta.com/nlw⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Plumb - The automation platform for AI experts and consultants ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://useplumb.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠The Agent Readiness Audit from Superintelligent - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://besuper.ai/ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Subscribe to the newsletter: https://aidailybrief.beehiiv.com/Interested in sponsoring the show? nlw@breakdown.network

Transcript
Discussion (0)
Starting point is 00:00:00 Today on the AI Daily Brief, no, 95% of AI pilots are not failing, and the ones that aren't failing for the reasons that all they're reporting about that study suggested. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. All right, friends, quick announcements before we dive in. First of all, thank you to the sponsors of today's show, KPMG, Blitzy, and Superintelligent. To get an ad-free version of the show, go to patreon.com slash AI Daily Brief. and if you are interested in sponsoring the show, shoot us a note at sponsors at AIDailybrief.aI.
Starting point is 00:00:40 We're now basically full up for 2025 and selling into 2026. And with that, let's get to this one that has been brewing all week long. Welcome back to the AI Daily Brief. Boy, has this one been brewing all week, man. Today we are finally digging into this report out of MIT, claiming that 95% of generative AI pilots at companies are failing. Now, there is really important context and background on why this report is worth the time that we're about to give it. And simply put, that's because it has not stayed as just a point of discussion in the AI space or even in the corporate and enterprise space,
Starting point is 00:01:19 but has actually been cited as one of the reasons that AI stocks are flailing right now. Analysts are talking about this. They're saying their clients want to talk about it. And it is being relentlessly cited as an example of why, the AI bubble in big old air quotes is about to or in the process of popping. Now, this is not a market show, but I do think it's worth before we dive into the report having a little bit of a discussion about what I actually think is going on in the markets. As I'm recording this, Fed Chair Jerome Powell is about to speak in Jackson Hole. This is an annual Central Bankers
Starting point is 00:01:54 conference where each year the Fed lays out their vision of what's to come. Now, this being Jerome Powell's last year, people aren't really looking at it as some big inflection. point on what the future of the Fed might hold, but they are very closely looking at it when it comes to whether we're going to get a rate cut in September. Markets really, really, really want rate cuts in September, and they are very, very nervous that Jerome Powell is going to throw cold water on that in this speech. In that general nervous climate, which is also, by the way, being aided by the fact that we are in the late summer period where liquidity is low, and just in general there's less actual stuff happening, there's definitely an increasing recognition.
Starting point is 00:02:33 of the fragility of the stock market and the extent to which it's been propped up by enthusiasm around AI. And this is absolutely undeniable. Ever since the rate hiking cycle started back in 2022, it has been AI enthusiasm versus the world. At every point when there's been some big crazy macro event, be it tariffs, be it war, AI enthusiasm has been there to hold it down and continue to give stock market investors something to be excited about and basically, people. pin all their hopes onto. There is a growing fear now of what happens if that all goes away, that's being actively abetted by short sellers who are specifically going after stocks like Palantir, which they view as wildly overpriced. Add to that the reporting that Sam Altman agreed
Starting point is 00:03:19 that we're in a bubble, which by the way his CFO was then deployed to walk back, plus wildly overstated reports around meta not spending on AI, which they as well are denying. New chief AI officer at Meta Alexander Wang tweeted, we are truly only in investing more and more into Meta Super Intelligence Labs as a company. Any reporting to the contrary of that is clearly mistaken. Ultimately, it doesn't matter. The point is that the market is jittery right now, and that is the context into which this MIT study has come. Now, I will say, as we get into this report and into my very deep critique of both the report and the interpretation of the report, I will note that the folks at MIT who did this study are not working for Citron
Starting point is 00:04:02 trying to tear down Palantir and other AI stocks. They didn't know the context that their study was going to be released into. They didn't know when they were typing this up, that there was going to be a whole narrative moment around GPT-5. They didn't know that Sam Altman was going to say that we were in a bubble. The point is that while you can critique lots of things about this report, there was certainly no ill intent or an attempt to drive a particular narrative. It just happened to be the perfect headline bomb in that environment.
Starting point is 00:04:28 Now, with those caveats out of the way, this is what I will say. Anyone, and I mean anyone who is letting their opinion be overly shaped by this study, and especially anyone who is making financial decisions based on it, should be embarrassed and needs to rethink their general susceptibility to headlines. For two reasons. The first, the study just is not well constructed, period. I'm sorry, that's the truth, we're going to get into it. Second, it's not even really saying what the reporting has suggested.
Starting point is 00:04:59 So let's get into this thing. The first weird thing about this is that it's been extremely hard to get your hands on the actual report. Everything about this, the entire conversation, came out of this one piece in fortune called MIT report. 95% of generative AI pilots at companies are failing. Now, we don't have nearly enough time to critique the general media discourse where people are comfortable just reading the headline and going with it as though it's fact. But the group at MIT, who actually released this report, did not make it easy to get their hands. onto it. As recently as yesterday on Thursday, Professor Ethan Mollick wrote, Has anyone seen a copy of the Project Nanda report on 95% of AI pilots failing? It has been reported
Starting point is 00:05:41 about widely, but I haven't been able to actually read it, though I filled out a form to get access. I want to understand what they found and how they measured it. Any links? I had the same experience. As soon as I saw this news, I clicked the little link in the fortune piece to the report and got this Google form to request access to the research. I did so, numerous times, never heard from them, and eventually just found a version that someone had uploaded separately. Now, to the extent that it was an intentional decision not to make this report widely available, I kind of understand that after I got access to it. Before we get into the methodology, let's talk about the group behind this. It's called Project Nanda,
Starting point is 00:06:18 and they described themselves as, quote, building the foundational infrastructure for an internet of AI agents, i.e. the system where trillions of AI agents can collaborate, communicate, and transact across organizational boundaries. Created at MIT, Nanda provides the index protocols and tools needed to enable this decentralized protocol neutral ecosystem. Now, where I think that the issues begin with this report is in the methodology. In short, this thing is kind of like vibes masquerading as a credible academic study. Turns out that this report, which has been so widely cited as causing billions of dollars to come off of the market cap of AI stocks, is based around 52 interviews, 150 survey responses, and a review of public announcement,
Starting point is 00:07:01 to see if people had talked about how AI is making them money yet. For some context, super-intelligent interviews significantly more than 52 executives every day. The idea that you could get a broad-based understanding from that in a way that should move markets is extremely, extremely dubious. By the way, we don't have any information about who those executives were, the size of the organizations, the role the executives had within the organization, or any of that information which would help us contextualize and understand how much we're talking about. talking about big firms, small firms, mid-market firms. It's also worth questioning this so-called
Starting point is 00:07:36 systematic review of over 300 publicly disclosed AI initiatives. Here's how the researchers say that they define success. We define successfully implemented for task-specific gen AI tools as one users or executives have remarked as causing a market and sustained productivity and or P&L impact. As Yoink AIs Brian put it on Twitter, the MIT report to main sources 300 plus company's public announcements. The companies need to have announced that AI caused a market and sustained productivity and or P&L impact to be successful. No announcement apparently equals zero return. It really looks like they basically read press releases and SEC filings to determine if any of these things were successful based on how much productivity or P&L enhancement was being reported by the
Starting point is 00:08:19 companies. Now, other problems start to reveal themselves when you dig in. When you go to section 3.4 of the report, they share that they found that 50% of Gen.I budgets are going to sale. in marketing. Now for them, this is part of why there's a Gen. AI divide, because sales and marketing is a worse place to spend AI money than back office automation. But the idea for anyone who spend any amount of time digging around AI, that 50% of Gen AI budgets are going to sales and marketing is just absurd. Every credible study looking at where Gen AI is being applied sees it basically equal across the organization. It's everything from coding assistance to content generation, to document and knowledge retrieval, to product and design, sales productivity,
Starting point is 00:09:01 customer engagement and service, QA and testing. In other words, there is no universe in which 50% of GenAI spending is going to sales and marketing. The only implication of that can be that the people who they interviewed from these 52 organizations are hyper-concentrated in those domains, which is fine if we understand that, but when it's not revealed explicitly, it calls into question all of the implications of the broader study. Reinforcing the idea that these were probably sales and marketing executives. When they asked their interviewees to allocate a hypothetical $100 to different functions, sales and marketing functions captured approximately 70% of the budget allocation. More broad than that, the terminology is very unclear in this. What they're referring to as a pilot is sort of ill-defined,
Starting point is 00:09:43 what sort of implementations are happening that they're talking about is sort of ill-defined. There doesn't really seem to be any mention of AI or agent decoding despite the fact that that is very clearly the breakout use case inside the enterprise. And it kind of feels like all of this is about deployments of co-pilot tools and JATGBTRappers, rather than agentic AI, which is obviously where all of the emphasis is right now in the enterprise. And agents are really important to this story. Because, and here's where we move out of the critique of the report itself into the critique of the way that it's being interpreted. When you actually take this thing at face value, the report is not actually about whether AI delivers value or not,
Starting point is 00:10:22 but how much value accrues to individuals inside the corporation versus to the organization itself. The most unreported part of the study is that they found that basically all the value is accruing to the individual. They call this the shadow AI economy, and the starkest example they give is the percentage of companies who have purchased an LLM subscription versus employees who use LLMs regularly. Although they found that 40% of companies had purchased LLM subscriptions, they found 90% of employees. employees were using LLMs regularly. In many cases they write, Shadow AI users reported using LLMs multiple times a day, every day of their weekly workload through personal tools, while their company's official AI initiatives remain stalled in pilot phase. Now, it's important to pause here, because the implication of all of these articles and all of the analysis on Wall
Starting point is 00:11:13 Street about it was that what the study said was that 95% of AI is not useful and not valuable. In other words, it was not that 95% of pilots were structured in ways where the organization didn't get as much value as they wanted to. It was straight up that 95% of the time AI isn't useful. And that's just absolutely not what they're saying. And here's where some of our discourse about productivity is problematic. Let's imagine that everyone inside an organization starts using a full slate of AI tools and is instantly 40% more efficient. Basically, they do their work in three days instead of five days. even if that happened and everyone across the organization had that same experience that would not
Starting point is 00:11:53 show up in productivity metrics or on a P&L. To show up in P&L or in some sort of productivity metric, the organization would either have to reduce the workforce by the equivalent 40% so that the overall cost to produce the same amount of outputs went down or they'd have to ensure that each of those people who were now more productive went back and used that new time won back to produce more work. widgets for the same cost. Part of why organizations have been so aggressively interested in agents is that they have the potential to replace wholesale functions where that time can be better reorganized and
Starting point is 00:12:29 reconstituted in a much more direct way than in the context of the very fuzzy person got their work done faster or better with chat GPT type scenario. It's not to say that organizations aren't enthusiastic about their people using productivity enhancing co-pilots, it's that when they think about the real big productivity gains of AI, they're looking beyond just the individual deployment of tools to the redesign of systems, and especially the redesign of systems where agents are taking on big chunks of the work. Now, here's the other big problem with the interpretation of this report. Even if the numbers were correct and 95% of Gen AI pilots were failing,
Starting point is 00:13:05 the way that the market and the naysayers have been interpreting this is that that means that 95% of AI technology itself is crap, rather than the much more obvious if you've ever worked inside an organization or big enterprise even for a little bit at any point in your life interpretation, which is that there are major problems not with the technology, but with the implementation of the technology. Problems, in other words, on the organization side. Well, it turns out that even right there in the report,
Starting point is 00:13:34 it is very clear that much of the problem is on the organizational side. When users were asked to rate each issue on a scale of 1 to 10 on how big barrier it was in scaling AI in the enterprise, there were technical problems, model output quality concerns got a seven and a half, and poor user experience got around a seven, but the other three top answers all had to do with the organization itself. Challenging change management got a six and a half, lack of executive sponsorship got a six and a half, and an unwillingness to adopt new tools got a full nine. Yet given that we, at Superintelligent, do this day in and day out,
Starting point is 00:14:10 and actually interview thousands of people about these questions, I've put together this list of 15 reasons why AI pilots actually fail. And let's be clear, some of them are yes technology problems. There can be platform mismatches, where a startup is trying to implement a new solution that just doesn't play well with whatever legacy systems and enterprise has. This is a really common refrain. Now, usually you can avoid that before you get into the pilot if you've gone through any sort of quality assessment and design process to actually lay out the approach, but still it's there. There is also yes underperformance. This is a new technology.
Starting point is 00:14:47 Startups are going to over promise and underdeliver sometimes. It would be insane to claim that they didn't. We also are dealing with an environment where there can very easily be surprised costs. Where a thing that you thought was going to take X tokens actually takes X times 10 tokens and there's overages and all of a sudden people are all chagrined. And the thing that seemed like it was going to add value actually isn't all that much cheaper in the first place. There are other technology problems as well. and the point is that I'm not trying to obfuscate those things or say that they don't exist. My point is that whether you consider this one or three or five of the problems,
Starting point is 00:15:20 I'm about to tell you about 15 others that are on the organizational side. What if AI wasn't just a buzzword, but a business imperative? On You Can with AI, we take you inside the boardrooms and strategy sessions of the world's most forward-thinking enterprises. Hosted by me, Nathaniel Wittamore, and powered by KPMG, the seven-part series delivers real-world insights from leaders who are scaling AI with purpose, from aligning culture and leadership to building trust, data readiness, and deploying AI agents. Whether you're a C-suite executive, strategist, or innovator, this podcast is your front row seat
Starting point is 00:15:53 to the future of Enterprise AI. So go check it out at www.kpmG.us slash AI podcasts, or search you can with AI on Spotify, Apple Podcasts, or wherever you get your podcasts. This episode is brought to you by Blitzy, the Enterprise Autonomous Software Development Platform with infinite code context. Blitzy uses thousands of specialized AI agents that think for hours to understand enterprise-scale codebases with millions of lines of code. Enterprise engineering leaders start every development sprint with the Blitzy platform,
Starting point is 00:16:25 bringing in their development requirements. The Blitzy platform provides a plan, then generates and pre-compiles code for each task. Blitzy delivers 80% plus of the development work autonomously while providing a guide for the final 20% of human development work required to complete the sprint. Public companies are achieving a 5x engineering velocity increase when incorporating Blitzie as their pre-IDE development tool, pairing it with their coding co-pilot of choice to bring an AI-native STLC into their org. Blitzy is providing a limited time, 30-day free proof of concept for qualifying enterprises.
Starting point is 00:16:55 The team will provide a 5x velocity increase on a real development project in your org. Visit blitzy.com and press book demo to learn how Blitzie transforms your STLC from AI-assisted to AI Native. That's BLITZY.com. If you are a regular listener, you will have heard about Super Intelligence Agent Readiness Audits at this point. But I wanted to tell you today about the full suite of agent readiness products that go beyond just the initial readiness report. Over the last six months, Super Intelligence has built out an entire agent planning suite. We help you move from discovery to planning to implementation. After you've completed your agent readiness audits,
Starting point is 00:17:34 We help you double-click on your most important use cases with what we call our use case planning reports. These reports are going to help you understand what sort of technical preparation you need to do to be ready for a use case, what challenges you might face in implementation, and whether you should be thinking about building, buying, partnering, or some combination. After that, you can even get a spec document in what we call our technical blueprint that gives either your developers or the developers of the partner you work with, what they need to build exactly the agent that you're looking for. If you want to learn more about superintelligence agent planning suite, we've built a custom GPT to answer your questions. Just go to bit.ly slash super super super agent. That's bit.l.ly slash super super agent, all one word. And if you have any questions, the agent can even help you book an appointment with our team.
Starting point is 00:18:22 Let's start with the most obvious one, which is leadership buy-in. If executives are not bought in to change, it will not happen. Anytime we see an organization where there's some innovation group that has a theoretical mandate to do AI pilots, but not actually the budget to do so or the executive sponsorship to do so, we know beyond a shadow of a doubt that it's got an extremely limited shelf life. It's just not going to go that far. Leadership buying is the fundamental prerequisite to all change, period, not just AI change. We see constantly the organizations that have the best agent readiness scores when we do our assessments
Starting point is 00:18:55 are those that have not only executive buy-in but CEO-level buy-in. Without it, you're just living on borrowed time. Interestingly, though, there is an inverse of this, which is equally important, frankly, which is team buy-in. We often see the opposite situation, which is where the executives are thrilled, engaged, highly excited and motivated, but haven't taken the time to get their teams and employees bought in. Now, those teams and employees may be going through a set of considerations and concerns
Starting point is 00:19:23 around what these technologies are going to mean for them. Are they simply training their replacements? What's the company's visions of how digital employees and human employees are going to interact in the future? Is this an organization that's trying to use agents to do new things, build new products that weren't possible before? Or is it an organization that's just trying to cut costs at any cost? Lots and lots of studies have shown this potential for misalignment between employees and executives. A number of times I've shared this study from writer last December that interviewed 800 employees and 800 executives, and found just wildly divergent perspectives on AI inside the organization. For example, this 31% gap between executives and employees when it comes to belief that their
Starting point is 00:20:03 company has a high level of AI literacy, a 30% gap between the percentage of employees who think the company had been successful in adopting AI versus the number of executives who did, and so on and so forth. So yes, you need leadership buy-in, but you need team buy-in just as much. Next up, we have problem value fit. This is basically where you see your real cool demo and you decide to try it out, but no one can really name the metric that's going to move. These tend to come out of briefs that talk about innovation or productivity in the abstract, but where the people who are actually implementing this can't point to a specific error or cost or problem that's going to actually be impacted positively. Success then tends to feature
Starting point is 00:20:41 anecdotes and screenshots, not a target KPI or baseline. And speaking of baselines, this is another area where pilots often fail. Teams will say it feels faster or it feels better, but no baseline exists and there's no control. The dashboards that we create don't have a before number. Wins in these situations are measured in vibes. And when challenged, teams can't show lift or even really a simple pre and post. Another challenge is that a lot of times, the tools and systems that are being implemented lack the relevant enterprise context to actually be useful. Sometimes, yes, just a general purpose tool can provide some lift. But a lot of times, what's going to make AI work or not is the information and data it has access to regarding the enterprise that it's trying to work within.
Starting point is 00:21:27 Enterprise AI that lacks enterprise context is ultimately extremely limited. Which gets to another problem, of course, which is data readiness. Lots of times that context does exist, but not in ways that AI can take advantage of. There's a reason that you see so much effort and work around things like getting enterprise data into MCP servers, or you see tons of money being spent on the AIification of data lakehouses. It's because of that context question and the need that AI has to have all that enterprise context that's obviously expressed in the form of data, ready and able to be used by these AI systems. Unfortunately, though, when it comes to enterprise data, it's not just a readiness issue, it's also an access issue. I was talking with the hedge fund earlier this year,
Starting point is 00:22:09 and they were in the midst of getting all of their data and signal into formats that were accessible by AI systems. The problem was that across the organization, there was a huge variety of different levels and types of data permissions. One person might have access to data sets X, Y, and Z, but not A, B, and C, while another person might have access to A, Y, and D, but not B, Q, and F, and so on and so forth. So even once you've got your data in a format that is usable by AI, you then have to design systems for permissions and provisioning that reflect the real world of who can interact with what information.
Starting point is 00:22:45 At this point, if your head is spinning around how much has to go into making these systems work, You're not alone, man. Another big challenge is poorly documented workflows. Now, as I talk about all the time on this show, AI is not RPA 2.0. Organizations that think about it simply as a way to one-to-one automate the existing work that people do are wildly under maximizing the potential of what AI can do for their organizations. At the same time, a lot of the natural starting points are some amount of automation of existing monotonous workflows, which can only happen if those workflows are actually documented in a way that the AI can
Starting point is 00:23:19 understand. There's a reason that you see a million startups right now that are basically recording the screens of people who are doing work to understand exactly what they do so that they can then go imitate and hopefully improve upon those workflows. But right now, on average, the quote-unquote documentation of workflows exists solely in the heads of the employees who are actually doing that work. And then, of course, there's skills, enablement, and support. Wait, so you're giving all of your employees access to this incredibly powerful and complex new technology? And you think that just because the technology itself is expensive, you shouldn't also have to pay for skills enablement and upskilling? Sorry, man,
Starting point is 00:23:54 it's not how it works. Even people who are quote-unquote AI experts are only AI experts because of the sheer amount of time they've spent figuring out how to actually interact with these systems. In many cases, the patterns that we have from using and interacting with previous software do not apply to Gen AI. And guess what? The state-of-the-art opportunities that AI really represents are going to take way more than a Coursera prompt engineering course. Now, this is also a market provisioning problem, given that there are so few good resources for things like how to interact with and manage agents. But part of the reason that it's a market problem is that entrepreneurs know that enterprises are trying to get out of this without having to pay for that, and so they don't want to be
Starting point is 00:24:31 the one who's desperately clinging to the coattails asking for some scraps for the table. If organizations and enterprises are serious about AI transformation up and down the organization, both in terms of agents doing big buckets of new work, but also their existing employees being more productive, they're going to have to pony up for skills training, enablement, and broader change management. Then, of course, you have some very obvious things like overzealous risk departments that don't allow people to actually use these tools in the ways that could create the most opportunity. For example, we have a partner right now that is reselling our voice agent interview assessments to their clients, but whose risk department will not let their teams be interviewed by voice agents.
Starting point is 00:25:10 If you want to try to take the time to go make sense of that, by all means, I'm just going to keep cash in the checks. There are broader management issues like organizational fragments. where different people in different parts of the organization may be piloting different systems that don't necessarily work with one another or even in competition, or the reverse, which is existing vendor lock-in. And I think this one is worth taking a moment for in the context of this specific study. It's pretty clear if you read between the lines of this thing, that the employees at these organizations that this MIT group looked at are fundamentally disinterested in using the crappy versions of tools that their organizations are giving them access to,
Starting point is 00:25:47 and instead just want to use the general consumer tools that are way more advanced. Call this the co-pilot chat GPT problem. Anyone who has touched AI at all in the enterprise has seen examples of this, where when you're logging in with your Gmail at home and using these most advanced reasoning models, to then have to come back and use the neutered versions that your enterprise is giving you access to, is just completely unbearable. Especially because in a lot of cases,
Starting point is 00:26:14 every new update of the state-of-the-art unlocks meaningful amounts of new use cases. It's not like we're so far into the future right now, where even older, crappier models are super useful. For some use cases, they are, but for many use cases, you really need something that's close to the state of the art, and if you don't have access to it, you're simply not going to be able to do that work. The way that the MIT group puts it is this. Employees know what good AI feels like, making them less tolerant of static enterprise tools. The last couple of reasons I'll mention that pilots fail have to do with leadership again, but leadership in the context of the pilots.
Starting point is 00:26:48 It can so often happen that pilot ownership or leadership is like a hot potato. Some executive sponsor says that they want it. They hand it off to someone who has never exactly bought in. And then they're just going through the motions of aiding the pilot when they're not even particularly convinced that it's actually going to be all that useful. This happens all the time. And it's why I separated leadership buy-in and team buy-in and said that they're both incredibly important in context with one another.
Starting point is 00:27:12 And then, of course, there's this situation. in a circumstance where even if ownership of the pilot is clear, it's a one-off with no strategic plan or next steps articulated and no ultimate direction. At this stage, this is the default and the norm. Let's try a pilot to see what we can do without any larger consideration of the big goals that you're trying to achieve as an organization. Pilots that are conducted like this in a strategic vacuum are simply much less likely to succeed and be a part of actual organizational change. I've been discussing this study throughout the week with our head of research. And when I asked her to estimate the actual distribution of failure rates between organizational issues and technology, her thesis was that it was about 80% organizational, 20% technology. So four to one organizational versus tech-related issues.
Starting point is 00:28:01 Now, there's one more funny thing underneath all of this, which is the idea of using pilot failure as a bully cudgel in the first place. Like the idea that pilots failing isn't simply a part of. of the expected distribution of pilot results. If you are running an organization and trying a novel technology like AI, especially one that's as fast-moving and dynamic as AI, and all of your pilots are working, it is almost assuredly the case that you're not being experimental enough. You're not trying enough things. You're not thinking far enough about what AI could be doing for you. Some percentage, in other words, of your pilots should be failing, certainly not the 95% that MIT claims, but
Starting point is 00:28:42 some meaningful amount. AI, again, is not a technology that's exclusively meant to be a one-to-one replacement for existing workflows. It represents an opportunity to do things that were not possible before, and you're not going to discover those things if you have no tolerance for pilot failure. Now, luckily, a week under our belt, we're finally having a narrative shift again around the airport. Venture Beat writes, MIT report misunderstood. Shadow AI economy booms while headlines cry failure. Fortune's AI editor felt the need to go write a follow-up, and MIT report that 95% of AI pilots fail spooked investors. But it's the reason why those pilots fail that should make the C-suite anxious. Like I said at the beginning, I think a lot of the resonance of this report has to do with larger
Starting point is 00:29:26 market forces right now, and in a different context, we wouldn't be giving it all this attention that we've been giving it. However, to the extent that it becomes used, as an excuse for why your organization can slow walk this change, I think that you're doing yourself a disservice. Hopefully you have a better sense now of not only why you should perhaps take this particular set of results with a grain of salt, but also a better roadmap of the type of reasons that pilots actually fail in practice. In any case, this has gone on way too long. It's a Friday. Go out. Have a great weekend. I'm sure we'll have more to talk about next week.
Starting point is 00:29:57 Appreciate you listening or watching, as always. And until next time, peace.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.