Big Technology Podcast - Are 95% of Businesses Really Getting No Return on AI Investment? — With Aaron Levie

Starting point is 00:00:00 Why are the headlines telling us that businesses are getting no return on AI investment? And our AI agents finally ready to get to work. We'll cover it all with Bach's CEO Aaron Levy right after this. Octane is the premier identity event, bringing together the world's leading minds to discuss the future of secure access. Instead of consolidating security into a single platform, a modern identity security fabric is the key to unifying your defenses. At Octane, you'll learn how to extend that fabric across all types of identities, including the emerging threat of AI agents. Join in-person in Las Vegas from September 24th to 26th or catch the keynotes and sessions online. To register and see the full agenda, visit octa.com slash octane.

Starting point is 00:00:48 That's okaytta.com slash okaytanae. You're used to hearing my voice on the world, bringing you interviews from around the globe. And you hear me reporting environment and climate news. I'm Carolyn Beeler. And I'm Marco Werman. We're now with you hosting the world together. More global journalism with a fresh new sound. Listen to the world on your local public radio station and wherever you find your podcasts.

Starting point is 00:01:27 for cool-headed and nuanced conversation of the tech world and beyond. Well, today we're going to talk about AI and its application and business, whether it's actually making a difference and whether AI agents are a real thing. We have the perfect guest to do it today because we have Aaron Levy back with us fresh off the BoxWorks AI event. And Aaron, it's great to see you as always. Thank you. I thank you, Alex. Good to do, good to be here. So did I did I add box box works AI event or is it just called boxworks and i'm just i actually like i like you i like you call an a i event um it it is just called box works but uh but any anytime you want to jamming an i in there we're good okay sounds good you had a lot of i news we'll get into that in a moment

Starting point is 00:02:12 but since you are talking with a lot of folks about ai applications uh in business i want to run this mit study by you and get your perspective on what's real and what's not so uh this is from axios a couple weeks ago, MIT study on AI profits rattles tech investors. Well Street's biggest fear was validated by a recent MIT study indicating that 95% of organizations studied get zero return on their AI investment. They studied 300 public AI initiatives trying to suss out the no-hype reality on AI's impact on business. 95% of organizations said they found zero return despite despite enterprise investment of $30 billion to $40 billion into generative AI. This has been a study that everybody in the business world is talking about.

Starting point is 00:03:01 Do you think there's any validity to it? You're already shaking your head. I'm shaking my head on actually like seven dimensions. We could parse each one. Let's do it. So, I mean, actually, maybe the first one that is maybe the most funny is the kind of Wall Street element. Actually, Wall Street is completely schizophrenic on this dimension. Obviously, a report like that scares them on one dimension, but actually there's an equal amount of kind of Wall Street, maybe, you know, kind of frenetic energy around the idea that AI will be so good that all of software is dead.

Starting point is 00:03:39 So it's this very kind of bipolar state of, you know, where are we in AI adoption versus AI is going to be so powerful that there's not even going to be software business models because everything will just be delivered by AI. AI. And as with most things that have these kind of extreme polarization elements, I think the reality is just, you know, way more nuanced. We are still early in the adoption curve of AI. In the early, you know, curve of all of these types of technologies, you have lots and lots of proof of concepts. You have lots of trials of different technologies. People are trying to figure out which tool works for which use case. So by definition, you're kind of in the Wild West where there's lots of attempts at trying these technologies with various vendors and technology stacks. And many of those projects and pilots will absolutely fail because by definition they're pilots. And we're still in the early phases. One interesting thing about this study was

Starting point is 00:04:34 they saw a significant delta between companies that tried to effectively DIY their AI stack versus going with really kind of applied solutions and use cases. And this is what we tend to find in our customer base. So I think there was maybe an initial theory of, well, AI will be relatively easy to kind of get our arms around. We could build our own AI application. We'll do all of the vector embeddings of our data ourselves. We'll put

Starting point is 00:05:01 into a vector database. We'll have, we'll manage the security and permissions of data access ourselves. And, you know, before you know it, a company that wanted to deploy AI in a particular workflow in their enterprise, they might have 10 or 15 different pieces of software that they have to run and manage just before, you know, before a single user could actually interact with, with, with AI within that organization. So that's probably an architecture that's not going to work. You need to have purpose-built solutions that solve sort of tailored use cases. Those can be very big use cases like all of AI coding,

Starting point is 00:05:33 but you probably don't want to be in a position where you have to kind of bootstrap this or build it all out yourselves. And that was one of the kind of recognitions in the survey. But I obviously wholeheartedly disagree with any of the maybe conclusions other than just you have to get your use cases right, you have to kind of target the most effective areas for AI, and you probably shouldn't be building this technology yourselves. But it's sort of empirical on our end.

Starting point is 00:06:03 We get to talk to customers every single day that are seeing the immediate gains. We've talked to customers where they have had colleagues that can't actually... They can't present the actual ROI savings to their board. The actual kind of expected ROI savings to the board because the board won't believe how they won't believe the numbers based on how good they are. So they actually have to water them down.

Starting point is 00:06:34 So it's actually more pragmatic and believable based on what they're seeing. Isn't that a terrible board? I mean, if the board, it can't hear the truth. Well, the true board. But the truth is so good that it doesn't sound credible. So that is the, like, when the ROI is so good that you actually don't, you aren't going to be believed when you actually explain how this thing's going to work. So we're seeing examples all across the board, at least for our customers. You know, we have the benefit of a very applied use case, which is we take documents and unstructured data.

Starting point is 00:07:06 And then we have AI agents that can operate on that data to do things like extract structured data from your documents. So give us, you know, 100,000 contracts. we'll pull out the structured data fields in those contracts, or give us invoices, and we'll pull out the key details in an invoice so we can help automate a workflow. Those use cases tend to be very high ROI because either you weren't getting that data before or it used to be very expensive to do so, and AI is getting increasingly good at being able to execute that kind of task. And so there's immediate benefit to customers.

Starting point is 00:07:37 You can automate workflows as much more easily as a result. You can lower the cost of operations in some areas. So we tend to see a different set of outcomes. based on the AI adoption within our customer base. But if you zoom out and you kind of think about all projects across the past couple of years, I do think you're going to get a mixed bag just as a reality of how early we are on the space. Yeah, and it says internal builds fail at double the rate of external partnerships. So spot on there, people trying to parse this together on their own versus doing it externally

Starting point is 00:08:07 or having a tough time, which sort of flies in the face of like some of the conventional wisdom. I think the conventional wisdom was you wanted to be able to build internally, maybe with open source, so you could customize to your use case, but it turns out some of the off-the-shelf stuff is actually working quite well. Yeah, I think you have to, you know, a lot of the challenge with either these types of surveys or even talking about architectures is you have to kind of separate the tech industry from the non-tech industry, the non-tech industry being the kind of consumers of these types of technologies and the tech industry being the builders.

Starting point is 00:08:41 So open source is insanely valuable, but not in the sense where a large, law firm should go off and build their own AI project using an open source model. Like, that is just a recipe for disaster if, you know, we think that every single company on the planet is going to go build their own technology to go automate their workflows. And that has been actually the case for a lot of pilots because we've been early in the technology and you haven't had applied solutions that you could go deploy. But open source is actually extremely valuable for a company like Box because, you know, we're, you know, we're powering technology for 120,000 customers.

Starting point is 00:09:15 And so we actually do have the expertise internally to leverage those kinds of capabilities. And so I would say the conclusion from the dimension of open source as an example is just, you know, you probably shouldn't expect that every company in the plan is going to DIY their own AI strategy. And that's a recipe for not getting the returns and gains from an AI adoption standpoint. And then maybe the only final point I think I kind of point out is just there really is a decent amount of change management required to getting real gains from an AI adoption standpoint. AI. There's not a, this is not a panacea type of, of solution where you could take an existing workflow, drop AI directly into it, and then all of a sudden that workflow will be, you know, 3x better. You usually do have to re-engineer the work to take advantage of AI. And the conclusion

Starting point is 00:10:03 I've recently come to more and more is, you know, I think we had this feeling maybe two or three years ago where AI was going to learn everything about how we work, it would be able to adapt to our workflows and then bring automation to our workflows. And I think realistically, increasingly, we probably will have to modify our work, hopefully incrementally, but in some cases, meaningfully, to fully take advantage of AI. And that sounds maybe hard on one hand, but for the companies that do that, the ROI is going to be fairly massive. So if you think about AI coding as maybe the most obvious example right now where you're seeing productivity gains, The way that AI first engineers tend to work is pretty different than how you engineered two or three years ago.

Starting point is 00:10:47 The engineer really becomes more of a manager. You're deploying agents to go off and work on large parts of the code base, and then it's coming back with a bunch of work that you go and review. So if you don't change your workflow as an engineer to take advantage of background agents and how you give them the right kinds of prompts to actually execute on their task, and the new ways you should effectively think about your code base, and, you know, handling the specifications and, you know, rules of what the AI agent should do, if you don't do all of that work, you're probably not going to get a 2x or 5x gain from AI.

Starting point is 00:11:21 And so we will actually have to re-engineer some of our business processes to make agents effective, as opposed to thinking agents will just drop into our processes and automate everything that we're doing. By the way, you've brought up pilots a couple times. And I think it's important to talk about because this study was not just pilots. It was 95% of organizations get zero return on AI investment. So I think the pilot thing is interesting because it's natural that pilots are going to fail. And in fact, we've had some listeners who've given me some feedback that said, because I talk often about how like only 20% of AI pilots or 10 to 20% of AI pilots get out the door into production.

Starting point is 00:11:59 And that might be a good number because you're going to obviously, you know, have some trial and error in the early days. Yeah. And to be clear, I'm using pilots colloquial. in the sense that we're just so early in the technology that when we talk to customers, what a lot of times they have so far deployed is the equivalent of a pilot, just because of literally how organization-wide. Yes. Well, organization-wide is, you know, it's hard for one centralized survey taker to represent

Starting point is 00:12:27 an organization-wide. It's like, like, that's why, again, that's why I don't want to like, the survey is great. It's an interesting, you know, kind of a conversation starter, but, like, if you actually tried to go assess how is the answer, you know, answering this question and what is their way of measuring that productivity and have they actually surveyed all of the end users that are just using Chachabitin in an unsanctioned way and what they're doing. It's like it's not possible to capture all of that. So it tends to more represent the kind of the centralized, you know, heavily sort of, you know,

Starting point is 00:12:58 again, kind of, I think more likely pilot oriented type projects because of just, again, how early we are. The word agents just came onto the scene less than a year ago. So we're just early in a lot of these spaces. But again, I think it's a fantastic survey because it gets a conversation going. But I think if the takeaway was to slow down, you know, using AI or to do anything other than kind of realize what you should mitigate from a risk standpoint, then actually the failure would just be or the problem with that would just be, all it's going to do is cause some companies to to move even more slowly and then you'll have other companies just outrun them so so it's kind of up to the you know it's sort of you know uh at the risk of um you know you know the the risk is now

Starting point is 00:13:42 on the listener to decide what they want to do about that that survey yeah and i can tell you one more thing that i found super interesting about this study which has sort of been underappreciated so it says official lm purchases cover only 40% of firms yet 90% of employees use personal AI daily at least those surveyed, which just is so interesting because it means that, yeah, there's, there's more personal use and more interest among individuals than companies to get this stuff into production. Yeah, you obviously have reaction here. So let's hear it. Yeah. Well, I know. I just think that's that's like empirical revealed preference. So, so like you don't have to like you don't have to survey once you know that. Why are people, you know, going off and and using AI in a personal productivity sense at that rate? It's because they're they're getting value from. So you almost, like, that is sort of now in the baseline of how people are working. It's unquestionable that if you just sort of eliminated AI just today, let's just say, you would just notice, wow, okay, I actually have to go and do that three hours of research

Starting point is 00:14:46 that I used to be able to go and kick off as a deep research project and go and check back in on it, you know, after five minutes. And so, so, you know, it's empirical that we're choosing to use these technologies on a daily basis because they're adding that productivity. And I would argue that that what we've seen with AI thus far is barely scratching the surface of what is going to start to happen as you start to deploy these technologies. But do you think the use in business could it potentially be just individuals using, let's say, chat GPT on their own versus scaled enterprise use of large language models? Or because, or do you think it will be some blend? In the future? You're obviously watching in the future because you're obviously watching this happen on the other side of things. No, the future is, I think that we are in the earliest phases of just even the diffusion of the technology itself, of the basic use cases of, hey, when you're going to go research a customer, you know, why don't you get a full account plan, you know, instead of just saying, okay, this person works at this company and they're interested in these things and these are the trends of that industry, why not ask an AI system to generate the full plan?

Starting point is 00:15:57 And that's super powerful, but also relatively basic if you think about how people work and the full scope of workflows that people do. One really interesting example of, again, how early we are. Claude this week announced a new capability that will generate files for you. And even though we're two and a half years, nearly three years into the chat GPT moment, it's the first time where an AI system can, I believe, generate reliably. a kind of high quality document in the form of a word document or a PowerPoint presentation. So we're nearly three years in, and it's the first time ever that you could generate something that you would sort of look at and say, oh, that looks like a good presentation.

Starting point is 00:16:40 So we are only at the very, very beginning stages. Now imagine it'll still take a couple of years. Now imagine a technology like that begins to ripple through corporations. And in the future, before you go and present whatever product you're selling to a, a customer. Instead of spending one or two hours of doing a bunch of research and making your PowerPoint file, that's your presentation, you go to an AI agent, you say, I'm about to go sell to this customer, generate this presentation for me. You kick that off and again, three minutes later, it's sort of done for you. This is going to just show up in all of our workflows every

Starting point is 00:17:15 single day in almost everything that we're doing. So coders are getting the first lens into what the future looks like, you know, earliest because, you know, they're sort of wired to take advantage these tools and AI coding has been the kind of first breakout use case. But that same dynamic of you're going to go to an interface, you're going to talk to an agent, it's going to go and execute kind of multiple steps of work for you, that will start to emerge within all of knowledge work over the coming years. I actually am probably a pragmatist on this sense that it will not be like this instant overnight transformation of work. It will take years of change management. We just hosted our conference this week, as you noted, and

Starting point is 00:17:54 And it happens to be a crowd, obviously, by definition, that is sort of forward-leaning in kind of early adopters of technology. But that represents a small fraction of the total economy. It will take years before, again, all of the banks, all of the pharma companies, all of the law firms start to get wired up in this AI-first way. But, I mean, unequivocally, it's going to happen. And there's nothing that will kind of slow that train down. All right, let's talk a little bit more about this using Cloud to generate documents use case. I mean, I would imagine, so you talked, the example that you gave was using one of these to go in and sell into a client. Now, I would imagine most organizations, they have like their PowerPoint templates and the data baked in.

Starting point is 00:18:37 So even if I were to go into Cloud and like upload my pricing spreadsheet, my inventory spreadsheet, a document about positioning and say make a PowerPoint based off of this, I'm sure it would do a good job. But how practical is it to then say this is going to be a way that people do their work versus something that might look like a party trick where you're going to use the other documents that you have already when you actually are going to go out into market? Oh, yeah, no, the way that this will actually show up, and I can't represent the exact date that this will happen, but box, you'll just go to box and you'll say, here's my sales presentation template, here's the new client information, please generate a PowerPoint presentation with that. And then you'll just do that with your existing data. This is not sort of, you know, some kind of one-off vibe-coded document. You will use your existing assets as the source material for the next document that you'll generate. And you'll go and review its work, and that'll take you three minutes. But it will have saved you, you know, an hour or two hours of all of the time that it took to do the customer research and move around all the graphics and put the relevant information in place.

Starting point is 00:19:46 That will just be done for you. So, and that will, you know, multiply that over a million people that do that per day in, you know, some sector of the economy. And you'll just see, you know, that's how you'll get tens of millions of hours of productivity gained, you know, within the economy. And how are you feeling about the trustworthiness of these models? Because you've talked a couple times now about how you could use deep research to prepare you for something or you could use these models to generate a PowerPoint and then spend a couple minutes checking them over. Are you at the point now where you think the outputs of these models are trustworthy enough that that's all it takes? I think as long as – and this is where I get very excited about now that obviously what's in the zeitgeist is context engineering. As long as you are really good about what context you're giving the AI and how you are effectively grounding the AI in trustworthy data with the right kinds of prompts and a high enough quality model, you can nearly eradicate the –

Starting point is 00:20:42 you know, all of, if not the vast majority of hallucinations or accuracy issues. So in our case, you know, everything that we do at Box is we think about your existing data as the source material for the AI agent. So it's the source context for the AI agent to be effective. And so if I take an existing PowerPoint document that's our sales presentation and I say modify this for a new customer, and you do that with a, you know, a frontier model that is a reasoning model, you know, with some degree of kind of thinking mode, I would posit that 99% of the time it's going to make, you know,

Starting point is 00:21:19 infinitesimally fall small kind of errors or failures on that. That's just like a solved problem at this point. And so, and it is still easily worth the kind of five-minute trade-off for the couple hours you save to go and review its work. And this is the, like, we actually have this incredible front row seat in watching what the future looks like with coding. So if you talk to, if you talk to the new, like the brand new startups, and I don't know if you, oh, if you do this, but I know that,

Starting point is 00:21:47 you know, you get, you know, to spend your time with the demest of the world and whatnot, but like go talk to a five-person startup that's brand new. And what's exciting is they're working in the craziest ways that I've ever seen in my entire life. I was talking to a nine-person startup the other day that estimates that they're, at a minimum, executing at the size of about a hundred-person company. And that was, again, kind of conservative, probably when, when, you know, do the underlying math. And it's because each of their engineers has the capacity

Starting point is 00:22:16 output now of five or ten or twenty engineers worth of work, but they are working in a completely different way. They are managers of AI agents. They spend their time on writing really good specs for what they want to build. They spend a really good time on the design architecture of their software. And then they spend a lot of time on reviewing the output of the agent. So, you know, not every area of knowledge work will look exactly like that. But if you imagine, you know, in sales, if you imagine in marketing, if you imagine in legal work, and your role is to manage agents that are doing a lot of the underlying data preparation, research, you know, creation type of work, and then your job is to go review that work

Starting point is 00:22:59 and put it together in a broader business process, that will actually be what a lot of work looks like in the future. And this idea of hallucinations or errors will be no different then the fact that I have to sometimes review other people's work and other people review my work. And I have errors in the presentations that I create that somebody catches and they see a misspelling or they see that I change the name of a customer in the wrong way and they change that. We will be doing that for AI agents. So it's this flip of the model where we thought AI agents were going to review our work and kind of incrementally make us more productive.

Starting point is 00:23:33 We will be the reviewers of the AI agents work. We will be the editors. We will be the managers. will be the orchestrators, and that's actually how you then get the productivity gains. So I'd say watch the AI coding space, watch what startups are doing to get leverage, and then think about that against the broader economy. You know, it's really interesting, Aaron, because the last time we spoke, you told me about this person that you knew who was basically building a company on their own

Starting point is 00:23:55 using AI coding tools. And so I was in the process of writing this profile of Dario at Anthropic, which you're quoted in. And I went out unfound a developer doing something quite similar using Claude Code to build on their own. So this is clearly, I mean, to the point where like Anthropic now has to put some rate limits on, but this is clearly a thing that's happening. And this is the thing that, again, I'm still, I love the MIT survey. I think it's great. It's a fun conversation topic.

Starting point is 00:24:24 But the one travesty would be if people miss that what you just said is actually happening on the ground and then not starting to pay attention to what that's going to mean is that ripples through corporations. and how people should probably start to think about re-engineering workflows for a world of AI agents. And, you know, this happens in every single technology, you know, wave, which is actually why you have early adopters and early innovators and why you have laggards is because the early adopters and innovators are going to read, you know, your anthropic piece and see, oh, this actually is a real trend.

Starting point is 00:25:00 And the laggards are going to read the MIT piece saying, oh, I've been vindicated. And some companies will then get those early returns at a much faster rate. and other companies can wait. And, you know, sometimes that means that your company gets disrupted, and sometimes it doesn't because you actually have, you know, some proprietary, you know, capability as an organization. Like, like if Pfizer or Eli Lilly took a little bit longer to adopt AI as a result of, you know, wanting to be more pragmatic,

Starting point is 00:25:26 that'll be totally fine. They're not going to get disrupted. Like, they have enough of market position, they have enough distribution. They can afford to kind of wait for this technology to be more baked. But if I'm a startup right now, I'm probably going to use that as my advantage as much as possible to try and run circles around maybe a larger incumbent. And this is what kind of creates this nice tension in the market that creates creative destruction in every kind of wave of technological change. Okay, I definitely want to speak a little bit more about what the definition of an agent is and how you're rolling them out at box and also get your reaction about GPT-5. So let's do that right after this.

Starting point is 00:26:05 You're used to hearing my voice on the world, bringing you interviews from around the globe. And you hear me reporting environment and climate news. I'm Carolyn Beeler. And I'm Marco Werman. We're now with you hosting the world together. More global journalism with a fresh new sound. Listen to the world on your local public radio station and wherever you find your podcasts. Technology Podcast with Box CEO, Aaron Levy. Aaron, let me start, before we get into agents and before we get into GPT-5, let me just start with a basic question, which is if this is already happening in business, which is basically like you're finding ways to get the AI to do work on its own and pull information

Starting point is 00:26:54 from different data sources and present it coherently, why do you think it's been so difficult for consumer companies like, let's say Amazon with, like, like, like, you? Exa Plus and Apple with Apple intelligence to put this together as something on device or a consumer product that does similar activities because they've all promised it, but it's not quite there yet. Yeah, I think there's the fact that the technology can exist is different from the still execution requirements to bring it to life. And so, you know, we get to all have a front row seat on what the frontier models can do. And you have companies that can pack those up in a way for these applied use cases. But if you're a company with tens of millions or hundreds of millions of

Starting point is 00:27:40 users of your product and consumers that have a certain expectation, and that is a lot of execution gap required to go from the frontier model to how do you deliver that to your end customer in a reliable way that is trustworthy, that is affordable. And so I think that the bigger companies are all going through their own version of that motion. I'd also imagine that given the space is moving so fast. I can sympathize for probably some degree of indecision maybe where one day a model is on top and then the next day a different model is on top. Another day, another model kind of breaks through. And so you probably want to make sure that by the time that you land on a final architecture, you want that to be the sustainable long-term architecture. And so to some

Starting point is 00:28:24 extent, time is on your side up to a point because you might want to wait to see kind of who falls out and who keeps going um but but i i'm i i think that you know the as an example the companies just mentioned like i don't think the the those spaces have been so utterly uh disrupted that that that uh that they can't catch up uh once they land on a final architecture um but you know we'll have to see kind of how they execute through this and so for business it's more that there are more prescribed use cases and i think with with a phone maybe if you're trying to get these proactive notifications, then that you're looking at a massive universe of data, whereas you're more concentrated in business? Or what's the difference?

Starting point is 00:29:06 Well, actually, I wouldn't say there's a difference. I would say even in business, we're insanely early. We have to process how early we are. The, the, the, the, the, the, the, the, the, the, the, the, the, the, the, the, the, the, the, the, the, you know, coding agents for very, very, uh, wired in engineers that, that, that are, you know, very online, they're paying attention, everything going on, and then early adopters across the economy. You know, most of the agents that are being deployed in the enterprise are being done by the, like, maybe you can flash it up or something. Jeffrey Moore came up with this idea of the technology adoption curve, or at least popularized it. It has multiple categories of where a company

Starting point is 00:29:49 or a group of individuals will be. You have these early innovators and early adopters. Then you have a chasm. Then you have, then you basically have kind of pragmatist and early majority, and then you have laggards. And, and we are in the early adopter, kind of the earliest phase of jumping over the chasm on some use cases. But we have to imagine there's this chasm where what happens is the early adopters, the people that, you know, we all hang out with and talk to all day, they're going to try everything. We're going to try these crazy goggles and we're going to put, you know, magnets on our head. And we're going to do the craziest things. We're going to wear Google. glass. And that actually tells you almost nothing about whether the thing will jump over the

Starting point is 00:30:29 chasm. You have to actually see what makes it to the early majority or those pragmatists that really adopt things at scale. And so the kind of technologies that have clearly broken through our chat to BT, products like cursor, products like, let's say, you know, a bunch of these kind of next-gen research agent type things, perplexity is done well in that kind of early majority. But we're so early in terms of AI agents jumping over now the chasm. So some won't make it. Some will. But I would say that business is not particularly moving faster than the examples you just gave. I just think we can see lots of examples of it, but they're usually in that kind of early adopter type category. Right. And so the week we're talking, you at Box are releasing a number of

Starting point is 00:31:14 different agents. Let me start this discussion by just asking you, what is an agent? Because it does seem like it's an overused term. And even myself, who I'm in this all the time, I don't fully have clarity on what that word actually means. I think we should anticipate that it's fully overused. It is now the new term of art for talking to an AI system that is doing work for you. So just we will hear, this will be the main term that we use going forward as an industry. And not because it's a buzzword, but actually it's a useful term. It's a, it's a definable object that is doing automated work for you. That could be, in some cases, as simple as answering a question. But I think most people in the tech industry would generally argue that it should be doing some degree of work

Starting point is 00:32:00 and looping through the AI model multiple times to do that work. And so that could be everything from, you know, very clearly something like Claude Code or Cursor has an agent or Replit has an agent where you give it a task like, build me a website that has these qualities. And it will go off and do, you know, weeks worth of human work in 10 minutes. And that's an agent that is managing that whole process, looping through the model multiple times, keeping track of what it's doing, updating its memory in the process.

Starting point is 00:32:33 And that's effectively an agent. So that's an agent encoding. And we're going to see that same kind of agent architecture emerge in law and healthcare and finance and education where you can deploy agents to go off and do work for you. And there will be, you know, a critical action which is how much work can the agent do before you have to intervene and modify and kind of repoint it in the right direction.

Starting point is 00:32:56 And so a lot of that work right now can be maybe a couple minutes long, but we're seeing examples where agents can be running for tens of minutes or maybe even hours and effectively drive, you know, better and better and high quality, more high quality output. So, so I think that that's a way to think about agents and these are going to be very pervasive in the coming years. But this is really the first year, 2025 is the first year. year where we could even really be talking about it seriously. And I think Andre, you know, Carpathie had a, you know, probably phrased it as we shouldn't

Starting point is 00:33:29 think about this as the year of agents. We should think about it as the decade of agents. That's probably the right way to think about it. This is, this is sort of mobile became the decade of mobile, but then eventually we started using mobile. Yeah, but, but the, and again, when you just said the year of mobile mattered, right, did people say, you know, some people said that was in 2020. but probably the first time it could have been realistic was 2000, sorry, not 2022, 2002.

Starting point is 00:33:53 Some people, but it wasn't really realistic until 2006 and 2007 when you had the iPhone. So, you know, I and I think fairly other, many other people are actually convinced we already have our iPhone for agents. We don't need, we don't need any kind of new breakthrough architecture. We have an architecture that already kind of works as the core scaffolding for agents. So we can start the decade kind of clock now. But it will be a full self-driving type problem. You know, obviously, Waymo got kicked off, I don't know, a decade, decade, a half ago, and only this year is it accessible in suburban Silicon Valley.

Starting point is 00:34:32 So what took a decade or a decade and a half? It was just lots of engineering work, lots of miles on the road, lots of improving every single dimension of the accuracy and the intelligence of the system. We are going to see the same thing for knowledge work. It's going to take years. The early adopters will get the early returns. The pragmatists will use it once it sort of works without a lot of handholding, and everybody will land somewhere in the middle of that spectrum.

Starting point is 00:35:02 Okay. And so I watched a chunk of your presentation this week, and some of the agents that you're talking about enabling companies to deploy will be things that will, for instance, take a look at a application to be involved, to maybe take an apartment out or oh yeah or to look at some property records and then do tasks there or to create reports looking at clinical tests and trying to pull out issues so talk a little bit about how the process to create these works and is this still in the demo phase or is this actually real so maybe second question first so so we made a number of big announcements

Starting point is 00:35:46 this week. Some of the product and capabilities that we announce are fully GA right now, so customers can already start to use it. Some of it, we kind of give a little bit of a crystal ball view into the next couple of quarters of the product that we're getting out there. As an example, we have an AI agent right now that any customer can go and use, which is a data extraction agent, so you can give us, again, contracts or invoices or medical data. And then we have an AI agent that works through that content and pulls out the critical data from those documents and then lets you go and automate a workflow around that. What we announced at BoxWorks was a new capability called Box Automate.

Starting point is 00:36:26 And what the idea of Box Automate is, is it's very, very powerful to have one-off agents that can help you, you know, review a document or generate a proposal or generate a sales plan for a client based on data. That's super powerful. But what's even more powerful is that I can drop many of those agents into a full business process. So what Box Automate lets you do is actually define your business process within Box.

Starting point is 00:36:50 It could be a client onboarding workflow. It could be an M&A due diligence review process. It could be a healthcare patient review process. And you define that workflow within Box Automate. It's a drag and drop kind of workflow builder. And then at any point in the process, you can bring in an AI agent to do work within that process. And so one thing that is very important with AI,

Starting point is 00:37:13 agents is they need the right context to be effective. So our system allows you to get that context to agents from your enterprise content. So your marketing assets, your research data, your contracts, your invoices, that becomes very important context for agents. So Box Automate lets you basically build these agents on demand or on the fly in a workflow that leverages your existing content. And then we can start to help you automate a bunch of knowledge work tasks around the enterprise. Now, a lot of the early reviews around GPT5 was it was sort of built to do these type of things or like as a foundational layer for this type of work, right? Yeah.

Starting point is 00:37:49 The reviews we read early on was that it just does stuff. And there have been people that have noticed that like when you're in chat GPT using GPT5, you like literally can't have an answer where it doesn't say, can I do something for you. So I'm actually curious, Aaron, what your response has been. The last time we spoke was pre-GPT-5. what your what your feeling has been about this new set of models really to set of models and and i'm curious like what you make of the fact that so many people were disappointed early on well um yeah so so um we on on the on the disappointment or kind of online zeit guys

Starting point is 00:38:25 which actually interestingly has already shifted um i think you know quite a bit where a lot of folks have kind of updated their views on on gbd5 and i think codex has come out very strong recently on the coding agentic side, you know, if, I think we have gotten used to, and we've been hooked on these incredible kind of jumps and breakthroughs over the past, you know, year or so. We had, we went from, if you think about it, we went from GPD4 to GPD 40 to 01 and 03 and then GPD 41, and each of those on a different axis was actually a pretty meaningful step function. So if you had just taken GPD 4 and then you jumped to GPD 5, it would have looked insanely exponential. But we got these points along the way that effectively, you know, kind of gave us an early preview into what GPD 5 would ultimately become,

Starting point is 00:39:19 which is a thinking model, a chain of thought, with a way higher quality of coding skills and a bunch of capabilities on critical dimensions of work. And so I think it was mostly just driven by the fact that we got lots of incremental steps or step function steps on the path to GPD5. And then GPD5 was just the culmination of a lot of those breakthroughs. So again, I think it's probably more psychological than kind of empirical. Like I think if we had gone from three to four to five, it would be the most vertical axis we've ever seen. But it was really, again, those steps along the way that maybe caused a little bit of that kind of reaction. In our world, you know, we test every single model on a number of evaluations where we give the model different types of enterprise data, contracts, financial documents, research materials, internal memos, those types of things. And we asked the model a series of questions about that document or data, and we saw meaningful improvements from GPD5 versus GPD4-1 as an example on our e-vow.

Starting point is 00:40:25 So for us, it was multiple points of improvement on a number of our key tests. And those improvements then translate into real-life improvements for, you know, customers where they all of a sudden will mean that when you're a health care provider using GPD-5 on unstructured healthcare data, you're going to get better results than you got before. Or when you're using it on your contracts, you're going to get better results. And so on a number of spaces where either it was kind of expert analysis required in health care or law or financial services, we saw improvements. or in more a general sense, if you needed logic or reasoning or math, it was also an improvement on those dimensions as well. Can I get a quick gut check from you on the economics of the AI industry right now? I mean, we are talking at a moment where we just talked about this on the Friday show with Ron John that open AI's losses are now going to total 115 billion through 2029. Oh, sorry, it's cash burn, 115 billion through 2029, 80 billion higher than it previously expected.

Starting point is 00:41:30 It's expected to make like $10 billion this year, but it just signed a $300 billion deal with Oracle that, like, turned Oracle into a nearly $1 trillion company almost overnight and made Larry Ellison the richest person in the world above Elon Musk. How does this, how does this make sense? Well, I think it makes sense if you believe like I do and certainly others, Jensen, you know, Cooley Sam, even Elon, I think would believe that this is the single biggest technology. that we've probably ever had access to. And so if you think about this as sort of a third industrial revolution, where for the first time ever, we can bring automation to knowledge work. Just think about that for a second. We bring automation to knowledge work.

Starting point is 00:42:17 Everything about the world of knowledge work was always basically limited by how fast we as humans could work. We can type into a computer, put data into a system, somebody else reads that data. it moves along in some kind of process. That was about the speed of knowledge work, was how quickly we could type or read information and then do something in the real world with that data. That was the rate of pace, that was the pace that knowledge work could happen at.

Starting point is 00:42:42 And so every field that we know of in kind of knowledge work, you know, healthcare experts reading, you know, medical diagnoses, life sciences experts that are doing research on clinical studies, lawyers that are trying to find facts about a case or working through intellectual property, an engineer trying to generate code and read product specifications.

Starting point is 00:43:08 All of that work has always been constrained by how fast we as individuals can do that work individually ourselves. For the first time ever, with AI, we can bring automation to effectively all of that work. And that automation can kind of be tuned based on just how much compute we throw at the problem. And then of course, how good our data is and how effective our systems are

Starting point is 00:43:29 and getting that data to the AI. But in a world where you can toggle compute and then get different levels of automation and effective output in work to get done at a way lower cost than what people can do, that is the biggest breakthrough we've ever had in the economy and in the sort of, in the kind of post-industrial world.

Starting point is 00:43:52 And so, you know, $100 billion of loss, let's say, to get to that point of, you know, saturation where that technology is out there. That's a very, it's actually a very small number when you think about the economy and the size of the economy for all of health care, all of law, all of life sciences, all of financial services, all of engineering. So I think that's how these technology companies are underwriting this. And the losses are a choice, to be clear. Like, that's very obvious.

Starting point is 00:44:24 Like, they're choosing to lose that money. They're doing it for a strategic reason, you know, that's at least their decision. The strategic reason is that this is such a valuable market to own and to dominate in that they would rather build up capacity and, in many cases, subsidize usage, let's say in free consumer tiers of Chachibout, then charge everything at today's, you know, kind of rate of cost and then, you know, make sure everything is profitable. That's a choice. They could decide to charge for everything.

Starting point is 00:44:54 They would get less adoption today. They would be, you know, instantly a more sustainable business. But enough people believe that the prize is big enough, that it's worth actually doing all of the research expenses, all of the data center expenses, and the subsidization where necessary to drive that adoption and demand. And it's a go-bigger, go-home type of bet. You know, clearly very, very smart, very economically rational firms, individuals, sovereign wealth funds believe that that bet is. worth it. I'm probably on the side that the bet is worth it because of again how how material of an economic impact this technology can have. And then we'll obviously how we'll see how it plays out with any kind of individual player in the in the space.

Starting point is 00:45:36 Folks, you can learn more about Box's offerings at box.com. There's a video playing on the homepage right now that talks a lot more about the things that Aaron and I have discussed here today. Aaron, so great to see you. Thanks again for coming on the show. Thanks, Alex. All right, everybody, thank you so much for watching. We'll see you next time on Big Technology Podcasts.

Big Technology Podcast - Are 95% of Businesses Really Getting No Return on AI Investment? — With Aaron Levie

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.