The AI Daily Brief: Artificial Intelligence News and Analysis - My Autumn AI Predictions

Episode Date: September 3, 2025

Back to school season means back to AI predictions! After a summer of skepticism around the MIT study claiming 95% of AI pilots fail, NLW dives nto what's really coming this fall and beyond. From ...simmering skepticism to multimodal model progress to the potential for AI M&A, NLW breaks down all the key trends. Brought to you by:KPMG – Discover how AI is transforming possibility into reality. Tune into the new KPMG 'You Can with AI' podcast and unlock insights that will inform smarter decisions inside your enterprise. Listen now and start shaping your future with every episode. ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.kpmg.us/AIpodcasts⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Blitzy.com - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://blitzy.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ to build enterprise software in days, not months Robots & Pencils - Cloud-native AI solutions that power results https://robotsandpencils.com/Vanta - Simplify compliance - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://vanta.com/nlw⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠The Agent Readiness Audit from Superintelligent - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://besuper.ai/ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Subscribe to the newsletter: https://aidailybrief.beehiiv.com/Interested in sponsoring the show? nlw@breakdown.network

Transcript
Discussion (0)
Starting point is 00:00:00 Today on the AI Daily Brief, as the Burmonds begin, you are getting my autumn AI predictions. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Hello, friends, welcome back to the AI Daily Brief. Quick announcements before we dive in. First of all, thank you to today's sponsors, KPMG, pencils and robots, super intelligent and blitzie. To get an ad-free version of the show, go to patreon.com.com slash AI Daily Brief. And if you are interested in sponsoring the show, send us a note at sponsors at AIDailybrief. AI. Now, I have a fun one for you today. The news cycle is slowly starting back up. It is back to school time here in the U.S., which also means back to work time for all intents and purposes.
Starting point is 00:00:46 This is really the time of year that everyone shakes off the summer sunshine and gets back down to business and I love it. It has since I was very young and this was the school year beginning felt more like New Year's to me than New Year's. And so rather than just do a retrospective on what happened this summer, which if you were listening to the show, you'll already have heard, I want to translate that into some predictions. So what I'm going to do is I'm going to look at where we are and how I think it plays out over the course of the next couple of months. And I think where we have to start is with continued simmering skepticism around AI. We right now have an unholy alliance of market shorters, AI skeptics, and media headline editors
Starting point is 00:01:24 who are all keeping this MIT study headline alive. This, of course, is the study where MIT found that 95% of all AI pilots were going now. nowhere and creating no value, which was translated by headline writers into 95% of AI efforts are completely worthless, never mind the fact that this was based on around 50 interviews, and a reading of public financial statements to see whether companies were declaring that AI had increased their revenue or not, a very dubious methodology at best. Now, interestingly, however, you're also starting to see a lot of companies who want to sell you something lean into the MIT story as well.
Starting point is 00:01:58 Basically, the extension of the narrative is that, yes, of course, many pilots are meaningless, but it's not because AI doesn't work, it's because it wasn't implemented well. The implication, of course, being that if you work with them, they can help you implement it well, which honestly, I'm totally fine with this narrative. Secure the bag, my AI entrepreneur friends. But the point is that surrounding all of this, we are heading into this fall with a little bit of AI skepticism. Usually at this time, the summer AI antagonistic narrative has sort of worn off. But of course it wasn't just the MIT study, it was also Sam Altman, sort of a little bit saying we're in a bubble.
Starting point is 00:02:31 although even that was widely overreported, I think. But as I said before, I think that part of the reason that this hit the way that it did when it did is that it came in the context of the broader market cycle. Summer is a historically low liquidity kind of time when there's not all that much market activity and narratives that otherwise wouldn't move the needle have a more outsized impact. Take on top of that anxiety around the rate cutting cycle as well as tariff policy, although really for most of August it was about the rate cutting cycle. and Wall Street's general concern that AI is the only thing propping up this market,
Starting point is 00:03:05 the TLDR for me is that I think that the response had a lot more to do with where specifically the market is than anything that you have to be concerned about if you're thinking about how to use AI for your own benefits or for your company's benefits. So, predictions. First of all, I think that this story has some more legs from where it is right now, but ultimately we'll feel much less significant in the future than it was presented when it came. same as previous AI skeptic narratives in years past. I think this is especially true if markets get the rate cut that they want in September.
Starting point is 00:03:37 At the same time, I also predict that there is going to be a positive outcome of this, which is a return to focus on some things that have gotten lost, which is the real challenge of actually implementing AI in the enterprise. Remember, another part of the reason that the MIT study hit when it did was that it came right after the announcement of GPT5, which was, for a time, viewed with a lot of skepticism. Now, at this point, the narrative, I think, has shifted fairly dramatically and people are enjoying GPT-5, figuring out how to get more out of it, et cetera, et cetera. But all of this has combined to create a conversation where companies are
Starting point is 00:04:12 thinking about not just how well-performant a model is, and assuming that a more powerful model means that their experiments are going to work, but instead, this is creating context to go even deeper on the real challenges of implementation, the real challenges and opportunity of the last mile. You're starting to see these other types of pieces pop up, like this one in the Harvard Business Review, that's all about how companies can escape the AI experimentation trap. That's their term, and I think it's a good one. And hold aside their advice, which is basically to focus on the use cases that have real business opportunity and double down on them rather than spreading yourself too thin. The point isn't so much whether that advice is right or not, although I do think
Starting point is 00:04:50 it's fairly sound, and instead the fact that this is the conversation that's coming out of this. Organizations, by and large, are not looking at that MIT study and justifying a lack of behavior. Instead, they're saying, how can we do this better? And I think that's a really positive outcome. I think, in fact, that it is going to create the context for a much broader conversation about infrastructure and why that should be where companies place a lot of their emphasis. Which brings us to our next prediction. I recently said that I thought, thought that 2026 was going to be the year where the big theme in enterprise AI was context orchestration or context engineering. And the idea of context orchestration or engineering is basically
Starting point is 00:05:30 that AI and agents can only get you so far in an enterprise if they don't have good access to the context in which they are operating. In other words, if your agents don't have the relevant background, the relevant data, the relevant information that can help them contextualize their work to your organization, they're just going to be really severe walls that you run up against very quickly. it's very natural at the beginning of a technology cycle, to focus on big, flashy pilots and exciting proof of concepts, to basically move the ball down the field by doing things that are fun to show off, rather than a lot of the laborious hidden behind-the-scenes work that ultimately is going to produce good results. Now, frankly, AI has always been a little bit of an exception to this rule in the sense
Starting point is 00:06:10 that pretty quickly organizations started to try to get their data houses in order and understood that there was likely to be a direct relationship between how good at giving AI access to information and context they were and how well it was going to perform for them. But I think that this moment with GPT5 and the MIT study are going to give enterprises even more narrative cloud cover to really have the new hotness be the messy, difficult infrastructure work that is so necessary for getting to the next level of performance when it comes to AI and agents. Now, a lot of this has already started this year. One of the major themes, I think, when we look back on 2025, will have been MCP, Model Context Protocol, in these new standards around agent access to
Starting point is 00:06:48 information that so many enterprises are adopting quite quickly. You're also seeing conversations spring up around other standards like A-to-A. Ravine, the CEO of CREDLE recently wrote on his LinkedIn, I'm not a fortune teller, but my educated guess on what will dominate AI headlines in 2026 is agent-to-A workflows. Over the past few months, my conversations with AI experts, customers, and other builders all point in the same direction. Single agents aren't enough. We've seen AI evolve in waves. 2020 was the year of AI chat. 2024 was all about LLM Rappers. 2025 has been dominated by AI agents,
Starting point is 00:07:23 and I believe 2026 will be the year A-to-A workflows take center stage. He argues that the two reasons why, our first massive tool and data sprawl that happens inside companies that make a single agent managing all tasks across those systems really unrealistic, and the fact that communications infrastructure like A-to-A helps us build more specialized agents that can then work together. You already started to get a taste this year of MCP showing how infrastructure buildout can be sexy. I think we're going to see that do nothing but increase,
Starting point is 00:07:52 and I think that some of the skepticism that we got in the summer is actually going to create some momentum to move even faster in that area. Then again, I also think that we're starting to see a bunch of counter signals to that skepticism from the summer. And one of my predictions for the fall is that you're going to see a bunch of people race in the other direction over the next few weeks. An example of this came on Monday when Evercore bucked the trend of calling things in AI bubble and instead released a note predicting a 20% rally for U.S. stocks by 2026, driven in large part by excitement around AI. Evercore's report basically said that if you zoom past all of the hand-wringing and nervousness, earnings continue to defy expectations despite tariff and policy uncertainties.
Starting point is 00:08:33 Now, they're not saying it's a sure thing that we're going to see another big rally, but remember, in the last quarter we saw CoreWeave tripled their revenue, Microsoft's Azure jumped 39% and InVideo was up 56%, which yes, was less than the 200% the year before, but is still a remarkable result for the biggest company in the world. I think that this sort of market positivity is going to surge, especially after we get whatever we're going to get when it comes to rates at the end of this month. And when it does, people are going to look back on some of the stories
Starting point is 00:09:01 that they decided to conveniently not focus on in the summer, like Sam Altman saying that they were expecting to spend trillions on infrastructure quite quickly despite the fact that they knew that economists were going to say they were nuts, and that is going to help fuel this narrative even further. Another counter signal comes around fundraising. This isn't so much a prediction, just an observation, although to the extent that I think there is a prediction here, I think it's going to do nothing but increase. But the TLDR is that at least when it comes to private markets, appetite for AI investment remains unabated. Biggest example of this is that Anthropic has now closed this round that has been increasing in both total size and total valuation over the last couple of months,
Starting point is 00:09:35 coming in at a hot $183 billion valuation. The company ended up raising $13 billion, which was like three or four times as much as it seemed like they were going to at the beginning, and at a significantly juice valuation from where they started. Now, of course, the reason that Anthropic was able to command such a big increase in their valuation is that the company has just been absolutely crushing it. They 5xed their revenue from $1 billion annualized in January to $5 billion annualized in August,
Starting point is 00:10:00 thanks in large part to the rise of the agented coding use case. Speaking of which, more evidence of the counter signal and fundraising, vibe coding platform lovable, fresh off of a raise at a $1.8 billion valuation, the year-old startup, is now apparently receiving offers at double that at a $4 billion valuation. Point is, I don't think these are outliers. I think these are reflective of the fact that we are just going to see nothing but more private investor enthusiasm for hot AI deals. And boy, at this point, we are definitely in the okay but that's so obvious portion of these predictions,
Starting point is 00:10:30 that vibe, or perhaps as it will now be known, agentic coding, is going to continue to dominate as maybe the key theme of 2025. Vibe coding wasn't even a term when we started this year. Tools for AI coding had only just started to really get good enough, but obviously over the last six months, it has become a dominant force. We saw this expressed in Andreessen Horowitz's recent top 100 Gen AI consumer apps, where Lovable jumped from just off the list six months ago to number 23, and was one of the big themes that A16Z called out,
Starting point is 00:10:58 in part because we were seeing an increase not only in traffic to the core websites of these companies, but also to the domains where they hosted users' creations, suggesting that people are actually building real stuff that they're then publishing for other people. I do predict that the language of vibe coding is going to go a little bit by the wayside. I'm already noticing people starting to call it simply agentic coding more often, and I think that's a more accurate reflection of the full range of how it's being used. Now, we might still use vibe coding to refer to people who are not traditional software engineers using these tools, or maybe we'll come up with some other term for that,
Starting point is 00:11:30 but I think that agentic coding will continue to be a force, it will continue to be used in higher and higher value and more complete production uses. I think that inside companies, it'll start to jump out of the prototype phase and will have specific buckets of work to become normalized to be completed with agentic coding. I think that we're going to see more startups
Starting point is 00:11:46 who start to fill in the gaps of agentic coding, such as the just-released code review agent from Lindy that can look at your vibe coding app and see where it needs to be improved. And again, prediction for the fall is that by the time we're doing end-of-year lists, agentic coding will be seen as perhaps the most important force in AI of 2025. What if AI wasn't just a buzzword, but a business imperative?
Starting point is 00:12:09 On You Can with AI, we take you inside the boardrooms and strategy sessions of the world's most forward-thinking enterprises. Hosted by me, Nathaniel Wittamore, and powered by KPMG, this seven-part series delivers real-world insights from leaders who are scaling AI with purpose, from aligning culture and leadership to building trust, data readiness, and deploying AI agents. Whether you're a C-suite executive, strategist, or innovator, this podcast is your front row seat to the future of Enterprise AI. So go check it out at www.kpmg.org.us slash AI podcasts or search you can with AI on Spotify, Apple Podcasts, or wherever you get your podcasts. Today's episode is brought to you by robots and pencils. When competitive advantage lasts mere moments,
Starting point is 00:12:53 speed-to-value wins the AI race. While big consultancies bury progress under layers of process, robots and pencils builds impact at AI speed. They partner with clients to enhance human potential through AI, modernizing apps, strengthening data pipelines, and accelerating cloud transformation. With AWS-certified teams across U.S., Canada, Europe, and Latin America, clients get local expertise and global scale. And with a laser focus on real outcomes,
Starting point is 00:13:17 their solutions help organizers work smarter and serve customers better. They're your nimble, high-service alternative to big. integrators. Turn your AI vision into value fast. Stay ahead with a partner built for progress. Partner with robots and pencils at robots and pencils.com. If you are a regular listener, you will have heard about Super Intelligence Agent Readiness Audits at this point. But I wanted to tell you today about the full suite of Agent Readiness products that go beyond just the initial readiness report. Over the last six months, Super Intelligence has built out an entire Agent Planning suite. We help you move from Discovery to Planning.
Starting point is 00:13:53 After you've completed your agent readiness audits, we help you double-click on your most important use cases with what we call our use case planning reports. These reports are going to help you understand what sort of technical preparation you need to do to be ready for a use case, what challenges you might face in implementation, and whether you should be thinking about building, buying, partnering, or some combination. After that, you can even get a spec document in what we call our technical blueprint that gives either your developers or the developers of the partner you work with what they need to build exactly the agent that you're looking for. If you want to learn more about superintelligence agent
Starting point is 00:14:26 planning suite, we've built a custom GPT to answer your questions. Just go to bit.ly slash super, super, super agent, that's bit.l.ly slash super super agent, all one word. And if you have any questions, the agent can even help you book an appointment with our team. This episode is brought to you by Blitzy, the Enterprise Autonomous Software Development Platform with Infinite Code Context. Blitzy uses thousands of specialized AI agents that think for hours to understand enterprise-scale codebases with millions of lines of code. Enterprise engineering leaders start every development sprint with the Blitzy platform bringing in their development requirements. The Blitzy platform provides a plan, then generates and pre-compiles code for each task. Blitzy delivers 80% plus of the development
Starting point is 00:15:10 work autonomously while providing a guide for the final 20% of human development work required to complete the sprint. Public companies are achieving a 5x engineering velocity increase when incorporating Blitzy as their pre-I-D-E development tool, pairing it with their coding co-pilot of choice to bring an AI-Native STLC into their org. Blitzy is providing a limited time, 30-day free proof of concept for qualifying enterprises. The team will provide a 5x velocity increase on a real development project in your org. Visit blitzy.com and press book demo to learn how Blitzie transforms your STLC from AI-Assisted to AI Native. That's BLITZY.com.
Starting point is 00:15:45 Let's talk about what we can expect from models. One prediction that I have is that as we get out of the summer doldrums and we get farther away from the release of GPD 5 and further past the MIT study, etc, I think that we will realize that while, yes, we may be seeing the saturation of chatbot-style use cases and progress is moving more incrementally when it comes to the difference between GPD 4.5 and 5, and eventually Gemini 2.5 to 3 and GROC 4 to 5, etc. That while we were all focused on that and looking over in that direction, multimodal capabilities were just absolutely exploding all around us. There are a bunch of big examples of this from the summer,
Starting point is 00:16:25 a huge number of which, by the way, come from Google. Most recently, of course, we got Nanobanana. I've talked extensively about Nanobanana, which is, by the way, technically Gemini 2.5 Flash Image is its real name, but where Nanobanana excels is not in how much better it is at Native Generation, it's how much better it is at making changes to existing photos. I'm working on an episode for some time this week or next week, where I talk about this new idea for a benchmark I have called the Unlock Score that was inspired directly by seeing
Starting point is 00:16:53 how many new use cases this simple change opens up. Basically, while some people have complained that Nanobanana doesn't represent some major increase in raw generative capacity, the fact that it can allow for more precision photo changes creates a huge number of commercially viable use cases that used to either be A, only possible through an extended set of workflows that were cobbled together or be not possible at all. It is, of course, not just images that have gotten a major upgrade. We haven't really gotten an update in a while, but what we are seeing now is V-O-3 actually making its way into production for a lot of different use cases, particularly around social media and advertising. V-O-3 is the first model that has made it fully into production for ads that
Starting point is 00:17:38 are airing on national TV in major places, and that is a trend that I think is going to do nothing but increase, especially as it gets plugged in with models like nanobanana that make the process of going from image to video more viable and just better. Indeed, a lot of what you're seeing right now on the internet is people combining the capabilities of these models, as well as apps like Higgsfield, to make AI videos that can more natively translate crazy imaginings into something really powerful and clear. There's this other X factor as well of world models. At the beginning of August, we got a preview of Genie 3, which made some major advancements when it comes to the ability to create 3D interactive worlds with just a text prompt, specifically we've got much more extended
Starting point is 00:18:18 memory that again opens up a whole variety of new use cases. Most people are barely touching images in video to say nothing of these world models, but this could be a whole new frontier for creation and is happening at a more rapid clip than I think most people would have predicted. So my prediction for this area of multimodality is that again, as we shake off the slough of debates from the summer about MIT and GPT5, I think people are realized that they have this totally expanded creative canvas that they have barely scratched the surface on. And I think we're going to see a lot of that come to actual production as people dig in and try these rapidly improved, step function improved type of multimodal models out. Now, let's talk about cost. The state of the cost of AI is really
Starting point is 00:19:00 fascinating to me. On the one hand, we're seeing the cost of inference go down and down and down in a way that basically no one would have thought possible. I mean, it just massively exceeds anything that we would have expected or any comparison to Moore's law. At the same time, so far, most people, for most use cases, have opted to just focus on the highest-end models, and so total cost continues to rise. The Wall Street Journal wrote about this over the weekend,
Starting point is 00:19:27 with a piece called Cutting Edge AI was supposed to get cheaper. It's more expensive than ever. And obviously what's going on here is not about AI not getting cheaper. It's actually about the fact that as it gets more powerful and cheaper on a unit basis, we use more of those units. Aaron Levy from Box wrote about this, specifically in the context of the journal piece and said, this is precisely Javon's paradox in action in the purest form.
Starting point is 00:19:50 Because the cost of AI tokens have gone down, we can now afford to use far more of them for increasingly complex tasks. The key point thus is not that AI is getting more expensive. Instead, it's that because it's getting cheaper and more capable, we're using more of it to solve problems better. For almost every like-for-like task, we're just using way more tokens to complete the task to deliver far better output. Whether it's writing code, answering a healthcare question, or analyzing a contract, we're using far more AI today to perform that work because we need the additional points of performance.
Starting point is 00:20:19 Getting a 99% correct answer when working with a legal contract is very different from a 90% correct answer, and it's easily worth the 10x to 100x increase in tokens. Now, at some point, we will start to reach plateaus for certain types of tasks, and then the cost per task will go down. For instance, we probably don't need 100x more tokens that we use today for answering a simple medical question or summarizing a document. So then eventually, on a like-for-like basis, these workloads will become cheaper as we're able to capture the efficiency gains from the models.
Starting point is 00:20:47 But the general cycle will go on essentially forever, because we will just keep raising the bar of what we do with AI. As tokens continue to get cheaper due to algorithmic breakthroughs, competition in GPU prices, general compute efficiencies, and open-weight alternatives, we will find the next set of ways to consume the tokens. We'll deploy far more agents in parallel to speed up tasks. We'll use multi-agent systems to compare answers and get to consensus. We'll solve more complex knowledge work problems and will have far longer running agents
Starting point is 00:21:12 in the background. AI will both simultaneously always be getting cheaper and more expensive. It is very rare that someone says exactly what I would have said, but this is exactly one of those times. Going back to this idea that agended coding is the key use and the breakout use of AI in 2025, we've moved from sitting there, telling a coding tool what to do, to now running multiple coding agents in the background, in a way that just simply consumes more tokens. We also continue to ride the frontier, where each increase in AI capability opens up new uses that weren't possible
Starting point is 00:21:45 before. And to the extent that anyone's most important work is in that set of new use cases, they're kind of priced into using the most expensive versions of the models. However, at some point, there will be entire categories of tasks that can comfortably use models that aren't the state of the art and can take advantage of ridiculously reduced prices and inference. So my prediction, exactly as Aaron said, for a little while at least, even though AI costs are coming down dramatically, the things that get enabled both by new capabilities and by the cost coming down are going to ironically increase the total amount that we spend on AI because we're simply going to be using more of it.
Starting point is 00:22:25 Last date of the models conversation, new data sources. We continue to have lurking around us a conversation that's been going on since about this time last year, which is whether we've reached a plateau in pre-training as a scaling methodology. Certainly for some people, the more incremental performance between something like GPT 4.5 and GPD5 suggests that yes, we have. And while I think I've clearly shown that there are other areas where models are improving at every bit as exponential eclipse as anything we've seen before, there is still this conversation around plateaus in specifically LLMs and what it means in terms of what we can predict for the future. Part of the challenge is the incremental cost of compute, but part of the challenge is also access to data. At this
Starting point is 00:23:04 point, every model has been trained on the entire corpus of publicly available human information. That's just basically what you can assume. And so to train on more novel data, companies are going to have to go find more novel data. I predict that rather than just resting on their laurels, AI companies are increasingly going to be looking for those novel sources of data. One example of this is it's going to be very default, even more default than it is now, for AI to be training on your interaction with it. Anthropic was a long-term holdout around this, but even they have finally changed their terms of service to force and opt out when it comes to general consumer usage being included in the training data. You're also starting to see
Starting point is 00:23:41 companies go out and look for novel sources of data. The information today posted this piece, OpenAI and XAI show interest in cursors coding data. Startups that sell AI-powered coding assistants have created some of the fastest growing businesses in Silicon Valley, making them write acquisition targets for OpenAI and other large AI developers. So far, Curser's owner, AnySphere, isn't selling. Instead, potential acquirers such as OpenAI, XAI, and Anthropic have discussed a possible deal with the coding startup to license or purchase what could be a gold mine of data, reams of information on how millions of software engineers use Cursor to edit or write their code. My prediction is that we're going to see a renewed interest in any novel source of data that
Starting point is 00:24:19 hasn't been captured yet. Could you see the big labs even go try to cut deals with, for example, the enterprise content management providers? Wouldn't surprise me at all. Now, moving past the state of the models, what are the models that we're waiting for that we might get this fall? Well, one thing of note is that Sam Altman is already talking about GPT6. He has said explicitly in recent interviews that it will come much more quickly than the time between GPT4 and 5, although that did lead him to have to walk it back and say that it wasn't coming like tomorrow or anything like that. He's even started talking about where some of the technical emphasis might be. He's talking a lot, for example, about memory as a key feature. This goes back to that context engineering or context orchestration
Starting point is 00:25:02 idea, although put in a consumer dimension. Elon Musk has suggested that they intend to launch GROC 5 by the end of the year. And of course, there has been a ton of conversation around Gemini 3, although most of that, frankly, has not been encouraged by Google and has just been the internet doing its rumor mill thing. So again, when it comes to predictions, sadly, my prediction and my base case is that we will not get any of these major models this year. I think we will see other models launched. I think we'll see multimodal updates and maybe some advancements in open models, particularly from Chinese companies. And my ranked order of most likely to least likely in terms of a new model release in calendar year 2025 would be GROC 5, then Gemini 3, then GPD6.
Starting point is 00:25:45 The biggest reason that I think that they won't come out is that I think that they're going to have watched what happened with GPT-5 and decide that unless they've made some major, crazy advancement, it's not going to be worth fighting the disappointed initial expectations until the model is really and distinctly better in a way that is incontrovertible. At the same time, I think that we're going to see a bit of a shift in where the lab's emphasis is. Specifically, I think we're going to see, especially the bigger labs, have more focus on individual applications and use cases that take advantage of their models. Now, certainly with OpenAI, they are signaling this very clearly.
Starting point is 00:26:21 They have brought in an entire new CEO of applications named Fiji Simo. She was previously the CEO of Instacart and is going to be focused on the practical, useful applications of ChatGBT and OpenAI's models more broadly. News also just broke of an OpenAI acquisition of a product testing startup, plus who have people like OpenAI's CPU Kevin Wheel sharing that he's starting what they're calling Open AI for Science with the goal to build the, quote, next great scientific instrument, an AI power platform that accelerates scientific discovery. Over an Anthropic, over the summer we got Claude for Financial Services, a more custom
Starting point is 00:26:54 skin version for that particular set of applications, and I predict that we're going to see more, not less of that. In fact, one of the big questions for the next six to 18 months is going to be in what domains the foundation model companies want to compete with their own version of, for example, Claude for Financial Services, and what that means for startups that are building specific applications in those verticals using those models? What room will there be for third-party apps instead of the foundation model companies? I'm not as pessimistic as some. I think that there is a lot of contextual knowledge that someone who is deep in those industries can bring. There may be U.S. specifications or even
Starting point is 00:27:28 custom data sets that the Anthropics and Open AIs of the world don't have access to. So I don't think that a priori, Anthropic or Open A.I. deciding to move into a space like financial services or healthcare means that third-party AI vertical apps in those spaces are going to be out-competed, but it obviously does change the competitive dynamics. Speaking of competitive dynamics, one theme that people are talking about for the fall is the question of consolidation. It's actually been kind of more than a year since we had a major wave of changes to the AI landscape. Yes, we had the windsurf acquisition and the scale sort of acquisition, and those are big, but for example, inflection being absorbed by Microsoft.
Starting point is 00:28:05 That was over a year ago. Character AI moving back over into Google, or at least the leadership team moving back over into Google. We haven't really had that type of big surprising shift in the landscape this year. The odds-on favorite to where that could happen is, of course, surrounding Apple. Recent reporting has suggested that some inside Apple have been aggressively pushing acquisitions as a way to get back in the race, with mistral and perplexity being the main targets that were discussed, but those deals have had the kibosh put on them by no less than Tim Cook himself. I think outside of Apple, another big open question is meta. It's very clear that Zuckerberg is going to do whatever it takes to out-compete in this area, and he's got deep pockets for spending as
Starting point is 00:28:44 witness, not only by his hiring spree, but by his semi-acquisition of scale AI just to get Alexander Wang. And so it's not at all inconceivable to me that they could go out and try to buy someone big as well. My prediction, because it's way more fun to have something on record that you were either right or wrong about than just vagaries, is that one, we will see at least one big unexpected M&A deal this fall, but that, two, it will not involve Apple. I just think Apple's not going to get out of their own way and be willing to embrace this very different approach, but I hope I'm wrong about that. And if it doesn't happen in 2025, well, Goldman Sachs is already predicting in general across all different categories record-breaking
Starting point is 00:29:22 M&A in 2026. So maybe that will be the year for AI as well. Anyways, guys, that's it. My Autumn AI predictions. Let me know what you think is going to happen. Anything where you think I'm way off base, anything you think I missed. Let me know in the comments and appreciate you guys listening or watching as always. Until next time, peace. Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.