The AI Daily Brief: Artificial Intelligence News and Analysis - 50 AI Predictions for 2026 - Part 1

Starting point is 00:00:00 Today we are casually talking through 50 AI predictions for 2026. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. All right, friends, quick announcements before we dive in. First of all, thank you to today's sponsors, KPMG, Blitzy, Super Intelligent, and Robots and Pencils. To get an ad-free version of the show, go to patreon.com slash AI Daily Brief, or you can subscribe on Apple Podcasts. To learn about sponsoring the show, visit AIDdailybrief.AI, or send us a note at sponsors at AIDDailybrief. And lastly, if you would like to learn more about our recently released AI-R-OI benchmarking survey or our forthcoming AIDB Intelligence Service, which includes original research information benchmarks,

Starting point is 00:00:48 check it out at AIDBIntel.com. All right, friends, the time has come to shift from looking backward to looking forward. And I'm thrilled to spend the next two days looking at AI predictions for 2026. Now, originally, I had intended this to be a single episode, but when I got to an hour and 47 minutes of our recording, it was quite clear that two episodes was on the docket. For the visuals, I dumped my outline into both Gen Spark and to Manus to help produce this. And rather than picking one or the other, I decided I'm just going to go back and forth between them, A, so you can get a feel for how these various tools perform, but B, to keep it a little bit more visually interesting, as this is a

Starting point is 00:01:26 particularly talky type of episode. I've organized the predictions into about seven categories, models and capabilities, vibe coding, enterprises plus vibe coding, enterprise trends, not including vibe coding, competition, market, and politics. Now, number three enterprises and vibe coding probably could have just been in one or the other, but they were distinct enough that I decided to keep them independent. Let's kick off with models and capabilities. Broadly speaking, I think that we are going to stay roughly on the meter line. Now, this is obviously a GenSpark made up chart, and the meter line I'm talking about is this one. This is the chart that measures the length of a task in human hours that different models can complete at 50 and 80% success rates. This line has been fairly consistent

Starting point is 00:02:06 for some time now. For a while, we saw capabilities doubling every seven months, and more recently it's jumped up to closer to four and a half months. You can see here the difference between the seven-month line and the four-month line on both the 50 and the 80% reliability threshold. Now it is at least theoretically possible that we see recursively self-improving AI, but I think it's far more likely that the new Nvidia architecture, which is coming online in the form of Blackwell chips and then eventually Hopper chips, keeps us on something like this trajectory, even as we max out capabilities and move them beyond human capacity in a lot of different areas. Next up, I think we are going to get a lot more models, a lot more frequently.

Starting point is 00:02:42 GPT5, more than anything, showed that there is just a ton of risk in building up big expectations around a single model release. Now, yes, of course, Gemini 3 was kind of the opposite, but the hit to open AI and more broadly the entire AI field that GPT5 wrought probably could have been avoided by a different approach to release schedules. Of course, to be fair to OpenAI, they had released models in between. We had 03, 04 mini, but they obviously had built a lot of expectations around their big 5.0 model. Subsequent to that, we have gotten 5-1, then 5-1 codex, then 5-2, then 5-2 codex, all in very short order from one another. Anthropic, of course, was kind of already on this tip, not only releasing

Starting point is 00:03:22 more sub-variations, but also splitting the releases of their haiku, sonnet, and opus versions in a way that took some pressure off of any one release. Now, for all of us users, this is going to be a little bit of a double-edged sword. On the one hand, we are pretty constantly going to have new toys to play with, but on the other, there is going to be a never-ending slate of new things to test and try and figure out if they actually improve upon the existing models for your particular use cases. What's more, I think especially when it comes to writing-type tasks, or just generally being smart, research, etc., model upgrades are going to be increasingly vibe-based. This is, of course, due to the fact that all of the premier models are really good right now.

Starting point is 00:04:02 When I'm deciding between Gemini 3, Opus 4.5, and GPD 5.2 for some writing or research use case, use case, it's largely going to be stylistic for me and use case by use case. Now, what this may lead to for most users, is just picking one that generally they like the vibes up best and sticking with it, knowing that even if one of the other models gets ahead for a moment, there's probably a new release coming right around the corner that will get your preferred model back up to the state of the art. That said, because there's so much saturation and similarity around a lot of those base writing and thinking type of tasks, I think there's going to be a lot more emphasis on multimodal competition.

Starting point is 00:04:37 Already you're seeing that. Nanobanana Pro, which Manus used, of course, to create these images, you can kind of tell, felt every bit as significant to Google's second half of the year, as did Gemini 3. And obviously, OpenAI did not wait very long to respond, even moving up the release of their images 1.5 model. It is very clear that OpenAI is not seeding this, even if Google does look like the juggernaut in this particular area. It's worth noting that Grogh also isn't seeding this, continuing to push both images and video. The only major lab that is very clearly taken themselves out of this particular race, which actually never entered it, is anthropic.

Starting point is 00:05:11 Now, in addition to multimodal, I also predict that there will be a lot more emphasis on productization and the interface around models. Again, if you think that the models are pretty commensurate with one another and all kind of at the state of the art, then the choices you're going to make as a user of those models is going to shift to other areas, such as, for example, the user experience and how navigable they are, how much it helps you do, what you need to do with them. I think the fact that OpenAI put a distinct user experience, even if a very limited one around the images release, is testament to that fact. And of course, I'm talking about even in the context of the foundation model labs, given that

Starting point is 00:05:46 there are already so many of what used to pejoratively be called wrapper companies that have become extremely successful by focusing on specific interfaces for specific industries and use cases. One particular interface that I think that we're likely to see is what I'm calling a notebook LM for agent building, by which I really mean a really simple studio type of interface for building agents. I fundamentally don't believe that the drag and drop automation type builders that you see with products like Zapier and Lindy, as powerful as they are and as useful for power users as they are, are going to be an interface that takes building agents to the main. Now, Notebook LM might not exactly be the right analogy.

Starting point is 00:06:24 I just mean we're going to have distinct experiences, I think, for building agents, some of which come from the major labs themselves. Google is a good bet to deliver this first given that Google AI Studio is kind of already inching towards this in a number of ways. Still in the Models and Capabilities section, I believe that the focus on coding that we saw throughout 2025 not only won't decrease, it will radically ratchet up. It is both a massive use case, but also a capability set that unlocks lots of other use cases, and you better believe it is going to be very much on the minds of every

Starting point is 00:06:54 single lab with every single model they release. My next prediction is that we're going to learn in 2026 just how valuable it is to have last mile end user data that can help refine your models. Swix has framed one of the competitive battles, which I'll talk about in the competition section, as the agent labs versus the model labs. The agent labs, of course, are things like cognition and cursor, whereas the model labs, open AI, anthropic, etc. At the very end of 2025, we started to see the agent labs moving into the model space, taking advantage of the fact that they have a set of data

Starting point is 00:07:25 that the model labs don't necessarily because of how much of that end usage they have. Will that actually allow them to jump out ahead and become the next generation model labs? I don't know that that'll be decided in 2026, but we're certainly going to have a lot more information about it. Another prediction around where I think the labs are going to focus, memory feels to me like just obviously the biggest opportunity in some ways. Already the very nascent type of memory that we have in these LLMs at the end of 2025 as opposed to, for example, at the end of 2024, has made a major difference. Likewise,

Starting point is 00:07:56 already, that limited memory is maybe the biggest barrier preventing people from model switching. I'm about as voracious a model switcher as they come, with the top level of every subscription across all of the major models. And yet, despite the fact that I try most use cases across most models, there are certain things where the memory that one of the models has about a particular area of business or previous conversations I've had just means it's too much of a pain to transfer from one to the other. Now, this is not a particularly difficult prediction. It's something, for example, that Sam Altman is already talking lots about, but I do think it's going to be an increasingly important focus, especially if and as the other models start to catch up with chat

Starting point is 00:08:32 GPT, and they're looking for better ways to lock users in. One that you might have heard me talk about a little bit in my review of the A16Z big ideas is my thoughts on world models. I think that this is going to continue to be an area that people are really excited about. I think we're going to see some new entrance to the market. Jan Lacoon, for example, left meta and is purportedly raising a half billion dollars at a big valuation to go pursue this opportunity. But I think that in 26 specifically, we're going to continue to get really cool demos and maybe some really early sandboxes, but I don't think that we're going to have a generalist usability type of moment yet. Right now, world models feel a little bit to me like the VR of the AI world, where it's not

Starting point is 00:09:11 hard to understand how powerful they could be in theory, but because they represent some totally new capability set for experiences and are not just a one-to-one replacement for things we used to do, there's just going to be a lot more time to shift that type of behavior. Now, world models are valuable for more reasons than just the end user. Obviously, many people think that they are a better path to AGI than the approaches we're currently taking. So in that way, they're not like VR as some new consumer category, but I still think that when it comes to their maturity, I'd be surprised if we were all using some major model by the end of 2026. I would of course be delighted to be wrong on this one. Lastly, in the Models and Capability section, I think that in

Starting point is 00:09:49 26, we're going to see the lines between assistants and agents get more blurry, not more clear. What I mean by that is that I think that the way that agents will start to make their way into the real world on a wider array of use cases is still going to be through individual users delegating more to them. I think that users shifting and using agents to manage more complex tasks, like, for example, taking this outline and turning it into a 56 slide presentation is going to be the way that agentic AI starts to proliferate, particularly in the enterprise. Now, this is not to say that we also won't see lots of progress on fully autonomous agents, but I think in practice, it's more likely that 2026 is the year of agent managers than is the year of full autonomy.

Starting point is 00:10:36 Sure, there's hype about AI, but KPMG is turning AI potential into business value. They've embedded AI and agents across their entire enterprise to boost efficiency and improve, improve quality and create better experiences for clients and employees. KPMG has done it themselves. Now they can help you do the same. Discover how their journey can accelerate yours at www.kpmG.us slash agents. That's www.kpmg.org.comg.coms agents.

Starting point is 00:11:04 This episode is brought to you by Blitzy, the Enterprise Autonomous Software Development Platform with Infinite Code Context. Blitzy uses thousands of specialized AI agents that think for hours to understand Enterprise-scale code-based. with millions of lines of code. Enterprise engineering leaders start every development sprint with the Blitzy platform, bringing in their development requirements. The Blitzy platform provides a plan, then generates and pre-compiles code for each task. Blitzy delivers 80% plus of the development work autonomously, while providing a guide for the final 20% of human development work required to complete

Starting point is 00:11:34 the sprint. Public companies are achieving a 5x engineering velocity increase when incorporating Blitzy as their pre-I-D-E development tool, pairing it with their coding pilot of choice to bring an AI-native SDLC into their org. Visit blitzie.com and press get a demo to learn how Blitzy transforms your SDLC from AI assisted to AI native. Today's episode is brought to you by my company, Superintelligent. Superintelligent is an AI planning platform. And right now, as we head into 2026, the big theme that we're seeing among the enterprises

Starting point is 00:12:04 that we work with is a real determination to make 2026 a year of scaled AI deployments, not just more pilots and experiments. However, many of our partners are stuck on some AI plateau. It might be issues of governance. It might be issues of data readiness. It might be issues of process mapping. Whatever the case, we're launching a new type of assessment called Plateau breaker that, as you probably guess from that name, is about breaking through AI plateaus.

Starting point is 00:12:32 We'll deploy voice agents to collect information and diagnose what the real bottlenecks are that are keeping you on that plateau. From there, we put together a blueprint and an action plan that helps you move right through that plateau into full-scale deployment and real ROI. If you're interested in learning more about Plateaubreaker, shoot us a note, contact at B-super.a.I with plateau in the subject line. Small, nimble teams beat bloated consulting every time. Robots and pencils partners with organizations on intelligent, cloud-native systems powered by AI. They cover human needs, design AI solutions, and cut-through complexity to deliver meaningful impact without the layers of bureaucracy. As an AWS-certified

Starting point is 00:13:11 partner, Robots and Pencils combines the reach of a large firm with the focus of a trusted partner. With teams across the U.S., Canada, Europe, and Latin America, clients gain local expertise and global scale. As AI evolves, they ensure you keep peace with change. And that means faster results, measurable outcomes, and a partnership built to last. The right partner makes progress inevitable. Partner with robots and pencils at robots and pencils.com slash AI Daily Brief. Next up, let's talk vibe coding. And by the way, I've decided now that I've gone through a full section, jumping back and forth between Manus and GenSpark, that I just like the Jen Spark better in this case. Manus did a great job as well, but it's got a little bit too much of that

Starting point is 00:13:53 obvious nanobanana pro-sheen for our purposes here. So next section, vibe coding, obviously one of the biggest themes of 2025, so how do I think it's going to change next year? First of all, I think we're going to see a big bifurcation. Right now, we use the same words to describe two totally different things, vibe coding or AI and agentic coding within software engineering organizations, and vibe coding among non-developers. These are wildly different things, and I think that we'll stop treating them as the same thing. Now, moving into what that's going to mean,

Starting point is 00:14:22 I think that on the engineering side, we came into 2025, with there still being a ton of resistance, especially among enterprise engineering departments, to AI and agentic coding. By the end of the year, we've shifted all the way to the conversations being about how to best handle

Starting point is 00:14:38 and organize different types of autonomy, how to manage the new challenges that AI and Agenda coding create, where and in what ways organizations think they need to ignore certain capabilities, so their own capabilities don't atrophy. But all of it, I think, amounts to a big reorganization of engineering organizations to take advantage of AI-enabled coding. Now, this might seem obvious, and for those of you who are in startups or who live deep in the AI industry, this has probably just been happening continuously throughout the year. But I think you're going to see it start to jump into even traditional organizations that are really

Starting point is 00:15:09 going to have to reevaluate how they're structured, how they deliver, how they don't deploy. Next up, and one of the predictions that I feel most strongly about, vibe coding is going to move beyond prototypes into production mode in non-tech areas of the enterprise. That could be things like custom legal contract analyzers, onboarding apps for HR. I think you're going to see a ton of vibe-coded experiences enter the marketing world. And of course, these things may never touch the engineering organization. You might still have engineering departments that make sure these things don't introduce new security risks or are production ready if they're public facing. But I think we're going to see production mode vibe coding enter all the non-tech areas of the enterprise

Starting point is 00:15:47 this year. On the consumer side, I think we're going to see a lot of bespoke personal software. Some people have called this ephemeral software. I don't think that the terminology is exactly figured out yet. But the idea here is basically people building themselves tools because it's easier to chat with lovable or replet or whatever they're using and get a thing that is exactly tailored to them than it is to go find and tweak some existing app experience, or maybe that thing just doesn't exist. For example, right now I have a gift tracker that I was using to keep track of what we had got for our kids so we don't end up getting way too much as always happens with me, which is an example of something that just doesn't exist right now. Or honestly,

Starting point is 00:16:25 I didn't even really look to find to see if it did because I knew exactly what I wanted and it was easier to just build it. And I also built myself a simple fitness tracker. Now, I've tried like every different fitness tracker. And it's not that they didn't have the features that I was looking for. I just wanted something very specific that made sense to my particular brain, and it was easy enough to just build for myself. Anyways, I think we're going to start to see a lot more of this personal software start to happen this year. I've been vibe coding all year and it's only just in the last month or so that I felt myself start to naturally ask, could I solve that with software? I think probably a growing number of people will start to have a similar experience, and that'll

Starting point is 00:17:01 lead us in some really interesting places. One of the places I think that'll lead is we'll probably see a new class of AI app entrepreneur. Some number of these things that start off as people building for themselves, they'll probably figure out have kind of a market. And since they never needed to raise venture capital or anything like that, the economics of these things look totally different. Maybe for example, you don't care about subscription costs, and you think people would be happier paying 10 bucks one time than having to think about $2 a month for perpetuity. The other thing that makes this one interesting is, of course, chat ShibbT becoming something of an app platform, although I don't think we have any idea yet exactly how that's going to play out, and whether there will actually even be a way for independent

Starting point is 00:17:40 and smaller developers to actually find their way into that flow, or if it's just going to be dominated by the major partners. Another really hyper-specific prediction, I think it is going to be a very tough time for template-based website creation software. Once you have used English to manage your personal website, and when you want something changed, you can just explain it, you're never going back to templates. Now, of course, Wix and Squarespace are both aware of this. WixBotBase 44 and is heavily investing in this area, so it's not a knock on the companies themselves,

Starting point is 00:18:09 but I think this mode of building personal websites is on its very, very last legs. One more super-specific one. I think Shopify potentially has a uniquely important role in the AI ecosystem. Shopify is already how so many people, small creators, small builders, people who don't consider themselves technical at all,

Starting point is 00:18:27 interface with e-commerce. and increasingly just interface with the entire spectrum of their online business. It's not just their store, it's also their website. Shopify has been extremely attuned to the AI opportunity, and I think because they serve such a normy audience who is definitionally not necessarily tech savvy, they have a really important role in transmitting and helping share the value that AI can bring,

Starting point is 00:18:51 not just to tech people, but to regular people who are just trying to run their businesses more effectively. Speaking of businesses, let's move over to the Enterprise World, starting with the section on Enterprises and Vibe Coding. Overall, I think we're going to see a knowledge work vibeification, which basically means we're going to see what happened with software engineering this year, go into all other areas of knowledge work next year. Simply put, we're going to start to make a shift from doing to managing.

Starting point is 00:19:15 This entire presentation is a great example of that. Now, I think that this is a five-to-ten year megatrend, and so I don't overstate how dramatically the shift will happen, but I think it will feel distinct even inside big, lumbering, boring old organizations. I also think we are going to see new vibe coding specific roles. Basically, I think companies are going to start hiring people who have an overlap of some particular functional experience and also are good vibe coders. Think of them as internal forward-deployed vipers.

Starting point is 00:19:46 Now, Lenny Richisky recently called this out saying that he had seen some of this happening. So maybe I'm cheating by making a prediction. But I definitely think that this is going to be a thing that more and more enterprises hire for in 2026. and the forward-deployed Vibers will, of course, help all the different departments and functions figure out how to use coding in ways that they couldn't be for. Now, I've talked about personal software, but will companies build their own version of personal software, basically replacement software for their big enterprise software deals?

Starting point is 00:20:14 Klarna very famously a couple years ago scrapped workday and Salesforce and shifted to their own, and I've always been quite skeptical that that's something that companies are going to do on mass. So here's the nuance. I actually do think that in 2026, we are going to see companies build replacement software. But I don't think it's going to be massive companies ripping out Salesforce. I think this is going to impact small and medium-sized companies. The companies who, if you checked out the AI-R-OI benchmarking study,

Starting point is 00:20:41 are operating a little bit more nimbly and already seeing more value from AI because they can take full advantage of it more quickly. I think you're going to see those types of companies who those big lumbering enterprise sales contracts were never necessarily a great fit for, increasingly not only not work with the sales forces of the world, but also have a pretty high bar for even the long-tail software providers, like in the case of CRM, a HubSpot or something like that. I think more and more you are going to see people who don't have use for 70 or 80% of the features just build the 20% that they want, especially if it's internal facing and it can be a little clunky

Starting point is 00:21:16 and broken. Now this won't be ubiquitous, and of course the SaaS providers are doing a lot to integrate AI features to make their products better, but I do think we are going to increasingly see companies build replacement software, particularly in areas like CRM. Now, moving to enterprises more broadly, to the shock of no one, I think that there is going to be a huge ROI and benchmarking focus. Call 2026 the year of the dashboard. Now, it's not that I think that companies will stop doing AI if they can't get precise measures of ROI, but I do think that they're going to start trying to measure things in a much more distinct and discrete way. In fact, I think it's kind of going to be the wild west of measurement this year until we actually

Starting point is 00:21:56 get some benchmarks under our belt. People are going to explore all sorts of different types of impact metrics and different ways of determining value. But I would expect it to be way more quantitative than qualitative heading into 2027 than it is heading into 2026. I also think that there is going to be a ton of focus on data and context engineering. I think investing in your AI and agent infrastructure is going to be sexy in the enterprise in 2026. Companies are going to to realize that to really get full value, especially out of agents, they're just going to have to take the time and make the investment to have their data available to work for those agents. Now, they've known this for a while, but I think it'll really come to the fore and be something

Starting point is 00:22:35 that people talk about and focus on, even to the exclusion of some random test agents in the year to come. Now, the next one is kind of an echo of what we talked about before with Notebook LM for agents, but I think that for enterprises to shift more of their behavior into the agent realm, in other words, out of the realm of assisted AI and automated workflows, it's going to take some serious interface improvements. Again, enterprises are not going to use Zapier-style builders, but I think that as we do get those new interfaces, a lot of opportunity will unlock. In fact, I kind of think that we're going to start to see a bit of a squeeze on that workflow automation this year. One of the things that's happening right now is that a lot of enterprises,

Starting point is 00:23:14 and this makes sense, are trying to use AI to map how their humans currently do things, to allow agents or realistically automated workflows copy that human process. In many cases, there could be a ton of value there. However, I think that it is highly likely that the real destiny will be total process reinvention based on new agentic capability, not just an agent copying what a human did. Agents are not humans. They work in different ways. To get the full value out of them, in most cases, we'll probably need to figure out, or allow them to figure out the best way of accomplishing a goal without imposing an existing process on them. So I think you're going to start to see a squeeze on automation

Starting point is 00:23:52 from both just the assisted AI on the one hand, which is going to continue to be a huge part of personal productivity gains that then can translate up into the organization, and actual new agentic processes from the other side that start to redefine how a workflow can work. Finally, in the enterprise, I think we are going to start to see the full impact of AI compounding. We're now at a point where the organizations that are leading

Starting point is 00:24:15 are going to start to get farther and farther ahead, not just on their AI usage, but I think that their AI usage will actually start to open up not just efficiency gains in what they do now, but new opportunities, such as new product and revenue lines. As they do that, the distance between them and the AI laggards is going to do nothing but grow. For now, that is going to do it for today's episode. Appreciate you listening or watching, as always. Until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - 50 AI Predictions for 2026 - Part 1

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.