The AI Daily Brief: Artificial Intelligence News and Analysis - 50 AI Predictions for 2026 - Part 1
Episode Date: December 29, 2025Part one of a two-episode forecast on AI in 2026, focusing on models and capabilities, release strategy shifts, multimodal races, memory, and the evolution from assistance to agent management. It also... explores how vibe coding expands beyond engineering, why bespoke personal software grows, and how these trends start reshaping enterprise adoption next year Brought to you by:KPMG – Discover how AI is transforming possibility into reality. Tune into the new KPMG 'You Can with AI' podcast and unlock insights that will inform smarter decisions inside your enterprise. Listen now and start shaping your future with every episode. https://www.kpmg.us/AIpodcastsBlitzy.com - Go to https://blitzy.com/ to build enterprise software in days, not months Robots & Pencils - Cloud-native AI solutions that power results https://robotsandpencils.com/The Agent Readiness Audit from Superintelligent - Go to https://besuper.ai/ to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Interested in sponsoring the show? sponsors@aidailybrief.ai
Transcript
Discussion (0)
Today we are casually talking through 50 AI predictions for 2026.
The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
All right, friends, quick announcements before we dive in.
First of all, thank you to today's sponsors, KPMG, Blitzy, Super Intelligent, and Robots and Pencils.
To get an ad-free version of the show, go to patreon.com slash AI Daily Brief, or you can subscribe on Apple Podcasts.
To learn about sponsoring the show, visit AIDdailybrief.AI, or send us a note at sponsors at AIDDailybrief.
And lastly, if you would like to learn more about our recently released AI-R-OI benchmarking survey
or our forthcoming AIDB Intelligence Service, which includes original research information benchmarks,
check it out at AIDBIntel.com.
All right, friends, the time has come to shift from looking backward to looking forward.
And I'm thrilled to spend the next two days looking at AI predictions for 2026.
Now, originally, I had intended this to be a single episode, but when I got to an hour and 47 minutes of
our recording, it was quite clear that two episodes was on the docket. For the visuals, I dumped my
outline into both Gen Spark and to Manus to help produce this. And rather than picking one or the other,
I decided I'm just going to go back and forth between them, A, so you can get a feel for how these
various tools perform, but B, to keep it a little bit more visually interesting, as this is a
particularly talky type of episode. I've organized the predictions into about seven categories,
models and capabilities, vibe coding, enterprises plus vibe coding, enterprise trends, not including
vibe coding, competition, market, and politics. Now, number three enterprises and vibe coding probably
could have just been in one or the other, but they were distinct enough that I decided to keep them
independent. Let's kick off with models and capabilities. Broadly speaking, I think that we are going
to stay roughly on the meter line. Now, this is obviously a GenSpark made up chart, and the meter line
I'm talking about is this one. This is the chart that measures the length of a task in human hours
that different models can complete at 50 and 80% success rates. This line has been fairly consistent
for some time now. For a while, we saw capabilities doubling every seven months, and more recently
it's jumped up to closer to four and a half months. You can see here the difference between the seven-month
line and the four-month line on both the 50 and the 80% reliability threshold. Now it is at least
theoretically possible that we see recursively self-improving AI, but I think it's far more likely that
the new Nvidia architecture, which is coming online in the form of Blackwell chips and then
eventually Hopper chips, keeps us on something like this trajectory, even as we max out
capabilities and move them beyond human capacity in a lot of different areas.
Next up, I think we are going to get a lot more models, a lot more frequently.
GPT5, more than anything, showed that there is just a ton of risk in building up big expectations
around a single model release.
Now, yes, of course, Gemini 3 was kind of the opposite, but the hit to open AI and more broadly
the entire AI field that GPT5 wrought probably could have been avoided by a different approach
to release schedules. Of course, to be fair to OpenAI, they had released models in between. We had
03, 04 mini, but they obviously had built a lot of expectations around their big 5.0 model.
Subsequent to that, we have gotten 5-1, then 5-1 codex, then 5-2, then 5-2 codex, all in very short
order from one another. Anthropic, of course, was kind of already on this tip, not only releasing
more sub-variations, but also splitting the releases of their haiku, sonnet, and opus versions
in a way that took some pressure off of any one release. Now, for all of us users, this is going to be
a little bit of a double-edged sword. On the one hand, we are pretty constantly going to have
new toys to play with, but on the other, there is going to be a never-ending slate of new things
to test and try and figure out if they actually improve upon the existing models for your particular
use cases. What's more, I think especially when it comes to writing-type tasks, or just generally
being smart, research, etc., model upgrades are going to be increasingly vibe-based.
This is, of course, due to the fact that all of the premier models are really good right now.
When I'm deciding between Gemini 3, Opus 4.5, and GPD 5.2 for some writing or research
use case, use case, it's largely going to be stylistic for me and use case by use case.
Now, what this may lead to for most users, is just picking one that generally they like the vibes
up best and sticking with it, knowing that even if one of the other models gets ahead for a moment,
there's probably a new release coming right around the corner that will get your preferred model
back up to the state of the art.
That said, because there's so much saturation and similarity around a lot of those base writing
and thinking type of tasks, I think there's going to be a lot more emphasis on multimodal competition.
Already you're seeing that.
Nanobanana Pro, which Manus used, of course, to create these images, you can kind of tell,
felt every bit as significant to Google's second half of the year, as did Gemini 3.
And obviously, OpenAI did not wait very long to respond, even moving up the release of
their images 1.5 model. It is very clear that OpenAI is not seeding this, even if Google does
look like the juggernaut in this particular area. It's worth noting that Grogh also isn't
seeding this, continuing to push both images and video. The only major lab that is very clearly
taken themselves out of this particular race, which actually never entered it, is anthropic.
Now, in addition to multimodal, I also predict that there will be a lot more emphasis on productization
and the interface around models. Again, if you think that the models are pretty commensurate with one
another and all kind of at the state of the art, then the choices you're going to make as a user
of those models is going to shift to other areas, such as, for example, the user experience
and how navigable they are, how much it helps you do, what you need to do with them.
I think the fact that OpenAI put a distinct user experience, even if a very limited one around
the images release, is testament to that fact.
And of course, I'm talking about even in the context of the foundation model labs, given that
there are already so many of what used to pejoratively be called wrapper companies that have become
extremely successful by focusing on specific interfaces for specific industries and use cases.
One particular interface that I think that we're likely to see is what I'm calling a notebook
LM for agent building, by which I really mean a really simple studio type of interface for building
agents. I fundamentally don't believe that the drag and drop automation type builders that you
see with products like Zapier and Lindy, as powerful as they are and as useful for power users as they
are, are going to be an interface that takes building agents to the main.
Now, Notebook LM might not exactly be the right analogy.
I just mean we're going to have distinct experiences, I think, for building agents, some of
which come from the major labs themselves.
Google is a good bet to deliver this first given that Google AI Studio is kind of already
inching towards this in a number of ways.
Still in the Models and Capabilities section, I believe that the focus on coding that we saw
throughout 2025 not only won't decrease, it will radically ratchet up.
It is both a massive use case, but also a capability set that unlocks lots of
other use cases, and you better believe it is going to be very much on the minds of every
single lab with every single model they release.
My next prediction is that we're going to learn in 2026 just how valuable it is to have
last mile end user data that can help refine your models.
Swix has framed one of the competitive battles, which I'll talk about in the competition section,
as the agent labs versus the model labs.
The agent labs, of course, are things like cognition and cursor, whereas the model labs,
open AI, anthropic, etc. At the very end of 2025, we started to see the agent
labs moving into the model space, taking advantage of the fact that they have a set of data
that the model labs don't necessarily because of how much of that end usage they have.
Will that actually allow them to jump out ahead and become the next generation model labs?
I don't know that that'll be decided in 2026, but we're certainly going to have a lot more
information about it.
Another prediction around where I think the labs are going to focus, memory feels to me like
just obviously the biggest opportunity in some ways.
Already the very nascent type of memory that we have in these LLMs at the end of
2025 as opposed to, for example, at the end of 2024, has made a major difference. Likewise,
already, that limited memory is maybe the biggest barrier preventing people from model switching.
I'm about as voracious a model switcher as they come, with the top level of every subscription
across all of the major models. And yet, despite the fact that I try most use cases across
most models, there are certain things where the memory that one of the models has about a particular
area of business or previous conversations I've had just means it's too much of a pain to transfer
from one to the other. Now, this is not a particularly difficult prediction. It's something,
for example, that Sam Altman is already talking lots about, but I do think it's going to be an
increasingly important focus, especially if and as the other models start to catch up with chat
GPT, and they're looking for better ways to lock users in. One that you might have heard me talk
about a little bit in my review of the A16Z big ideas is my thoughts on world models. I think that
this is going to continue to be an area that people are really excited about. I think we're going to
see some new entrance to the market. Jan Lacoon, for example, left meta and is purportedly
raising a half billion dollars at a big valuation to go pursue this opportunity. But I think that in
26 specifically, we're going to continue to get really cool demos and maybe some really early
sandboxes, but I don't think that we're going to have a generalist usability type of moment yet.
Right now, world models feel a little bit to me like the VR of the AI world, where it's not
hard to understand how powerful they could be in theory, but because they represent some totally
new capability set for experiences and are not just a one-to-one replacement for things we used to do,
there's just going to be a lot more time to shift that type of behavior. Now, world models are
valuable for more reasons than just the end user. Obviously, many people think that they are a better
path to AGI than the approaches we're currently taking. So in that way, they're not like VR
as some new consumer category, but I still think that when it comes to their maturity,
I'd be surprised if we were all using some major model by the end of 2026. I would of course be
delighted to be wrong on this one. Lastly, in the Models and Capability section, I think that in
26, we're going to see the lines between assistants and agents get more blurry, not more clear.
What I mean by that is that I think that the way that agents will start to make their way into the
real world on a wider array of use cases is still going to be through individual users delegating
more to them. I think that users shifting and using agents to manage more complex tasks,
like, for example, taking this outline and turning it into a 56 slide presentation
is going to be the way that agentic AI starts to proliferate, particularly in the enterprise.
Now, this is not to say that we also won't see lots of progress on fully autonomous agents,
but I think in practice, it's more likely that 2026 is the year of agent managers than is the year of full autonomy.
Sure, there's hype about AI, but KPMG is turning AI potential into business value.
They've embedded AI and agents across their entire enterprise to boost efficiency and improve,
improve quality and create better experiences for clients and employees.
KPMG has done it themselves.
Now they can help you do the same.
Discover how their journey can accelerate yours at www.kpmG.us slash agents.
That's www.kpmg.org.comg.coms
agents.
This episode is brought to you by Blitzy,
the Enterprise Autonomous Software Development Platform with Infinite Code Context.
Blitzy uses thousands of specialized AI agents that think for hours
to understand Enterprise-scale code-based.
with millions of lines of code. Enterprise engineering leaders start every development sprint with the
Blitzy platform, bringing in their development requirements. The Blitzy platform provides a plan,
then generates and pre-compiles code for each task. Blitzy delivers 80% plus of the development work
autonomously, while providing a guide for the final 20% of human development work required to complete
the sprint. Public companies are achieving a 5x engineering velocity increase when incorporating
Blitzy as their pre-I-D-E development tool, pairing it with their coding pilot of choice to bring an AI-native
SDLC into their org.
Visit blitzie.com and press get a demo to learn how Blitzy transforms your SDLC from AI
assisted to AI native.
Today's episode is brought to you by my company, Superintelligent.
Superintelligent is an AI planning platform.
And right now, as we head into 2026, the big theme that we're seeing among the enterprises
that we work with is a real determination to make 2026 a year of scaled AI deployments,
not just more pilots and experiments.
However, many of our partners are stuck on some AI plateau.
It might be issues of governance.
It might be issues of data readiness.
It might be issues of process mapping.
Whatever the case, we're launching a new type of assessment called Plateau breaker
that, as you probably guess from that name, is about breaking through AI plateaus.
We'll deploy voice agents to collect information and diagnose what the real bottlenecks are that
are keeping you on that plateau.
From there, we put together a blueprint and an action plan that helps you move right through
that plateau into full-scale deployment and real ROI. If you're interested in learning more about
Plateaubreaker, shoot us a note, contact at B-super.a.I with plateau in the subject line.
Small, nimble teams beat bloated consulting every time. Robots and pencils partners with organizations
on intelligent, cloud-native systems powered by AI. They cover human needs, design AI solutions,
and cut-through complexity to deliver meaningful impact without the layers of bureaucracy. As an AWS-certified
partner, Robots and Pencils combines the reach of a large firm with the focus of a trusted partner.
With teams across the U.S., Canada, Europe, and Latin America, clients gain local expertise and global
scale. As AI evolves, they ensure you keep peace with change. And that means faster results, measurable
outcomes, and a partnership built to last. The right partner makes progress inevitable.
Partner with robots and pencils at robots and pencils.com slash AI Daily Brief.
Next up, let's talk vibe coding. And by the way, I've decided now that I've gone through a full
section, jumping back and forth between Manus and GenSpark, that I just like the Jen Spark better
in this case. Manus did a great job as well, but it's got a little bit too much of that
obvious nanobanana pro-sheen for our purposes here. So next section, vibe coding, obviously
one of the biggest themes of 2025, so how do I think it's going to change next year? First of all,
I think we're going to see a big bifurcation. Right now, we use the same words to describe two totally
different things, vibe coding or AI and agentic coding within software engineering organizations,
and vibe coding among non-developers.
These are wildly different things,
and I think that we'll stop treating them as the same thing.
Now, moving into what that's going to mean,
I think that on the engineering side,
we came into 2025,
with there still being a ton of resistance,
especially among enterprise engineering departments,
to AI and agentic coding.
By the end of the year,
we've shifted all the way
to the conversations being about how to best handle
and organize different types of autonomy,
how to manage the new challenges
that AI and Agenda coding create, where and in what ways organizations think they need to
ignore certain capabilities, so their own capabilities don't atrophy. But all of it, I think,
amounts to a big reorganization of engineering organizations to take advantage of AI-enabled coding.
Now, this might seem obvious, and for those of you who are in startups or who live deep in the
AI industry, this has probably just been happening continuously throughout the year.
But I think you're going to see it start to jump into even traditional organizations that are really
going to have to reevaluate how they're structured, how they deliver, how they don't
deploy. Next up, and one of the predictions that I feel most strongly about, vibe coding is going to
move beyond prototypes into production mode in non-tech areas of the enterprise. That could be things
like custom legal contract analyzers, onboarding apps for HR. I think you're going to see a ton
of vibe-coded experiences enter the marketing world. And of course, these things may never touch
the engineering organization. You might still have engineering departments that make sure these
things don't introduce new security risks or are production ready if they're public facing.
But I think we're going to see production mode vibe coding enter all the non-tech areas of the enterprise
this year. On the consumer side, I think we're going to see a lot of bespoke personal software.
Some people have called this ephemeral software. I don't think that the terminology is exactly
figured out yet. But the idea here is basically people building themselves tools because it's
easier to chat with lovable or replet or whatever they're using and get a thing that is exactly
tailored to them than it is to go find and tweak some existing app experience, or maybe that
thing just doesn't exist. For example, right now I have a gift tracker that I was using to keep
track of what we had got for our kids so we don't end up getting way too much as always happens
with me, which is an example of something that just doesn't exist right now. Or honestly,
I didn't even really look to find to see if it did because I knew exactly what I wanted and it was easier
to just build it. And I also built myself a simple fitness tracker. Now, I've tried like every
different fitness tracker. And it's not that they didn't have the features that I was looking for.
I just wanted something very specific that made sense to my particular brain, and it was easy enough
to just build for myself. Anyways, I think we're going to start to see a lot more of this personal
software start to happen this year. I've been vibe coding all year and it's only just in the last
month or so that I felt myself start to naturally ask, could I solve that with software?
I think probably a growing number of people will start to have a similar experience, and that'll
lead us in some really interesting places. One of the places I think that'll lead is we'll probably
see a new class of AI app entrepreneur. Some number of these things that start off as people building for
themselves, they'll probably figure out have kind of a market. And since they never needed to raise venture
capital or anything like that, the economics of these things look totally different. Maybe for example,
you don't care about subscription costs, and you think people would be happier paying 10 bucks one time
than having to think about $2 a month for perpetuity. The other thing that makes this one interesting is, of course,
chat ShibbT becoming something of an app platform, although I don't think we have any idea yet
exactly how that's going to play out, and whether there will actually even be a way for independent
and smaller developers to actually find their way into that flow, or if it's just going to be
dominated by the major partners. Another really hyper-specific prediction, I think it is going
to be a very tough time for template-based website creation software. Once you have used English
to manage your personal website, and when you want something changed, you can just explain it,
you're never going back to templates.
Now, of course, Wix and Squarespace are both aware of this.
WixBotBase 44 and is heavily investing in this area,
so it's not a knock on the companies themselves,
but I think this mode of building personal websites
is on its very, very last legs.
One more super-specific one.
I think Shopify potentially has a uniquely important role
in the AI ecosystem.
Shopify is already how so many people,
small creators, small builders,
people who don't consider themselves technical at all,
interface with e-commerce.
and increasingly just interface with the entire spectrum of their online business.
It's not just their store, it's also their website.
Shopify has been extremely attuned to the AI opportunity,
and I think because they serve such a normy audience
who is definitionally not necessarily tech savvy,
they have a really important role in transmitting and helping share
the value that AI can bring,
not just to tech people, but to regular people
who are just trying to run their businesses more effectively.
Speaking of businesses, let's move over to the Enterprise World,
starting with the section on Enterprises and Vibe Coding.
Overall, I think we're going to see a knowledge work vibeification,
which basically means we're going to see what happened with software engineering this year,
go into all other areas of knowledge work next year.
Simply put, we're going to start to make a shift from doing to managing.
This entire presentation is a great example of that.
Now, I think that this is a five-to-ten year megatrend,
and so I don't overstate how dramatically the shift will happen,
but I think it will feel distinct even inside big, lumbering, boring old organizations.
I also think we are going to see new vibe coding specific roles.
Basically, I think companies are going to start hiring people who have an overlap of some
particular functional experience and also are good vibe coders.
Think of them as internal forward-deployed vipers.
Now, Lenny Richisky recently called this out saying that he had seen some of this happening.
So maybe I'm cheating by making a prediction.
But I definitely think that this is going to be a thing that more and more enterprises hire
for in 2026.
and the forward-deployed Vibers will, of course, help all the different departments and functions
figure out how to use coding in ways that they couldn't be for.
Now, I've talked about personal software, but will companies build their own version of personal
software, basically replacement software for their big enterprise software deals?
Klarna very famously a couple years ago scrapped workday and Salesforce and shifted to their own,
and I've always been quite skeptical that that's something that companies are going to do
on mass.
So here's the nuance.
I actually do think that in 2026, we are going to see companies build replacement software.
But I don't think it's going to be massive companies ripping out Salesforce.
I think this is going to impact small and medium-sized companies.
The companies who, if you checked out the AI-R-OI benchmarking study,
are operating a little bit more nimbly and already seeing more value from AI
because they can take full advantage of it more quickly.
I think you're going to see those types of companies
who those big lumbering enterprise sales contracts were never necessarily a great fit for,
increasingly not only not work with the sales forces of the world, but also have a pretty high bar
for even the long-tail software providers, like in the case of CRM, a HubSpot or something like that.
I think more and more you are going to see people who don't have use for 70 or 80% of the features
just build the 20% that they want, especially if it's internal facing and it can be a little clunky
and broken. Now this won't be ubiquitous, and of course the SaaS providers are doing a lot to
integrate AI features to make their products better, but I do think we are going to increasingly
see companies build replacement software, particularly in areas like CRM.
Now, moving to enterprises more broadly, to the shock of no one, I think that there is going
to be a huge ROI and benchmarking focus. Call 2026 the year of the dashboard. Now, it's not
that I think that companies will stop doing AI if they can't get precise measures of ROI, but I do
think that they're going to start trying to measure things in a much more distinct and discrete way.
In fact, I think it's kind of going to be the wild west of measurement this year until we actually
get some benchmarks under our belt. People are going to explore all sorts of different types of
impact metrics and different ways of determining value. But I would expect it to be way more
quantitative than qualitative heading into 2027 than it is heading into 2026. I also think
that there is going to be a ton of focus on data and context engineering. I think investing in your
AI and agent infrastructure is going to be sexy in the enterprise in 2026. Companies are going to
to realize that to really get full value, especially out of agents, they're just going to have to
take the time and make the investment to have their data available to work for those agents.
Now, they've known this for a while, but I think it'll really come to the fore and be something
that people talk about and focus on, even to the exclusion of some random test agents in the year
to come. Now, the next one is kind of an echo of what we talked about before with Notebook LM for
agents, but I think that for enterprises to shift more of their behavior into the agent
realm, in other words, out of the realm of assisted AI and automated workflows, it's going to take
some serious interface improvements. Again, enterprises are not going to use Zapier-style builders,
but I think that as we do get those new interfaces, a lot of opportunity will unlock.
In fact, I kind of think that we're going to start to see a bit of a squeeze on that workflow
automation this year. One of the things that's happening right now is that a lot of enterprises,
and this makes sense, are trying to use AI to map how their humans currently do things, to allow
agents or realistically automated workflows copy that human process. In many cases, there could be
a ton of value there. However, I think that it is highly likely that the real destiny will be total
process reinvention based on new agentic capability, not just an agent copying what a human did.
Agents are not humans. They work in different ways. To get the full value out of them, in most cases,
we'll probably need to figure out, or allow them to figure out the best way of accomplishing a goal
without imposing an existing process on them.
So I think you're going to start to see a squeeze on automation
from both just the assisted AI on the one hand,
which is going to continue to be a huge part of personal productivity gains
that then can translate up into the organization,
and actual new agentic processes from the other side
that start to redefine how a workflow can work.
Finally, in the enterprise,
I think we are going to start to see the full impact of AI compounding.
We're now at a point where the organizations that are leading
are going to start to get farther and farther ahead, not just on their AI usage,
but I think that their AI usage will actually start to open up not just efficiency gains in what they do now,
but new opportunities, such as new product and revenue lines.
As they do that, the distance between them and the AI laggards is going to do nothing but grow.
For now, that is going to do it for today's episode.
Appreciate you listening or watching, as always.
Until next time, peace.
