The AI Daily Brief: Artificial Intelligence News and Analysis - 10 Things GPT-5 Changes
Episode Date: August 10, 2025This episode explores ten practical shifts in the AI landscape following GPT-5’s release — from the plateau of raw LLM capability gains to the rise of tool-driven performance, consumer-first desig...n choices, and the explosion of “vibe coding.” NLW breaks down how these changes reshape enterprise competition, open up opportunities for rival labs, drive price wars, and signal a future where multi-agent workflows become the norm.Brought to you by:KPMG – Go to https://kpmg.com/ai to learn more about how KPMG can help you drive value with our AI solutions.Blitzy.com - Go to https://blitzy.com/ to build enterprise software in days, not months AGNTCY - The AGNTCY is an open-source collective dedicated to building the Internet of Agents, enabling AI agents to communicate and collaborate seamlessly across frameworks. Join a community of engineers focused on high-quality multi-agent software and support the initiative at agntcy.org Vanta - Simplify compliance - https://vanta.com/nlwPlumb - The automation platform for AI experts and consultants https://useplumb.com/The Agent Readiness Audit from Superintelligent - Go to https://besuper.ai/ to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Subscribe to the newsletter: https://aidailybrief.beehiiv.com/Join our Discord: https://bit.ly/aibreakdownInterested in sponsoring the show? nlw@breakdown.network
Transcript
Discussion (0)
Today on the AI Daily Brief, 10 things that change after GPT-5.
The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
Hello, friends, quick announcements before we dive in.
First of all, thank you to today's sponsors, KPMG, Blitzy and Super Intelligent.
To get an ad-free version of the show, go to patreon.com slash AI Daily Brief.
And if you are interested in sponsoring the show, send us a note at sponsors at aidailybrief.aI.
Now, it was obvious this being GPD 5 week that the long reads and or big think episode of the week
had to be dedicated to the subject of GPT5 in some way, shape, or form.
I've obviously been thinking about this topic a lot, and I've been trying to think about
what I think the state of play is post-GPT-5, taking into account people's first reactions,
how I'm seeing it used, what I think it suggests for the direction of the industry.
I came up with a list of 10 things that I think will change or are changing or just being
reinforced now that this model is live.
You'll see these aren't big societal pronouncements that relate to the capabilities of GBT5.
This is much more practical and specific and about the state of play in the AI space itself.
This is, of course, all just my subjective opinion based on my observations.
And I'll be interested to see what you guys think, especially as you get access to this model
and play around a bit.
Now, the first thing that I think changes, or perhaps as reinforced, is our sense of LLM progress.
There has been a question for some time now of whether we were hitting some sort of plateau.
This really started to be a narrative around the end of last year, frankly around this time last year,
when we got what we felt like were delays on GBT5.
There was a lot of scuttlebutt about pre-training just not working as well,
and a big shift that came with the introduction of reasoning models and new.
approaches to scaling like test time compute that put the emphasis at the moment of inference rather
than on pre-training. Subsequent to that, of course, we got a whole new generation of models like
01 and then 03 and all the reasoning models from the other labs as well that showed that even if
the pre-training technique was reaching its limits, there were still lots of areas of development
to be had. I think with this model though, even though there are all those other areas of development
and places that we are going to continue to get progress, there is broadly a lot of. There is broadly a
sense, that at least this paradigm has at least some amount of a plateau. CEO Amjad
Mossad wrote, can't help but feel the crushing weight of diminishing returns. We need a new S-curve.
AI researcher Jack Morris affirmed this, but also put it more positively. He wrote,
Shortest explanation of GBT5, this is exactly what the scaling laws predicted. The model is
better. The returns are diminishing. And sadly, absolute general intelligence improvements
will only get smaller. The good news is that there's still so much to do.
personality, reasoning, memory, and creativity are still open problems.
And that brings me, I think, to the second point,
which is the idea that the emphasis in model improvement is shifting away from just
raw capabilities and towards tool usage and how models can interact with the real world.
This was the core subject of Ben Heilich's essay on latent space about GPT5.
He summed up on Twitter,
GPT5 is insanely good at using tools. Tools are about to change fundamentally, and this is why
OpenAI just released unstructured function calling. For those of you who didn't listen to my intro
episode on GPT5, Ben basically made a comparison to the Stone Age. He argued that what marked the
dawn of human intelligence was humans learning how to use tools. As humans, he wrote, we manifest
our intelligence through tools. Tools extend our capabilities. We trade internal capabilities
for external capabilities. It's the defining characteristic of our intelligence.
GPD5, he wrote, marks the beginning of the Stone Age for agents and LLMs.
GBT5 doesn't just use tools, it thinks with them, it builds with them.
I think Ben is dead on here, and I think a lot of what we're going to see is optimizations
and improvements that are designed for how models and how their agentic expressions
can actually go use tools.
One of the things you start to notice now, even as companies present the results of their
on benchmarks, is that they're always going to share models raw, but then also models that use
tools. For example, when OpenAI presented GPT-5's performance on Humanity's last exam, while it got 24.8%
with no tools, with a full slate of tools including Python and a search, it got 42%. In other words,
tools represent a whole new frontier of places to get more gains from these models. And so if you
are worried that we are maybe on a raw capabilities plateau, at least with current strategies,
there is very clearly a ton of new areas and new frontiers to explore that will continue to see
greater progress.
Moving on now, one of the things that is most clear from the release of GBT5 is that this is a
huge boon for the Normies.
For people who have mostly only interacted with chat GBT through whatever model was
default like 4-0, they are going to have their head spun by some of the capabilities of GBT5.
Dan Shipper from Evere had his mom tested out, and she was glowing, saying this is
way more comprehensive than the answers I usually get from chat GPT, the information it gives me is
readable and flows really well. The model is gold. Signal argued that the no-model roulette thing
is actually bigger than most people, especially average people, will clock. Basically, they argue
that the amount of cognitive load it takes for people to understand or try to figure out which
model to use was actually even more damaging than it seemed. One of the things we saw when Deep
Deep Seek launched earlier this year was that even though the model itself was less performant than the
reasoning models that Open AI had available, Open AI wasn't giving those models to people as their
base, which means that when people tried Deepseek, it was the first time they had used a reasoning
model, and the experience blew them away. Now that experience is going to be the norm for everyone,
and I think you're going to see a massive democratization of a lot of the best capabilities of AI,
two huge new pools of audiences that didn't have access to them before.
I want to talk specifically about two use case categories that I think get a major boost
that way. The first is strategy support or strategic thinking. Since the advent of reasoning models,
especially 03, LLMs have for many of us become constant strategic companions. I am literally day in and
day out, weighing different decisions for super intelligent or for the podcast through the strategic
lens of previously 03. Now, of course, this does not mean that I act on the strategy. There are still
big gaps, but the reasoning models really have hit a point where they're extremely adept at
helping you think more comprehensive and holistically about the types of decisions that you're going to make.
GPD5 improves that meaningfully.
First of all, in my early tests, I've seen the reduction of sycophancy that OpenAI worked so hard on
manifest as a willingness to take a harder line on decisions, which is a key part of
strategic thinking, whereas previously O3 would try to hedge and explain or justify anything
that I said, it now is much more comfortable actually weighing different options and suggesting
a best course of action. Again, that doesn't mean that I'm necessarily any more likely to take it,
but it's a lot more instructive and informative and useful to see what the AI actually thinks,
not just how it justifies what I think. This type of strategic collaboration was not possible
with the non-reasoning models. And for that reason, the vast majority of chatchabit users have
not engaged with the models in that way. I think it will be one of the biggest unlocks for people's
personal productivity, for how they manage their careers, for how they manage their part of businesses,
be able to engage with the strategic capabilities of GPD-5.
Now of course, the other even more obvious thing is that we are about to see an absolute explosion
in vibe coding.
Vibe coding was already an insanely fast-growing area of AI usage, pretty definitively the most
important theme of 2025.
As I said on my initial coverage, OpenAI made it pretty clear.
They believe that about 700 million weekly new vibe coders are going to come online very, very
soon. It was not just, in my opinion, that they really wanted to catch up with Anthropic
for the sake of professional developers. Yes, that was and is absolutely a goal, but I think that they're
thinking about this more broadly. I think that OpenAI have come to the conclusion, one that I
agree with, by the way, that coding is a new lingua franca, and that interacting with code via vibe
coding type tools is going to be simply a standard part of using computers in the future. Now, this
won't happen all at once. Yes, certainly now that GPT-5 is available, some number of their 700 million
weekly users who haven't been vibe coding at all will say things like, hey, build me a game or build me
a website, but it'll take a while for that to become normalized. But normalized, I believe it will
become. And it's clear with GPT-5 that they're placing a lot of emphasis on that type of beginner
usage. A term you hear all over the place with people's first impressions of GPT-5 is one-shoting.
The idea that with very little guidance and very little follow-up, they were able to one-shot
some comprehensive thing that code could create.
Alam writes, what's truly impressive about GBT5 one-shotting this game isn't the graphics.
It's the flawless prompt adherence and constraint handling.
A fully functional, interactable game generated in a single pass.
This level of instruction following and code generation is wild.
Other people have been sharing all the different things that they've one-shotted with GBT5,
a space simulator, a meditation app, a duolingo clone, Windows 95 even.
The more that people share these short of examples, the more that regular people are going to
come online, try it out, and realize that this entirely new capability set that they didn't even
consider before is now opened up to them. Now, this doesn't mean that everyone is in full agreement
around OpenAI's emphasis here. Dax, for example, writes, I think the biggest market for
AI coding tools is software engineers. A lot of the industry believes it's non-software engineers,
hence the focus on one-shotting. But in the end, I believe it'll be the smaller numbers.
That created a great conversation. Atlassian engineer Koon Chen writes, it's hard to say. Web 2.0 was
less about professional writers, more about regular bloggers. Tick-Tock was less about filmmakers with
the DSLR, but more random people shooting random stuff. Things that don't feel valuable in aggregate
can add up when there's a long tail. What's for sure is that even if traditional software engineers
remain the majority of software engineers, at least when it comes to important code that gets pushed.
Vibe coding for all is a major, major theme, unlocked in a huge way by GPT5.
AI agents are the buzzword that everyone's talking about, but do you truly understand their significance?
KPMG's agent framework demystifies the concept, offering practical steps to unlock AI agent's immense potential.
Think of it as your GPS for AI strategy. KPMG partners with clients to harness the benefit
of AI agents, guiding you from strategy to execution with a secure architecture, and a plan for
workforce devolution. Check out their comprehensive insights on scaling agent power within your enterprise.
This isn't just about tech, it's a leadership imperative. Go to www.kpmg.us slash agents to learn more.
That's www.kpmg.us.comg.us slash agents.
This episode is brought to you by Blitzy, the Enterprise Autonomous Software Development Platform with
code context. Blitzy uses thousands of specialized AI agents that think for hours to understand
enterprise-scale codebases with millions of lines of code. Enterprise engineering leaders start every
development sprint with the Blitzy platform bringing in their development requirements. The Blitzy
platform provides a plan, then generates and pre-compiles code for each task. Blitzy delivers 80%
plus of the development work autonomously while providing a guide for the final 20% of human development
work required to complete the sprint. Public companies are achieving a 5x engineering velocity increase
when incorporating Blitzy as their pre-I-D-E development tool,
pairing it with their coding co-pilot of choice to bring an AI-Native STLC into their org.
Blitzy is providing a limited time, 30-day free proof of concept for qualifying enterprises.
The team will provide a 5x velocity increase on a real development project in your org.
Visit blitzy.com and press book demo to learn how Blitzie transforms your STLC
from AI-assisted to AI Native.
That's BLITZY.com.
If you are a regular listener, you will
have heard about Super Intelligence Agent Readiness Audits at this point, but I wanted to tell you
today about the full suite of Agent Readiness products that go beyond just the initial
readiness report. Over the last six months, Super Intelligence has built out an entire agent
planning suite. We help you move from discovery to planning to implementation. After you've
completed your agent readiness audits, we help you double click on your most important use cases
with what we call our use case planning reports. These reports are going to help you understand
what sort of technical preparation you need to do to be ready for a use case, what challenges you
might face in implementation, and whether you should be thinking about building, buying, partnering,
or some combination. After that, you can even get a spec document in what we call our technical
blueprint that gives either your developers or the developers of the partner you work with
what they need to build exactly the agent that you're looking for. If you want to learn more
about superintelligence agent planning suite, we've built a custom GPT to answer your questions.
Just go to bit.ly slash super agent. That's bit.l.ly slash super agent, all one word. And if you have any
questions, the agent can even help you book an appointment with our team. One sub-theme from all of this,
which isn't exactly new, but which I do think is profound, is let's call it the consumerification
of OpenAI. Open AI certainly hasn't abandoned any part of their enterprise efforts. In fact,
they used the occasion of their announcement to share that they now had five million businesses
using ChatGBTGBT. C.O. Brad Lightcap and his thread wrote that he thought that developers
and enterprises especially will love it. We also recently got news that OpenAI was developing their
own forward-deployed engineering teams to service big customers who were spending at least $10 million
with them with the sort of very hands-on type of development engagement that's required for
enterprise AI to work. And yet at the same time, it's unignorable that Chat-Gat.
GBT is AI to a huge portion of the world. In advance of the announcement of GBT5, the company
shared that they were on track to reach 700 million weekly active users. As Professor Ethan Malik pointed
out, that's 8.6% of the world's population using chat GBT every week. That consumer basis,
the fact that they are so definitively the model, the company, the application that people
think about when they think about AI, has to go into the decisions that they make about where
they're going to put their emphasis. GPD5 to me is not a strictly better implementation of AI than the
previous OpenAI models was. They have talked endlessly about how challenging the model selector was,
but it wasn't for us power users. In fact, one of the big complaints that you see all over places
like X right now is that the power users want their menus back. They want to be able to choose between
the different models based on their query. No, instead, Open AI has made a choice here. They've made a choice to
prioritize the needs and UX features that will most benefit the normal consumer over those of
the power user, which doesn't mean that they're not considering the power user. The people who are
paying $200 a month, like obviously I am, still can go in and select the legacy models to be
available in their menus. And I'm sure there will be other concessions to power users that
come online in the future. But it's very clear that this is a step towards the emphasis of
that base user. The 700 million in general, not the 7 million power users on the top.
The implications for the enterprise, I think, are really interesting.
On the one hand, as we'll see in just a minute, I think it opens up competitive opportunities
for others. On the flip side, there is kind of an argument that we might see a bit of a
convergence of consumer and enterprise AI usage. Basically, it wouldn't ultimately be all that
surprising to me if in practice, the simplicity and the decisions that they made for consumers
with GPD-5 actually improve its utilization in the enterprise as well. I don't think that's a for sure,
but I wouldn't be surprised.
Still, like I said, I do think that one of the other things that has changed after GPD5
is that it really reinforces that there is opportunity for the other big players, for Gemini,
for Grock and Claude.
OpenAI has had such a definitive lead since the beginning, because of the launch of ChatGPT,
because they were first to get to a GPD4 class model, in fact, we called them GPT4 class models,
but even at their size and scale, the decisions that they make have tradeoffs to other decisions
they don't make. When it comes to finding the right balance between an intersection of consumer and
enterprise, there are reasons to think that Google Gemini is better positioned than even OpenAI is.
And as much improved as GPT5 is when it comes to coding, there are lots and lots of coders out there
who are not shifting their daily driver away from Claude. Beyond that, I think the continued focus
on the consumer for OpenAI opens up an opportunity for Claude and Anthropic to continue to peel
off enterprise users from OpenAI, although, as we'll discuss in a minute, there is a cost dimension
here which could make that a little bit trickier. And then for GROC, there's the simple fact
that GAPT5, as good as it is and as much of an improvement as it represents, is not some
crazy knockout blow on performance. Grock 4 heavy beat it, for example, on a number of the
benchmarks. Tony from XAI wrote, very proud of us at XAI after seeing the GPT5 release.
With a much smaller team, we are ahead in many ways. Grock 4's world first university,
model and crushing GPD5 in benchmarks like RKGI.
OpenAI is a very respectful competitor and still the leader in many areas, but we're fast and
relentless.
Many new models to share in the next few weeks.
And look, for us as consumers, the fact that there are all these opportunities for the
other big labs post-GPT5 release is awesome.
It means that we are going to get so much advancement in so many areas we're just going
to be drowning in opportunity for the foreseeable future.
Now, one interesting thing that comes out talking about enterprise and developers and Anthropics' ability
to peel off, for example, users from Open AI, is that this was very much not just a capability
competition moment, but a price competition moment. People were gagged by how low OpenAI price this.
Theo writes, I've been using GPT5 for a bit now. The model broke me it so good. I didn't know what the
price was. I assumed it would be 03 Pro price because it's that smart. Nope, truly insane.
Niko Christie writes, this was an attempted Anthropic killshot. Get cozier with cursor and make pricing
10x cheaper than Opus. Excited to see how Anthropic responds. Personally, I don't mind $15 per million
inputs. Give me the frontier. And at least for now, Nico is far from alone in that. In fact,
in Menlo's mid-year LLM market update, they found very much that at this point, people are not
switching models for price. They are switching only for performance. However, workloads are going up
dramatically, we're now in the multi-agent paradigm, which we'll get into in just a minute,
and even as costs come down, the sheer number of tokens that we are going to be consuming
is likely to be going up faster.
So I'm not sure how long enterprises at least will have the privilege of not thinking about price.
Now, going back to the idea that there's so much competitive opportunity between all the labs
right now, it really reinforces that a lot of the action is going to take place at the app layer.
In a world where all the models are commoditized and very closely clustered together in terms of
capabilities, the actual product experiences that people have are going to be the big drivers
of customer devotion. Mix panel founder, Sehale, has been talking about this all year. Back in January,
he shared a tweet from Cursor, where they wrote, O3 Mini is out to all Cursor users. We're launching
it for free for the time being to let people get a feel for the model. The Cursor dev still prefer
sonnet for most tasks, which surprised us. Cahill added to that, the app layer decides which
model is used and which model isn't used now. Defaults matter. A few months later, he shared comments
from Sam Altman and said app layer incoming. One, very smart models will be commoditized. Two,
build the best defining product in the space. Now, clearly OpenAI gets this. It's why they're building and
released to much fanfare chat GPT agent. They very clearly value owning the relationship with their
customers, rather than just being the foundation layer that everyone else builds on. Still for builders in
this space, the fact that there is going to be so much opportunity at the app layer is a really
exciting development and confirmation. Once again, for us as consumers, it means that the amount of
choice we have, the amount of people who are building things customized for our use cases,
is just likely to be incredibly, incredibly high. Last thing that changes, or at least gets amplified
in the wake of GBT5, this kind of goes back to tool usage, this kind of goes back to the way
that agent decoding is evolving. But it's very close.
clear that a big phenomenon right now is not just autonomous agents doing things for us. It's
groups or parallel sets of autonomous agents doing things with a selector to determine which
output is best or to combine the results. One of the things that Beth Jzos noted, comparing
GPD 5 and GROC 4 on Humanity's last exam, is that GPD Pro, even with tools, still did not
beat GROC 4 Heavy. GROC 4 Heavy had 44.4% as opposed to GPTT5.
Pro with tools 42%.
Beth pointed out, however, that
given that it's a single agent rather than a
swarm of agents, that's very impressive.
Now, I actually am not totally
sure that GPT5 Pro
for that humanity last exam
isn't a similar type of structure.
In their research blog, OpenAI writes,
for the most challenging complex tasks
we're also releasing GPT5 Pro,
a variant of GPT5 that thinks
even longer using scaled
but efficient parallel test time compute.
I would imagine that parallel test
time compute involves a process similar to Grog4 Heavy, where they spin up and deploy multiple
agents to do that work in parallel. And increasingly, this is just going to be the norm.
You're starting to see it with coding, with all of these IDs building and tooling where you can
spin up multiple agents at once. And I think you can view that as a leading indicator of where
everything else is going to go. Right now, you're not using multiple agents at the same time
to write social media copy, but that's only because the interfaces that you have access to
aren't suggesting that you do. I think this is going to be one of the areas where we next saturate
and see how much power we can pull out of it.
And I think that that process starts right now.
So friends, those are 10 things that I can see changing in the wake of GPT5.
Let me know what you think about these.
Which are the most significant?
Are there any you disagree with?
Are there any that are obvious that I've missed here?
Excited to begin this conversation and excited to really see what this model can do.
But for now, that is going to do it for today's AI Daily Brief.
I appreciate you listening or watching as always.
And until next time, peace.
