The AI Daily Brief: Artificial Intelligence News and Analysis - The AI Model Wars Just Heated WAY Up
Episode Date: August 2, 2025Today’s AI Daily Brief dives into the escalating model wars between OpenAI, Google, and Apple. OpenAI seems to have leaked GPT-5 and their open weights model temporarily, plus the surprise launch of... Google’s Gemini 2.5 Deep Think, and why Apple is scrambling to catch up—with M&A as its only viable AI strategy. We also explore new AI interface innovations from Manus and Perplexity, plus the implications of China’s probe into Nvidia’s H20 chips. Ask GPT about our Agent Readiness Audits - https://bit.ly/supersuperagentBrought to you by:KPMG – Go to https://kpmg.com/ai to learn more about how KPMG can help you drive value with our AI solutions.Blitzy.com - Go to https://blitzy.com/ to build enterprise software in days, not months AGNTCY - The AGNTCY is an open-source collective dedicated to building the Internet of Agents, enabling AI agents to communicate and collaborate seamlessly across frameworks. Join a community of engineers focused on high-quality multi-agent software and support the initiative at agntcy.org Vanta - Simplify compliance - https://vanta.com/nlwPlumb - The automation platform for AI experts and consultants https://useplumb.com/The Agent Readiness Audit from Superintelligent - Go to https://besuper.ai/ to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Subscribe to the newsletter: https://aidailybrief.beehiiv.com/Join our Discord: https://bit.ly/aibreakdownInterested in sponsoring the show? nlw@breakdown.network
Transcript
Discussion (0)
Today on the AI Daily Brief, the latest in the very much heating up AI model wars.
Before that in the headlines, can Apple actually acquire its way out of AI failure?
The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
All right, friends, quick announcements for this Friday.
First of all, thank you to today's sponsors, Blitzy Vanta, Plum, and Superintelligent.
To get an ad-free version of the show, go to patreon.com slash AI Daily Brief.
And if you're interested in sponsoring the show, hit me at NLW at Breakdown.network.
But with that, let's dive into some Wall Street dealmaking and model intrigue.
Welcome back to the AI Daily Brief Headlines Edition, all the daily AI news you need in around five minutes.
We have a very deal-oriented headlines today, kicking off with Apple, who were the latest big tech company to do their quarterly earnings.
TLDR on this one is that while the company is seeing a big rebound in iPhone sales, their failure on AI is absolutely weighing.
them down. Big Tech earnings have been telling a very clear story around AI. The companies that are
betting big like Microsoft and Meta have seen blowout results and massive increases in their stock
prices. Indeed, despite the fact that some analysts remain concerned on this KAPX spending,
as the Wall Street Journal put it, Big Tech's $400 billion AI spending spree just got Wall Street's
blessing. The sort of results that Microsoft and Meta are putting up are, in other words,
justifying even in the short term those costs to Wall Street investors.
And then there's Apple. Strange, strange, strange Apple. On the one hand, the numbers were actually
strong. iPhone sales were up 14% to reach 44.6 billion for the quarter, which was 10% above analyst
forecasts. Top line revenue was up 9.6%, which wasn't nearly as impressive as the cloud giants,
but far from a disaster. Guidance was also strong, with Apple predicting revenue growth in the mid-to-high
single digits for the next quarter, which is well above the previous 3% analyst forecast.
Overall, this was Apple's strongest quarter of revenue growth since December 2021.
In any other era, this would have been a blowout quarter that defied negative views from analysts,
but the market only gave Apple a 2% boost in after-hours trading.
And honestly, if you have to have one takeaway from mega-cap tech earnings this week,
it's that the market does not care how many iPhones you ship,
only how many AI tokens you're serving.
Apple is now squarely in catch-up mode and did, in fact,
articulate something of a plan to investors.
During the earnings call, CEO Tim Cook said,
we see AI is one of the most profound technologies of our lifetime.
We are embedding it across our devices and platforms and across the company.
We are also significantly growing our investments.
Apple has always been about taking the most advanced technologies
and making them easy to use and accessible for everyone,
and that's at the heart of our AI strategy.
Now, of course, this is what we all thought made sense last year
when they announced Apple Intelligence,
but their ability to deliver on that idea has been woefully lacking.
Cook did go on to say that they were, quote,
reallocating a fair number of people to focus on AI features and that they have a, quote,
great, great team and we're putting all of our energy behind it.
A CNBC interview ahead of earnings had focused on mergers and acquisitions as Apple's potential
AI solution. Cook said that Apple would significantly grow and is, quote, open to M&A that
accelerates our roadmap. He touted seven acquisitions made so far this year, but also acknowledged
that none of them were huge in terms of a dollar amount. Later on the earnings call, Apple also
said that they were currently making acquisitions at the rate of one every several weeks.
so clearly they are leaning into this narrative that maybe they can buy their way out of this.
Now, if that is the strategy that they approach, it will represent a big shift for Apple,
which has historically been reluctant to acquire their way to success.
For many observers, including me, it's felt like this was their only path.
Back as Apple intelligence really started to sputter painfully between Q1 and Q2,
I and many others suggested that Apple should go out and try to buy one of the foundation model companies,
mistrial perplexity or my top choice at the time Anthropic.
However, since then, the problem is that if that is the strategy they want to pursue, valuations
are now running away from Apple. Anthropic was already a stretch in March, but with its valuation
rumored to have reached $170 billion, it's now basically unobtainable. Even mistral and
perplexity have moved from single-digit billions to the mid-teens. What's more, it'd be
surprising if any of the leading AI startups are even really open to an acquisition partner.
I don't want to overstate that case. You never know what's going on behind the scenes. But the point is
that the longer that Apple waits, the harder it gets.
The people who watch Apple most closely just aren't buying this.
Bloomberg's Apple Watcher Mark German writes,
to be sure, Cook has said several times that Apple is unafraid of big deals,
and they've still never done one.
While I do think AI changes that,
Apple's pace of M&A has only slowed down dramatically in recent years,
and his comments today aren't a new line of thinking.
Frankly, to me, it sounds kind of like Tim Cook is trying to write it out for a few years.
Honestly, he kind of reminds me of Jay Powell at FOMC Pressors
when he does nothing and says we're just going to have to wait
and C. When an analyst asked Cook whether he thinks AI models will become commoditized,
Cook declined to answer, adding, that gives away some things on our strategy. But honestly,
even if they are playing a wait and see strategy or they have some conviction around when things
get commoditized, they are paying real costs in the meantime. Peter Anderson of Anderson Capital
Management said, Apple's embarrassing AI shows how it has lost its mojo with innovation. And the lack of
innovation speaks to the lack of revenue growth. And that speaks to why we don't see upside in the stock.
says Glenbue Trust, Chief Investment Officer Bill Stone, it's hard to be excited about Apple when you can go look to the other magnificent seven stocks and find double-digit growth that should continue for a while amid the AI wave, especially since those are often cheaper. Apple would be a lot more interesting if the multiple was lower, but what finally gets growth going again is the biggest question.
Now, moving over to the private market, we have some further reporting on the rumored OpenAI deal. According to New York Times sources, OpenAI has raised $8.3 billion at a $300 billion valuation. The round with,
was completed months ahead of schedule and was five times oversubscribed.
ARR is not the 12 billion that was reported earlier this week, but in fact 13 billion.
That's up from 10 billion in June and projected to surpass 20 billion by the end of year.
Importantly, given how much conversation there is around Anthropics starting to eat their lunch in this area,
the number of business users who pay for chat GPT has jumped from 3 million a few months ago to 5 million now.
Meanwhile, Anthropics revenue number does seem to be 5 billion, and growing by my calculations
at maybe more than a billion a month.
Moving over to the geopolitical side of things,
Chinese authorities have summoned NVIDIA
to discuss alleged security risks with their H20 chips.
The investigations are beginning before the first shipments can even land
following the U.S. reversing their ban on H20 exports.
Yesterday, the Cyberspace Administration of China
called company representatives to discuss what they call serious security vulnerabilities.
The CAC wrote,
U.S. lawmakers have previously called for advanced chips exported from the U.S.
to be equipped with location tracking features.
The location tracking and remote shutdown capabilities on Nvidia computing chips are ready, according to USAI exports.
The regulator is demanding that Nvidia release documentation of this and explain the potential
loopholes and backdoor capabilities of the H20.
Now, the notice has some pretty obvious echoes to the Huawei ban of 2019.
US officials declare the company's networking equipment to be an unacceptable national security risk
and placed Huawei on an import blacklist.
And while there's good reason to believe their claims, the ban had the second order
effect of stopping the Chinese hardware giant in its tracks.
It ensured Western competitors like Cisco and Juniper continued to have a place in the market.
It's entirely possible that Chinese officials are attempting to do the same thing for Huawei in their
domestic AI market. In the six or so months where H20 chips have been banned,
Huawei have made large investments in building their own competitive chips.
In fact, a large part of the rationale for unbanding the H20s was to ensure that Huawei can't
establish a foothold in their domestic market due to a lack of competition.
Forrester analyst Charlie Dye said,
the CAC's scrutiny over H20 security risks could further erode NVIDIA's Chinese market share
and rising domestic competition. It also aligns with China's broader push to accelerate domestic
semiconductor alternatives for technological self-reliance amid U.S. export controls.
NVIDIA, for its part, denied the allegations saying,
cybersecurity is critically important to us.
NVIDIA does not have backdoors in our chips that would give anyone a remote way to access
or control them.
Lastly today, a new user milestone, GitHub co-pilot has crossed 20 million users.
Microsoft's CEO, Satcha and Nadella, slipped the comment into Wednesday night's earnings call,
although it wasn't immediately clear whether this was weekly or monthly active users or some other metric at the time.
TechCrunch has now confirmed that Nandela was quoting for all-time users.
And while obviously a recurring number would give a better picture of what current usage is actually like,
it's still pretty impressive for Microsoft.
It means that 5 million users have tried out GitHub copilot for the first time over the past three months,
which is a very big ramp up.
By way of comparison, although we haven't heard from Cursor for a while,
Back in March, they had a million daily active users.
We can only assume that a lot of these GitHub copilot users aren't coming back day after
day, but the number still demonstrates just how much of a distribution advantage Microsoft
has over its startup rivals.
For now, though, that is going to do it for today's AI Daily Brief Headlines edition.
Next up, the main episode.
This episode is brought to you by Blitzy, the Enterprise Autonomous Software Development Platform
with infinite code context.
Blitzy uses thousands of specialized AI agents that think for hours to understand
enterprise-scale code bases with millions of lines of code.
Enterprise engineering leaders start every development sprint with the Blitzy platform,
bringing in their development requirements.
The Blitzy platform provides a plan, then generates and pre-compiles code for each task.
Blitzy delivers 80% plus of the development work autonomously,
while providing a guide for the final 20% of human development work required to complete
the sprint.
Public companies are achieving a 5x engineering velocity increase when incorporating Blitzy
as their pre-IDE development tool,
pairing it with their coding co-pilot of choice to bring an AI-Naping.
native STLC into their org.
Blitzy is providing a limited time,
30-day free proof of concept
for qualifying enterprises.
The team will provide a 5x velocity increase
on a real development project in your org.
Visit blitzy.com and press book demo
to learn how Blitzie transforms your STLC
from AI assisted to AI Native.
That's BLITZY.com.
As a founder, you're moving fast
towards product market fit,
your next round, or your first big enterprise deal.
But with AI accelerating how quickly startups
build and ship, security expectations are higher earlier than ever. Getting security and compliance
right can unlock growth or stall it if you wait too long. With deep integrations and automated
workflows built for fast-moving teams, Vanta gets you audit-ready fast and keeps you secure
with continuous monitoring as your models, infra, and customers evolve. Fast-growing customers
like Langchain, writer and cursor trusted Vanta to build a scalable foundation from the start.
And look, as someone who lives in the world of enterprise procurement, I love how Vanta makes it
easy to get compliance right. The last thing you need when you're trying to win that big deal is to
have it scuttled by something that Vanta has solved for over 10,000 companies. Go to Vanta.com
slash NLW to save $1,000 today through the Vanta for Startups program and join over 10,000 ambitious
companies already scaling with Vanta. That's VANTA.com slash NLW to save $1,000 for a limited time.
Today's episode is brought to you by Plum. Are you building with AI? Plum noticed that every
technical creator tends to hit the same wall. You've got AI workflows people want, but monetizing
them feels impossible because client work doesn't scale. Selling copies gives away your IP. And building
your own platform, that's becoming a software company. It's a hard gap to bridge, and that's why they
built Plum. Plum helps creators build an audience of paid subscribers for their AI workflows,
all on a single platform. Think substack for automations. There's no need to build extra infrastructure
just to get paid for your expertise.
Plum handles that so creators can do what they do best,
solving problems with AI.
Ready to turn your expertise into passive income,
visit useplum.com, that's Plum with a B.
If you are a regular listener,
you will have heard about Super Intelligence Agent Readiness Audits at this point,
but I wanted to tell you today about the full suite of agent readiness products
that go beyond just the initial readiness report.
Over the last six months, Super Intelligence has built out an entire agent planning,
suite. We help you move from discovery to planning to implementation. After you've completed your
agent readiness audits, we help you double-click on your most important use cases with what we call
our use case planning reports. These reports are going to help you understand what sort of technical
preparation you need to do to be ready for a use case, what challenges you might face in implementation,
and whether you should be thinking about building, buying, partnering, or some combination. After that,
you can even get a spec document in what we call our technical blueprint that gives either your developers
or the developers of the partner you work with,
what they need to build exactly the agent that you're looking for.
If you want to learn more about superintelligence agent planning suite,
we've built a custom GPT to answer your questions.
Just go to bit.ly slash super agent.
That's bit.l.ly slash super agent, all one word.
And if you have any questions,
the agent can even help you book an appointment with our team.
Welcome back to the AI Daily Brief.
This is one of those shows where there is a better-than-I-like chance
that by the time you're actually listening to it,
a huge amount of the information has changed
because something has just been released.
But given that, I'm going to try to get it out as soon as possible.
And I think even if some new models have been released in the meantime,
the broader point that model competition is heating up significantly right now
is going to remain.
So to dig into this, obviously, if you are paying attention to AI right now,
the big thing that everyone is waiting for is GPT-5.
It feels like we are a matter of days away from this.
We've seen more and more examples of what
people think is GPT-5 in the wild. We had a bunch of what we thought were test models that were taken
off the testing arena suggesting that it was getting ready for release. And then just at A,
we got GPD-5 for a few minutes only to see it get removed. Basically, very briefly on Hugging Face,
a new reported OpenAI model called GPT-5 New Proxy API EV3 popped up only to be withdrawn very,
very quickly thereafter. In addition, we got what looked like OpenAI's open-source model.
In fact, it looked like we got two versions of it, GBTOSS-120B and GBTOSS-20B, so potentially
a 120-b-perameter version and a 20-billion-parameter version.
Chedislua writes, the repo only provided three bits of info, D-types, config.json, and the
weights.
Now, people dug in very quickly to the limited information we had, with a lot of people assuming
that this was actually an intentional leak to build up hype.
People are speculating about the architecture, thinking that it's a mixture of
experts model, and other people are just starting to get hyped. Vracer X writes, Elon says open
AI betrayed their mission, but meanwhile, OpenAI just leaked O3-level open source models and deleted
them. Too late, 120 billion parameters, MOE with four experts, runs on a single H-100, 130K
context, rope-scaled, blazing fast, multilingual code-native FP4 train, no API, no gatekeeping,
just raw weights. They write, this might be the biggest open source moment since Deepseek.
Mr. Who, matter what? If this is real, the monopoly just cracked.
Now, a lot of the dissection so far is pretty technical. It obviously doesn't have anything to do
with the sort of use case discussion that is the bread and butter of this show. But the point is
that it appears that in addition to GPT5 coming very soon, we are very much on the verge of getting
their open weights model as well. And at least from the developer community, there is easily as much
excitement about that. Although, to be fair, there is also some amount of lingering skepticism.
Nathan Lambert writes, I welcome all contributions to open model.
ecosystem, but seriously doubt we can rely on OpenAI to be a long-term champion we need in releasing
more models. Happy to be proven wrong, we'll see what the next couple days bring. Now, if a lot of
the excitement is about the theoretical OpenAI models coming soon, Google swooped in with something
that I don't think people really expected. You'll remember that recently we got both OpenAI and
Google achieving the equivalent of a gold medal on the International Math Olympiad with state-of-the-art
versions of their models. While Sam Altman made it clear that exactly that version wouldn't
be coming to consumers anytime soon, Google just dropped their version of that model.
It's called Gemini 2.5 DeepThink.
CEOs in Dar Pichai writes,
We're bringing a version of Deep Think that achieved gold medal status at IMO to ultra
subscribers in the Gemini app, and the official version is now in the hands of mathematicians.
So, editors note here, it sounds like this is close enough for them to claim it's the same thing,
but there are some differences apparently with the version that some number of mathematicians
have.
Anyways, Pichai continues, toggle it on when reasoning through complex scientific literature,
tackling a coding problem that requires careful consideration of time complexities,
or anything else Demis Sivas considers a fun Friday night.
Putting my branding hat on for a second, being geeky and playful is a good fit for Google.
They should do more of this.
In their blog post, they talk a little bit more about how deep think works.
Basically, they say that it extends Gemini's parallel thinking time.
Just as people tackle complex problems by taking the time to explore different angles,
weigh potential solutions, and refine a final answer, Deep Think pushes the frontier of thinking
capabilities by using parallel thinking techniques. This approach lets Gemini generate many ideas at once
and consider them simultaneously, even revising or combining different ideas over time before
arriving at the best answer. By extending the inference time or thinking time, we give Gemini
more time to explore different hypotheses and arrive at creative solutions to complex problems.
We've also developed novel reinforcement learning techniques that encouraged the model to make use
of these extended reasoning paths, thus enabling Deep Thing to become a better, more intuitive
problem solver over time. Now, the principle of this makes total sense to me, right? It's not
dissimilar to the core idea underlying my Dr. Strange theory, which is basically that when
you have this much intelligence, one of the really interesting ways to deploy it is to use a bunch
of different scenarios, basically to answer a question in a bunch of different ways, and then see
which one seems to be best for whatever set of criteria. So what are the types of use cases that
this approach opens up. One says Google is iterative design and development. They write,
we've been impressed by Deep Think's performance on tasks that require building something complex
piece by piece. For example, we've observed Deep Think can improve both the aesthetics and functionality
of web development tasks. The example prompt they give is design and create a very creative,
elaborate and detailed voxel art scene of a pagoda in a beautiful garden with trees, including some
cherry blossoms. Make the scene impressive and varied and use colorful voxels. Use whatever libraries
to get this done, but make sure I can paste it all into a place.
a single HTML file and open it in Chrome. It shared the outputs of Gemini 2.5 Flash, Gemini
2.5 Pro, and Gemini 2.5 Deep Think. Other areas they highlight for Deep Think are scientific and mathematical
discovery, basically saying it's a powerful tool for researchers, as well as algorithmic
development and code. Specifically, they say that it excels at tough coding problems in which
formulation and careful consideration of tradeoffs and time complexity is paramount.
Now, this being a new model release, of course, there had to be some benchmarks, and according
to Google at least, deep think out.
absolutely crushes them. On humanity's last exam, it meaningfully exceeds Gemini, OpenAI, and GROC
4, which you'll remember just about five minutes ago, was the model that everyone was talking about.
On live codebench, it also sees a major jump up. And then, of course, in mathematics, on both the
IMO 2025 and the AIME 2025, there is a major step change with this model. Now, so far, not too many
people have popped up sharing their early access experiments with Deep Think, so we're going
to have to wait a couple days to see what people do with it when they get their hands on.
on it. That said, the people who have had it are affording pretty favorable first impressions.
Wright's Professor Ethan Malik. Very good model, big gains over standard Gemini 2.5 Pro for a lot of
problems. One of the examples he gives is the Starship Control Panel Prompt that he tries
with every model. We recently shared a version that he thought might be GPT5, which was very
impressive relative to the previous competitors, but he pointed out that this is the first time
he's seen a model make a 3D interface in response. By the way, if you were watching this, this is the one
that we previously shared. He wrote the mystery model summit with the prompt,
Create something I can paste into P5JS that will start on me with its cleverness in creating
something that invokes the control panel of a starship in the distant future.
2351 lines of code first time. He also shared his otter on a plane using Wi-Fi
and draw a unicorn with Tick Z, which as he says is a language built for scientific diagrams
and very much not for drawing. Now right now, Deep Think is only available for ultra
subscribers, and over the next couple of days I'll be keeping an eye out for people who are
experimenting, and frankly, if I don't see enough of it, I'll just dig in there myself.
Now, it's not just raw, state-of-the-art frontier models that are interesting right now.
There's a lot of interface development happening around the models as well.
Manus, for example, on Thursday introduced with their calling wide research, as opposed to deep
research, get it?
They write, earlier this year, the launch of Manus defined the category of general AI agents
and shaped how people think about what agent products can and should be.
But to us, Manus was never just an AI. It has always been a one-of-a-kind personal cloud computing platform.
Traditionally, harnessing cloud compute for custom workflows has been a privilege reserved for
engineers and power users. At Manus, we believe AI can democratize that power. Behind every
Manus session runs a dedicated cloud-based virtual machine, allowing users to orchestrate complex cloud
workload simply by talking to an agent. From generating tailored rental presentations to safely
evaluating cutting-edge open-source projects, deterring completeness of the virtual machine is what
gives Manus its generality, and opens the door to endless creative possibilities.
Naturally, we've been asking ourselves, how can we scale the compute available to each user
by 100x, and what new possibilities emerge when anyone can control a supercomputing cluster
just by chatting? After months of optimization, our large-scale virtualization infrastructure
and highly efficient agent architecture have made this vision a reality. Today, we're introducing
the first feature built on top of this foundation, Manus-wide research. They say that basically
Manus is a way to go after complex, large-scale tasks that might require information on, for example,
hundreds of items. So, for example, if you wanted to understand recent earnings results across all
of the Fortune 500, that's the sort of breadth that wide research is designed for. What the wide
really conveys, and this goes back a little bit to the architecture conversation we were just having
around Deep Think as well, is that this is a system for parallel processing, and those are the
words that they use. In fact, they write, at its core, wide research is a system-level mechanism for
parallel processing and a protocol for agent-to-agent collaboration. The key to wide research they write
isn't just having more agents, it's how they collaborate. Unlike traditional multi-agent systems
based on predefined roles, every sub-agent and wide research is a fully capable general-purpose
minus instance. Which is super interesting. Basically, this means that tasks are not bound by a predetermined
architecture of agents, but the model can theoretically figure that out. And once again, people are only just
starting to get access to wide research, but people are into exploring this approach of breaking
research into subtasks and seeing agents work on them in parallel. One more platform that I view
in this model battle that is, again, more of an interface or an approach update than it is just a
model improvement, is Perplexity's Comet Browser. Now, this is something I've talked about a fair
bit on this show, but I feel like every day that goes on, I'm seeing more and more people
rave about Perplexity Comet and desperately try to get an invite to the system. Take, for example,
Toby Luckie, the CEO of Shopify, who wrote,
I'm constantly impressed with Perplexity Comet.
Amazing to give it a complex task and watch it claim a tab and toil away at it.
Browsers are interesting again.
People have been sharing their use cases like cleaning up subscriptions,
consolidating information, and more.
I think people are definitely seeing in this,
not just a new type of browser, but is the first step to something bigger.
Arvind and not the Arvind Trinivas, who is the CEO of Perflexity writes.
Comet Browser is getting really innovative.
Now we can easily automate mundane things we do
the net. I see it as an early precursor to an AIOS. I'm worried Zuck and his meta who suck all our data
is going to get there first. I hope not. I root for perplexity winning it. Mostly, if you are an
enfranchised AI user, what a time to be alive. We've got exciting rumors coming every day,
actually super powerful new models to use, agentic systems in practice for the first time,
and whole new interfaces to explore. I highly recommend taking some time this weekend to go check out
some of these new tools, although like I said, it's getting a little pricey given that all of them
are coming to their version of an ultra subscription first. But still, what a time to be alive,
excited to see what we all get to build next. For now, that is going to do it for today's AI Daily Brief.
Appreciate you listening as always. And until next time, peace.
