The AI Daily Brief: Artificial Intelligence News and Analysis - What People Are Actually Using AI For Right Now
Episode Date: December 8, 2025Today’s episode breaks down a massive new empirical study from OpenRouter and a16z that analyzed more than 100 trillion real-world tokens to reveal what developers and power users are actually doing... with AI right now, from the surge in reasoning models to the dominance of coding workloads to the unexpected rise of roleplay in open-source systems. The discussion explores how the shift toward long-context programming tasks, tool-use invocation, and hybrid stacks of closed and open models is reshaping the practical AI landscape and what patterns matter most heading into 2026. Headlines include fresh rumors around GPT-5.2, OpenAI’s UX cleanup efforts, and the latest shake-ups at Apple and Meta. Brought to you by:KPMG – Discover how AI is transforming possibility into reality. Tune into the new KPMG 'You Can with AI' podcast and unlock insights that will inform smarter decisions inside your enterprise. Listen now and start shaping your future with every episode. https://www.kpmg.us/AIpodcastsGemini - Build anything with Gemini 3 Pro in Google AI Studio - http://ai.studio/buildRovo - Unleash the potential of your team with AI-powered Search, Chat and Agents - https://rovo.com/AssemblyAI - The best way to build Voice AI apps - https://www.assemblyai.com/briefLandfallIP - AI to Navigate the Patent Process - https://landfallip.com/Blitzy.com - Go to https://blitzy.com/ to build enterprise software in days, not months Robots & Pencils - Cloud-native AI solutions that power results https://robotsandpencils.com/The Agent Readiness Audit from Superintelligent - Go to https://besuper.ai/ to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Interested in sponsoring the show? sponsors@aidailybrief.ai
Transcript
Discussion (0)
This podcast is sponsored by Google.
Hey folks, I'm Amar, product and design lead at Google DeepMind.
Have you ever wanted to build an app for yourself, your friends,
or finally launched that side project you've been dreaming about?
Now you can bring any idea to life, no coding background required,
with Gemini 3 in Google AI Studio.
It's called vibe coding and we're making it dead simple.
Just describe your app and Gemini will wire up the right models for you
so you can focus on your creative vision.
Head to AI.studio slash build to create your first app.
Today on the AI Daily Brief, what 100 trillion tokens tell us about real-world AI usage?
And before that, on the headlines, could we be getting GPT 5.2 this week?
The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
All right, friends, quick announcements before we dive in.
First of all, thank you to today's sponsors, Gemini, robots and pencils, blitzie, rovo, and super-intelligent.
To get an ad-free version of the show, go to patreon.com slash AI Daily Brief, or you can subscribe on Apple Podcasts.
In either case, it's just $3 a month for ad-free.
And lastly, if you are interested in sponsoring the show, send us a note at sponsors
at AIDailybrief.aI.
Welcome back to the AI Daily Brief Headlines edition, all the daily AI news you need in around
five minutes.
And of course, we are kicking off the day with the recap of the weekend's rumors around
OpenAI's Code Red response to Google.
It appears that the first drop of Code Red will be GPT 5.2.
The Virges Tom Warren is of the understanding from his sources that GPT-52 is
earmarked for release on Tuesday. The release date is, of course, still subject to change due to anything
from server capacity issues to leaks from rival labs. And interestingly, Warren sources said that the model
was originally slated for this month. So even before Code Red, it was going to come sometime in
December, but that it was being fast-tracked because of the pressure of Gemini 3. And as if OpenAI
weren't dealing with enough from the pressure from Gemini 3 and skepticism in the markets,
new data from Censor Tower also suggests that ChatchipT user growth has slowed down.
According to Censor Tower, only 7 million new monthly active users were added last month,
that compares to 40 to 60 million being added per month over the summer.
What's more, growth was just 6% between August and November.
Bloomberg also reported that investors are backing companies tied to Google's AI ecosystem
and turning away from Betz linked to OpenAI.
Before the release of Gemini 3, their basket of OpenAI exposed public stocks was up 125%.
That's now down to 74% since Gemini.
3 was released. The basket that was exposed to Google was at around 110% year-to-date when Gemini
was released and has now surged to 146%. There is also even some chatter that OpenAI stock
has fallen marginally in private markets, although this one I think we need to have even a little
bit more skepticism around, as the signal is really hard to tell in these non-public markets.
Regardless, altogether, the stakes are very clearly high for the next iteration of chat GPT,
but the buzz is that the model could live up to the hype. On December 6th, Matt Schumer tweeted,
the model landscape is about to be shaken up again.
Reporting from last week suggested that GPT-5-2
was ahead of Gemini 3 on internal testing,
and on Friday, model leaker and suspected insider,
I rule the world, posted a fairly clearly fake benchmark card that went viral.
Now, on the one hand, I think most people assumed
that this was a nanobanana creation,
but still it seems to me like the general sentiment
is to think that OpenAI might be right back in this
after their next model drop.
The betting markets are also going haywire,
On Polymarket on Friday, in the market for which company would have the best AI model by the end of 2025,
Google was at 87% while OpenAI was at just 10.5%.
Keep in mind that 10.5% was already a fairly big jump from where it had been just a couple days earlier.
Over the weekend, OpenAI jumped to 25%, although they've now fallen slightly back to 18%.
In the coding-specific market, however, Open AI completely flipping things at the end of last week,
going from 12.4% to Anthropics 85% on December 5th to now sitting at 75% compared to Anthropics 19%
as of this morning December 8th when I'm recording.
AI Breakfast wrote, The Insiders know, and sure enough, it appears that users that exclusively bet on OpenAI-related markets are loading up in anticipation of the GPT-5-2 release.
Still, as much as people may be focused on the new models, efforts to improve the user experience could end up being even more impactful.
The Vergegan reports that the focus will shift away from, quote, flashy new features and towards
improving the chatbot's speed, reliability, and customizable. And certainly, it's not hard
to find evidence for the need for that as well. Also, over the last week, we've seen a number of tweets
like this one, with users showing links to integrated apps for Target, Spotify, and Peloton,
in response to completely unrelated queries. And initially in response, OpenAI went the strategy of
saying, actually, these aren't ads. Head of ChatGPT, Nick Turley wrote, I'm seeing lots of Confucius
about ads rumors in ChatGBT.
There are no live test for ads.
Any screenshots you've seen are either not real or not ads.
If we do pursue ads, we'll take a thoughtful approach.
People trust ChatGBT, BT, and anything we do will be designed to respect that.
Unfortunately for them, a lot of people felt like Benjamin DeKracker, who wrote,
It's not an ad if we just keep repeating that it's not an ad.
He shared an image of a recommendation to connect to Target to shop for home and groceries
on a conversation that seems like it was about a computer issue
and said, you guys literally announced a partnership with Target right before this.
You're handling this very badly and people are noticing.
A few hours later, Chief Research Officer Mark Chen, took what I think was probably the better tact
and acknowledged that being told a shop at Target in every session feels a lot like advertising,
even if it isn't an ad unit that OpenAI specifically sold.
Chen wrote,
I agree that anything that feels like an ad needs to be handled with care and we fell short.
We've turned off this kind of suggestion while we improve the model's precision.
We're also looking at better control so you can dial this down or off if you don't find it helpful.
Benjamin De Cracker, whose post I was just mentioning, responded, thank you for taking this
seriously, Mark.
Point of all this is, OpenAI clearly has a lot of work ahead of it, but also, there is lots
of excitement about how they might respond.
Bucco Capital summed it up, Open AIs Code Red is bullish, not bearish.
It's an admission that they were overeating getting beat and needed to focus.
That's what great teams do.
All eyes on how they execute Code Red.
And so we'll just quickly go through a couple of other headlines before we move over into
today's main episode.
The first is another big thing that people are talking about, which is more departures from Apple.
Last week, we learned that senior VP of machine learning and AI strategy, i.e. their head of
AI, John G. and Andrea would be leaving the company. A few days later, meta-secured the services
of Alan Die, Apple's head of U.X design. By the end of the week, Apple announced that their general
counsel and head of government affairs would also be moving on. Compounding with over a dozen
departures from Apple's AI team, you're talking about a major loss of talent in Cooper Tino.
Now, Bloomberg's Apple correspondent Mark German reports that senior VP of
of hardware technologies, Johnny Shrugi is considering leaving in the near future.
German says that Shrugi, who he considers to be one of Apple's most respected executives,
recently discussed leaving the company with CEO Tim Cook.
And while the other departures kind of felt necessary, particularly around Gianandria,
for this one, it is hard to find a silver lining.
Shruchy, as German writes, was the architect of Apple's prized in-house chip efforts.
And frankly, Apple's M-Series chips have been one of the few unambiguous bright spots for the company
over recent years.
Twitter user Nicholas wrote,
Shrugi has had AI-capable chips
and hundreds of millions of devices for years
and Apple's software teams
still haven't put them to use
outside the camera app.
I imagine he wants to build chips
relevant to AI today.
Now, German wrote that differently
than the other executives,
Tim Cook has apparently been working
aggressively to retain Shrugi.
An effort that he said
included offering a substantial pay package
as well as the potential
of more responsibility down the road.
One scenario floated internally
by some execs involved
elevating him to the role
of chief technology officer.
Basically, things just continued to be a mess over there, and we still feel very much in the
part before they get things straight.
Lastly, today, a couple meta stories.
The first is that they have acquired an AI device startup called Limitless to further their
wearable strategy, or perhaps to cut off the wearable strategy for others.
Limitless was a part of the wave of AI wearables that launched last year.
Their device was a small pendant that recorded the user's conversations throughout the day
and delivered an AI-generated summary.
Now, that segment, of course, so far has fallen flat, and multiple companies
have now been acquired for their talent, leaving their devices to fall by the wayside.
Here again, the Limitless Pendid will no longer be sold, although the device will still be supported
for at least the next year. Subcriptions will be canceled, and existing device owners will have
access to the unlimited plan for free. Other services, including their Rewind software that
records desktop activity and meetings will be sunsetted immediately. Now, Meta doesn't seem to be
acquiring Limitless for their hardware. Instead, the team will join Reality Labs, which produces
the Meta Raybans and other AI-enabled smart classes. People are trying to figure out the signal in this
one, is the story Meta stocking up on talent in the wearable space because of their high conviction
and their lead there? Is it them trying to cut off talent to competitors because of their lead there?
Not totally clear? And so what's more when it comes to AI wearables? That is a category that continues
to be in the let's call it pre-product market fit stage. Lastly today, Meta's chatbot will now
provide up-to-date news content under multiple new media deals. On Friday, Meta announced deals with
CNN, Fox News, USA Today, People, Inc, and more. Meta said the deals would, quote, improve meta-a-a-a-i's
ability to deliver timely and relevant content and information with a wide variety of few points
and content types. One of the stories that has been muted in 2025, relative to where I think
people thought it was going to be, is the story of AI platforms versus copyright holders, but I
imagine we'll get a lot more of that in 2026. Indeed, with perplexity facing a pair of new
lawsuits from the Chicago Tribune in the New York Times, arguing that perplexity's web crawlers
have intentionally ignored or evaded technical content protection measures. We have yet another example
of where this is going to be fought out in courts in the coming year.
Now, that is longer than we can get into in this particular episode.
So for now, we will close the headlines and move on to today's main episode.
AI changes fast.
You need a partner built for the long game.
Robots and pencils work side by side with organizations to turn AI ambition into real human impact.
As an AWS certified partner, they modernize infrastructure, design cloud native systems,
and apply AI to create business value.
And their partnerships don't end at launch.
As AI changes, robots and pencils stays by your side so you keep pace.
The difference is close partnership that builds value and compounds over time.
Plus, with delivery centers across the U.S., Canada, Europe, and Latin America, clients get local expertise and global scale.
For AI that delivers progress, not promises, visit robots and pencils.com slash AI Daily Brief.
This episode is brought to you by Blitzy, the Enterprise Autonomous Software Development Platform with Infinite Code Context.
Blitzy uses thousands of specialized AI agents that think for hours to understand Enterprise-scale code bases with
millions of lines of code. Enterprise engineering leaders start every development sprint with the Blitzy
platform, bringing in their development requirements. The Blitzy platform provides a plan, then generates
and pre-compiles code for each task. Blitzy delivers 80% plus of the development work autonomously,
while providing a guide for the final 20% of human development work required to complete the sprint.
Public companies are achieving a 5x engineering velocity increase when incorporating Blitzy as their
pre-I-D-E development tool, pairing it with their coding pilot of choice to bring an AI-native SDLC into
their org. Visit blitzie.com and press get a demo to learn how Blitzy transforms your
SDLC from AI-assisted to AI-native.
Meet Rovo, your AI-powered teammate.
Robo unleashes the potential of your team with AI-powered search, chat, and agents, or
build your own agent with Studio.
Rovo is powered by your organization's knowledge and lives on Atlassian's trusted and
secure platform, so it's always working in the context of your work.
Connect Robo to your favorite SaaS app so no knowledge gets left behind.
Robo runs on the teamwork graph, Atlassian's intelligence layer that unifies data across all of your apps and delivers personalized AI insights from day one.
Robo is already built into Jira, Confluence, and Jira service management standard, premium, and enterprise subscriptions.
Know the feeling when AI turns from tool to teammate? If you rovo, you know.
Discover Rovo, your new AI teammate powered by Atlassian. Get started at ROV, as in Victory, O,
Today's episode is brought to you by my company, Superintelligent.
Superintelligent is an AI planning platform.
And right now, as we head into 26, the big theme that we're seeing among the enterprises
that we work with is a real determination to make 2026 a year of scaled AI deployments,
not just more pilots and experiments.
However, many of our partners are stuck on some AI plateau.
It might be issues of governance.
It might be issues of data readiness.
It might be issues of process mapping.
Whatever the case, we're launching a new type of assessment called Plateau breaker
that, as you probably guess from that name, is about breaking through AI plateaus.
We'll deploy voice agents to collect information and diagnose what the real bottlenecks are
that are keeping you on that plateau.
From there, we put together a blueprint and an action plan that helps you move right
through that plateau into full-scale deployment and real ROI.
If you're interested in learning more about Plateaubreaker, shoot us a note, contact at B-super.aI
with plateau in the subject line.
Welcome back to the AI Daily Brief.
Today, we are looking at what people are actually using AI for right now.
In other words, beyond our suppositions and our guesses,
is there a way to see these specific types of applications that are driving AI adoption?
And last week, we got a study that was trying to do exactly that.
The study comes from a team up of OpenRouter and A16Z.
A16Z, of course, being a prominent venture fund.
and OpenRouter being a startup that provides a unified API that gives developers and users
access to hundreds of different LLMs through a standard API gateway.
So to provide a little bit more background on who OpenRouter is,
the service offers a near-complete range of proprietary and open-source models being served
on a range of different infrastructure.
They serve 25 trillion tokens monthly across 300 models to 5 million end users.
One of the big use cases for OpenRouter is consumer-facing AI apps.
So basically developers can use OpenRouter to automatically route request to the most efficient or appropriate model.
It also provides failover services in case service of a favored model goes down.
So not hard to imagine how you would use this if you are a startup.
Most startups that are providing some sort of consumer or business interface for using AI
are trying to abstract away all the details of which model you're using and things like that.
And so OpenRouter gives them an alternative to plugging into just a single model.
Instead, they can get access to the full suite.
It's more redundant.
It has potential cost efficiencies.
that's the sort of idea here.
Now, individual users can also make use of open router,
but that definitely tends to be for extreme power users.
By way of example, users can plug their OpenRouter API keys into cursor
and get full access to models without needing to handle multiple sets of keys.
The study they released last week is called the State of AI,
an empirical 100 trillion token study with OpenRouter.
In the abstract, they write,
we analyzed over 100 trillion tokens of real-world LLM interactions
across tasks, geographies, and time.
The findings underscore that the way developers and end users engage with LLMs in the wild
is complex and multifaceted.
Now, one more note on the methodology before we dive in,
while 100 trillion tokens is absolutely nothing to sneeze at
and is a very meaningful and reasonable sample size to start to infer some patterns,
the caveats are that, one,
that's somewhere between a 10th and a 15th of the number of tokens
Google Gemini was serving per month before the release of Gemini 3.
So while 100 trillion is a lot, it is still a fairly limited sample size overall.
The second thing to note is that this pattern of usage is concentrated around people who are building things.
So if you did a study like this across all the end users who are using ChatGBT and Claude and Gemini and things like that,
it would probably look a little bit different.
So with that out of the way, let's look at what they actually found.
There were a few different things that stood out to me.
The first, which just absolutely defined the year, is the balance between reasoning versus non-referrales.
reasoning tokens completely shifted over the course of the year. Remember, it was only at the beginning
of December of 2024 when OpenAI's 01 became broadly available. Since then, and over the course of
2025, reasoning model token usage went from basically negligible to now over 50% of tokens consumed.
OpenRouter calls this a full paradigm shift, and I think that this is absolutely a key part of the
story of AI in 2025. Now, of course, part of what reasoning models open up
is more autonomy and agented capabilities. And while not as dramatic as the growth in reasoning,
some indications of that are also starting to show up in the data. They write that the share of
requests that invoke tools rose steadily throughout the year, from around 0% at the beginning
of the year to 15% now. Overall, and this will be surprising to no one who is listening to this show,
the dominant use case, by far, has become programming. Early in 2025, programming was around
11% of usage, and now it is over 50%. We are coming up towards end of the year episodes,
and I think any accounting of 2025 has to start with the fact that the dominant and most
important phenomenon of this year in AI was the rise of AI coding. That's unsurprisingly then
is showing up in token consumption in this study. Now, there are some other ways that we see coding
as the major use case showing up in the study. The average number of prompt tokens per request,
in other words, the average prompt length grew about 4x over the course of the year.
year, from around 1.5,000 tokens to 6,000 tokens. OpenRouter translated it for us saying,
the median request is less, write me an essay, and more, here's a pile of code, docks, and logs
now extract the signal. Now, the next thing that is notable, and in some ways a lot of this
study is a tale of two use cases, is that the other use case that dominates is roleplay,
basically everything in and around, chatting with AI in a fantasy context from innocent to not-so-safe
for work. That is particularly true for open source models, where role play and or creative dialogue,
as they put it, accounted for more than 50% of OSS usage. Now, actually, before we look more at that,
let's look at the patterns of open source versus close source overall. Another big story for this year,
at least among developers building AI applications, has been the rise of open source models,
and specifically Chinese open source models. OpenRouter notes that by Q4 of this year,
open weight models had reached about a third of overall usage, but they also noted that they've
plateaued this quarter. Now, this makes sense intuitively, given that this quarter, we've seen
some major advances in the closed weight models like Gemini 3, GPD-51, and both Sonnet and Opus
4.5. Still, the landscape looks really different than it did last year at this time in terms of the
composition of these two types of models, which makes sense when you remember back that the first
big story in AI of this year was the deep-seek moment.
Indeed, the rise of Chinese open source models is one of the big phenomenons that OpenRouter
noted. They grew from around 1% to as many as 30% in some weeks. In understated fashion,
OpenRouter Notes release velocity and quality make the market lively. And really what
they're saying and what these numbers are showing is that for developers in 2025, open source
models in general, but particularly Chinese open source models, became a major contender when it
came to choosing what models you were going to use for your applications. Indeed, it turns out
that it's not really in either or it's a both-and.
OpenRouter writes,
if you want a single picture of the modern stack,
closed models are for high-value workloads,
and open models are for high-volume workloads.
And as they point out, teams are using both.
Now, going back to the breakdown of what people are using open-source models for,
over 50% of it is role-play and creative dialogue.
Now, I think a lot of people are interpreting this,
as developers using the open models for use cases
that clearly have a lot of demand,
but which fall outside the bounds of what closed source providers want their models being used for.
It is notable, though, that over the course of the summer, programming also became a big part of
open source consumption and now sits at between 15 and 20% of usage. Indeed, when it comes to the
Chinese open source models, programming and technology, in aggregate, are now ahead of roleplay,
which is down to 33%. Basically, the current crop of Chinese open source models is being seen as
viable for pretty much every type of use case. One last note from their highlight summary that I think is
interesting, they observe what they call a Cinderella glass slipper effect for new models.
Basically, when a new model gets released, tons of people come in and try it, and the people who
persist create what Open Router calls a foundational cohort who resist substitution even as newer
models emerge.
Basically, they create a foundation and a base group for that model moving forward.
So what are other people's observations of the study?
Tang Yan, who runs the Chain of Thought AI newsletter, noted a couple things.
One of them, which he called out specifically was the division of different models.
by different usage. He writes, Anthropics Clod is used for over 80% of programming and almost zero
roleplay. It is the serious work model, while DeepSeek is the Entertainment King with two-thirds
roleplay traffic. He also noted that although people are willing to try new models, as he puts
it, quote, a model that's the first to nail a painful workload creates near permanent lock-in.
Early 2025 cohorts of Claude 4 Sonnet and Gemini 2.5 Pro still retain 40 to 50% of users six
months later, while every later cohort churns. Relatedly, he points out, demand is wildly
price in elastic. Users happily pay 10 to 50x more per token for Clotter GPT5 if it saves them 10
minutes of debugging. Being cheap is nowhere near enough. Going back to this idea of different models
for different uses, he noted that there is a new medium-sized model sweet spot in the 20 to 70 billion
parameter range. Token Bender points out that while this study is super useful for understanding
the breakdown of different open source model usage,
we probably shouldn't extrapolate their patterns overall
because OpenRouter is a less preferred option
for the closed model providers.
Most people were focused on the use cases.
Anan Chaudry writes,
OpenRouter reported what everyone building tools already knows.
AI usage is mostly long-running coding job with tool calls.
Jay Little writes,
Her deep seek was good at roleplay
but didn't think 80% of the use would be that, lull.
Sean Chahan writes,
roleplaying in creative writing is 52% of open source usage,
While VC's fund productivity, humans are using AI to write fan fiction and debug code.
The market gap versus reality gap is hilarious.
I don't know if that's totally fair.
If, for example, you look at the internet, it's not like the fact that there is massive amounts
of adult content doesn't mean it's also super useful for productivity, although it certainly
does suggest that there's probably capital opportunities that aren't being taken advantage
of because of particular norms and morals.
One subpart of the conversation was about how GROC dominated total consumption charts,
But this is potentially a little bit dismissable in where the limits of this study show up most to me.
Grok made tokens available for free for some time on OpenRouter as part of a promotion strategy,
which was obviously successful as a way to get people to try it, but which warps the model results at least a little bit.
One really interesting reflection came from Brian Kintano, who actually got meta on the success of OpenRouter in general.
Brian writes, I really thought Cursor and OpenRouter would not become big.
Cursor is just a fork of VS code?
OpenRouter is just a wrapper on top of model APIs.
I was very wrong. I'm realizing that my baseline visceral skepticism of scaffolds and rappers
needs to be unlearned. The AI market, he continues, is special in its sensitive differentiation.
It's easy to switch between providers, but evaluating any model or provider is sensitive.
Small changes in input caused large changes in output. This is true at the prompt level and at
the model level. GPD5 versus Claude 4.5 as inputs to write my code will yield vastly different
results. So buyers in a sensitively differentiated market have the following problem. It's
easy to switch between providers and the models are always getting better. In addition, because this
market is so new, none of the models are sticky yet. This might change with memory, et cetera. So you end up
needing wrappers and scaffolds to do your work over time. Otherwise, you lose out on optionality in a
rapidly changing provider market. I keep expecting one model to win, but this hasn't ever really
happened. Ten Yan again made this point as well. There is no single best model. The top ten models
by volume are from eight different labs. So overall, this is a super interesting study that
while focused on a particular audience of app developers and power users,
in a relatively limited number of 100 trillion tokens,
still shows some of the big changes that we've been feeling throughout the year.
If you want to check out the study for yourself,
you can find it at openrouter.aI.
It's on a banner right on top of the website.
Thanks to the team there and at A16Z for putting this all together.
For now, that's going to do it for today's AI Daily Brief.
Appreciate you guys listening or watching, as always.
And until next time, peace.
