The AI Daily Brief: Artificial Intelligence News and Analysis - The Era of AI Mass Intelligence Arrives

Starting point is 00:00:00 Today on the AI Daily Brief, the era of mass intelligence arrives. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Hello, friends, quick announcements. Thank you to today's sponsors, KPMG, High Touch, and Blitzy. And to get an ad-free version of the show, go to patreon.com slash AI Daily Brief. Well, friends, it is the end of the summer. Here in the U.S., at least, we are in the midst of the Labor Day weekend, the historical end mark of the summer before we head back to school, back to work,

Starting point is 00:00:40 thinking markets are going to return in September, but actually it's just a bunch of people who come back and dump all the things that they didn't want to sell into low liquidity during the summer, leading to poor performance in the stock market in September before it finally comes back in October and then we get ripping again. But in general, the point is that we are at one of those seasonal inflection point moments where I think it is useful to reflect on where we've been and where we're going. This has been anything but a slow summer. We have had numerous model announcements. GPT5, notably, but then also more recently, and as we'll discuss today, Google's image model that's affectionately known as Nanobanana. We've had lots and lots of

Starting point is 00:01:15 policy announcements. We've had intrigue in the geopolitics of AI. The China-US AI divide continues in these interesting ways. We've had a scuttlebutt about Enterprise AI. I will be talking about this MIT 95% survey till the day I die, it seems at this point. And so the question is where this all nets out. To help frame the conversation, I'm going to do something that we haven't had a chance to do for a while, which is read a piece by Professor Ethan Mollick from his one useful thing blog, simply called mass intelligence. It is his rumination on this exact question, what it means when so many people have access to these hyper-powerful models. And so let's go read that piece, which will be me, not AI, as I might skip around or add annotations a little bit,

Starting point is 00:02:00 and then we'll come back and discuss it a bit more. Ethan writes, more than a billion people use AI chatbots regularly. ChatGPT has over 700 million weekly users, Gemini and other leading AIs add hundreds of millions more. In my posts, I often focus on the advances that AI is making, for example, in the past few weeks both OpenAI and Google AI chatbots got gold medals in the International Math Olympiad, but that obscures a broader shift that's been building. We're entering an era of mass intelligence, where powerful AI is becoming as accessible as a Google search. Until recently, free users of these systems, the overwhelming majority, had access only to older, smaller AI models that frequently made mistakes and had limited use for complex work.

Starting point is 00:02:44 The best models, like reasoners that can solve very hard problems and hallucinate much less often, required paying somewhere between $20 and $200 a month. And even then, you needed to know which model to pick and how to prompt it properly. But the economics and interfaces are changing rapidly, with fairly large consequences for how all of us work, learn, and think. There have been two barriers to accessing powerful AI for most users. The first was confusion. Few people knew how to select an AI model. Even fewer knew that picking 03 from a menu in chat GPT would get them access to an excellent reasoner AI model, while picking 4-0, which seems like a higher number, would give them something far less capable. According to OpenAI, less than 7% of paying customers selected 03 on a regular basis,

Starting point is 00:03:27 meaning even power users were missing out on what reasoners could do. Another factor was cost. Because the best models are expensive, free users were often not given access to them, or else given very limited access. Google led the way in giving some free access to its best models, but OpenAI stated that almost none of its free customers had regular access to reasoning models prior to the launch of GBT5. GBT5 was supposed to solve both of these problems, which is partially, why its debut was so messy and confusing. GPD5 is actually two things. It was the overall name for a family of quite different models, from the weaker GPT5 nano to the powerful GPT5 Pro. It was also the name given to the tool that picked which model to use and how much

Starting point is 00:04:06 computing power the AI should use to solve your problem. When you are writing to GPD5, you are actually talking to a router that is supposed to automatically decide whether your problem can be solved by a smaller, faster model or needs to go to a more powerful reasoner. You could see how this was supposed to expand access to powerful AI to more users. If you just wanted to chat, GPT5 was supposed to use its weaker specialized chat models. If you were trying to solve a math problem, GPT5 was supposed to send you to its slower, more expensive GPT5 thinking model. This would save money and give more people access to the best AIs. But the rollout had issues. This practice wasn't well explained and the router did not work well at first. The result is that one person

Starting point is 00:04:46 using GPT5 got a very smart answer, while another got a bad one. Despite these issues OpenAI reported early success. Within a few days of launch, the percentage of paying customers who had used a reasoner went from 7% to 24%. And the number of free customers using the most powerful models went from almost zero to 7%. Part of this change is driven by the fact that smarter models are getting dramatically more efficient to run. The graph below shows how fast this trend is played out, mapping the capability of AI on the y-axis, and the logarithmically decreasing costs on the X-axis. When GPT4 came out, it was around $50 to work with a million tokens. Now it costs around 14 cents per million tokens to use GPT5 nano, a much more

Starting point is 00:05:26 capable model than the original GPT4. The efficiency gain isn't just financial, it's also environmental. Google has reported that energy efficiency per prompt has improved by 33X in the last year alone. The marginal energy used by a standard prompt from a modern LLM in 2025 is relatively established at this point from both independent tests and official announcements. It is roughly 0.003 kilowatt hours, the same energy used as 8 to 10 seconds of streaming Netflix or the equivalent of a Google search in 2008. Interestingly, image creation seems to use a similar amount of energy as a text prompt. How much water these models use per prompt is less clear, but ranges from a few drops

Starting point is 00:06:04 to a fifth of a shot glass. These improvements mean that even as AI gets more powerful, it's also becoming visible to more people. The marginal cost of serving each additional user has collapsed, which means more business models like ad support become possible. Free users can now run personal. prompts that would have cost dollars just two years ago. This is how a billion people suddenly get access to powerful AIs, not through some grand democratization initiative, but because the economics

Starting point is 00:06:29 finally make it possible. However, getting access to a powerful AI is not enough. People need to actually use it to get things done. Using AI well used to be a pretty challenging process, which involved crafting a prompt using techniques like Chain of Thought, along with learning tips and tricks to get the most out of your AI. In a recent series of experiments, however, we've discovered that these techniques don't really help anyone. Powerful AI models are just getting better at doing what you ask them or even figuring out what you want and going beyond what you ask. And no, threatening them or being nice to them does not seem to help on average. And it isn't just text models that are becoming cheaper and easier to use. Google released a new image model with

Starting point is 00:07:07 the codename nanobanana and the much more boring official name Gemini 2.5 Flash Image generator. In addition to being excellent, though better at editing images than creating new ones, it is also cheap enough that free users can access it. And unlike previous generations of AI image generators, it follows instructions in plain language very well. As an example of both its power and ease of use, I uploaded an iconic and copyright-free image of the Apollo 11 astronauts in a random picture of a sparkly tuxedo and gave it the simplest prompt. Dressed Neil Armstrong on the left in this tuxedo. Editor's note Ethan then shows the output, which indeed has Neil Armstrong in the tuxedo. He continues, There are issues that someone with an expert I would spot, but it's still impressive to see the

Starting point is 00:07:50 realistic folds of the tuxedo and how it's blended into the scene. It even has a NASA pin on the lapel. There is still a lot of randomness in the process that makes AI image editing unsuitable for many professional applications, but for most people, this represents a huge leap in not just what they can do, but how easy it is to do it. And we can go further. In his next prompt, Ethan writes, now show a photograph where Neil Armstrong and Buzz Aldrin, in the same outfits are sitting in their seats on a modern airplane. Neil looks relaxed and is leaning back playing a trumpet. Buzz seems nervous and is holding a hamburger. In the middle seat is a realistic otter sitting in a seat and using a laptop. Ethan then shares the picture of exactly that

Starting point is 00:08:28 scenario. He continues, this is many things. A pretty impressive output from the AI, a distortion of the famous moment in history made possible by AI, and a potential warning about how weird things are going to get when these sorts of technologies are used widely. Concluding section, the weirdness of mass intelligence. When powerful AI is in the hands of a billion people, a lot of things are going to happen at once. A lot of things are already happening at once. Some people have intense relationships with AI models, while other people are being saved from loneliness.

Starting point is 00:09:00 AI models may be causing mental breakdowns and dangerous behavior for some, while being used to diagnose the diseases of others. It is being used to write obituaries and create scriptures and cheat on homework and launch new ventures and thousands of other unexpected uses. These uses, and both the problems and benefits, are likely to only multiply as AI systems get more powerful. And while Google's AI image generator has guardrails to limit misuse, as well as invisible watermarks to identify AI images, I expect much less restrictive AI image generators will likely get close to nanobanana in quality in the coming months. The AI companies,

Starting point is 00:09:33 whether you believe their commitments to safety or not, seem to be as unable to absorb all of this as the rest of us are. When a billion people have access to advanced AI, we've entered what we might call the era of mass intelligence. Every institution we have, schools, hospitals, courts, companies, governments, was built for a world where intelligence was scarce and expensive. Now every profession, every institution, every community has to figure out how to thrive with mass intelligence. How do we harness a billion people using AI while managing the chaos that comes with it? How do we rebuild trust when anyone can fabricate anything. How do we preserve what's valuable about human expertise while democratizing access to knowledge? So here we are. Powerful AI is cheap enough to give

Starting point is 00:10:15 away, easy enough that you don't need a manual, and capable enough to outperform humans at a range of intellectual tasks. A flood of opportunities and problems are about to show up in classrooms, courtrooms, and boardrooms around the world. The mass intelligence era is what happens when you give a billion people access to an unprecedented set of tools and see what they do with it. We are about find out what that is like. What if AI wasn't just a buzzword, but a business imperative? On You Can with AI, we take you inside the boardrooms and strategy sessions of the world's most forward-thinking enterprises.

Starting point is 00:10:49 Hosted by me, Nathaniel Wittamore, and powered by KPMG, this seven-part series delivers real-world insights from leaders who are scaling AI with purpose. From aligning culture and leadership to building trust, data readiness, and deploying AI agents. you're a C-suite executive, strategist, or innovator, this podcast is your front-row seat to the future of Enterprise AI. So go check it out at www.kpmg.org.us slash AI podcasts, or search you can with AI on Spotify, Apple Podcasts, or wherever you get your podcasts. If you are a regular listener, you will have heard about superintelligence agent readiness audits at this point. But I wanted to

Starting point is 00:11:28 tell you today about the full suite of agent readiness products that go beyond just the initial readiness report. Over the last six months, Superintelligent has built out an entire agent planning suite. We help you move from discovery to planning to implementation. After you've completed your agent readiness audits, we help you double-click on your most important use cases with what we call our use case planning reports. These reports are going to help you understand what sort of technical preparation you need to do to be ready for a use case, what challenges you might face in implementation, and whether you should be thinking about building, buying, partnering, or some combination. After that, you can even get a spec document in what we call our technical blueprint that gives either your developers

Starting point is 00:12:07 or the developers of the partner you work with what they need to build exactly the agent that you're looking for. If you want to learn more about superintelligence agent planning suite, we've built a custom GPT to answer your questions. Just go to bit.ly slash super super super agent. That's bit.l.ly slash super super agent, all one word. And if you have any questions, the agent can even help you book an appointment with our team. This episode is brought to you by Blitzy, the Enterprise Autonomous Software Development Platform with infinite code context. Blitzy uses thousands of specialized AI agents that think for hours to understand enterprise scale code bases with millions of lines of code. Enterprise engineering leaders start every development sprint with the Blitzy platform,

Starting point is 00:12:50 bringing in their development requirements. The Blitzy platform provides a plan, then generates and pre-compiles code for each task. Blitzy delivers 80% plus of the development work autonomously while providing a guide for the final 20% of human development work required to complete the sprint. Public companies are achieving a 5x engineering velocity increase when incorporating Blitzie as their pre-IDE development tool, pairing it with their coding co-pilot of choice to bring an AI-native STLC into their org. Blitzy is providing a limited time, 30-day free proof of concept for qualifying enterprises. The team will provide a 5x velocity increase on a real development project in your org.

Starting point is 00:13:24 Visit blitzie.com and press book demo to learn how Blitzie transforms your STLC from AI assisted to AI native. That's BLITZY.com. All right, so first of all, thanks to Ethan for another great post. If you don't follow one useful thing yet, you should. It's at one useful thing.org. And broadly speaking, I think that there's kind of two posts that Ethan has. One is a sort of almost in medius race, fast reaction to some new model or some new study that has just come out. And the other is this sort of much bigger zoom-out kind of pondering piece, which, as you might be able to tell, are the types of things that I love, especially for these big-thinking type of long reeds episodes.

Starting point is 00:14:02 In any case, I really do think that when we look back at the story of 2025 and AI, the democratization of access to powerful models, specifically via the reduced cost of models, is going to be the big blinking story of this era. Think about it. The year kicked off with the Deep Seek moment where this new model out of China, perforedly trained at a tiny fraction of the cost of all the other big frontier models, all of a sudden was rocketing up the app charts, in fact beating out Chat Chitbt as the App Store's top app.

Starting point is 00:14:32 Now, what was happening, of course, was that for the first time, people were experiencing a reasoning model, despite the fact that OpenAIs O1 had been out for a few months, and that it was even at the time still better by most metrics than Deepseeks R1, it was hidden, as Ethan points out, behind the paywall for the vast majority of people.

Starting point is 00:14:51 What's more, if you weren't paying attention even if you did pay for ChatGPT, having to keep track of what the difference between the O series of models was versus the numbered series of models, was something that most busy people just weren't willing to do. In any case, in practice, what happened is that all of a sudden people had this cute little app that was explaining its thinking, quote unquote, as it was thinking, which, by the way, people love that little bit of anthropomorphism, and then it was producing really good responses because reasoning models are just a step change relative to what most

Starting point is 00:15:18 people have experienced. Now, of course, subsequent to this, DeepSeek itself has fallen off a little bit. We haven't gotten R2. The company seems to be being impacted by limited access to advanced chips. And despite its popularity, which continues, it's not nearly as disruptive of force as some thought it might be back in January. However, its legacy is profound. One of the outcomes was OpenAI getting less restrictive about giving access in the free version to more powerful models. And I think it certainly impacted how they thought about the launch of GBT5. Now, speaking of GBT5, Another big story, obviously, probably the biggest in some ways from the summer, was the absolute dust up following the launch of GPT5.

Starting point is 00:15:57 The response to the model taught us a lot of things, I think. One of them was simply that trying to accommodate 700 million different types of users all with different types of expectations and different types of usage patterns is an extraordinarily hard feat. Another was that people have very specific devotion to the models that they work with, which is a phenomenon that could get more pronounced and more powerful over time. But a third, I think to me, is that it shows the problems of being over-reliant on thinking exclusively about AI progress in terms of how much the latest-numbered GPT is obviously and

Starting point is 00:16:31 clearly better than the past numbered GBT. I think that the launch of GPD-5 revealed just how limited a view of AI progress that is, and I think the limitation is more serious than just the ego hit for the people who built the models and any financial implications. I think it revealed that we are fundamentally misthinking about how to, you know, to judge AI progress right now. Specifically, our obsession with just asking how much better the five version is of the thing than the four version, whether it's GPT now or GROC in the future, or Gemini 3 when it comes out and displaces 2.5, our obsession with that sort of strict

Starting point is 00:17:06 numbered foundation model progress is obscuring a bunch of other versions of progress that are in practice, just as if not more significant. The first dimension that we're missing out on is cost reduction. In his post, Ethan gave the dramatic example of the depreciation and cost of GPT4 quality tokens from $50 when GPT4 first came out to now 14 cents per million tokens, around one 300th of that cost for an even more performant model in GPT5 Nano. Now, we've seen this year with studies like Menlo's Enterprise Update that at the moment, cost is simply not a huge driver of major AI behavior. By which I mean, right now, when companies or individuals are choosing the models they use, they're largely focused on using the state of the art. Performance beats out cost of

Starting point is 00:17:56 ownership 61 to 36 when it comes to why enterprises switch models, for example. And they don't do that switching all that much. However, I believe that this actually undersells an important thing that's happening. Right now, we are riding the frontier of capability. Every new model that gets released unlocks some meaningful set of new opportunities and use cases that wasn't really possible before. However, the more performant that we get, the more that it means that models that are just behind the state of the art, which potentially have a totally different economic cost structure, can be used effectively for some variety of use cases. And more than that, so much of our usage right now, especially when you look at enterprise usage, is still in the pilot experimental or

Starting point is 00:18:43 very early single-use kind of deployment phases. In those scenarios, it makes sense that companies are optimizing maximally for performance, especially as they're trying to think about replacing big chunks of human tasks with AI or agents. In many, if not most cases, companies are going to want the highest-performing models. At the same time, we're also starting to see another phenomenon, which is more and more autonomously capable agents doing things in the background while humans do other work. In other words, we're starting to see the forking off of people using AI to help themselves to instead setting up AIs that simply work in the background doing things while humans are working on other tasks or simply accomplishing other goals. I believe that you're starting to see this

Starting point is 00:19:29 pattern of autonomous background agents, specifically in coding, start to show up in token consumption. Between May and July this year, Google's monthly token processing jumped from $4.80 trillion to $980 trillion, a 104% growth in just two months. I think that that reflects this mass expansion of coding workloads, and I think you're going to start to see that across a variety of use cases in the enterprise very, very soon. As that happens, it is very likely to me that it turns out that in practice, the most important frontier for enterprises, will not have been simply performance, but the relationship between, in other words, performance and cost. When you're an enterprise seeking to deploy millions or even billions of tokens across a variety of uses, you better believe that the cost profile is going to start

Starting point is 00:20:20 to batter. The point being that by simply focusing on how much better on the benchmarks, GPT5 was versus GPT 4.5, we're really missing out on this entire other dimension. Now, OpenAI even tried to explain this aspect of cost in intervention. views following the announce of GPT-5, but frankly, they didn't do a very good job of explaining how significant this was when it comes to opening up new types of usage. The second problem, I think, with the strict how much better is 5 than 4.5 type of analysis, is that it misses out on how small changes can open up entirely new use cases, as people started to excitedly share all of their

Starting point is 00:20:56 nano-banana generations, switching out the clothes in different images like you saw with Ethan's post, creating 3D game asset models of real-world photos, or a variety of other uses like that where they were editing photos, you started to see some number of posts like this one from David Shapiro, who wrote, I really don't get the nanobanana hype. It's an editor. It doesn't generate breathtaking images unless it looks like a stock photo. It's certainly a step in an interesting direction,

Starting point is 00:21:21 but I was expecting an all-in-one model that blasted everything else out of the water. This is not that. Now, as I said, this is not to pick on David at all. But what's clear to me watching how people are using nanobanana, is that it doesn't matter that it's quote-unquote just editing. It's that the type of editing that it enables unlocks a huge array of new use cases that are extremely valuable, both on a personal and on an economic level that were not possible in any sort of efficient way before. When everyone can do the most powerful type of editing that Photoshop ever enabled,

Starting point is 00:21:50 without requiring any of the learning and intensive time on task that it took to get good enough at Photoshop to do those things, that is an economically significant and impactful development. I'm working on and we'll have a show in the next couple of weeks about a new benchmark proposal I'm calling the unlock score. The idea is pretty simple. Instead of focusing on either academic benchmarks that can be gamed and that are mostly saturated right now, or on subjective preference benchmarks like LM Arena, which I think are valuable, the unlock score would basically look at what new use cases a new model unlocked, how valuable they were, and how widespread they were.

Starting point is 00:22:24 In order to help people understand not an overall metric of a model's capabilities, but a comparative metric of how much it changes versus its most proximate predecessors. And the broader point here, I think, again, is that by overly focusing on how much better GPT5 was versus 4.5, we're missing out on all of the multimodal innovation that's happening right now as well. For as much as we've saturated some of the text-based use cases of LLMs, we've barely scratched the surface on what a lot of the visual, audio, and video possibilities are going to be in the near future. And to bring it back to Ethan's post, and this idea of entering into the new, era of mass intelligence. Part of the power of this moment and part of what I think will be the

Starting point is 00:23:04 distinct hallmark of 2025 is the fact that the changes in the efficiency and cost multiplied by the changes in capabilities, which yes, especially when we take a step away, have been significant, means that the total audience of people who are using these powerful tools to go create new use cases and do new things is expanding rapidly. Now, this is not the end of year episode. This is an end of summer episode. So we'll refrain from trying to make conclusions that are too big. But this is certainly what I'm watching heading into this fall. TLDR, I am not strictly focused on how much better the next model is going to be.

Starting point is 00:23:39 When it comes to GROC 5 or Gemini 3, or even GPD6, if they really rush it, what I'm interested in is how it opens up new use cases and changes the cost of doing those things at scale. Let me know what you think about all this. Shoot me a note in the comments or on YouTube. And I hope that all of you are having a great weekend, whether it's a long one in America or a normal one somewhere else. Appreciate you listening or watching as always.

Starting point is 00:24:02 Until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - The Era of AI Mass Intelligence Arrives

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.