The AI Daily Brief: Artificial Intelligence News and Analysis - The Perils of the AI Exponential

Starting point is 00:00:00 Today on the AI Daily Brief, the perils of the AI exponential. And before that in the headlines, and definitely not related at all, Claude Code turns one. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. All right, friends, quick announcements before we dive in. First of all, thank you to today's sponsors, KPMG, Mercury, AIUC, and Blitzy. To get an ad-free version of the show, go to patreon.com. If you were interested in sponsoring the show, send us a note at sponsors

Starting point is 00:00:38 at AIDDailybrief.aI. You can also find out all about the AIDB ecosystem on AIDLYBief.A.I. The one thing that I would point you to today is that the newsletter is officially back. Rather than making this thing complicated, we decided to just give you guys what people have been requesting forever, which is the links to all of the things that we discuss in the show. So if you are ever looking for some tweet that I mentioned or for an article that I'm referencing, you should go subscribe to the newsletter because it's going to be there.

Starting point is 00:01:04 Again, you can get a link to that as well as everything else on AIDdeleydlybrief. Now with that out of the way, let's dive into the headlines. We kick off today with another reminder of just how fast things are changing. Claudecode, this platform that has become so integral to the changing of the world and the shift in how business gets done is just one year old. In fact, this weekend, Anthropic threw it a first birthday party to celebrate. There was clearly something in the air in February of last year. At the beginning of the month, Andre Carpathy coined the term vibe coding,

Starting point is 00:01:35 and it was a capability set that had clearly just started to come into its own with the latest generation of models. At the time, agentic coding was still seen as something of a fascination. It was something quirky that might help non-technical people build some fun, personal apps, but was very clearly too unreliable to be used in production environments. Fast forward just a year, and on any given day on this show, you're going to hear about the extent to which agentic coding is disrupting not only the software industry, but also infiltrating other areas of work as well. For Anthropic, Claude Code has fundamentally changed the destiny of the company. What started as a side project for developer Boris Churny has become the central pillar of their strategy. Not only is Claude Code generating $2.5 billion

Starting point is 00:02:14 in ARR, it's also being used to code its own upgrades and develop new products at a staggering pace. In a recent interview, Chirney recalled the early weeks of internal release. He said, I remember Dario asking like, hey, are you forcing engineers to use this? Why is everyone using it? Churny responded that all he needed to do was make it available and everyone voted with their feet. The same distribution method is working for developers the world over. Anthropics' recent analysis of their API figures found that almost half of all tool calls are related to software engineering. In other words, AI coding is the biggest use case for Anthropics models, and it's not close.

Starting point is 00:02:46 What's more, ClaudeCode transformed the AI industry. It completely eliminated the argument that AI is just fancy autocomplete or a better version of Google. Watching the changes happen from inside, Boris believes this is a fundamental phase shift for software engineering. In a recent interview, he commented, continuing to trace the exponential, I think what will happen is coding will be generally solved for everyone. Today, coding is practically solved for me, and I think it will be the case for everyone

Starting point is 00:03:09 regardless of domain. So, happy birthday to Claude Code one year in, and it already changed the world. Now, changes in the world are rarely simple. In fact, they are more often, chaotic and even violent. On that front, Anthropics' new security tool sent cybersecurity stocks into a tailspin last week, raising new questions about the software sell-off. On Thursday, Anthropics, Unveiled ClaudeCode Security, another new plugin to extend the tool's capabilities. Anthropics said the feature scans codebases for security vulnerabilities and suggest patches, allowing developers to find and fix security issues that traditional methods often miss. They're phrasing, obviously.

Starting point is 00:03:45 Friday's market action saw cybersecurity stocks decimated. These companies had so far been resistant to the broader software sell-off, with the first trust cybersecurity index losing 11% over the past six months, compared to 24% for other software indices. Friday alone, however, saw CrowdStrike lose 8%, Okta lose 9%, and Cloudflare lose 7%. Many were totally incredulous, with Kenton Varda, a tech lead for Cloudflare posting, Lollet investors who think all forms of security are fungible and so the release of cloud code security, a tool for finding security bugs in your code, means Okta, CrowdStrike, and others should lose

Starting point is 00:04:19 5% of their stock value. Now, of course, a big part of the pushback during the software sell-off has been that even if companies can theoretically vibe code their own SaaS products, if you want to take a on the task of maintaining and supporting internal software. One might imagine that this goes double for cybersecurity, which adds a ton of insurance and liability issues on top. In this case, and this is the point that Kenton was making, the features of cloud code security don't even overlap what the products offered by these firms. What was released by Anthropic is only designed to audit and monitor internal code for vulnerabilities. Cloudflare and CrowdStrike largely provide security for

Starting point is 00:04:52 customer-facing services, preventing downtime from internet-based cyber attacks, and the oak to drawdown is even more puzzling, given that they provide two-factor authentication services. Anthropic didn't even hint at anything regarding any of these aspects of cybersecurity. Still, while it might be easy to dismiss this as irrational markets acting irrationally, for investors, there's a lot of signal in how this crash is playing out. Datis Dick of Triple D trading said, there's been a steady selling in software and today it's security that's getting a mini flash crash on a headline. This kind of market is scary for investors because things are just moving relentlessly to the downside as soon as you get a hint of disruption. It's rational to be cautious because people were saying a while ago that the software drop was overdone,

Starting point is 00:05:30 and yet it keeps going down. Bucco Capital put the logic in even more fundamental terms, writing, I think it's fine to sell cloud flare and crowd strike, actually, even if the Anthropic news doesn't impact them today, because maybe you shouldn't pay 25x revenue when the landscape is shifting this quickly. This is sort of the closest to my take very broadly defined, which is that even if, yes, it does seem like almost all of these moves are overblown in the short term and the catalyst don't really warrant them, I don't think that they're really about the specific catalysts.

Starting point is 00:05:58 I think that they are very clearly part of a broader-based repricing going on right now. That's about exactly what Bucco is talking about here, a question of how to value software when it is changing so quickly. Neither I nor the market knows where things are going to land and where they're going to feel comfortable, and so until that reset point is hit, if it even can be, you're just going to see a lot of these weird moments like this. Still, say Sassy thought that Anthropic could make better use of their newfound power to crash markets asking,

Starting point is 00:06:23 can Anthropic publish a blog post about how they're going to replace four-bed, 4.5 bath homes and walkable neighborhoods with good schools? Now, if you were tracking the model releases over the last week or so, you've probably noticed that Google, Anthropic, and XAI have all thrown new models onto the pile. That means, of course, that it's just about time for the next frontier model from OpenAI. Now, rumors about this one have been coming for a little while. This is the model known as garlic internally, which was the main focus of Sam Altman's Code Red Push, which began in December.

Starting point is 00:06:51 We've been hearing this is coming every week for a few weeks now, with the latest rumor being that GPT 5.3, aka Garlic, will be released on Thursday. We, of course, already got the coding-focused version, GPT-53 Codex at the beginning of the month, with that model bumping up the coding benchmarks, being competitive, if not ahead of Opus 4.6, which released on the same day. At the same time, it also improved on reasoning benchmarks, suggesting there is much to be transferred over to the core version of GPT-53. Covering the rumors AI engineer Dan Mack wrote,

Starting point is 00:07:19 It surpasses human baseline on Simple Bench of 83.7%. In fact, it blows every previous model out of the water on all non-coding benchmarks. Word has it, it is a huge leap, a GPT3 to GPT4 moment again. OpenAI has long had the best reinforcement learning pipeline, which makes sense since they were the first lab to train LLMs for inference time reasoning using RL with 01. Now they've got their mojo back when it comes to pre-training too. Public comments from Sam Altman also point in the direction of major progress. This could be the big one. It may be deserving of a major version bump. AI rumor account I rule the world, added their take saying, just heard from separate sources that this is accurate.

Starting point is 00:07:54 This feels like what we expected from the initial GPT5 release. Expect it quicker, smarter, video and audio in. Start preparing for a big week. They've hidden just how much progress they've made. For those tracking out there, I think there are two very separate things. One, whether we're going to get a model this week, and two, how big a deal it is. I would anticipate that even if it is a big change, I would be very surprised if it was named anything other than 5.3, given how burned Open AI has

Starting point is 00:08:19 has been in the past with more bigger jump naming conventions. Staying in OpenAI land for a minute, a new financial forecast from the company suggests surging revenue alongside rapidly escalating costs. The information got their hands on the latest set of projections headed to Open AI investors. The company is now forecasting 282.5 billion in revenue by 2030, a 27% jump from their previous round of projections. For context, that would put them ahead of where meta is currently, and implies roughly 100% revenue growth in each of the next three years, followed by two years of around 55% growth. OpenAI expects this year's revenue to come in at $30.1 billion, more than doubling 2025's total. They anticipate another doubling in 2027 to reach

Starting point is 00:08:58 $62 billion. And yet, Open AI also doubled their forecast for cash burn, reaching a peak of $85 billion in 2008 and a total of $665 billion over the next five years. They still expect to reach profitability by 2030, but they are anticipating much greater costs along the way. One big part of the adjusted figures was spiraling inference costs in 2025. OpenAI said that the cost to serve their models quadrupled over the past year, causing a compression in gross margins. Margins fell from 40% in 2024 to 33% in 2025. Interestingly, OpenAI had originally forecast margin expansion for 2025,

Starting point is 00:09:33 expecting model efficiency to boost margins to 46%. They lowered margin expectations across the five-year forecast as a result, but are still forecasting margin expansion each year. While inference costs are expected to rise to $14 billion this year, model training is expected to quadruple to $32 billion. Training costs for 2027 are expected to double again to reach $65 billion, which is $44 billion more than forecast last summer. In total, OpenAI expects to spend $440 billion on model training through 2030. The financial presentation also included some user metrics, which came in a little soft due to increased competition from Anthropic

Starting point is 00:10:07 and Google last year. Weekly active chat GPT users are now at 910 million, falling short of the billion user target for 2025. OpenAI leaders pointed to a slowdown around the release of GPT5 as one of the major stumbles for user growth last year, which of course led to the Code Red being declared in December. Now, of course, last week we got this chart from Epic, suggesting that Anthropic could overtake Open AI in revenue as early as this year, which maybe goes away to explaining why Dario and Sam wouldn't hold hands in India last week. One more little OpenAI update, the company's device plans are coming into focus with several new details, including pricing. One of the interesting tidbits from OpenAI's financials was a forecast that hardware sales will contribute $1.3 billion

Starting point is 00:10:47 in revenue next year. To achieve that, OpenAI will need to actually bring their AI devices to market, and according to some new reporting, they seem to be well on their way. The information again reports that a team of 200 people is now working on OpenAI's family of devices. Sources said the family includes a smart speaker and possibly smart glasses and a smart lamp. Notably absent from the reporting was the behind-the-ear capsule-shaped device said to carry the codename Sweet P. The smart speaker is reportedly going to be the first device released by OpenAI and will be priced between $200 and $300. Amazon Echo Smart speakers are currently priced between 50 and 220, so the OpenAI device would be competing at the top end of the market. Sources said the speaker will be equipped with

Starting point is 00:11:22 a camera, allowing it to draw context from its immediate surroundings, and also said that the camera would allow people to use facial recognition to approve purchases. None of the devices will feature a screen of any kind. No new reporting on a timeline for the smart speaker, though previous reporting suggested we won't see the first Open AI device until early next year. The smart glasses, meanwhile, are expected to be released in 2028 at the earliest. Prototypes are said to be available inside the company with the smart lamp specifically mentioned. Sources emphasize that design is still in early stages across the board, so feature details aren't set in stone. One interesting little tidbit is that the device is being designed at a separate office,

Starting point is 00:11:57 away from OpenAI's headquarters, and some at OpenAI have complained that Johnny Ives' design studio love from is slow to revise the designs and shares few details with the main organization. This is, of course, similar to Apple's design culture, where early details, of new devices are shared on a need-to-know basis. Given that every week we get new form factors, I would say at this stage, what you should assume is that basically every object and device that you interact with in the real world is probably going to be tested for its capability to become an AI device before we actually get the final OpenAI product. With that, though, we end the headlines. Next up, the main episode. Agentic AI is powering a $3 trillion productivity revolution,

Starting point is 00:12:39 and leaders are hitting a real decision point. Do you build your own AI agents? buy off the shelf or borrow by partnering to scale faster. KPMG's latest thought leadership paper, Agendic AI Untangled, navigating the build, buy or borrow decision, does a great job cutting through the noise with a practical framework to help you choose based on value, risk, and readiness. And how to scale agents with the right trust, governance, and orchestration foundation. Don't lock in the wrong model.

Starting point is 00:13:03 You can download the paper right now at www.kpmg.us. Again, that's www.kpmg.us. This episode is brought to you by Mercury, banking for people who expect more from the tools they rely on. If you're building a modern business but still using a traditional bank, it just doesn't make sense. I use Mercury for all of my ADB family of companies, and it honestly feels like financial software built for how people actually operate today. It's fast, clean, no in-person visits, no minimum balances, and the things that used to take forever, like sending wires or spinning up new accounts, take seconds. Everything lives in one dashboard, cards, payments, invoices, team permissions, and you can automate all. a lot of the busy work so you're not constantly manually managing your money. Of all of the services

Starting point is 00:13:47 I used to run AIDB, I never thought banking would be one of my most painless and most happy experiences, but with Mercury, that's exactly what it is. Visit Mercury.com to learn more and apply online in minutes. Mercury is a fintech company, not an FDIC insured bank. Banking services provided through Choice Financial Group and column NA, members FDIC. There's a new standard that I think is going to matter a lot for the enterprise AI agent space. It's called AIUC1, and it builds itself as the world's first AI agent standard. It's designed to cover all the core enterprise risks, things like data and privacy, security, safety, reliability, accountability, and societal impact, all verified by a trusted third party. One of the reasons it's on my radar is that 11 Labs,

Starting point is 00:14:27 who you've heard me talk about before and is just an absolute juggernaut right now, just became the first voice agent to be certified against AIUC1 and is launching a first-of-its-kind insurable AI agent. What that means in practice is real-time guardrails that block on safe responses and protect against manipulation, plus a full safety stack. This is the kind of thing that unlocks enterprise adoption. When a company building on 11 labs can point to a third-party certification and say our agents are secure, safe and verified, that changes the conversation. Go to AIUC.com to learn about the world's first standard for AI agents. That's AIUC.com. With the emergence of AI code generation in 2022, Nvidia Master Inventor and Harvard engineer

Starting point is 00:15:06 Sid Peresci took a contrarian stance. Inference time compute and agent orchestration not pre-training would be the key to unlocking high-quality AI-driven software development in the enterprise. He believed the real breakthrough wasn't in how fast AI could generate code, but in how deeply it could reason to build enterprise-grade applications. While the rest of the world focused on co-pilots,

Starting point is 00:15:24 he architected something fundamentally different. Blitzy, the first autonomous software development platform leveraging thousands of agents that is purpose-built for enterprise-scale codebases. Fortune 500 leaders are unlocking 5X engineering velocity and delivering months of engineering work in a matter of days with Blitzy. Transform the way you develop software. Discover how at blitzie.com. That's B-L-I-T-Z-Y.com. Welcome back to the AI Daily Brief.

Starting point is 00:15:51 Today we are talking about an update in the Meter Moore's Law for AI agents chart. Opus 4.6, among others, is finally on the chart, with everyone scurrying around to understand the implications. Combined with that, a new research note from Citrini Research, which is rocketing around the pages of X and the internet more broadly, and it's an interesting case study in the moment in which we are in. Now, by way of background, I'm sure at this point that the vast majority of you are familiar with this chart from Meter, the model evaluation and threat research lab. The chart comes from a continuous study and shows the longest time horizon tasks an AI agent can handle. It was first released in March of last year, and at the time,

Starting point is 00:16:30 Sonnet 3.7 was the most advanced AI model available. Meeter conducted their study going all the way back to GPT2 and found that the time horizon of agenic tasks was reliably doubling roughly every seven months. That's where the idea of this being a kind of Moore's Law for AI agents came from. That original report even suggested that the speed of improvement was accelerating, with the more recent models at the end of 2024 and early 2025, implying a doubling rate as fast as three months. The chart was a huge part of the discourse at the time and became even more significant towards the end of the year as we were overwhelmed by AI Bubble Talk. Now, as we've discussed, 2025 was the first year that we started to get some wobbles in the AI

Starting point is 00:17:09 narrative. It started with the Deep Seek moment, which wiped $600 billion off of Nvidia's market cap in January. And throughout the year, there was this kind of ping pong back and forth behind excitement, but also increasing skepticism. Now, one of the flavors of skepticism that is particularly relevant for those proclaiming AI bubbles in the markets had to do with performance plateaus and scaling walls. Basically, the short of it is that if AI actually hit a scaling wall, where performance just wasn't really getting better anymore, that would make the bubble idea much more likely. The gist of it is that if AI can improve from here, how could it possibly hope to justify these huge infrastructure deals that were predicated on the idea that it kept being more and more

Starting point is 00:17:45 of a significant force in the economy? This is why by the end of the year, as the bubble narrative took hold, many were calling it the most important chart in the economy. It was in many ways the bulwark holding back the full tide of AI bubble pop narratives. Now, before we dig into the latest findings, it is worth noting a little bit on what the meter studies actually say and what they do not say. The studies are designed around a set of software engineering tasks ranging from the trivial to the complex. Human engineers were tasked with solving each problem and their times were used as a benchmark. For example, if a human engineer takes two hours to complete a task, that task has a time horizon of two hours regardless of how quickly an AI can complete it. In other words, and this is

Starting point is 00:18:24 the mistake you see most often on the internet, the metric is not a measure of for how long an AI agent can continuously work. It is a measurement of how difficult a problem an agent can solve, measured in comparative human time to solve the same problem. If a task that takes a human code or two hours is solved by Claude in two minutes, it still yields a two-hour time horizon. The other element of the study worth understanding is how meter determines success for a task. The researchers aren't looking for perfect reliability, as they're trying to measure the capability frontier. Instead, their core finding that features on the viral chart requires an AI agent to produce a correct answer 50% of the time. Meter also has a secondary finding that requires an agent to deliver

Starting point is 00:19:02 the correct response 80% of the time, which, as you would imagine, results in a much lower time horizon. To avoid saying it every time, whenever I'm referring to time horizon in the meter report, I am referring to that standard 50% success rate unless I say otherwise. The point is that these metrics aren't about benchmarking the model's ability exactly. They're about showing the relative improvements across model generations. A 50% success rate is never going to be good enough for an AI coding agent in production, but what matters for the benchmark is the consistent measure and the shift over time. Now, with all of that background out of the way, there was a ton of anticipation around the current generation of models. Google, OpenAI, and Anthropic have all focused

Starting point is 00:19:40 on improving coding agents over recent months, but meter has been relatively quiet. They published results for GPT-5-1 Codex in November, but the results weren't overwhelming. The model had a time horizon of 2 hours and 40 minutes, which was barely better than GPT-5. On the other hand, next came the Opus 4.5 result in December, which showed a 4-hour and 49-minute time horizon. This was a big improvement in almost doubling all on its own. Now, there was a sense that this might be a one-off change due to improvements in the way Anthropic was post-training their models for coding tasks, but obviously the market vindicated their findings. In fact, in January, Swicks wrote, Eval should be validated by vibes. I think not enough people give sufficient credit to meter

Starting point is 00:20:18 for clearly identifying and quantifying the Opus 4.5 outperformance. On paper, GPT-52 thinking outperforms Opus 4.5 by 55.6 versus 52% on Sweet Bench Pro. In practice, meters long avowal's benchmark, while getting increasingly sparse in the long tail, clearly called out the huge jump that many devs are now experiencing a month later. In fact, it is such an outlier that the curve fit was probably wrong and needs to be restarted as a new epic. And yet, of course, even if January's conversation was dominated by Opus 4-5, we've since had the twilight in releases of Opus 4.6 and GPD 5.3, which both seemingly represented another big jump in coding capabilities. That much was obvious from using the models, but people still wanted to see what

Starting point is 00:20:58 Meter would say once testing was complete. On Friday, Meter released the results for both models simultaneously and showed that model quality is accelerating faster than it ever has before. GPT-53 Codex achieved a time horizon of 6.5 hours at 50% completion rate, exceeding Opus 4.5. The results for 4.6 were even more dramatic, achieving a time horizon, of around 14.5 hours. This is the largest generational jump of any model in meter study. Opus 4.6 has more than tripled the time horizon of 4.5, implying the time horizon is now doubling every one and a half months.

Starting point is 00:21:32 Responses came in fast and furious. Investor Nick Carter wrote, This is the most important chart in the world, and it's going absolutely ballistic. Even Bernie Sanders mentioned it in a recent talk at Stanford. In fact, the jaws dropped so hard that many people race to give some caveats. Dean Ball wrote,

Starting point is 00:21:48 For what it's worth, I don't take the meter chart that's been going around as much of an update. Meter itself has been signaling their decreasing confidence in the benchmark for a while now, both because of saturation and limited long-duration tasks in the benchmark. It's certainly impressive in signals that nothing is decelerating, but I don't see it as strong evidence in and of itself that we are in some radically faster progress regime. Indeed, meter themselves heavily caveated the results. Codex results, for example, showed some issues with meter's scaffold. They tested the tasks set again with OpenAI scaffold,

Starting point is 00:22:15 and while they got similar results, they still found the issues noteworthy enough to point out. For Opus 4.6, Meter noted that their model has basically saturated their task set, leading to some unintuitive results. The upper band of their confidence interval is now 98 hours, practically infinite when it comes to this measurement. You can imagine that their task set has very few tasks that would take a human coder more than 14 hours to complete, so it stands to reason that the benchmark is starting to get a little saturated. Researcher David Wien writes, seems like a lot of people are taking this as gospel. When we say the measurement is extremely noisy, we really mean it. Concretely, if the task distribution we're using here was just a tiny

Starting point is 00:22:48 bit different, we could have measured a time horizon of eight hours or 20 hours. Now, overall, meter says that they are updating their methodology to address the issue, but cautioning then overly focusing on these particular results. I think Visimodino really summed it up when they wrote, it's possible that, one, there really is something massive happening right now, and the meter graph really does capture that fact. And two, some small subset of people are mistakenly thinking it's even bigger than it actually is, but that doesn't mean it's actually not very, very big. Now, that sense that something very, very big is happening was exemplified in the response to a new piece from Satrini research called the 2028 Global Intelligence Crisis.

Starting point is 00:23:23 Satrini is a well-regarded research firm among FinTwit, largely doing thematic research and having been very early to several key themes during the AI boom. This latest article covers the implications of abundant intelligence. It essentially took Dario Amade's concept of a country full of geniuses in a data center and applied it to the real world. Among other things the piece predicted that we'll see AI start to consume the entire economy, moving from sector-specific to broad application of cheap machine intelligence. Satrine's thesis is essentially that capital owners are about to reap the massive benefits of AI,

Starting point is 00:23:51 while workers in every strata of the economy will be left jobless and purposeless. Economic activity transforms from being household-based into a capital-based society. This eventually leads to a massive collapse in the stock market, a massive rise in unemployment, and general emisseration across society. Now, given that I'm ripping through this, you can probably tell that what's more interesting to me than the particulars of the piece is the response that it's getting. This is the latest in a long line, a future-oriented AI-dumer sci-fi. What's notable this time is that it turns out that many investors already believe some version of this thesis.

Starting point is 00:24:21 So the incredible response to Satrini's piece is because it's acting as a confirmation that the worst nightmares of an AI-driven economic crisis are possible. Previous reports were met by a lot of skepticism, whereas this article is being met with much more widespread acceptance, or one might suggest confirmation bias. Felix Javan writes, I think what's fascinating about Satrini's piece is it isn't necessarily new ideas for those that have been tapped into what's going on and thinking about it all, but smashes the common knowledge game around it, and now it's becoming something that everyone knows everyone knows. A tiny fraction of the population knows what OpenClaw is and an even smaller subset has set one up. There's a lot for people to come to terms with. Unemployed capital allocator writes,

Starting point is 00:24:59 The final boss of hysteria is entering the arena. In two weeks, it will be all over LinkedIn. In four weeks, Wall Street Journal and Financial Times. Every analyst will be typing in unemployment when to the chat-GBT chat box. Citrini will be appointed the AI policies are. Just remember, it might be peak fear, it might be dumb. Markets are primed to buy it and drive things down. There are no atheists in foxholes. Now, of course, there are plenty of people who took issue with specific parts of this. Dan Hockenmeyer writes,

Starting point is 00:25:25 This piece shows a profound lack of understanding of how marketplaces work and why they are defensible. Quoting from the piece, he says, A competent developer could deploy a functional competitor in weeks and dozens did, enticing drivers away from DoorDash and Uber Eats by passing 90 to 95% of the delivery fee through to the driver. Dan picks up, anyone could have done that at any time in the last 10 years. Why was no one able to? Because the hard part has nothing to do with building the app or attracting the drivers. The hard part is building a liquid marketplace with all the best supply in a massive series of optimizations and investments to drive down prices and delivery times and drive up reliability and quality.

Starting point is 00:25:56 DoorDash and Eats have built this when no one else could, and they will not allow agents to transact on their apps, nor will they have a legal requirement to allow it. But the real story isn't as sensational, so it doesn't get the engagement. Economist Guy Berger writes, this was an interesting read, but I'm not sure it's internally consistent. One question that comes to mind, those who own the agents, what are they doing with the money they're making? Why isn't that fueling employment, GDP, and stock prices? Now again, I have a feeling that we're going to be talking about this one more in the weeks to come, so I'm not trying to go fully in-depth today.

Starting point is 00:26:25 I think what's important here is the way that these individual elements all add up to something more. The story of early 2026 so far is a broad-based sense that, to quote that viral piece from about a week ago, something big is happening. The capability set of the coding models has increased dramatically, which has opened up agents as a real force. Those two things combined have moved the impact of AI and agents from just software engineering to everything else. Markets are starting to reprice things as a consequence, and nothing seems like it's going to slow down at all. And because of that, everyone is trying to figure out what next. Now, I would argue that we are desperately in need of the non-dumer version of the Satrini

Starting point is 00:27:00 piece, which is something that I'm trying to work on in the background. So keep an eye out for that. For now, it remains a really interesting anthropological study of the moment that I think can tell us a lot about where general sentiment is. That, however, is going to do it for today's AI Daily Brief. Appreciate you listening or watching, as always. Until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - The Perils of the AI Exponential

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.