The AI Daily Brief: Artificial Intelligence News and Analysis - What OpenAI and Anthropic Think Happens Next With AI

Episode Date: June 5, 2026

Today on the AI Daily Brief, NLW breaks down new pieces from OpenAI and Anthropic that reveal how the leading AI labs think about recursive self-improvement, frontier AI governance, and what happens n...ext as AI starts accelerating its own development. In the headlines: reports that the U.S. government is discussing taking equity stakes in major AI labs, OpenAI upgrades ChatGPT memory, and rumors swirl around GPT-5.6 and Anthropic’s Mythos.Sign up for AI Executive Catchup: ⁠⁠https://aiexecutivecatchup.com/⁠⁠Brought to you by:KPMG – Research from KPMG and the University of Texas at Austin shows the highest-impact AI users treat AI like a reasoning partner — and those skills can be taught at scale. Learn more at ⁠⁠⁠⁠⁠⁠⁠⁠kpmg.com/us/Sophisticated⁠⁠⁠⁠⁠⁠⁠⁠Bolt - Claim a free month of Bolt Pro - ⁠https://bolt.new/partner/aidb/⁠Outsystems - Stop wondering how AI will change your business and start building the agents that will lead it - http://outsystems.com/Scrunch - The AI customer experience platform - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://scrunch.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Zenflow Work - Agents for knowledge work - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://zenflow.free/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Blitzy - Want to accelerate enterprise software development velocity by 5x? ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://blitzy.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠AssemblyAI - The best way to build Voice AI apps - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.assemblyai.com/brief⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Robots & Pencils - Cloud-native AI solutions that power results ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://robotsandpencils.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://pod.link/1680633614⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Our Newsletter is BACK: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://aidailybrief.beehiiv.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Interested in sponsoring the show? sponsors@aidailybrief.ai

Transcript
Discussion (0)
Starting point is 00:00:00 Today in the AI Daily Brief, what open AI and anthropic think about what happens next in AI? Before that in the headlines, is the U.S. government going to take a stake in the big AI Labs? The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. All right, friends, quick announcements before we dive in. First of all, thank you to today's sponsors, KPMG, section, Zen Coder, and Out Systems. To get an ad-free version of the show, go to Patreon.com slash AI Daily Brief, or you can subscribe an Apple podcast to learn more about sponsoring the show, send us a note at sponsors at AIDailyBrieve.com. You can also find out about everything else going on in the ecosystem, including a bunch of free education programs like the Agent OS program, or you can
Starting point is 00:00:45 check out some of the paid programs that we've been building with Newfar Gaspar leading, including the upcoming four-week AI executive sprint called Executive Catchup. That is registering folks right now, but is going to be closing very soon. So if you want to check that out and get up to speed fast, check out the link in the show notes. Welcome back to the AI Daily Brief Headlines edition. All the AI news you need in around five minutes. And boy, do we have a juicy little end to this week. First up, bombshell reporting claims that the U.S. government is in talks to acquire equity in major AI companies. Rights Notice, senior U.S. officials have held primary discussions with major artificial intelligence companies about the potential for the federal government to acquire some
Starting point is 00:01:25 shares in their firms. The commentary is sourced to three people familiar with the matter. Notice adds that Sam Altman has discussed this idea periodically with senior administration officials, since the beginning of the second Trump term. In fact, Altman is said to have pitched the idea directly to the president in early 2025. Discussions have reportedly continued with senior officials in recent weeks. Sources said that Altman viewed this as a way to more broadly distribute the economic benefits of AI to the public. And what's more? People familiar with the discussion said that it currently centers on the idea of AI labs,
Starting point is 00:01:56 quote, voluntarily seeding shares to the government. The shares would then produce returns that could be directed to public purposes, such as cutting an AI dividend check to all-American households. Now, that language of voluntarily seeding makes it a little unclear whether the government would pay for the shares, and sources also state that Anthropic is not involved in discussions about providing equity at this time. Now, for those of you with one I raised on the sourcing, notice is a relatively new publication but has extremely strong credibility, founded by Politico reporter Robert Albatron.
Starting point is 00:02:24 Former Washington Post reporter Jeff Stein is the lead journalist on this story, and Stein is generally considered to be one of the most well-sourced and highly regarded journalists in Washington. Now, overall, while some of this is fairly dramatic, it's also not entirely unexpected. This administration has floated the idea of building a sovereign wealth fund and taken steps towards it by acquiring minority stakes in numerous companies, including Intel. It's also not, as we saw earlier this week, entirely novel thinking in Washington, with Bernie Sanders floating the idea of the government taking a 50% stake in AI companies through a one-time tax earlier this week.
Starting point is 00:02:55 The idea has made for some strange political bedfellows, with figures on the populist right finding themselves aligning pretty closely with Sanders. Responding to the Sanders proposal, Steve Bannon said this week, you can smell the stench of desperation, emanating from the oligarchs as they run heedlessly to a public market takeout. We should not take tip money but force them to cough up 50% of the equity, to be dispersed to American citizens. The horseshoe theory of American politics is well and truly intact. Now, in terms of the public reaction, a lot of folks just questioned simply why taxation wasn't the right way to do this. Georgetown Law Professor Peter Harrell writes,
Starting point is 00:03:30 The government should tax and regulate them and potentially distribute the taxes as a dividend, but ownership risks giving the government control outside of public view and potentially the wrong incentives. Boeback Mcuffin writes, What if they set up some system where the company sent a certain percentage of their profits to the federal government every year?
Starting point is 00:03:49 In quarterly installments, maybe the states could also choose to tax the companies if they operate in that state. Joel Griffith was more blunt, writing, more quote-unquote capitalism but with Chinese Communist Party characteristics, brought to you not by AOC and Bernie Sanders, but by the current United States president. Even Axios business editor Dan Primac wrote, this is basically the Bernie proposal. Really never expected that electing Trump would push the U.S. government so far towards
Starting point is 00:04:13 actual socialism. Not even a judgment called, just surprise. Now, heading on over to a completely different type of topic, OpenAI has shipped a huge update for their memory system, which they're calling dreaming. Now, some version of memory has been available in ChatGPT for a little over two years and has already come a long way. Early versions of memory were very manual and pretty clunky, relying on a list of saved memories. Users often needed to tell the chatbot to remember specific things, and actively cull useless information from the list. Indeed, one of the big challenges of memory is when it remembers details that are no longer relevant. Last April, OpenAI integrated the
Starting point is 00:04:48 first elements of the system that would become dreaming. It allowed ChatchipT to actively curate memories in the background, slowly building a more accurate picture. of the user's preferences. This upgrade made the process feel more natural and continuous, eliminating many of the early pain points. With this release, OpenAI said that they have made the memory system much more capable and compute efficient. Individual saved memories are gone. Instead, the dreaming system will maintain a summary that provides richer context about the user. The summary is fully accessible to the user and can be edited directly to make corrections or add more information. OpenAI provided a simple example of where memory is useful in the context of asking chat
Starting point is 00:05:22 GBT about buying peripherals for a photography setup. Without memory, chatGPT provides generic information and make standard recommendations. With memory, the chatbot can tailor its suggestions to the gear the user already owns. Now, OpenAI devised a new benchmark to test their system based on asking questions that required the model to recall relevant facts. With the 2024 version of memory, which again was just the saved list of facts, the model succeeded on 41.5% of tasks. The 2025 version, which added the early version of Dreaming, kicked that success up to 67.9%. With the version of Dreaming announced this week, OpenAI found the model could succeed at 82.8% of tasks that required the recollection of relevant facts.
Starting point is 00:06:00 Now, if you have done either our ClawCamp or AgentOS program, you might be thinking to yourself, this system looks a lot like building the Memory.mD file for an agent, or even something like the Personal Context Project we ran through the AIDB operators community. And indeed, one could be forgiven for thinking that Memories is just about creating and maintaining a markdown file that stores crucial information about the user. The key difference is that ChatGBTGBT is now running this process automatically through the back end, requiring far less of the average user to take full advantage. OpenAI also noted that the huge efficiency gains from their new setup will allow them to provide
Starting point is 00:06:31 dreaming to free users for the first time. Last summer, when paid subscribers gained access to dreaming style memory, free users had only access to basic saved memories. OpenAI says that they've been able to cut down the compute requirements for dreaming by 5X, meaning it's now practical to serve at scale. Mark Kretschman argues that this is a bigger deal than it sounds, writing, the more chaty-b-t becomes an actual work partner, the less sense it makes to restart from zero every time. Projects, preferences, constraints, tools, writing styles, code-based details, all of this should carry forward.
Starting point is 00:07:01 Sounds small, but it changes the product. A chatbot with real memory becomes much closer to a persistent agent. I also think it's interesting in our context of how companies are adapting to the token scarcity era. You remember, Arvin Jay and the CEO of Glein wrote a piece about the token economy that he called Your Token, is an AI architecture problem, and in it he discussed a lot of similar themes around token efficiency. In short, all the time that you spend getting a model to remember all the relevant context each and every time are wasted turns and wasted tokens that could be fixed theoretically through better memory systems. So again, in some ways, a small update, but one with potentially bigger implications.
Starting point is 00:07:35 Now, speaking of dealing with the token scarcity era, TSM has warned that there's only so much they can do to alleviate the chip shortage that is expected to last all of this decade. In candid, at their annual shareholders meeting on Thursday, CEO CC Wei said, customer demand is so high and we can only support so much. We're already working very hard. We're doing our best to ensure TSM does not become a bottleneck. Now, TSM has already committed to building multiple new fabs in the U.S. as well as more capacity in Taiwan, but construction takes time.
Starting point is 00:08:05 Wei discussed a series of roadblocks that have caused the U.S. plans to fall behind schedule, including environmental permitting, and a shortage of construction workers. TSM has expanded plans for six new fabs in Arizona, adding another four facilities to their construction plan. Still, Wei commented that construction and operational progress in Arizona is proceeding better than originally expected. While Wei didn't forecast how long he expects the overall shortage to last, he did comment, it will be a long time before we can meet customer demand. When a shareholder asked Wei if he plans to raise prices given the shortage, he said he would like to do that, but wants to avoid the abrupt price hikes that
Starting point is 00:08:39 have been seen in memory chips. TSMC is famously a relationship-based company with InVideo CEO Jensen Huang commenting that he's never signed a contract despite becoming TSM's largest customer. Still, referencing the memory chip suppliers way said, I envy their 80% gross margins, but I would never do that. One interesting little one, Airbnb CEO Brian Chesky, is planning to launch a new AI lab. Bloomberg said the lab would be focused on user interaction and design, although it's not entirely clear what that means. It doesn't sound like the pitch for a new foundation model lab,
Starting point is 00:09:09 so there's speculation it will be some kind of agent lab. Now, Chesky, despite a very prominent role in Silicon Valley, has so far been mostly in the periphery in the AI boom. He was considered for a role on the OpenAI board and was a key power broker in negotiating Sam Altman's return as CEO after his brief ouster back in 2023. Sources said that Chesky wouldn't be going founder mode for this venture. Instead, he plans to remain a CEO of Airbnb and hire a new leader to helm the lab. Bloomberg said that the unnamed startup is in the early stages of fundraising, and honestly
Starting point is 00:09:38 for most people, the jokes write themselves. Taylor writes, Soon you can rent your Airbnb as a data center. But others are interested to see what comes out of this. Saksham writes, instead of the Enth Lab to focus on coding benchmarks, we might get a model that is actually great at coming up with new UIUX primitives, huge alpha in just having a model that makes great UIUX experiences.
Starting point is 00:09:58 Lastly today, as we head into the weekend, X is a buzz with rumors of new models coming soon. While some expected GBT 5.6 to arrive this week, it's looking like we'll have to wait a little while longer. The OpenAI account dropped a new promotional video with the tagline Time to Fly, and the OpenAI developer's account posted a still from the video with the caption, look closely, there's more in the showcase. Many thought this was referring to a solid diamond symbol next to the model selector,
Starting point is 00:10:22 perhaps indicating a new ultra-fast speed mode. Still, the OpenAI release that everyone is holding their breath for is, of course, GPD 5.6. On the Anthropic side of the house, it's all about Mythos. Leo at Synthwaived posted, Anthropic is gearing up for the public launch of a new version of Mythos, better than Mythos Preview. A checkpoint of the model codenamed Oceanus was made available to Red Teamers yesterday. I'm told these programs typically begin seven days after the wider launch. Lassan on Twitter, meanwhile, dug up another API endpoint serving the model, noting the sky-high
Starting point is 00:10:50 pricing of 16 per million input tokens and $80 per million output tokens. That would price the model at around three times the cost of Opus 4-8, but slightly below the reported pricing of Mythos preview, which was 25 per million input and 125 per million output. Whatever the details, it is clear that Mythos is coming soon with Andrew Curran writing, public release is almost here. I predicted this for the 16th and I'm feeling pretty good about it. Now, usually model coming rumors aren't all that interesting, but in this case, I think they actually have an interesting story to tell about how the lab see competition. Specifically, we know that some version of Mythos is coming, and that theoretically it's much more powerful than Opus 4.8. And so what's interesting to me is when OpenAI decides to
Starting point is 00:11:29 release their next version of GPT. Right now, they're not under a ton of pressure to do so. The release, of Opus 4.8 didn't all of a sudden catapult Anthropic into the agreed-upon leader position once again. Many people still think 5-5 is better, and it hasn't really shifted the pro-coder momentum around 5-5 at all. What that means is that if they release 5.6 right now, it is not a response to 4-8, but a preemption to Mythos preview, meaning that they think that 5-6 or whatever the version is called probably won't be able to hang with Mythos when it comes. Because you have to think that if 5-6 is as good as or better than Mythos in the estimation of OpenAI, they would wait until right after Mythos came out to try to clip off the new momentum that
Starting point is 00:12:10 mythos will inevitably give Anthropic. So in this case, more than normal, the timing will tell us a lot about how companies see where the state of the art is relative to one another. In any case, lots of fun coming up in the early summer of 2026, but that is going to do it for today's slightly extended headlines. Next up, the main episode. One of the most important AI questions right now isn't who's using AI, it's who's using it well. KPMG in the University of Texas at Austin just analyzed 1.4 million real workplace AI interactions and found something surprising.
Starting point is 00:12:46 The highest impact users aren't better prompt engineers. They treat AI like a reasoning partner. They frame problems, guide thinking, iterate, and push for better answers. And the good news? These behaviors are teachable at scale. If you're trying to move from AI access to real capability, KPMG's research on sophisticated AI collaboration is worth your time. Learn more at KPMG.
Starting point is 00:13:07 slash us slash sophisticated. That's KPMG.com slash us slash sophisticated. Here's a harsh truth. Your company is probably spending thousands or millions of dollars on AI tools that are being massively underutilized. Half of companies have AI tools, but only 12% use them for business value. Most employees are still using AI to summarize meeting notes. If you're the one responsible for AI adoption at your company, you need Section. Section is a platform that helps you manage AI transformation across your entire organization. It coaches employees on real use cases, tracks who's using AI for business impact, and shows you exactly where AI is and isn't creating value. The result, you go from rolling out tools to driving measurable AI value. Your
Starting point is 00:13:49 employees move from meeting summaries to solving actual business problems, and you can prove the ROI. Stop guessing if your AI investment is working. Check out section at sectionaI.com. That's S-E-C-T-I-O-N-AI.com. So coding agents are basically solved at this point. They're incredible at writing code. But here's the thing nobody talks about. Coding is maybe a quarter of an engineer's actual day. The rest is stand-ups,
Starting point is 00:14:15 stakeholder updates, meeting prep, chasing context across six different tools. And it's not just engineers. Sales spends more time assembling proposals than selling. Finance is manually chasing subscription requests. Marketing finds out what shipped two weeks after it merged. ZenCoder just launched Zenflow work. It takes their orchestration engine,
Starting point is 00:14:32 the same one already powering coding agents, and connects it to your daily tools. Jira, Gmail, Google Docs, Linear, Calendar, Notion. It runs goal-driven workflows that actually finish. Your stand-up brief is written before you sit down. Review cycle coming up? It pulls six months of tickets and writes the prep doc. Now, you might be thinking, didn't OpenClaw try to do this?
Starting point is 00:14:50 It did, but it has come with a whole host of security and functional issues, which can take a huge amount of time to resolve. Zencoder took a different approach. SOC2, Type 2 certified, curated integrations, tighter security perimeter, enterprise grade from day one, model agnostic and works from Slack or Telegram. Try it at Zenflow.3. This episode of the AI Daily Brief is brought to you by OutSystems,
Starting point is 00:15:12 a leading Agendic Systems platform built for the enterprise. Organizations all over the world are building, orchestrating, and governing agentic systems on the OutSystems platform and with good reason. OutSystems open and unified platform allows teams to architect, deliver, and scale governed agentic systems with agility. Teams of any size and technical depth can use OutSystems to build, and manage AI apps and agents quickly and cost-effectively without compromising reliability and security. Without systems, you can rapidly launch ideas from concept to completion. It's the leading
Starting point is 00:15:43 Agendic Systems platform that is unified, agile, and enterprise proven, allowing you to accelerate growth, reduce operational friction, and deliver real enterprise impact with AI. OutSystems. Build your agentic future. Welcome back to the AI Daily Brief. At the close of the headlines today, I mentioned that the particular release sequence of the next upcoming models from Anthropic and Open AI would go a long way to telling us what those labs think about where the state of the art is and implicitly where their competition with one another lies. And that theme of the big labs revealing more about the state of the world as they see it is the subject of our main episode as well.
Starting point is 00:16:24 This episode is going to be anchored around two pieces of writing that came out, one each from Anthropic and Open AI. The first from Anthropic is called When AI builds itself. It is a bit of a meditation, I guess you would say, around the state of AI development and what comes next. The OpenAI document is a little bit more pointed. It is a policy document called Democratic Governance of Frontier AI, a blueprint for a federal framework, but actually starts from a similar place of giving us a picture of where OpenAI thinks we are when it comes specifically to AI development.
Starting point is 00:16:54 Now, to do a little bit of upfront contextualization, here's how Ethan Malik framed the anthropic piece when AI builds itself. He said, I think it is really worth reading this piece on RSI at Anthropic. There's a bit of navel-gazing, some marketing, and a lot of very sincere beliefs about what Anthropic thinks is likely in the near future of AI that you probably want to be aware of. The big theme of the piece is RSI or recursive self-improvement. And what Anthropic is pointing to at core is an inflection point moment coming very quickly around how the next best AI gets built. Anthropic writes, for most of AI's history, humans drove every step in its development cycle, but at Anthropic, we are delegating a growing share of AI development to AI systems themselves,
Starting point is 00:17:36 which is speeding up our work. Taken far enough and given enough compute, that trend points to an AI system capable of fully autonomously designing and developing its own successor. This is called recursive self-improvement. We're not there yet, and recursive self-improvement is not inevitable, but it could come sooner than most institutions are prepared for. Now, a lot of what you're going to see on social media from this is big surprising numbers. For example, they write, Anthropic engineers on average ship 8x as much code per quarter as they did
Starting point is 00:18:05 from 2021 to 2025. Another big number is 80%. That is the percentage of Claude's production code that is authored by Claude itself. They also note that, as they put it, the code that Claude writes is good and improving. Good code, they say, means two things. It works and is written in a manner that allows another engineer to understand it and build upon it. On the first criterion, they say, the evidence is clear. The rate at which anthropic staff correct, redirect, or takeover mid-task from clot has been falling steadily for a year, including on the most complex and open-ended tasks. This means problems with no clear specification, where the engineer isn't sure what the answer looks like. As evidence, they point to a chart of Claude Code session success rate, where across trivial
Starting point is 00:18:45 tasks, routine tasks, substantial tasks, and open-ended problems, the success rate of all of those has climbed well above 60% and for the trivial routine and substantial tasks, well above 80%, from a much lower place less than a year ago. They also note that the mode of how Claude interacts with the codebase is changing. Claude, they write, is getting better at proposing its own experiments. They point to research that was published in April of this year that was exploring whether a weaker AI could manage a stronger AI. The evidence suggests that the human role is narrowing at each step in the AI development process. Once human and AI authored code quality reach parity, humans will stop writing code entirely and shift to only reviewing it. But if they can't review
Starting point is 00:19:24 code as quickly as Claude can generate it, human review will become the bottleneck to AI development. Similarly, once Claude can run experiments, the question shifts towards which of those experiments is worth running. Put simply, the doing, writing the code, running the experiment, producing the result, now costs almost nothing in human time, even if it still has costs in compute. An area of human comparative advantage for now is research, taste, and judgment, including choosing which problems matter, which results to trust, and when an approach is a dead end. Indeed, they continue, the work that is still in human hands choosing which problems to work on is what matters most. Without that judgment, Claude is a capable assistant, but not as a system
Starting point is 00:20:00 that could drive AI progress on its own. They write, it's genuinely unclear whether today's training methods and architectures could unlock that capacity. But even if we suppose that Claude never achieves good research taste, a conservative reading of our evidence still implies compounding acceleration. Now, the meat of the piece and the part that's generating the most discussion is the last section on three possible futures. The first possible future, which they say they include for completeness but don't believe it's likely, is one in which the trend stalls, but today's AI capabilities are widely diffused. They write, this article features many exponential trajectories, but these trajectories may actually turn out to be S curves. We may be approaching the bend in the
Starting point is 00:20:38 curve, where returns to scale diminish and the lines straightens, then flattens. The judgment that separates a competent researcher from a great one might be a capability that cannot come from scaling up training inputs like compute and data. If so, getting past this bottleneck would require a new idea, like an architectural approach that supplants the transformer architecture that all current frontier models use. Alternatively, and this is my editorialization, but I think we're seeing lots of evidence of right now, the binding constraint to AI progress could be in the supply chain, not the model. Advancing and diffusing the frontier may require more energy and compute than presently exists. The pace of chip fabrication, grid expansion, or interconnect bandwidth may be the
Starting point is 00:21:12 constraint rather than the intelligence itself. Now still, they say, even if model capabilities were frozen at today's level, we would expect major changes to occur in the world. They point to the example of Mythos preview finding more than 10,000 high in critical severity software vulnerabilities across many of the world's most important systems. Still, like I said, they don't believe that this scenario is particularly likely. Every capability we can measure they write has so far followed the same curve. We've not yet seen that curve bend. Of the three futures we consider, this one would give governments and societies the most time to adapt. We are more worried they continue about the next two, which would move faster and leave far less room for preparation. Scenario two then is the AI labs
Starting point is 00:21:51 continuing to see compounding efficiency gains. In this scenario, they say AI development becomes substantially automated, but humans continue to set research, directions, and judge results. In this scenario, 100-person companies could do the work of 10,000 or 100,000-person organizations. They say this would revolutionize knowledge work in government services, but could also be termed to harmful ends, from authoritarian surveillance of whole populations, to influence operations that tailor manipulation to each individual and run at a scale no human team could match. Now, interestingly, and this is where I wish there was a bit more of a discussion, they write that while this is the scenario that is most likely based on the evidence that they've
Starting point is 00:22:24 seen, they also note, speeding up one part of a process often just shifts the bottleneck elsewhere. Overall, pace is capped by the parts that haven't sped up. In computing, they write this is known as Amdahl's law, and the same logic can apply to organization. Anthropic has already in encountered one signature of Amdahl's law, as we've begun to push more code around the organization, human code review has become a new bottleneck. We've also encountered this friction outside engineering. There's been an explosion of new ideas, initiatives, tools, and simulations, as a result of anthropic employees working with highly capable models. Far more than we have the capacity to pursue. The rate at which organizations can spot and fix these bottlenecks may be a skill that improves
Starting point is 00:23:00 over time, and it may become the most important skill for any organization. This gets into a lot of the ideas that I've explored around the infinite backlog, and why all of a sudden I don't think we're not going to have jobs. Aaron Levy from Box commented on this part, saying that it points to the key element of the optimistic scenario for AI. AI, he writes, lowers the barrier dramatically to allowing us to do more. As a result of that, we have far more ideas than we can pursue, and the ones that we want to pursue were ultimately limited by our ability
Starting point is 00:23:26 to go take on the surrounding work to execute those ideas. There's almost no amount of AI progress that can happen where that goes away. AI is going to let us build much more software, launch more marketing campaigns, research, more drugs, and so on. All of this work, even when augmented by agents, still ultimately requires people to manage. Now, back to Anthropic, the third scenario they point to is the full recursive self-improvement scenario, where AI systems start to build their own successors. Now, this scenario is where you see the most handwaviness from Anthropic, with them just not really knowing how to guess at all the implications of this.
Starting point is 00:23:57 The final section is the one that has been jumped all over, especially by AI safety advocates. They write, if it were possible to effectively slow the development of this technology to give ourselves more time to deal with its immense implications, we think that would likely be a good thing. We believe it would be good for the world to have the option to slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up with the advance of the technology. But they write in the same breath, if a slowdown simply lets the least cautious actors to catch up technologically, it could leave everyone less safe. Without a global coordination mechanism, companies and governments will have to make difficult decisions about safety while under competitive and geopolitical pressures.
Starting point is 00:24:36 They go through and talk about all that would be required for a slowdown or pause, noting that while none of it would be necessarily impossible in principle, pointing to, for example, the intermediate range nuclear forces treaty, quote, those regimes took decades to build both the infrastructure at the trust. We don't have that long, they write. A unilateral pause by one lab by contrast is achievable immediately, but accomplishes much less. it would change who the frontrunner is, but would not create the wider deliberative process that is currently missing. In the coming months, we will organize conversations where policymakers, researchers, civil society, and other AI companies can help answer some of the questions this piece raises, especially around full recursive self-improvement and how to create better options for coordination and deliberation. The window to investigate the questions together is here,
Starting point is 00:25:18 and people outside AI companies should be involved in this deliberation. Now, the responses to this fully run the gamut. Some in the AI safety community are thrilled. The AI Safety Memes account sums up, holy blank, let's blank and go. Yet others, like if anyone builds it, everyone dies, co-author Nate Suarez writes, one big quibble is that they aren't thinking big enough. The tone reads like RSI could happen but don't fret too much, it'll probably be fine, rather than OMFG were possibly on the brink of AIs that make smarter AIs. Society needs to act.
Starting point is 00:25:47 Others, though, find that the whole thing kind of leaves a bad taste at their mouth. Sean Ralston writes, No way that Anthropics slows or temporarily pauses frontier AI development. What an insincere and silly sentiment. If they really feel that way, then let them act that way. Corey Quinn writes, Asking your competitors to pause development right after you file your S-1 is the single most effective moat building exercise I've seen pitched as ethics.
Starting point is 00:26:10 Did they not realize the quiet period is for them, not homework they assign their competitors? Now, there's an even broader critique of Anthropic that recently came out on the All-In podcast from legendary investor Bill Gurley, who spent 30 days reading everything that Anthropic had ever written, coming to the conclusion, I don't think they're writing software. I think they're midwifing a deity here. I don't know which one I'm more afraid of. The regulatory capture or this Dr. Frankenstein theory. Jason Calcanus chimed in, these are delusions of grandeur. Let's call it what it is. They believe they're so
Starting point is 00:26:39 powerful they can create God. Then the God you create is going to be so benevolent and perfect that will give you your little pellet of resources. Now, even if you don't think that that's exactly what's happening, the fact that that idea is coming up in mainstream conversation, we'll give you some idea why the public discourse gets so frustrated with these companies who talk about the huge implications of their work, and yet proceed on with it at an ever-increasing pace. Former AIsar David Sacks wrote, Signs you might be trying to get your frontier AI lab nationalized. You compare it to nukes, threaten half of white-collar jobs, worn recursive self-improvement could end humanity,
Starting point is 00:27:12 then race ahead anyway. In other words, you want the government to save us from you. Now, like I said at the beginning, if Anthropics document is more meditation on the state of the world. OpenAI's policy document about democratic governance is a little bit more precise. And yet still, one part that people noted is that it also mentions RSI as a starting frame of reference. In the first paragraph, OpenAI writes, we also see signs of recursive self-improvement in today's systems, where AI development is itself accelerated by AI. We expect this to increase competitive pressures among developers and nations and create governance challenges that existing institutions are not equipped to address.
Starting point is 00:27:52 Writes Chubby, the vibe is changed. Something is happening. Now, the main thrust of Open AI's paper is that democracies specifically have a key role to play in solving these very complex and difficult problems of advanced AI and larger society. They propose three broad policy directions. The first, they call building a national framework
Starting point is 00:28:12 through reverse federalism, basically arguing that instead of the national system preempting state rules, Congress should in fact adopt and scale up the best pieces of state regulations. Now, the second policy priority is one that actually runs a little bit counter to the executive order. The recent executive order put the locus of the voluntary testing regime in the NSA, whereas OpenAI is arguing that we need to be investing in civilian institutions, specifically groups like the CAISI, the Center for AI Standards and Innovation.
Starting point is 00:28:41 They also argue contra the EO, that at least eventually there should be a mandatory evaluation process, not just a voluntary one. Their last policy priority is, quote, mobilizing a whole of government resilience strategy, writing that frontier AI should be treated as a national priority, requiring coordination across national security, public health, cybersecurity, scientific, diplomatic, and economic agencies, as well as with international partners. AI policy expert Dean Ball writes, this seems reasonable. Having Cassie, a civilian agency, conduct this testing in primarily non-classified ways is the way to ensure it does not become a licensing regime. The Trump's EO classification of the process raises the risk that testing morphs into a de facto mandatory permitting and licensing system.
Starting point is 00:29:21 Now, in addition to this document, which is being treated in some ways as a response to the recent executive order, even though it feels fairly likely that it was being worked on before that EO was finalized, Congress is also starting to get a little bit more active on what they think AI regulation should look like. On Thursday, Republican Jay Overnulty and Democrat Lori Trahan unveiled their bipartisan AI bill in the House. The comprehensive 269 page bill aims to set up a federal regulatory framework that would override the growing number of state AI laws. The bill would require leading AI labs to create and implement plans to deal with catastrophic risks posed by their models. Third-party auditors would be required to ensure compliance. Now, while this seems in line with
Starting point is 00:30:00 the AI law recently passed in Illinois, much of the controversy right now around any AI regulation is about a federal bill preempting state authority. Representative Trahan has received pushback from fellow Democrats for supporting this bill, particularly in the Northeast. New York has already passed their own laws and her home state of Massachusetts is quickly moving to do the same, with Brad Carson, the president of Americans for Responsible Innovation and former Arizona Democrat arguing that cutting state legislators out of the process would be a, quote, generational mistake. Still, I would say that if you're reading the tea leaves on average, this bill is a little bit less dead on arrival than most, at least in terms of the substance, the problem is the
Starting point is 00:30:36 current timeline. Politico reporter Meredith Lee Hill writes, lots of skepticism in House GOP leadership about the Obernulty AI framework, and also getting any AI bill to the floor before midterms. Speaker Johnson, when asked if he was committed to putting an AI bill on the floor before November, said, well, we're going to do it as soon as we're able to build consensus around a package. So, I mean, I would consider it a high priority, but I don't know yet on the timing. In other words, I wouldn't hold my breath. Anyways, friends, there is a lot going on in both the technology and policy of AI, but for now, that is going to do it for today's AI Daily Brief.
Starting point is 00:31:09 Appreciate you listening or watching, as always, and it takes you. Until next time, peace.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.