The AI Daily Brief: Artificial Intelligence News and Analysis - AI Generated Code Reaching 50% in Some Companies

Starting point is 00:00:00 Today on the AI Daily Brief, for some companies, AI-generated code now exceeds 50%. Before that in the headlines, chat GPT makes a movie. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. All right, friends, quick announcements before we dive in. First of all, thank you to today's sponsors, Superintelligent, Vanta and Robots and Penciles. And to get an ad-a-very version of the show, go to patreon.com. Welcome back to the AI Daily Brief Headlines edition. all the daily AI news you need in around five minutes.

Starting point is 00:00:35 We kick off today with something that will seem too many to be a completely natural evolution and totally expected, and which will for others be everything wrong with the world that we are heading into. The TLDR is that OpenAI is backing a feature film in hopes of showcasing the promise of AI filmmaking tools. The movie is called Critters with a Z and is, according to the Wall Street Journal, about forest creatures who go on an adventure

Starting point is 00:00:57 after their village is disrupted by a stranger. Now, the movie actually has OpenAI origins. Chad Nelson, who is a creative specialist at OpenAI, started sketching out the characters a few years ago while trying to make a short film with OpenAI's original Dolly image generation model. That short used AI for character and background design, but used regular methods for voice acting and animation. Now the goal is a feature-length film,

Starting point is 00:01:21 and with the help of a pair of animation studios, Nelson hopes to debut the film at next year's Cannes Film Festival. The goal is to complete the movie in nine months, which is significantly shorter than the more typical around three-year production time for similar types of animated films. They're also working with a comparatively modest $30 million budget. The goal is to leverage the full suite of OpenAI's models and really put them on display in a professional setting. Said Nelson, OpenAI can say what its tools do all day long, but it's much more impactful if someone does it. That's a much better case study than me building a demo.

Starting point is 00:01:51 Now, details are a little scarce on exactly what is going to be AI and what is not. The plan appears to be to cast human actors for character voices, as well as to hire human artists to draw sketches that are then fed into OpenAI's tools. How much of the video will be actually generated by AI, it isn't clear. And part of why it's unclear is that the team behind it just doesn't know all these things yet. James Richardson, the co-founder of production studio Vertigo Films, who are working with Nelson on the movie, said, quote, I have never been in this position in my life where we are starting a movie, and I have no idea what's about to happen. It's a very ambitious, massive experiment. I think investor Hirsch DeSai makes an important point when he writes,

Starting point is 00:02:28 I really hope the Critters team doesn't lean into the AI angle here. The bar isn't, this was good for an AI generated movie, but this was a good movie. People care about the output, not what tools you used. I think, in fact, if anything, Critters is going to face a higher threshold because of the latent skepticism that many people in the creative industry are going to have coming into this. The real win here would be to model how a full team of creative people can make a movie that they're really excited about, not in a way that replaces all of their jobs, but allows them to move much faster and show the possibility for more creative output. Now, speaking of Visual AI, Google VP Josh Woodward just gave an update post-nanobanana,

Starting point is 00:03:08 and it looks to have been a wildly successful release. Woodward said that in the last four days, the Gemini app had added 13 million more first-time users and 300 million more images. I think between this and OpenAI's breakout studio Ghibli trend, image generation appears to be a powerful onboarding tool for consumer adoption. And this makes sense, right? Social media is driven by easy-to-consume visual imagery, and so those outputs of AI would go more viral than comparable text prompts.

Starting point is 00:03:36 The next thing to watch, I think, given all this, is what happens when meta releases a competent AI assistant with Mid Journey built-in natively. Next up, let's move over to some big deal stories, starting with a huge one in the Neo-Cloud space. Microsoft has signed a $17.4 billion deal with AI cloud provider Nebius. The deal will run over five years and is the first deal anywhere close to this size for the company. In fact, Nebius right now only has a market cap of about $15 billion, meaning that the deal

Starting point is 00:04:06 is larger than their entire company's value. The market cap has tripled over the past year, but there's still only a third of the size of Corweave. This makes their deal with Microsoft even more impressive. Now, on Microsoft's side, the deal seems to indicate that Microsoft is quite happy continuing to be a renter rather than an owner of AI infrastructure. You might remember that back in the first quarter of this year, there was a huge new cycle around Microsoft canceling construction plans and dropping some of their data center leases. Their CFO insisted that this was just tinkering at the margins and

Starting point is 00:04:34 didn't meaningfully impact the company's plans. Still, the stories drove a ton of concern that Microsoft were stepping back AI infrastructure because it was being overbuilt. Fast forward six months, and the infrastructure buildout has only accelerated while GPU shortages have only gotten worse. Whether they're renting it or building it, this deal definitely suggests that, Microsoft foresees a lot of continuing demand for AI compute over the short term, and that they are willing to pay up in order to service it. Shano Matthew wrote, huge validation of the AI Neo-Cloud Model 2. Corrieve and Nebius clearly have value and compute is constrained if hyperscalers continue to sign new deals with them. Continuing on the deals theme, Data Bricks is set to close their latest

Starting point is 00:05:11 round of fundraising at $100 billion, and as they do so, have reported $4 billion in annualized revenue. Revenue at the company is up 50% from the prior year, with AI-related sales contributing a billion dollars to the total. CEO Ali Godsey said that the company had positive free cash flow over the past 12 months and expects to continue operating profitably from here. He commented that, quote, the vast majority of our costs go to wages and it's definitely expensive to spend on AI talent. Now, one little subnote for this, in and around the AI bubble talk, one of the concerns that

Starting point is 00:05:40 you will hear is the concern that people have that AI startups are just cash insincerators with no hope of justifying the expense of AI revenue. Some people picked up this story, for example, from the information which claimed that AI startups were generating around $15 billion in revenue. They noted the huge gap between that and the trillions of dollars in infrastructure spending forecast in the coming years. Those figures, however, didn't include the big tech hyperscalers who are growing at around 30% since the AI wave took off. They also don't include this additional billion dollars from Databricks, which is transformed into one of the largest AI startups without fitting the same mold as OpenAI or Anthropic.

Starting point is 00:06:11 The point here is that there is lots and lots of AI revenue that isn't being captured in the alarmist headlines and that is growing incredibly quickly. Speaking of which, Audio generation startup's 11 labs have made a tender offer to staff allowing them to sell stock at a $6.6 billion valuation. Up to $100 million in liquidity is available, with staff working at the startup for at least a year able to access the offer. The secondary round will be led by Iconic and Sequoia with participation from Indriessen Horowitz. This round would double their last valuation received in January. CEO Maddie Stanisuski said, earlier this year we surpassed 200 million in ARR and we expect to top 300 million by year end. We're also rapidly approaching a

Starting point is 00:06:49 50-50 revenue split between our enterprise and self-served customers, with enterprise revenue having grown more than 200% in the last year. Now, he then goes on to laud all the employees that they are giving liquidity to. But the thing I wanted to hone in on here, just given the themes of what we were just talking about a moment ago with Databricks, is once again the fact that revenue is growing extremely quickly across so many different AI startups. Maddie said the company will be at 300 million by the end of the year, meaning about 150 million of that will be enterprise, which is serious numbers for a startup that's just a few years old. Now shifting gears one more time to the policy side,

Starting point is 00:07:22 Anthropic is endorsing a new AI safety bill out of California. If you were listening last summer, you probably heard me talk a lot about SB 1047. It was California's first big attempt to regulate the AI industry, and there was a ton of antagonism towards the bill. SB 1047 was ultimately passed by the California legislature, but vetoed by Governor Gavin Newsom. Part of the logic was that a patchwork of state laws would be untenable, so it was better to wait for federal legislation.

Starting point is 00:07:47 However, almost a year since the veto, we've seen no industry. that Washington will move forward with legislation, changing the calculus for some about state-level governance. According to Anthropic, the new bill, SB 53, aims to cover much of the same ground as SB 1047, however, they claim it does so through, quote, disclosure requirements rather than the prescriptive technical mandates that plagued last year's efforts. The stated focus of the bill is on preventing AI models from contributing to catastrophic risks, which it defines as the death of at least 50 people or a billion dollars in damage. The bill is expected to go to a vote sometime this month, and while the AI lobby and companies are fighting it, the opposition does seem

Starting point is 00:08:21 a bit more muted than the fight around SB 1047. Could this then be a compromise that everyone can live with? Well, interestingly, former Trump AI advisor and chief opponent of SB 1047 Dean Ball said, in SB 53, the drafters have shown respect for technical reality, mostly reasonable intellectual humility appropriate to an emerging technology, and a measure of legislative restraint. Whether you agree with the substance or not, I believe all of this is worthy of applause. Now then again, people are just starting to take notice of this, so maybe the noise around it starts to get louder. We'll just have to wait and see. And lastly, there is one interesting little nugget in America's annual defense bill,

Starting point is 00:08:57 caught by Jacob Krause, who writes, America's annual defense bill might establish a temporary artificial general intelligence steering committee comprised of military officials. Jacob continues, one responsibility for the committee would be analyzing the military applications and implications of artificial general intelligence for the Department of Defense. AGI is defined as artificial intelligence, capable systems, with the potential to match or exceed human intelligence across most cognitive tasks, distinct from narrow artificial intelligence systems designed for specific tasks and defined

Starting point is 00:09:25 domains. Said Carnegie fellow, John Bateman, some might say this is just another committee or report. Actually, it's a smart way to force the people who would need to respond to and or shape AGI to proactively think about it and talk with each other sooner and more frequently. Now, if you've ever followed the evolution of a big must-pass bill like this one, you will know why it's probably not worth spending too much time and attention on this just yet, but still I think it's interesting and something I'm certainly going to keep an eye on. With that, however, let's shift over to the main episode. If you are a regular listener, you will have heard about Super Intelligence Agent Readiness Audits at this point, but I wanted to tell you today about the full suite of

Starting point is 00:10:03 agent readiness products that go beyond just the initial readiness report. Over the last six months, Super Intelligence has built out an entire Agent Planning Suite. We help you move from Discovery to Planning to implementation. After you've completed your agent readiness audits, we help you double-click on your most important use cases with what we call our use case planning reports. These reports are going to help you understand what sort of technical preparation you need to do to be ready for a use case, what challenges you might face in implementation, and whether you should be thinking about building, buying, partnering, or some combination. After that, you can even get a spec document in what we call our technical blueprint that gives either your developers or the developers of the

Starting point is 00:10:42 partner you work with, what they need to build exactly the agent that you're looking for. If you want to learn more about superintelligence agent planning suite, we built a custom GPT to answer your questions. Just go to bit.ly slash super super super agent. That's bit.l.combe, and if you have any questions, the agent can even help you book an appointment with our team. As a founder, you're moving fast towards product market fit, your next round, or your first big enterprise deal. But with AI accelerating how quickly startups build and ship, security expectations are higher earlier than ever. Getting security and compliance right can unlock growth or stall it if you wait too long. With deep integrations and automated workflows built for fast-moving teams,

Starting point is 00:11:27 Vanta gets you audit-ready fast and keeps you secure with continuous monitoring as your models, infra, and customers evolve. Fast-growing customers like Langchain, writer and cursor trusted Vanta to build a scalable foundation from the start. And look, as someone who lives, in the world of enterprise procurement, I love how Vanta makes it easy to get compliance right. The last thing you need when you're trying to win that big deal is to have it scuttled by something that Vanta has solved for over 10,000 companies. Go to Vanta.com slash NLW to save $1,000 to save $1,000 today through the Vanta for Startups program and join over 10,000 ambitious companies already scaling with Vanta. That's VANTA.com slash NLW to save $1,000 for a limited time.

Starting point is 00:12:08 AI isn't a one-off project. It's a problem. partnership that has to evolve as the technology does. Robots and Pencils work side by side with clients to bring practical AI into every phase, automation, personalization, decision support, and optimization. They prove what works through applied experimentation and build systems that amplify human potential. As an AWS-certified partner with global delivery centers, robots and pencils combines reach with high-touch service, where others hand off, they stay engaged, because partnership isn't a project plan. It's a commitment. As AI advances, so will their solutions. That's long-term value. Progress starts with the right partner. Start with robots and pencils at robots and pencils.com.

Starting point is 00:12:50 Welcome back to the AI Daily Brief. Today we have a bunch of interesting stories in and around AI coding. We've got some anecdotes from inside big companies around just how much code is being written by AI. We've got a monster funding round for an agentic coding company, alongside a really interesting personnel move, where one of the leading voices around AI engineering is going to that company, and we also have some really interesting and impressive results on a coding benchmark that I think altogether can help us understand where AI and agentic coding is right now and where it's headed. Now, to get into this, I want to take a step back to where we were. It's been clear throughout most of the year that agentic coding was one of the most important and breakout AI use cases.

Starting point is 00:13:29 Early on in the year, we had the CEOs of Microsoft and Alphabet talking about how much of their code was being written by AI. The numbers that were shared in and around the March and April time frame were around 30% in both of those cases, although by June, Microsoft officials had up that number to 40%. Around the same time, Zuckerberg said that while he wasn't exactly sure what percentage of code was being written by AI, estimating between 20 and 30%. He thought that they were going to be on a trajectory to be at around 50% by the next year in 2026. Now, it was also around this time that Anthropic CEO Dario Amade made a really big and bold prediction that over just the next, three to six months, AI might be writing up to 90% of all code. At a council of foreign relations event

Starting point is 00:14:10 in March, he said, I think we will be there in three to six months where AI is writing 90% of the code. And then in 12 months, we may be in a world where AI is writing essentially all of the code. Now, one of the things that even then people got was that the prediction was probably more about where AI's coding capabilities were and less about the complex integration and organizational inertia, which would hold it back. In other words, to the extent that that number wasn't being hit, it would be less about AI's capabilities and more about just the normal slowness in human adoption of new technologies. Now, recently, people have been arguing that this prediction is just kind of off. I shared recently the information piece, which actually asked Claude to grade

Starting point is 00:14:47 the prediction and gave it at F. It's very clear that if in practice the number was an overshoot, directionally, Dario seems oriented towards the future that we're getting. Indeed, one of the problems with quoting such a high number is that it then makes numbers that otherwise would seem extreme, seem not so extreme. For example, Coinbase CEO Brian Armstrong recently tweeted that around 40% of code was now being written by AI at Coinbase and that he was aiming for that to be 50% by October. Well, now we've got another report from Robin Hood's CEO Vlad Tenet. In a recent podcast appearance, he said that when asked what percentage of Robin Hood's new code is generated by AI, he thought that it was around 50%, although the nature of the new systems makes it a little hard to know.

Starting point is 00:15:26 He said, we've moved from GitHub copilot, which is an auto-complete system, to cursor and now things like WinServe, where nearly all of the code is written by AI. It's hard to even determine what the human-generated code is. He estimated, in fact, that the new code that was written by humans was now officially in the minority, i.e. less than 50%. He also said that among the company's engineers, close to 100% were using AI code editors. Now, you'll note that one of the benefits that a company like Robin Hood has is that it sounds like they're not locking their people in tools like GitHub co-pilot, and the high adoption rate probably reflects that. Now, obviously, Robin Hood and Coinbase are tech forward companies. They were what we would have called startups just a few years ago, and so some big enterprises

Starting point is 00:16:06 might be tempted to dismiss this high percentage of code now being written by AI as a byproduct of their natural tech forwardness. I tend to think, however, that savvy organizations will take the exact opposite response and think to themselves that if the guys closest to technology are getting up to 40 or 50% of their code written by AI, those might serve as interesting goals for our organization or enterprise to go after as well. Even Dario isn't giving up the ghost on his 90% number. In a recent interview with BBC,

Starting point is 00:16:32 he said that that's about what it was at Anthropic, 90% of the code being written or at least suggested by AI. And part of the reason that this is on such the upswing is that the tooling is just getting better. One of the really interesting phenomenon coming after the GBT5 launch is the way that we've gone from the phase of people being disappointed with GBT5,

Starting point is 00:16:49 to OpenAI's coding platform codex, starting to see a dramatic upswing and a reclaiming of certain ground that they had lost to Anthropics Codd Code. It was very clear during the GPT5 launch that the use case that they cared most about and were most trying to catch up in was AI-assisted and agentic coding, and that play seems to be paying off. Last week, Sam Altman tweeted, really cool to see how much people are loving Codex. Usage is up around 10x in the past two weeks.

Starting point is 00:17:16 Lots more improvements to come, but already the momentum is so impressive. On the same day, AI researcher Jan Peleg wrote, The Codex CLI hype is real. I just tried it. GPD5 high in Codex is great. It stays on track much longer than Opus, never gives up on your task even if it takes a while, much longer context window,

Starting point is 00:17:33 no arguing or you're absolutely writing. Peter Levels wrote just this morning as I'm recording this on Tuesday, September 9th, things in AI change so fast. In July, everyone switched from cursor to ClaudeCodecode, in September, everyone switches back from ClaudeCode to Cursor or Codex, after Anthropic allegedly decreased the quality of responses to save money. By the way, Anthropic is strenuously denying that claim of intentional model degradation.

Starting point is 00:17:56 The point is just that the tooling is getting way, way better. Whether you're on Claudecote or Curseur or Codex, all of these things are just light years ahead of where the tooling was, even just nine or 12 months ago. Now, speaking of tooling, we got a big funding announcement yesterday, this time from Cognition, who are the makers of Devin. The company raised the fresh 400 million at a whopping 10.2 billion post-money value The company has seen incredible growth in just a year. Eric Glyman from Ramp wrote,

Starting point is 00:18:23 Last year, pre-revenue, bold demo, borrowed Ramp's office on a Sunday to film a launch video. This year, $10 billion company, revolutionary product, core contributor to Ramps' codebase. Congrats to the team at Cognition, living proof of how fast AI is moving. Now, alongside the announcement, we got some really interesting details from the Cognition team. They took the chance to talk about how much has changed when it comes to Agentic Software Engineering in just a year. They wrote, We founded Cognition last year to build the future of software engineering. We envision a world of software abundance where engineers become architects, solving the most

Starting point is 00:18:55 challenging problems, and focusing on their creative visions while tasking an army of autonomous agents to support them on everything else. Last March, when we launched Devin, the AI software engineer, it wasn't clear this future would ever become a reality. While Devin could knock some tasks out of the park, for the most part, it was still a very junior engineer. Today, the baseline has changed drastically. While we are still in the earliest innings of AI code, agents are already doing real work alongside individual developers and within large enterprise engineering teams. As with many technologies throughout history, what first seemed to fringe theory quickly became an obvious reality. The company shared that Devin's ARR had grown from a million

Starting point is 00:19:30 dollars annualized back in September 2024 to 73 million in June of this year. What's more, the company's last-minute acquisition of windsurf, which happened in the wake of the deal between windsurf and Open AI following through, seemed to provide a windfall for the company. They wrote, our acquisition of windsurf more than doubled our ARR. More importantly, it gave us the complete product suite for AI coding. Today, the two main forms of AI coding tools are IDEEs and agents. Engineers naturally want both. The IDE for when you want to make each decision yourself,

Starting point is 00:20:00 with a speed up from AI assistance, and agents to delegate complete tasks asynchronously. Combination seems to be working, as they say, their acquisition of windsurf more than doubled their ARR, and their combined enterprise-focused ARR is up 30% in the seven weeks since buying windsurf. Now, this combination of offerings was a big part of why Sean Wang, better known to you guys as Swix, the host of Layton Space, the curator of the AI Engineer Summit and the AI Engineer

Starting point is 00:20:25 World's Fair, announced alongside the deal that he would be joining Cognition full-time. He wrote a post which you can find at Swix, swyx.io slash cognition, about his reason to do so, and it is absolutely not just some LinkedIn post about taking a new job. Instead, it's a meditation and a reflection on the state of AI coding right now. He actually divided his reasoning into five different buckets. A non-technical thesis, an agent's thesis, an engineer thesis, a business model thesis, and a team thesis. Combination of Devin and WinSurf was a big part of the engineer thesis. He called it local agent speed plus cloud agent capacity.

Starting point is 00:21:02 WinSurf for individual speed, Devon for unlimited parallel capacity. And what's interesting to me about this is that I think in this case, AI coding is modeling what are likely to be broader adoption patterns across other use cases as well. Basically, what cognition is done with this acquisition is combine the individual productivity use case, i.e. the human augmentation use case, with the automated non-human opportunity of agents as well. Point being that it's not going to be one or the other, it's going to be a both and. Soics, by the way, called this owning the sync async spectrum. A couple of other things that I thought were really interesting from this post and that are

Starting point is 00:21:36 worth chewing on. Sean said that his central realization was, Code AGI will be achieved in 20% of the time of full AGI and capture 80% of the value of AGI. What this led him to was to, quote, just do code AI now rather than later. He also makes an interesting argument about why agent labs versus model labs have some advantages heading into the next generation of AI competition. Sean wrote, from 2015, when Open AI was founded to 2025, the right place to work was clearly at model labs, seeing through the three paradigms of pre-training, scaling, and reasoning. Now, however, he wrote, the business justification for agent labs is simple.

Starting point is 00:22:11 I can't keep up with all this AI stuff. Let me hire the guys who nerd out about this all day to keep us on top of things. That is about the level of abstraction that the lower 90% of the AI bell curve can deal with. He then explores a few different reasons why agent labs will have relative alpha compared to model labs. He argues that agent labs are product first, and that while model labs create frontier models, it's the job of the agent labs to adapt them to domains they don't fully solve yet. Like I said, it's a really interesting rumination, and honestly, I think that if there's any takeaway. It's that even though this round is being done for cognition at a $10 billion valuation, this still feels so early innings for what agentic coding and AI coding are

Starting point is 00:22:49 actually going to do in the world. On that front, a last note before we head out for today, Blitzy, who is of course one of the sponsors for this show, just announced some wild results on the Sweet Bench Verified Test. The company is claiming 86.8% performance, which will represent more than a 13% improvement over the previous best. And to be a bit overly reductive, they basically solved it by radically extending the thinking time available two agents to solve what were previously considered unsolvable problems. Said Sid Pardeschi's CTO and co-founder, the unsolvables weren't actually unsolvable. They just required deeper thinking than System 1 AI could provide.

Starting point is 00:23:26 He continued, by design, Blitzie enables AI to think for hours or days rather than seconds or minutes, unlocking solutions to problems that stumped every previous approach. This validates inference time scaling as the key to exponential capability improvements. Now, given how much I've talked about how benchmarks are washed and how consistently we talk about where the next big improvements in AI performance are going to come from, I think we're going to dig into this a little bit farther. But again, the reminder here is that for as obvious as AI and agentic coding have become, we are still so barely scratching the surface on how it collectively is going to change the world.

Starting point is 00:23:59 For now, though, that's going to do it for today's AI Daily Brief. Appreciate you listening or watching as always. Until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - AI Generated Code Reaching 50% in Some Companies

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.