The AI Daily Brief: Artificial Intelligence News and Analysis - AI to Write 90% of All Code Soon?

Starting point is 00:00:00 Today on the AI Daily Brief, a prediction that AI will write basically all the code in the world in under a year. Before that in the headlines, OpenAI releases new tools for building agents. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, follow the Discord link at our show notes. Today, we have a story that could easily have been the main episode. I just thought the conversation around Dario Amade's predictions around AI coding was so interesting that I wanted to go a little bit deep on it. OpenAI has released a new suite of agentic tools which absolutely are going to accelerate the agent platform wars.

Starting point is 00:00:42 They released this with a huge new breakdown of everything that was included. The tool set includes their new responses API, which they say combines the simplicity of the chat completions API with the tool use capabilities of the assistance API for building agents, built-in tools including web search, file search, and computer use, a new agent's SDK to orchestrate single-agent and multi-agent workflows, and integrated observability tools to trace and inspect agent workflow execution. Now, sometimes with an announcement like this, the Twitter threadboys are useless because they're so caught up in their own desire to hype things up that they don't actually have anything

Starting point is 00:01:14 substantive, but sometimes, especially when it's an extremely dense technical announcement, they can be very useful for breaking it down. So let's turn to Elvis here because he did a great summary of what was actually released. He writes, OpenAI has already launched two big agent solutions like Deep Research and Operator. The tools are now coming to the APIs for developers to build their own agents. The first built-in tool is called the Websearch tool. This allows the models to access information from the internet for up-to-date and factual responses. It's the same tool that powers ChatGBT-G-T-Search.

Starting point is 00:01:44 Powered by a fine-tuned model under the hood. The second tool is called the File Search tool. This is useful for Agenic Rag-related use cases. It now supports metadata filtering and direct-search endpoint, which enables direct search to your vector databases. The third tool is the computer use tool. This is like the operator available via the APIs. It allows you to control the computer you operate. This comes with the computer use model that's used by operator.

Starting point is 00:02:08 Elvis continues they also announced the Responses API. Unlike the traditional chat completions API, this new API is flexible enough to support multiple turns and tools more natively. Elvis continues, you can also pair tools together with the Responses API. It can call multiple tools at once and give you a final response in one request. The computer use tool can also be used with the Responses API. You can add instructions and customize the display. What about for those multi-agent systems? while Elvis continues, OpenAI has also made Swarm their agent orchestration framework more production-ready.

Starting point is 00:02:39 It has been rebranded to the agent's SDK. It uses the responses API under the hood, but other vendors are also supported. The agent's SDK, which is open source, supports building multi-agents out of the box. The triage agent can hand off tasks with the relevant context to execute tasks. It also supports monitoring and tracing out of the box, which can be used for debugging your agents. The tracing UI is also available to track traces of your agendic workflows. Big thanks to Elvis, who is himself an agent builder, for that more simple breakdown at least, basically what's going on here is that OpenAI is asserting its place and it's offering for developers in the white-hot agent building space. It's very clear that even though OpenAI is

Starting point is 00:03:17 absolutely, I think, going to build some number of agents that they want to own themselves to keep close to the relationship with the customer, they also recognize that they're not going to be able to build everything, but they do want a piece of everything. Olivia Goatman writes, there are some agents that we will be able to build ourselves, like deep research and operator. But the world is so complex, there are so many industries and use cases, and so we're super excited to provide those foundations, those building blocks for developers to build the best agents for their use cases and their needs. Trying to explain the relationship between the Responses API and the Agents SDK,

Starting point is 00:03:49 Project Manager Nakunsh Honda writes, The Responses API is like this atomic unit of using models and tools to do a particular thing. The agents SDK is having multiple of those atomic units work together to solve even more complicated tasks. But what does this actually mean in practice? Simon Taylor writes, open AI's responses API and agents SDK is a huge moment for the AI Platform Wars. The goal is to make building workflow agents trivially easy. He can do things like connective browsers, files and apps, chain multiple agents together, and monitor performance in real time. Most startups spent the last year building what OpenAI just gave away for free. Here's what it replaces. Months of prompt engineering and

Starting point is 00:04:25 iterating, complex orchestration logic, endless fine tuning and testing, i.e. observability and eval, and ultimately this means that OpenAI is trying to be the all-in-one platform. Will it work? The bargain is we'll make the tooling easy if you use our LLM, but you can't use Claude 3.7, which many like, yet for many developers this will be tempting. This isn't the end of the competition, it's the beginning. There's now two visions for the world,

Starting point is 00:04:47 Claude's Open Model Context protocol, and OpenAI's tool use SDK and Responses API. And I think he's absolutely right that this is a major, major moment in the agent platform wars, which will dictate the shape of a lot of things to come in the coming months. That was, in fact, not the only OpenAI news, however. One of the things that GPT 4.5 is clearly better at is writing. And yet, OpenAI seems to also have a new writing focused agent, or at least a new model in development. Yesterday, Sam Altman tweeted, We trained a new model that is good at creative writing, not sure yet how and when it will get

Starting point is 00:05:20 released. This is the first time I've been really struck by something written by AI. It got the vibe of meta-fiction so right. Now, for the sake of the headlines, I will not read the short story that Sam attached, but you can be assured that the fourth wall was decimated by this model's meta-fiction. Another rumor is percolating from a subtle mention in OpenAI's API change log. The post-reference to model called O3 Mini Pro. When prompted to fix the typo, Adam GBT, who does go to market for OpenAI, commented, I don't see any typos. Although we don't have any official information, you can probably figure out what the model does based on the name. If it follows the same convention as 01 Pro, it will be a more capable version of the underlying model that uses

Starting point is 00:05:58 significantly more inference. Still speaking about the naming convention, Chubby commented, Please don't. Don't make an 03 Mini Pro next to 01 Pro and 03 Mini and O3Mini and O3 Hi in 03 and O3 Pro. Please don't open AI. Lastly today, Meta has begun testing their in-house chips designed for AI training. According to Reuters, the first batch has arrived from TSMC and meta has set up a small cluster for testing. One source mentioned that the chip is a dedicated AI accelerator rather than a GPU, which could make it more power efficient. This is the first so-called tapeout for the chip, the process of finalizing the design and completing the first test run. It's very common for chips to go through multiple tapeouts to refine the design and fix

Starting point is 00:06:34 issues before production is ready to ramp up. Each tapeout typically takes between three and six months. Meta has deployed custom AI chips before, but only for inference rather than training. Indeed, what effort to develop an inference chip in 2022 went pretty badly awry, leading meta to scrap the project and pivoting to becoming Nvidia's largest customer in an effort to catch up in the AI race. If this test is successful and meta can ramp up production, it will be a big step towards reducing reliance on Nvidia. The timeline for that is still at least six months away, even if everything goes according to plan.

Starting point is 00:07:04 Still, the infrastructure buildout continues apace. For now that, that is going to do it for today's AI Daily Brief headlines. Next up, the main episode. Today's episode is brought to you by Vanta. Trust isn't just earned, it's demanded. Whether you're a startup founder navigating your first audit or a seasoned security professional scaling your GRC program, proving your commitment to security has never been more critical or more complex.

Starting point is 00:07:28 That's where Vanta comes in. Businesses use Vanta to establish trust by automating compliance needs across over 35 frameworks like SOC2 and ISO-2101. Centralized security workflows, complete questionnaires up to 5X faster, and proactively manage vendor risk. Vanta can help you start or scale up your security program by connecting you with auditors and experts to conduct your audit and set up your security program quickly. Plus, with automation and AI throughout the platform, Vanta gives you time back, so you can focus on building your company.

Starting point is 00:07:59 Join over 9,000 global companies like Atlassian, Quora, and Factory, who use Vantage to manage risk and prove security in real time. For a limited time, this audience gets $1,000 off Vanta at vanta.com slash NLW. That's VANTA.com slash NLW for $1,000 off. There is a massive shift taking place right now, from using AI to help you do your work, to deploying AI agents to just do your work. for you. Of course, in that shift, there is a ton of complication. First of all, of these seemingly

Starting point is 00:08:32 thousands of agents out there, which are actually ready for prime time, which can do what they promise? And beyond even that, which of these agents will actually fit in my workflows? What can integrate with the way that we do business right now? These are the questions at the heart of the super-intelligent agent readiness audit. We've built a voice agent that can scale across your entire team, mapping your processes, better understanding your business, figuring out where you are with a AI and agents right now in order to provide recommendations that actually fit you and your company. Our proprietary agent consulting engine and agent capabilities knowledge base will leave you with action plans, recommendations, and specific follow-ups that will help you make your next steps

Starting point is 00:09:12 into the world of a new agentic workforce. To learn more about Super's Agent Readiness Audit, email Agent at B-Super.a.i, or just email me directly, NLW at B-Supor.Ai, and let's get you set up with the most disruptive technology of our lifetimes. Today we are talking about a topic that has absolutely lit up AI Twitter for the past day or so, which is a prediction from Anthropic CEO Dario Amadei that AI will write 100% of code or nearly 100% of code within a year. Where this comes from is Amade sat down at the Council on Foreign Relations for a wide-ranging interview. The discussion covered the future of AI leadership, the role of innovation in geostrategic competition, and the outlook for frontier models.

Starting point is 00:09:54 Still, it was his comments about the pace of worker replacement in the tech set. sector that have gone viral. Dario said, if I look at coding, which is one of the areas where AI is making the most progress, what we're finding is that we are not far from a world, I think will be there in three to six months, where AI is writing 90% of the code. And then in 12 months, we may be in a world where AI is writing essentially all of the code. Now, those of you who are not using all of these text to code tools, it may seem completely obvious. Still, for Dario, it's a significant acceleration of the timelines that he had previously expressed. In fact, I think it's the first time, at least the first time that I've seen, that he's actually given a concrete

Starting point is 00:10:30 forecast for the adoption of automated AI coding. When he was doing the interview circuit back in Davos in January, Dario only spoke in general terms about the overall workforce, for example, stating, I've never been more confident than ever before that we're close to powerful AI systems. What I've seen inside Anthropic and out of that over the last few months led me to believe that we're on track for human-level systems that surpass humans in every task within two to three years. So what's been happening since then that might make this timeline feel like it's accelerating, at least when it comes to coding? Dario's company Anthropic is obviously a big part of this. They release Claude 3.7 Sonnet alongside their agentic coding tool, Claude Cod Code. Both

Starting point is 00:11:06 releases represent major progress for AI coding assistance. Of course, surrounding these tools, we've also seen the rise of vibe coding. This is the idea of people who were not coders before being able to use a tool like a lovable or a bolt to actually build application. got people thinking in completely new ways about what it means to build software, but also what it means to be a creator more broadly. Riley Brown, who got his start as the biggest AI TikToker, has completely shifted to building apps as a form of content. He's convinced that this is where all creators are going ahead,

Starting point is 00:11:38 and a few weeks ago on February 18th, he shared a Google search comparison of prompt engineering as a term versus vibe coding as a term and said checking on this in one year. He then came back yesterday to show that vibe coding had actually already started to surpass prompt engineering as a search term. Riley added it only took three weeks. Very clearly then we are in the middle of a paradigm shift for software engineering work, and it feels like somewhere along the way we might have passed an inflection point. Now, this has been a big topic in industry conversations that are going on right now. For example, the discussion about AI coders replacing human engineers

Starting point is 00:12:14 was a big topic at South by Southwest, which is currently still happening down in Austin. Responding to Amadeh's prediction, IBM CEO Arvin Krishna was very skeptical. He commented, I think the number is going to be more like 20 to 30% of the code could get written by AI, not 90%. Are there some really simple use cases? Yes, but there's an equally complicated number of ones where it's going to be zero. Now, well, as you'll see, I'm not sure I agree with that point, he did add something that I think is an extremely important narrative and something that you've probably heard on the show before saying, if you can do 30% more code with the same number of people, are you going to get more code written or less? Because

Starting point is 00:12:50 history has shown that the most productive company gains market share, and then you can produce more products, which lets you get more market share. This is basically a way of him saying, the winners of AI will not be those who choose to do the same with less, but those who choose to do either more with the same or way more with a little more. So Arvind, if you happen to be a listener, thanks for amplifying this point. Mark Cuban had a similar take during his panel at the conference. While he didn't touch on AI coding, he made a more general point about work replacement saying AI is never the answer. AI is the tool. Whatever skills you have, you can use AI to amplify them. And while I don't disagree with that, I think the evidence is starting to show something a little bit different. For example,

Starting point is 00:13:28 back in October, when frankly coding assistants were less capable than they are now, Google's CEPA Chai said during Google's Q3 earnings call, today more than a quarter of all new code at Google is generated by AI than reviewed and accepted by engineers. So if the IBM CEO is saying that it'll only be 20 to 30 percent, but back in October, Google was already seeing 25 percent of new code coming from AI. Someone's going to be wrong here. From a technical perspective, one of the questions raised by Amade's comments is what improvements do we need to see over the next three to six months to get to 90 percent of code being AI generated? And by extension, what are the challenges that need to be solved to get that number to 100 percent within a

Starting point is 00:14:07 year's time? One of the big divides in this moment is the entrepreneurs and enthusiasts spinning up new products, this huge rise of vibe coders, for example, and on the other hand, the professional programmers working with enterprise codebases. Many of the current tools are unbelievably good at building prototypes, hunting bugs, letting people go from zero to one really quickly. That's not the same as being able to scale to huge enterprise codebases where lots of people have to contribute. If you go check out the Claude Reddit, for example, you can find lots of examples of struggles with situations that deal with things like multiple files across an enterprise code base, and that's without even getting into issues involving coordinating multiple engineers working on

Starting point is 00:14:44 the same codebase in parallel. There are also more mundane constraints in how fast this shift will happen. Melanjanan, for example, pointed out, you can't sign a contract saying your code is secure if none of your employees have read it or understand how it works. Of course, none of these issues are insurmountable. They make it more difficult to go full AI at the enterprise level, but they also create a pretty damn big honeypot for someone who wants to take on this particular set of challenges. developer Nick Dobos wrote, Keep seeing versions of AI coding is great until your app gets too complex for the AI to handle.

Starting point is 00:15:15 Some thoughts. Codebases are going to 1,000x to 1,000x in size over the next 10 to 20 years. Keeping it all in your head will not be viable. Two, this unknown large codebase is already the norm for large companies. The size, maturity, and turnover of modern companies as well as use of libraries and packages means huge pieces are and always will be unknown to you.

Starting point is 00:15:33 You still need to be productive in this labyrinth. Three, AI coding will get more powerful, and with it the ability for an AI to understand this labyrinth will grow. You simply need to ask for summaries and ask questions about the code as you go. Trying to keep this much info memorized in your brain will not work. Four, the number one skill of an AI programmer right now is constraining AI and providing just enough context when giving commands for the AI to do things correctly. Five, this will grow to include knowing what questions to ask your codebase,

Starting point is 00:15:58 so you have only enough info and context to do your tasks correctly. What's more, Sicky Bilstein, the CETO of Maza, suggested that some amount of this may be a skill issue. He wrote, it's absolutely insane how fast you can move in a well-structured code base with cursor. It is absolutely a travesty how slow it is to move in a Gordia knot of spaghetti code. And so, of course, one of the questions becomes, if AI is writing all the code, does that mean we don't have any software engineers? Dario actually addressed this in the next part of the interview.

Starting point is 00:16:27 He said, the programmer still needs to specify, what are the conditions of what you're doing? What is the overall app you're trying to make? What's the overall design decision? How do we collaborate with other code that's been written? How do we have some common sense on whether this is a secure design or an insecure design? So long as there are small pieces that a human programmer needs to do that an AI isn't good at, I think human productivity will actually be enhanced. I think that's true.

Starting point is 00:16:49 But I also think that when people hear statements like this, they have a tendency to lump them in together with some common tropes that you hear around AI right now. Things like AI is just going to replace the tedious tasks, or this little chestnut that you probably see on LinkedIn about 20 times a day. AI won't replace you, a person using AI will. To put my cards on the table, I think these sentiments reflect staggering amounts of cope and or not seeing where things are headed. When people who are heading these foundation labs are talking about AI being better at every

Starting point is 00:17:20 task than humans in two to three years, that means every task, not just the tasks that are tedious. My base case then is that effectively 100% of the tasks, quote unquote, that we do today, are going to be done by AI in the future. Where I think people lose the plot and lose the nuance here is that that does not mean a priori that jobs go away. Instead, I think that on a fundamental level,

Starting point is 00:17:46 jobs change. And I think about 100% of our jobs are going to change. Simply put, we are not going to be task executors and doers in the future. People are going to be generals of their own little armies and CEOs of their own little companies where agents and AIs are doing the tasks that they once had done themselves.

Starting point is 00:18:02 The reason that I'm so bullish is that I think that all of this leads to just more being produced, exactly what the CEO of IBM was talking about, which is a recognition that market winners always produce more and win market share by producing more better things. That's what AI is going to enable, but the path to it enabling that is going to be a complete, effective 100% replacement of the way that jobs are structured now with a totally new way of working. And by the way, from where I'm sitting, I don't think that Dario's timelines are all that crazy. Yes, I think that there are very real structural constraints and human constraints and legal constraints and inertia constraints that will slow this down in the

Starting point is 00:18:43 enterprise. I think that is absolutely undeniable. But when you get outside of the enterprise sector, it is not hard to see examples of how fast this is changing. Ycombinator partner Jared Freeman recently said that one quarter of YC founders said that 95% of their code base was AI That means a quarter of the companies that have come out of the most important accelerator in the world are seeing nearly all of their code generated by AI. Sahil from Gumroad says, we're already at 100%. If you're writing code, you're making a conscious choice not to ask AI to write the code for you. Fine, but it's a choice, similar to doing your dishes by hand instead of using a dishwasher.

Starting point is 00:19:20 Like I said, I do not think that this means that all software engineers are going to lose their jobs. As Adi from Trade Your Mean put it, software engineering has never been primary. primarily about writing code. The code is downstream to thinking precisely about how to model a particular domain. I think he's right that that's going to be an even more important skill in the future. What's more, there is also this other interesting lurking question for me, which is that if AI does write all the code, but AI remains in a place where it's not particularly good at inventing new things, it's just good at pattern matching old things. Does that mean that we're going to be stuck with the programming languages we have right now in the state that they are forever?

Starting point is 00:19:54 If engineers aren't in there actively working with the code, do we lose on some of that? progress? Gugliot-Rose writes, Interesting thing, Dario's prediction will not be true for the most critical software, like Linux. There is no AI to write that code which has never been written before and needs to be as efficient as possible. So there are still interesting things to explore even in this 100% vision of the world. Anyway, I think at this point you can probably see why this has been such fodder for conversation. It is yes about coding, about this huge important breakout area of AI and agent innovation, which has become even more broadly important with the

Starting point is 00:20:28 the rise of vibe coding, which has allowed all the people who are non-technical and non-developers to start participating in it. But it's also touching on the broader questions of job displacement more generally and how work looks in the future. Hopefully your mind's now worrying with some new thoughts. Let me know in the comments, of course, whether you think the prediction is right on, crazy, or not ambitious enough. For now, though, that is going to do it for today's AI Daily brief. Appreciate you listening as always. And until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - AI to Write 90% of All Code Soon?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.