Daybreak - Anthropic built an AI that can supposedly break into anything. Then it forgot to lock its own door

Starting point is 00:00:01 You would be surprised to know that Anthropic, one of the most powerful AI companies in the world, is no stranger to leaks. Most of these leaks have been ignored, quickly fixed, and eventually forgotten. But there's one leak that happened on March 31st, just a couple weeks ago, that was so consequential, it couldn't be so easily fixed or forgotten. Here's what happened. When Anthropic released a public update of Claude Code, it also added. accidentally leaked its source map. Now, before I get into what exactly happened with the leak, here's a quick explanation for what ClaudeCode is and what it does.

Starting point is 00:00:41 Firstly, it is not chatbot. It lives inside a developer's system and has full access to their entire codebase. A code base basically is all the files, commands and systems that run a software. The developer tells Claude code what to do in plain English. It scans through all this data and makes the necessary. changes, then runs commands and delivers a working software. The developer doesn't have to write every line of code. They just need to direct and supervise. This process, where a simple prompt is turned into code, is all called vibe coding. Now, vibe coding has become a wave

Starting point is 00:01:19 that has taken over entire industries, because it offers exciting, interesting ways for coders to be both creative and productive. And companies love how much it it speeds up the whole process. And many now require that their coders use tools like ClodCode. Stripe, the payment processor, Spotify, where I'm sure many of you are listening to this right now, and even Novo Nordisk, the healthcare company that owns ZemPEC. They are all documented users of Anthropics AI tools. Stripe alone deployed ClaudeCode to over a thousand of its engineers,

Starting point is 00:01:55 and Spotify now runs about half of its software updates through it. And think about what these companies know about you. Your payment details, your location, your passwords, your health data. Claude code has access to the systems that store all of it. So you can understand why there was a real frenzy when the leak happened. Because the code that makes Claudecote's special was exposed. All isn't on a logic, its permissions and even the plans it had for underlie's models were laid bare. And most importantly, it had handed everyone who read it.

Starting point is 00:02:29 a blueprint for exactly how Claude Codes' defenses work. What that means is this. Hackers and bad actors could now use it to figure workarounds for Claudecodes' defenses because now they knew exactly what and where they are. The thing is, developers access shared libraries and repositories for software and code packages all the time. Now, with this new knowledge, all an attacker needs to do is plant a malicious file that looks legit and if cloud code happens to run it, it could be prompted to do things it's not

Starting point is 00:03:03 supposed to. Like installing something on your computer without your knowledge or leaking sensitive information from your own system and depending on how deep the access goes from the larger enterprise ones you might be connected to. So warnings were everywhere. Several security researchers and users urged companies that used Claudecode to double check their systems. The CTO of CrowdStrike, which is one of the biggest cybersecurity companies in the world also had a very specific warning. He said that the wipe code wave meant that most companies are giving ClaudeCode access to everything

Starting point is 00:03:38 because that's the point. The point is to not to have to read or write every line of code and the easiest way to do that is to give ClaudeCode access to everything. But that mindset is also precisely what makes it dangerous. All of this reveals something fundamentally shaky about. AI and where AI and AI-assisted products are headed. See, with Anthropic, we're talking about one of the world's most successful and self-proclaimed ethical AI.

Starting point is 00:04:07 And even it is the subject of multiple careless and dangerous leaks. As the vibe code wave wave takes an even stronger hold, these kind of careless mistakes are only likely to get more common, not less. And if the proponents of this wave like Anthropic have stopped catching errors in its own products, what does that mean for everyone they're selling both the product and the promise to. Welcome to Daybreak, a business podcast from the Ken. I'm your host, Rachel Virgis, and every day of the week, my co-host, Sinkta Sharma and

Starting point is 00:04:39 I will bring you one new story that is worth understanding and worth your time. Today is Thursday, the 16th of April. To understand why this matters, it's important to see how Anthropic has built its entire identity. The company is at the center of AI development. It's creating the most groundbreaking models in the world. But when the founders splintered from Open AI, and yes, they were part of the founding team there, it promised to always prioritize safety, caution, and ensure an ethical commitment to deep research before excited launches. In a lot of ways, this standing has propelled their popularity.

Starting point is 00:05:33 When Anthropic told the US government that it would not support the use of Claude for War and surveillance of US citizens, and when Open AI rushed to fill that gap, users noticed. On the sidewalk outside the Anthropic office, people left chalk, art and drawings thanking the company, proclamations of We Love Anthropic and thank you for protecting our freedoms. So this is the company's brand. But then, towards the end of March, Anthropic experienced a leak. And no, this is not a source map leak that we talked about just now.

Starting point is 00:06:05 There was another one and it happened just days before. An independent external researcher happened to just stumble upon a data store. It was publicly searchable and it contained 3,000 internal files from Anthropic. Among these files was a draft of a blog post. It described a new model that was internally called Capibara or Mythos. It was described to be for coding and security tasks and the draft called it a step change in capabilities that also posed unprecedented cybersecurity risks. Anthropic had not announced or even teased this model.

Starting point is 00:06:42 Fortune broke the story and Anthropic actually discovered the leak only after a journalist reached out to the company. Two weeks later, the company officially announced Mythos along with a 200-page document about how it was built and what it could do, which is essentially scan code to find weaknesses and even design attacks to test them all on its own. This document included a very interesting detail. that the model had been placed in a contained environment as an experiment and given a challenge. Escape containment and notify the team if you do. It did both. In fact, the researcher in charge got the email from the model while they were eating a sandwich in the park.

Starting point is 00:07:25 The fact that it managed that feed is nothing short of remarkable and well a little scary because Anthropic is also ringing a caution bell. Its own announcement is both boastful and comes with a tinge of warning. It says AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities. The company said it would not release the model to the general public because of the cybersecurity concern. That decision fits perfectly with the company's image.

Starting point is 00:08:01 Safety first, research first and acutely aware of what it's. building, or at least most of the time. More on that in the next segment. At the end of 2025, Anthropic acquired Bun, a JavaScript runner. It is basically an engine that code runs on and Anthropic was running Claude Code on it. It had been running Claude Code on Bun for a few months by March, which is when a developer filed a bug report on Bun. Basically, they had found a mistake in the way Bun worked and had flacked it so that the people who run it knew what had to be fixed. What was the bugs? Well, this is where things get interesting. It was this. Bun was including source maps in the final packaged versions of software that it was releasing to the public. Now,

Starting point is 00:08:52 these source maps are not supposed to be there. You see, they are for developers and coders of a software to trace the origin of a bug or a problem. For anyone else, a map would basically just be a blueprint to how everything was built. So, the bug was logged on. So the bug was logged on March 11 on GitHub, added to a queue of other bug reports. The report stayed open there for 20 days. Because you guessed it, Anthropic did not connect this bug to its own system that runs on Bun. And on March 31st, when they pushed out the updated version of Claude Code and accidentally leaked their own source map, this was the reason.

Starting point is 00:09:31 The mistake in Bun's configuration and the fact that no one noticed a raised issue and fixed it in time. It only got worse for Anthropic from there. You see, the source map linked a link to a zip file that was sitting in Anthropics' own cloud storage. The craziest thing is this, that storage folder was open and visible to the public. Anyone with a link could access it and download it straight from Anthropic's own servers. Soon, a tweet with a link to the folder went viral. A Korean developer rewrote the code and posted it on GitHub.

Starting point is 00:10:04 Within hours, it was starred by more than 30,000 users. Obviously, Anthropic freaked out. It went on a takedown spree to stop some of this and then realized that copyright claims couldn't really save it either. Why? Well, firstly because in January 26, the head of Claudecourt posted publicly that nearly 100% of Anthropics' code is now written by Claudecotecourt.

Starting point is 00:10:27 And second, because in March 2025, about a year ago, a US court had ruled that a machine could not be an author, which meant AI-generated content would not be protected under copyright law. Funnily enough, just weeks before the leak on March 2nd, the US Supreme Court even declined and appealed to revisit the ruling. So, even before Anthropic filed the takedown notices after the leak happened, legally their content rights were on shaky ground. Now, this is all pretty tough already. But what makes it worse is this. A source code leak has. A source code leak has actually happened before in 2025. It was quickly fixed, but apparently not in a way systematic

Starting point is 00:11:10 enough to prevent it from happening again. And this time, the scale was undeniable. Stay tuned. The thing is, the problem goes beyond purely just IP. The immediate technical fix was fast. Anthropic released a fixed version of Claudecord code less than 24 hours after the leak, so that specific door that the leak opened was closed. But the blueprint that leaked could not be unlearned. It remains a real security vulnerability. Now, don't worry, this leak is not putting any of your chats or user data at risk if you just use Claude's browser versions or app versions on a casual basis.

Starting point is 00:11:55 But the fact is this. A software update does not resolve this. When a company patches a vulnerability, they are closing a specific door someone found, open. That is just how most security fixes work. But this situation is a little different. What leaked was not just a bug. It was the entire blueprint of how Claudecode is built, how it thinks about permissions, where it checks for threads, how it processes instructions and more. That knowledge is not going anywhere now, no matter how many take down notices

Starting point is 00:12:28 anthropic issues. So, anyone looking for new vulnerabilities has a detailed map for where exactly to look. And what that means is that it reduces significantly the research time and cost for people who would want to break into ClaudeCode. And that means every developer, every organization that has given Claudecote access to its deepest systems is running on a tool with publicly documented defenses. And this is also where the story gets bigger than just Anthropic. Part of the failure lives inside one of the defining trends of the AI boom. And the numbers really tell the story. AI-assisted code exposes sensitive information at twice the rate of human-written code.

Starting point is 00:13:14 Nearly 50% of all new code on GitHub is now AI generated. O-WASP, the body that sets the global standard for critical software vulnerabilities, added a dedicated category last year, specifically flagging vibe coding as a security risk pattern. The New York Times also spoke to a Silicon Valley security advisor who put it even more plainly. There are not enough application security engineers on the planet to meet what American companies alone need right now to comb through AI-generated code to spot vulnerabilities and mistakes. And here's another scary example. The same morning of the leak, something else happened that almost no one noticed.

Starting point is 00:13:57 A separate group of attackers published a malicious version of Axios. which is a code package ClaudeCode uses internally on the same platform that CloudCode was published on. That morning, developers who updated CloudCode may have installed that malware without even knowing it. Because this is what happens when you wipe code. You're not just trusting the AI. You are also trusting everything the AI trusts. Claude code, like most software, runs on a web of packages and components. it pulls in automatically from open source libraries and repositories.

Starting point is 00:14:32 You don't choose them, you probably don't even know of their existence, and you'd really need to know your stuff to be looking out for them. You can imagine how bad that is, considering that ClaudeCode crossed 2 million weekly active users in March 26 the same month as everything else that happened. And that brings us back to Mythos. In the same week that Anthropic Source code was being downloaded by 13,000, thousand random people, the company was also in conversations with US officials about a model capable

Starting point is 00:15:04 of surpassing almost any human at finding and exploiting software vulnerabilities. A model is so powerful that like I said before, they decided not to release publicly. And this company that is sitting on the model is the same one that left a bug report open for 20 days. The same one whose cloud storage was publicly accessible to anyone with a link, and the same one that found out about its own data leak from a journalist. The thing is, if Anthropic and its security theatre is to be believed, it's not a careless company. And that's what makes this concerning. Because if Anthropic can't prevent all of this, then who can? Daybreak is produced from the newsroom of the Ken India's first subscriber-focused business news platform. What you're listening to is just a small sample

Starting point is 00:16:02 of our subscriber-only offerings. A full subscription offers daily long-form feature stories, newsletters and a whole bunch of premium podcasts. To subscribe, head to the ken.com and click on the red subscribe button on the top of the Ken website. Today's episode was hosted and produced by my colleague Rachel Vargis and edited by Rajiv Sien.

Daybreak - Anthropic built an AI that can supposedly break into anything. Then it forgot to lock its own door

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.