Your Undivided Attention - Anthropic’s Mythos Has Changed Cybersecurity Forever. What Now?

Starting point is 00:00:01 Hey everyone, it's Tristan Harris, and welcome to your undivided attention. Now, a generation ago, your bank had a vault. Your medical records were in a filing cabinet. Our car was a physical machine, and an electric grid just ran on dials and switches that someone physically turned on or off. And today, all of those things are digital. The vault is a database. Our filing cabinet is a server. Your car, your Tesla, is a robot on wheels.

Starting point is 00:00:32 And in a world where all these systems are mostly secure, life just gets more convenient and efficient because of all this. But all that comes into question, when suddenly an AI system can break through the security that runs the world. Now, recently you probably heard, Anthropic announced their most powerful AI model yet, Claude Mythos. You've probably read the headlines.

Starting point is 00:00:53 Claude was looking for flaws in vulnerabilities in the software that runs the world, and within just a few weeks and a few hours, it found thousands of them. It found vulnerabilities in every major operating system and web browser. These are systems that human security researchers had thought were secure for years. Now, Mythos was so dangerous that Anthropics shared it with a select group of companies responsible for cyber defense so that they could use it to find and patch the vulnerabilities before anyone else got access.

Starting point is 00:01:21 That plan, though, is already showing cracks. A couple of weeks after the announcement, Bloomberg reported that a group of unauthorized users had gotten into Mythos through one of Anthropics vendors. And OpenAI announced that they now have a model that's nearly as capable with Chinese open source models just a few months behind. I actually have been talking to some people who run security at some of the companies that got access to Mythos, companies whose job is to keep us safe from cyber tax,

Starting point is 00:01:48 and they've told me, you know, this model is a big deal, and we should be concerned about it. So how do we live in a world where a private company suddenly has a skeleton key that can unlock the entire digital world with no government oversight or accountability. And what does those mean for all of us who rely on digital security to go about our lives? To answer these questions,

Starting point is 00:02:08 we've invited two people who spend their careers thinking about AI and cybersecurity. Josephine Wolfe is a professor of cybersecurity policy at Tufts University, where she focuses on the economic impact of cyber attacks. And Fred Hiding is a research fellow at the Defense Emerging Technology and Strategy Program at Harvard's Kennedy School of Government.

Starting point is 00:02:29 Josephine and Fred, welcome to your invited attention. Thanks so much for having us. Thank you so much, Kristen. So let's just start at the top. Why is this recent announcement from Claude about their mythos model seen as such a game changer? What can it do that the previous AI models or things in cybersecurity could not do?

Starting point is 00:02:46 Fred, let's start with you. There's two really, really big takeaways here. And as you said in the introduction, a lot of cybersecurity to today is surviving because we just didn't have enough manpower to test or attack from the... the attacker's perspective, everything, and that's just completely changing. These AI models, be that now or in one year or in two years,

Starting point is 00:03:09 they can just automate every part of cyber research or almost every part. So the human factors is gone. The day of human, pen testers and security experts are gone. And that's massive. So I think that's the first really big thing. The second really big thing is that this is almost changing from a security problem to an admin problem or a regulatory problem. And we see how Anthropic is working on giving this pre-access to defenders

Starting point is 00:03:34 so that they can use this model before attackers gets their hand on it. And that's actually massive. That type of collaboration can be a complete game changer. So there's technical things. There's collaborative things. And both of them are really big. There are some people who criticize that Cloud Mythos is just hype. Anthropic is trying to hype their capabilities in their model,

Starting point is 00:03:55 that this is, oh, this is so dangerous. We can't even release it to the public. This is just marketing. It's so they can raise more investors. dollars, oh, the thing we're building is so powerful. How do we assess how powerful this is? The first fundamental way to verify

Starting point is 00:04:08 this is just to look at the vulnerabilities that we find, right? And there's a lot of really bad vulnerabilities that could cause a lot of damage that Anthropics managed to find using these AI automated tools. So I think we can definitely say that this is bad. And of course, a lot of people are developing

Starting point is 00:04:24 AI models. Other AI models can also do these things. I think that matters less. We should feel as defenders that this is really bad. We may have a few months advantage in terms of time as defenders from the Frontier Labs, but very soon, you know, Chinese

Starting point is 00:04:40 unregulated open weight models, which is just models that everyone can download and use, they will be able to do these same things. So we should use this time to really do everything we can as defenders, but we shouldn't feel safe because, yeah, Anthropica has done a great job with their model, but other companies

Starting point is 00:04:56 will very soon be able to do this, if not now. I want to kind of contextualize what I think Methodos really represents. You hit return in your keyboard and you literally, the command is as simple as find a vulnerability in this system. That's it. You just put them plain English. You hit return and you come back 30 minutes or an hour later and it's found it. The NSA used to have a statement called nobus or nobody but us, the false idea that, hey, no one else has the capabilities that we have. But suddenly the kind of scarcity around zero day vulnerabilities that we used to have has turned into kind of an abundance. And we talk about AI.

Starting point is 00:05:30 and how it's going to create all this access to things for cheaply, but suddenly zero days are now abundant in a way that we also created. And I just want to help further kind of just settle into this picture of what is the world that we're now living in when we hear all that. Josephine? So I think that when we sort of think about the risks that mythos presents, to me, it's less of, oh my gosh, whichever, you know, powerful country with significant cyber capabilities gets this first is going to be a real risk.

Starting point is 00:06:00 because they're already a real risk, and they're already the people with the time and the resources and the expertise to find these zero-day vulnerabilities. So I think that that, to me, is less of a step change than the idea of sort of who are the people who did not previously have access to these kinds of capabilities who might get them now. And how would that change the landscape in which we've been able to say, you know, okay, well, this is a thing that only China could do, or only China and Russia and North Korea, or whatever the list is, right, I think we're going to have to change our thinking on that

Starting point is 00:06:34 in pretty significant ways doesn't mean that we shouldn't be worried about who has access to these tools. I think Anthropic has definitely hyped some things unnecessarily, but I think they're right to be sort of thoughtful and careful about that. And in the world that I think we're looking to, the world that I hope we're looking to,

Starting point is 00:06:53 let me start there, is one in which cyber defense is as easy as cyber offense. And that I think would be a radically different one from any we've ever lived in before in which I say to you, look, finding all of the zero-day vulnerabilities, patching all of them is the work of a few hours,

Starting point is 00:07:11 just like trying to exploit them. And China has much more secure infrastructure than it ever did before, and the United States has much more secure infrastructure than it ever did before. And so do a whole bunch of other countries and a whole bunch of other companies. And finding a vulnerability

Starting point is 00:07:27 that has not already been found by these AI tools is really, really hard and really, really rare. And I think that, to me, is a much better world to live in than the one that it feels like we're sort of heading towards right now of every country is sort of trying to develop more and more offensive cyber capabilities and plant more little footholds and malware in each other's critical infrastructure and try to exploit the fact that none of those systems are perfectly secure. I think a tool like Mythos allows us to imagine a future

Starting point is 00:07:58 in which actually the default is your critical infrastructure is secure and there's a very, very small number of actors who can possibly compromise it. Let's make sure we're touching on a couple points you're raising there. So one is you're mentioning it's not that state-level actors like China couldn't do these things before or they weren't in our systems, they are in our systems, but suddenly there's a question of who has access. So now maybe non-state rogue actors, you know, hacker groups, cybercriminals, terrorists, you know, Iran who is upset at the U.S. for the recent bombing, you know, naturally. Everyone has maximum incentive to use these things, but they had limited tools before. Now suddenly everyone has very good tools, especially if they can get that model.

Starting point is 00:08:39 The other thing you're raising is the idea that in the long term, you can imagine a world where it's defense dominant because everyone's using AI to just patch everything and we just live in a safer, more secure world. in general. Maybe we should go back in just a moment and make sure we're setting the table for listeners about what exactly is a zero-to-exploit. Why is it called that? And what is a bug bounty? So I think the zero-day piece refers to the idea between the time when it's been discovered and being exploited. So the time people have had to patch it prior to actually exploitation occurring. And the idea is if I try to exploit a vulnerability that we've known about for a year, some people may still be, vulnerable, right? Some people may not have downloaded their patches. We know that's true. But if I'm explaining a zero-day vulnerability, then the idea would be I can get into any system I want in the

Starting point is 00:09:30 whole world because nobody's had a chance to patch that yet. The bug bounties vary a little bit from company to company, but the general model is that tech companies will offer a reward or a bounty to people who don't work for them, but who discover vulnerabilities in their code and report them. So there's this interesting thing where essentially a private company, not a government, has developed something that unlocks all the locks in the world. Fred, one of the things that you were mentioning a second ago is how essentially, you know, with mythos, the U.S. and one specific private U.S. company called Anthropic happened to have this capability first.

Starting point is 00:10:10 And it happened to be the case that there's several months we think until China will get it. Let's say it's three or four months. So there's this weird thing where we have essentially three or four months for the U.S. to notify the people that it wants to help defend, and then give them early access to patched systems. So we basically just happen to prioritize through the decision-making of a handful of people at Anthropic that we're going to patch a handful of U.S. companies.

Starting point is 00:10:32 So what happens if I'm in the Philippines and I'm running old infrastructure? I'm defenseless now. What happens if I'm in Africa and I'm in Nigeria? I'm defenseless now. What happens if in Germany? And as you said, Fred, there's kind of a time question of maybe this time around we have three months to pass.

Starting point is 00:10:48 the systems, but every time further, what if that collapses down to two months, to one month, to one day? Do you want to speak to how you see the cat and mouse game happening in terms of the time horizon? Yeah, I think that's a really good point. And the time horizon is changing a lot. So first to address some of the other things you mentioned, yeah, it gets way easier for small state actors or actors that aren't the big ones, right? Like US and China, it gets way easier for them to launch really devastating cyber attacks, at least for a while, right? Because these AI models can just find vulnerabilities that we haven't found ourselves. And we see that exactly, as you said, with Iran, then it's so cheap to do it now, right?

Starting point is 00:11:26 So I think we will see way more of that. There's a few other interesting remarks I think is worthwhile making. One is that the landscape is changing. As we talk now, Mufos and these AI tools makes it way easier for defenders to test our systems, and that's great. But this is very, very short-sighted in a way, because, of course, AI tools are also being used to rewrite technical infrastructure. So our infrastructure will not look, you know, what it looks like today, it will not look like in one year.

Starting point is 00:11:53 And that's very problematic, potentially good, because AI can write really secure code. But very soon we will be in a world where AI is writing all the code. We have no idea what's going on. They may even write their own program languages and AI fund all the vulnerabilities in that. But that's basically, takes the humans completely out of the lobe. And that amount of just opakness, we will not understand what's going on. Then that's a really big problem. I think Fred's absolutely right to say we're going to see more and more AI generated code

Starting point is 00:12:24 that we aren't going to have as much intuition for how it works or where the vulnerabilities may be. But I think that's also in some ways a familiar problem. When you think about code maintenance, we use an enormous amount of software that humans today don't really understand. Not because it was written by AI, but because if you go to any big tech company that's been around for a decade or longer, there's some usually huge body of code that has been in their products for as long as anyone can remember and nobody knows exactly how it works

Starting point is 00:12:56 but they know that if you change anything, everything breaks. So I would say already we have a little bit of this dynamic where there are languages that people used to code in that most people don't know anymore where there's legacy code that we're sort of stuck with but we don't fully understand or know how to debug and the question is going to be what do we view as,

Starting point is 00:13:17 being the crucial sort of human touch elements here. Or do we view there as being any, right? Are there going to be people signing off on this? If so, what does that entail? What kinds of tests? Are they going to be running? How good? How effective are those tests? I think a lot of uncertainty there around how well we can assess any of these things using the AI tools themselves. So I agree that it's worth thinking about and worth preparing for. I also think that to some, extent this is a challenge we're already facing. And I think there will definitely be new challenges and new potential adversaries, right? If the AI tools themselves are working at odds with the people who designed them or the people who are deploying them, I'm less pessimistic about the

Starting point is 00:14:05 idea that this will be so much worse than the world that we live in today. I think it's certainly a possibility. But I think it could also help sort of fix a lot of a lot of the challenges we've had around what happens when you're not one of the biggest tech companies in the whole world. If you're an open source developer and you're trying to secure your code, then having access to the same kinds of tools that the biggest tech companies are using could be a real game changer. So I guess I'm confused a little bit about why we shouldn't be more concerned because Anthropic only chose those first, whatever it was, 12 to 20 companies to partner with, and then the rest of the world is sort of just screwed, where they're just

Starting point is 00:14:43 vulnerable. So is the world that you're talking about dependent on entropic turning around and making sure that they're just going to GitHub and basically automatically patching everything across all of GitHub in some automated way? What is the world that you're envisioning that enables the lower risk? Yeah, I think for it to be an equalizer, you have to have pretty widely accessible tools. I agree with Fred that I think those are coming,

Starting point is 00:15:04 whether we want them or not. But I also, I would say, and again, I mean to be too Pollyanna-ish about this, 20 tech companies could be a lot of code all over the world. It's not, you know, if you go to Microsoft, you are not just talking about patching machines in the United States. You are not just talking about a small piece of the world whose software you're trying to protect. There is a small number of tech companies that control a lot of the most widely deployed code in the whole world. So I don't know if that's the right number.

Starting point is 00:15:35 I don't know if this is the right set. But I would not necessarily say that's anthropic just trying to carve out a tiny little piece. of the world to protect, I think it's possible that that is a set of companies that have a very far reach. Yeah, definitely. Go ahead, Fred. I really like to try to bring in the everyday person, the ordinary citizens, so to speak here as well. And then you really have to think and ask yourself, well, okay, let's say 20 companies are the only ones in the entire world who can secure our systems who understands our systems, and they don't even understand it. But at least they have an AI that understand. It's everyone else, you know, every single other system is completely,

Starting point is 00:16:13 completely helpless. I don't like that. I don't like that at all. That doesn't feel good to me. And to a large degree, we have had a world where we didn't fully understand our code. That is one of the biggest security problems of our time. However, we did write it, right? There was always someone who couldn't understand it. If all the critical infrastructure, all the power goes down in Massachusetts, for example, someone could figure out how that works. Well, let's see in a future world. All the electricity in Massachusetts goes down and no one has any idea. what's happening in the code. And we don't know how to recover from it.

Starting point is 00:16:46 Yeah, I think that's really bad. I mean, we saw what happened during COVID with just crisis everywhere. And it could be so much worse and no one has any idea of how to fix it. I think that's problematic. Yeah, I mean, I lean on the side of this is much worse. So there's this interesting thing. I mean, I'm happy to go back and forth with you, just to opinion on this. I just, how do we differentiate between, you know, there's nothing new here.

Starting point is 00:17:10 State level actors had this capability, but now we have just like thousands and thousands more actors who can do this stuff. And then the point that you're also raising Fred is like, how comfortable should we feel that just one company has this capability? So, yeah, how should we think about that, Josephine? So I think one of the open questions that I don't know the answer to is, is there some point at which the AI vulnerability finding systems level out? Right.

Starting point is 00:17:34 So far we've seen, you know, continuous improvement. And the things that the models developed this year can do are much more impressive than the things that the models developed last year. can do. If that continues to be the case for the next 10 years, then you're right. Whoever has the newest, fanciest model has a really significant advantage. I don't know if that is the case or if we're going to sort of hit a little bit of a plateau where everybody has models that can find roughly the same set of vulnerabilities and patch and exploit them to roughly the same degree. My general instinct has been more the latter. There is going to be a very significant improvement

Starting point is 00:18:13 and how well we can find vulnerabilities with AI until there isn't, until we have developed systems that can find most of them, and then we're going to see more of a leveling off. In terms of the sort of what do we do when the AI writes all the code and none of us can possibly understand it, I want to emphasize that's a choice, right? It doesn't mean it won't happen. But if we decide we're going to replace all of the software,

Starting point is 00:18:38 powering the Massachusetts electric grid, with software written in a language that no human has, ever used and has ever tried to code or patch, we will be making a deliberate decision that that's the kind of software we want to be using. And I think, I mean, I'm biased because I'm somebody who spends her whole life-steading cybersecurity policy, but one of the reasons I think the policy piece of this picture is really important is because I don't think those are decisions we want to fall into. I think those are decisions we want to make really carefully and deliberately. And I absolutely agree. I think that would be a bad one. But I don't think it's an

Starting point is 00:19:13 inevitable one. None of this is to say I don't think there are risks here, right? Definitely we're going to see cyber attacks where AI is playing larger roles. We're already seeing some of them, especially in the scam world. I think there will be a lot of damage and there will be a lot of losses. Will those be exponentially larger than the damage and the losses we've seen from other cyber attacks? I genuinely don't know. Right. What I have seen so far since the announcement of Mythos has been fairly well contained, which suggests, me, by the way, that the way Anthropic has done this is not necessarily terrible, right? That, you know, choosing a couple large tech companies and working with them to patch some

Starting point is 00:19:52 of the most widely deployed software might be a sensible first step. It's obviously not where they're going to leave it, right? But I, nothing that I have seen in the wild so far has made me feel like, oh, this is a worse threat. These are bigger and scarier losses than any I've seen before. Fred, do you agree, disagree with that? Yeah, no, I think all of these are really good points. I think it's really good with optimism. I'm really pessimistic and that's why we make for a good conversation partner. And I always, yeah, I think you're always sort of spot on in everything you say, Yosephine. Some things I think about a lot is that

Starting point is 00:20:26 so let's say AI makes people develop code quicker. That's true. We see it all around right now. Does AI make you develop secure code? Well, it depends. If you ask it to, it will. But almost no one asks you to for two reasons, right? People don't think about this because they just say create code that can solve task X. Usually people don't think about explicitly telling the AI to make the code

Starting point is 00:20:50 secure. It's also more expensive, right? This is a game of resources as cyber security always have been because it costs tokens and everything will just become a token economy in the end. That's how the AI will work. And will we create a regulation that says you have to spend 20% of your tokens

Starting point is 00:21:06 on security? I don't think we will but that would be great. So you just rush forward and let's take this power plants in Massachusetts again, right? A lot of critical infrastructures owned decentralized by private partners. If they know that they can use this AI-generated super-fast code, that just is incredibly much cheaper, easier to keep up-to-date, easier to work with, et cetera, et cetera. It's not as secure as it should be, but it saves a lot of cost.

Starting point is 00:21:34 Oftentimes, they'll have to do it. They just can't afford not using it right now. I'm just not confident real break long enough, and we will have time to implement all the regulations to stop this. So it could work out. That would be really good if it does. I just see so many scenarios where, again, we have this arms race dynamic.

Starting point is 00:21:51 Everyone is rushing. There's a lot of cost savings to be done. And security usually doesn't fit into that cost equation, until it's too late, basically. So I'm skeptical. I guess the only thing, yeah, the other thing I want to say is that we could find all the vulnerabilities today with AI.

Starting point is 00:22:08 I'm certain we couldn't. That's great. The problem is to detect is moving. The tech infrastructure we have today will not be the one we use in one year. It's always changing. And that's a problem, right? So we're not going to find all the vulnerabilities that we use because these will be outdated and then new ones will create it.

Starting point is 00:22:25 And I just really don't like the world where humans have no say in finding vulnerabilities because we will just be at the mercy of AI to do this. And to me, that's terrifying. So I just want to talk for a moment about the system card for Cloud Mythos. That means the card that Anthropic published of what it is, capabilities. Whenthropic asked Mythos to escape a locked-down sandbox, a computer that was not connected to the internet, and to email a researcher when it had done so, it did that. And it also posted the exploit that it used onto public websites, unprompted, almost bragging or showing off

Starting point is 00:23:04 how it actually had hacked through the container. When Mythos couldn't edit certain files, it injected code into a file to elevate its privileges and then covered its tracks. When he was told not to access certain records, but when it, quote, accidentally saw the answer, it used it anyway, and then sort of hid the fact that it knew the answer. There's some other really crazy things that Mythos is capable of doing. For example, when an LLM was used to judge Mythos's work and kept rejecting it, mythos identified that the thing that was evaluating its behavior was itself an LLM AI, and then it prompted it, meaning that it hacked the AI that was trying to evaluate it. So we have AIs that are able to recognize that

Starting point is 00:23:43 they're being evaluated by other AIs and then hack them. So why this matters is, of course we've had systems and we've had people, human beings, who if they're a top-tier hacker, could hack into some of these systems. However, we have here a totally new level of hacking capability, where, you know, Mythos is able to not just find one exploit, but actually to string together multiple three, four, sometimes even five vulnerabilities in a sequence that can give you a very sophisticated end outcome that we've never had before. You know, one thing we haven't talked about is how, you know, the presumption of all this is that only, quote, the good guys have access to this model.

Starting point is 00:24:25 Anthropic had it. And then through Project Glasswing, they shared it with, quote, the good guys, the defenders. But Anthropics only as good as their security prevents that model from being stolen. And if you think about the Manhattan Project, like if someone from another country wanted to get access to everything we were doing with the Manhattan Project, they couldn't just walk in and then take one little object in their hand and walk out and have an entire nuclear bomb.

Starting point is 00:24:48 But with Claude Mythos, you can do that. We're talking about a weapon for cybersecurity that fits on a flash drive. And there's a joke in the AI security community that we all have to race, like, go faster, go faster, that the U.S. is in the lead. But literally the Chinese companies have, well, we have the second that we have it.

Starting point is 00:25:05 So we're not actually quote ahead of them. We're just ahead of them as far as giving it to them. So how should we think about the, we're only as good as the labs are themselves secure? And ironically, it's a recursive race that the more of these capabilities get developed, the less secure the labs are too. To me, the sort of the access question was always time limited. I would imagine Anthropic felt the same way. And that was why they were sort of making the decisions they felt they had to make

Starting point is 00:25:33 about who they would give early access to. But I don't know that I think that's a bad thing, right? I don't know that I think a world in which all of the companies large and small, all of the countries large and small have sort of access to roughly the same security capabilities is a much worse one. I think it depends on how those capabilities are harnessed. It depends on, again, whether we're able to sort of use them in ways to secure our systems. I think you could, you know, in keeping with my general,

Starting point is 00:26:06 clearly extreme optimism in this conversation, right, you could imagine a world in which it allows for much more geopolitical alliance across these countries if they sort of decide our real enemy is the AI and we all need to work together to make sure our systems are protected against that. I don't think it's the world we're in right now. But I also think that there's a huge amount of room for all of these companies and all of these countries to rethink the question of how secure can we make our systems?

Starting point is 00:26:40 Josephine, you brought up a very important point about is there actually mutual self-interest from the U.S. and China against these capabilities? Clearly on one side of the scale, one country having this step-function advantage in cyber is beneficial to them, not the other one, and they don't want to share or collaborate on that. But then from another perspective,

Starting point is 00:26:57 the risk of rogue actors having, like if either of us leaked a super-capable hacking model that we didn't have the defenses in place for yet or made it so that we only had one day to patch everything and that wasn't enough time to patch everything. Then we're actually all in a more dangerous world. And one of the things we always say in our work and informed the creation of this film,

Starting point is 00:27:14 the AI doc that we were a part of, is that in AI, the fear of all of us losing has to become greater than the fear of me losing to you. If the fear of me losing to you is dominant, then that's what I'm going to focus on is getting that dominant capability. But for example, I found it notable that when Mythos came out,

Starting point is 00:27:33 the public response from the White House didn't come from the Depends Department or the Homeland Security. It came from Treasury Secretary Scott Besant, who had an emergency call with the top banks and top companies. And I think that banks and financial infrastructure are clear places where cascading failures there

Starting point is 00:27:49 would actually create mutually assured financial destruction. Like, on the one hand, you could say China wants to take down the U.S. financial system because they want to switch everybody to Yuan. But on the other hand, there's no way of doing that in a way that doesn't create interconnected fallout for the entire global economy and the stability of the world as we know it?

Starting point is 00:28:06 And curious, both of your reactions to that. There are a variety of ways in which I could imagine this sort of spurring a little bit more, certainly discussion, maybe even cooperation among the countries that have a vested interest in maintaining the stability of the markets, maintaining the stability of critical infrastructure.

Starting point is 00:28:26 What exactly that will look like, how good will be at that in this particular political moment, It's, of course, a little bit difficult to predict. There, again, I think there is some advantage to everybody feeling like, oh, we've all basically got access to roughly the same AI capabilities and not, we've got the best ones, and so we're going to refuse to work with you. And I think it's not clear to me, especially if you sort of follow the trajectory we're talking about before,

Starting point is 00:28:54 of all of our code is written by AI. It has lots of backdoors that only AI can find, but they're not going to tell us about them. I think that's not a great world to live in, but I think it's a world in which a lot of governments are going to find common cause much more than they are right now. And maybe not even just the AI as the adversary, but if North Korea has the ability to shut down

Starting point is 00:29:16 everybody's critical infrastructure, they're probably going to be a lot less restrained about that than a number of other state actors have in the past, and that might also, you know, prompt a higher degree of cooperation. We have to know that this is, a different regime we're entering into. We're now talking about a world where it's not just humans can do the hacking. We're building AIs that can do the hacking. And you can't just negotiate with an AI and say,

Starting point is 00:29:39 don't hack me. You know, if I follow these things, will you not hack me? Like, the AI has its own inscrutable logic. And this is sadly not science fiction anymore. I think the key to me that unlocks the possibility for coordination is mutual recognition of an existential outcome. I think with AI, if you have an AI that is hacking every major web browser and every major, operating system in the world successfully, and that's only going to get stronger, and the AI is going to be able to do that on its own, and if I release it and screw it up, it might cause more existential damage, and it's the existentiality of that outcome that motivates a trustworthy basis for collaboration. To me, that speaks to how the U.S. and China should have something like a, just like there was the red

Starting point is 00:30:20 phone between the Soviet Union and the U.S. to de-escalate nuclear. It seems like we need a red-lines phone for AI between the U.S. and China, by which I mean, you know, Anytime we have evidence of AIs that are going rogue or doing things like hacking in ways that we don't know how to control or stop, at the very least, the right people in national security and the top of both governments should know about that same evidence. Because that creates the common knowledge of, quote, the existential outcome that we're trying to avoid. So to me, that is an achievable thing. I'm not saying this because I have faith in the government leaders that they would do this. I'm just trying to articulate, you know, the pathways that would be there.

Starting point is 00:30:56 And I'm curious if you all have other ideas. Like if we were really designing and trying to scheme about how we would get to some safer world at the level of international understanding and safeguards, what are other things that we would be doing? Josephine? So I think another piece of this that to me is important for thinking about that sort of mutual existential outcome is thinking about how much shared digital infrastructure we all use, right? How many of the same software programs are running on our computers all over the, the world, how many of the same devices we're relying on. And I think a lot of the security

Starting point is 00:31:36 progress in this space is going to have to come from really close collaboration with those companies. And so I think, you know, the cyber red phone, I think there might even have been like a China Daily op-ed advocating for that 10, 15 years ago. I like that idea, right? I think it makes sense to me that there would be some avenue for really trying to focus specifically on these issues and not getting too mired down and everything else going on between these countries at any given moment. But I also think we need to do a much better job of thinking about how do you bring the private sector into those discussions, how do you both sort of respect and defer to their expertise, and also not leave governments kind of completely on the sidelines as we're trying

Starting point is 00:32:21 to decide what kinds of restrictions and constraints we want to put on these systems. and think really seriously about what those constraints are. And I think that we are much more likely to be able to put in place those restrictions with more international cooperation. I think the U.S. on its own is never going to say we shouldn't be pursuing AI to develop bioweapons because if they think China is pursuing that,

Starting point is 00:32:48 then they're never going to want to give up their access to it. So I think it opens the door to being able to say, look, this particular capability seems bad for all of, of us. Let's take it off the table together and that way, you know, worry less about, oh, are you going to get there first? I agree with all those points. What I would add here, and you mentioned it briefly, Tristan, it's just not just educating the government or bringing companies in, but also educating the people, right, and making sure that everyone sees AI as as big of a threat or even bigger than nuclear weapons. I do personally believe that AI is much more

Starting point is 00:33:25 a threat to humanity than even nuclear weapons. I mean, nuclear weapons, could kill a lot of humans, but I think it wouldn't extinct us as a race. I do believe that AI could completely enslave the human race in ways that sounds like sci-fi, but it's not. We already see totalitarian regimes, like look at North Korea, to some degree, China. Russia has parts of this.

Starting point is 00:33:45 They're just without AI, right? It's just people with smart uses of technology. And these smart uses of technology makes it really, really easy for a few people to control a population. And I think people don't understand, this the same way they understand that nuclear is bad. And if people would understand this, they would put pressure on companies, on governments to just drastically change what we're doing.

Starting point is 00:34:08 You're speaking to what we call the attractor state of totalitarian lock-in. So once you locked into authoritarian governments that had both AI surveillance and AI hacking, how can you as a citizen ever fight back if you have no secrets? You can't. Let's take a step down from the international coordination bit, which we talk to. about with China, and we want to go to policy solutions. And one of the things I think, Josephine you've written about is how, you know, you're not liable if you make a piece of code that someone can later be discovered to hack into. We don't treat the software maker as liable for that. So the company that gets hacked has to do with that themselves. And then we started

Starting point is 00:34:49 developing this new economics of an insurance market. Can you talk a little bit about what would be the policy solution that we would do? And this is related to what Fred said earlier around, incentivizing companies to spend more on those tokens to basically ask the AI system, don't just write the code for me, write the secure code for me, which means spend more money on compute, but that's going to cost more. So how do we deal with this from a domestic policy angle?

Starting point is 00:35:13 For the most part right now, we don't. I think the hope for an insurance industry would be that it would incentivize or require companies that are developing software to use state-of-the-art tools for security testing, right? In the same way that, you know, none of us would have smoke detectors in our homes if our insurers didn't require us to. Maybe none of us would spend any money securing our code,

Starting point is 00:35:38 but if our insurance says you've got to do this or we're not going to cover certain types of losses, then perhaps we'll be willing to. And I do think that one of the other things that I find sort of hopeful about tools like Nithos is that they could provide insurers with a clearer roadmap than they've had before of what is it you should actually require of your policyholders to do in terms of security. Is there, you know, a really solid approach that could be just a condition of the coverage? You know, one thing that strikes me is basically saying mythos can change the economics and almost create more precision pricing for insurers saying,

Starting point is 00:36:17 here's what it would cost for you to basically use mythos to do it. Something it didn't hit me until now is obviously now the entire world's dependent on five companies to secure themselves, both for the vulnerabilities of the world and to protect themselves. So it's a racket. It's essentially if those guys went rogue, they have basically, they have everybody, you know, locked into paying them forever to protect themselves. I think it's a reason to, you know,

Starting point is 00:36:40 be advocating for other models of artificial intelligence. It's a reason to be thinking about the open weight models. It's a reason to be thinking about sort of, are there alternatives to a world in which there's a very, very small handful of companies that hold all the cards. But if you can say, like, look, here's a tool, you have to run it, you have to patch everything it finds, that's actually a much more concrete piece of guidance. Now, maybe it won't be perfect, maybe it won't be where we'll end up.

Starting point is 00:37:06 But it would certainly be a big step forward if it turned out to mean that we could then impose some liability on developers who failed to use these tools for vulnerabilities that could have been caught but weren't, if it means that insurers are going to condition their coverage on the use of these types, of tools. It will give a huge amount of power to these companies. No question. Will it give them more power than like the cloud companies have right now? I don't know, right? Tech has always been a very concentrated industry. I think that's a broader systemic issue than just with AI. Fred, do you want to speak to your policy recommendations? I know mandating pre-deployment access for defenders, treating AI labs as critical infrastructure. Do you want to speak to some of these solutions? Just before I want to say that I really

Starting point is 00:37:53 amplify you this criticism or maybe skepticism of a few companies owning all this AI chain. I think Josephine makes a good point that cloud companies are also powerful. We have other powerful semi-monopolis in the world. I do believe that AI is in another category than anything we've seen before.

Starting point is 00:38:09 So I think that is really problematic. And I would love to see more people-owned AI, if possible, more decentralized owners structures. And we could make policies right to approach that. More security-specific to be a little bit more small level for a second.

Starting point is 00:38:25 I think there's a lot of things we could do. Jason Clinton at Anthropic, I'm sure a lot of other people too, talks about this one-day patch policy, and maybe it's even shorter now. But I think that's great, right? Every company should be able to just patch a vulnerability within 24 hours or even much, much quicker, because we just have to. It's going to be so stressful and time-dependent in the future. Whenever a vulnerability is discovered, companies need to have the frameworks in place

Starting point is 00:38:50 to just patch that instantly because we can't wait and be. be slow as we've been even a few weeks is way too long. One of the things you mentioned is treating AI labs as critical infrastructure that they shouldn't be able to maybe there's some public commons level way of accessing this public utility of basically defense so that maybe there's some amount they can charge

Starting point is 00:39:10 but basically they can't overcharge or there's got to be something that just kind of makes it a commons of common security because at the end of the day we need it for securing a safer world. Then the question is, is it just a national thing? Are we extorting still all the international allies to say we're going to force us to forcing them to pay for all these things

Starting point is 00:39:26 it just gets into geopolitics and complicated quickly. I think that's such a good point. I'm definitely seeing AI as a critical infrastructure and there's different arguments here. If we make it an official 17th critical infrastructure sector in the US, maybe we'll slow development, maybe we'll create regulatory overlap

Starting point is 00:39:43 which can be problematic as well. We could do that in a way that I think gives policymakers more power to demand security standards and that might slow things down, but that could also make us more secure. I'm pretty positive to such an approach. I don't think it will happen, but I like to advocate for it.

Starting point is 00:39:59 Maybe just to wrap up, what are some of the things that people can do just in their personal lives to, you know, in light of mythos existing, which if it can hack every operating system, people say they throw up their hands, what can I possibly do? But let's give people some hope. What are some basic things that people should be doing? I think the advice I have, and it's the most irritating and obnoxious advice you can give, but I think it's also the right advice is that it's something people should be thinking about when they're voting, right?

Starting point is 00:40:26 That the question of how politicians are approaching artificial intelligence and whether they think there should be any safeguards and whether they're willing to challenge any of the companies that are developing it is really important and it's only going to get more important as those companies are pouring more and more money into lobbying. There are a whole bunch of issues to think about

Starting point is 00:40:48 when you vote today, and I'm not going to tell you it's the single most important one, but I think it's a very important one and only becoming more so. Well, it's a monopoly of enactment where once this happens, there's no more enactment of anything by citizens because, and so from that perspective, there's a weird way in which is like, okay, well, is this actually more important than the price of eggs or gasoline or whether my kids have school?

Starting point is 00:41:10 Well, it's like, well, but if I can't, if I'm about to lose my political power permanently, then it actually is the most important thing. This should be the number one issue on the midterms. and people do have a say. And if they can share this episode, share this material, go watch the AI doc, get people to see it,

Starting point is 00:41:25 recognize that we're not heading to a pro-human future by default, and we want to be moving towards a pro-human future and against the anti-human future. But I do think that this conversation is, you know, trying to play a role in clarifying the nature of the problems that we face so that we make sure that we're putting in the policies, putting in the guardrails, and also putting forward, as you said, Fred,

Starting point is 00:41:45 basically the collective problems that we need everyone's mind unsolving. How do you protect citizen secrets in a world where AI can hack those secrets? What are the new laws? What are the new code level protections so that anybody who access to such a thing, for example, gets locked. Here's the one system

Starting point is 00:42:01 that can hack into computer systems. If you're using it, there has to be oversight of who's using it and for what. And that has to be enforced at the level of code, basically. Okay, well, I'm just going to give the most irritating cybersecurity advice. And again, I'm only going to give it because I think it's the right advice. You want to be really aggressive about installing the updates, as annoying as you find them,

Starting point is 00:42:19 as much as you want to tell your computer and your phone to delay them. You want to be really careful about how you're using AI, what you're giving it access to, what pieces of your digital life, what pieces of your data are being fed into it. You want to be really thoughtful about which companies, AI tools and products you're using. You want to, you know, think carefully about who's running those companies and what their interests are. And in a moment of deciding, do I need AI for this or maybe not, I think it makes sense right now to err on the side of maybe not. I think these are really good advice.

Starting point is 00:42:56 Some things to maybe take that one step more extreme just to do it, right? Well, let's say something really bad would happen in terms of a totalitarian lock it happens where the people just don't have control anymore. And that could go quickly because all of these AI models are right now being used as social media companies also use their tools to collect, what do you think, what do you do, what's your digital footsteps? And right now that's being used heavily to create ads, right? And you're fair enough, that's annoying, but maybe you can live with that.

Starting point is 00:43:28 But that is to a larger degree being used to nudge you into different direction, making you think in a different direction. So what information do you digest online? I think it's really important to think this. I think there's these statistics that the younger generation get 90% of their news from social media. What accounts do you follow? Are these people rational human beings who seem to know what they're talking about and present both sides of the arguments?

Starting point is 00:43:54 Maybe I can add one thing, Tristan. You spend a lot of years, you know, let's say almost a decade on just trying to figure out how can we counter these incentives of social media, right? And I think it's fair to say we failed as a society to incentivize social media. These are for-profit companies that have done really, really bad harm to the human population. in terms of dopamine hijacking and other things. And we're now starting a similar thing, right, but with AI companies. We have these for-profit AI companies.

Starting point is 00:44:22 They're obviously seeking to shareholder maximize and profit-maximized as they develop their AI models. Are we going to repeat the same mistake again? And we really shouldn't. We have to learn from the mistakes with our failed social media regulation and try to make AI into something better. And that would be really good if we take it seriously. And I don't think we take it seriously right now.

Starting point is 00:44:44 Yep, and we can. We're in a critical window. If we play our cards right, we can make sure that defenders get access to this first. We can have regulation that tries to close the gap of the extra costs for adding security. We can have international coordination with enforceable metrics that we're doing the verification. This could end better than it did with social media. But if we don't, the internet becomes basically unusable for people who don't have top-tier tools. And I do think that this qualifies as kind of a Manhattan project kind of moment. And we need everybody who works in cybersecurity, who has any interest in any capability or talent in these areas to work on defense right now.

Starting point is 00:45:19 You can think of AI as kind of introducing a Y2K vulnerability in all of society, but in a rolling way. So we kind of have a rolling mobilization, a wartime mobilization, to defend our systems from the new vulnerabilities that AI creates. You know, I hope this conversation helps activate everyone in every corner of society, whether it's policymakers or people listening to this, to take part in this. And again, vote in the midterm elections. This is not inevitable. Fred and Josephine, thank you so much for coming on Your Undivided Attention.

Starting point is 00:45:47 This has been really fantastic. Thanks for having us. Thank you so much, Tristan. Your Undivided Attention is produced by the Center for Humane Technology, a non-profit working to catalyze a humane future. Our senior producer is Julia Scott. Josh Lash is our researcher and producer, and our executive producer is Sasha Fegan.

Starting point is 00:46:09 Mixing on this episode by Jeff Sudaken, original music by Ryan and Hayes Holiday. And a special thanks to the whole Center for Humane Technology team for making this podcast possible. You can find show notes, transcripts, and much more at humanetech.com. And if you like the podcast, we'd be grateful if you could rate it on Apple Podcasts

Starting point is 00:46:26 because it helps other people find the show. And if you made it all the way here, let me give one more thank you to you for giving us your undivided attention.

Your Undivided Attention - Anthropic’s Mythos Has Changed Cybersecurity Forever. What Now?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.