Your Undivided Attention - Anthropic’s Mythos Has Changed Cybersecurity Forever. What Now?
Episode Date: May 14, 2026A generation ago, the world's critical infrastructure was physical. Today, it’s largely digital. Your bank vault is a database, your filing cabinet is a server, your car is a robot on wheels. And in... a world where these systems are mostly secure, life is more convenient and efficient. But all that comes into question when an AI system can break through the security that runs the world. That’s what’s happened with Claude Mythos, Anthropic’s most powerful AI model yet. In a very short time, Claude found thousands of flaws and vulnerabilities in the software that runs the world, in every major operating system and web browser — systems that human security researchers had thought were secure for years. How do we live in a world where a private company suddenly has a skeleton key that can unlock the entire digital world with little oversight or accountability? And what does Mythos mean for all of us who rely on digital security to go about our lives? In this episode, we speak with two cybersecurity experts to answer these questions: Josephine Wolff is a professor of cybersecurity policy at Tufts University, where she focuses on the economic impact of cyberattacks. Fred Heiding is a research fellow at the Defense, Emerging Technology, and Strategy Program at Harvard's Kennedy School of Government.Your Undivided Attention is produced by the Center for Humane Technology. Follow us on X: @HumaneTech_ and subscribe to our Substack.RECOMMENDED MEDIA The Claude Mythos System Card The Project Glasswing announcement “Black-hat LLMs,” a talk on AI’s hacking capabilities by senior Anthropic researcher Nicholas Carlini You'll See This Message When It Is Too Late: The Legal and Economic Aftermath of Cybersecurity Breaches by Josephine Wolff “America’s Endangered AI: How Weak Cyberdefenses Threaten U.S. Tech Dominance,” by Fred Heiding and Chris InglesRECOMMENDED YUA EPISODES America and China Are Racing to Different AI Futures “Rogue AI” Used to be a Science Fiction Trope. Not Anymore. The Self-Preserving Machine: Why AI Learns to Deceive Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Transcript
Discussion (0)
Hey everyone, it's Tristan Harris, and welcome to your undivided attention.
Now, a generation ago, your bank had a vault.
Your medical records were in a filing cabinet.
Our car was a physical machine, and an electric grid just ran on dials and switches that someone physically turned on or off.
And today, all of those things are digital.
The vault is a database.
Our filing cabinet is a server.
Your car, your Tesla, is a robot on wheels.
And in a world where all these systems are mostly secure,
life just gets more convenient and efficient because of all this.
But all that comes into question,
when suddenly an AI system can break through the security that runs the world.
Now, recently you probably heard,
Anthropic announced their most powerful AI model yet,
Claude Mythos.
You've probably read the headlines.
Claude was looking for flaws in vulnerabilities in the software that runs the world,
and within just a few weeks and a few hours,
it found thousands of them.
It found vulnerabilities in every major operating system and web browser.
These are systems that human security researchers had thought were secure for years.
Now, Mythos was so dangerous that Anthropics shared it with a select group of companies
responsible for cyber defense so that they could use it to find and patch the vulnerabilities
before anyone else got access.
That plan, though, is already showing cracks.
A couple of weeks after the announcement, Bloomberg reported that a group of unauthorized users
had gotten into Mythos through one of Anthropics vendors.
And OpenAI announced that they now have a model that's nearly as capable
with Chinese open source models just a few months behind.
I actually have been talking to some people who run security at some of the companies
that got access to Mythos,
companies whose job is to keep us safe from cyber tax,
and they've told me, you know, this model is a big deal,
and we should be concerned about it.
So how do we live in a world where a private company suddenly has a skeleton key
that can unlock the entire digital world
with no government oversight or accountability.
And what does those mean for all of us
who rely on digital security to go about our lives?
To answer these questions,
we've invited two people
who spend their careers thinking about AI and cybersecurity.
Josephine Wolfe is a professor of cybersecurity policy
at Tufts University,
where she focuses on the economic impact of cyber attacks.
And Fred Hiding is a research fellow
at the Defense Emerging Technology and Strategy Program
at Harvard's Kennedy School of Government.
Josephine and Fred, welcome to your invited attention.
Thanks so much for having us.
Thank you so much, Kristen.
So let's just start at the top.
Why is this recent announcement from Claude
about their mythos model seen as such a game changer?
What can it do that the previous AI models
or things in cybersecurity could not do?
Fred, let's start with you.
There's two really, really big takeaways here.
And as you said in the introduction,
a lot of cybersecurity to today is surviving
because we just didn't have enough manpower
to test or attack from the...
the attacker's perspective, everything, and that's just completely changing.
These AI models, be that now or in one year or in two years,
they can just automate every part of cyber research or almost every part.
So the human factors is gone.
The day of human, pen testers and security experts are gone.
And that's massive.
So I think that's the first really big thing.
The second really big thing is that this is almost changing from a security problem
to an admin problem or a regulatory problem.
And we see how Anthropic is working on giving this pre-access to defenders
so that they can use this model before attackers gets their hand on it.
And that's actually massive.
That type of collaboration can be a complete game changer.
So there's technical things.
There's collaborative things.
And both of them are really big.
There are some people who criticize that Cloud Mythos is just hype.
Anthropic is trying to hype their capabilities in their model,
that this is, oh, this is so dangerous.
We can't even release it to the public.
This is just marketing.
It's so they can raise more investors.
dollars, oh, the thing we're building is so powerful.
How do we assess how
powerful this is? The first
fundamental way to verify
this is just to look at the vulnerabilities
that we find, right? And there's a lot of
really bad vulnerabilities that could cause
a lot of damage that Anthropics managed
to find using these AI automated
tools. So I think we can
definitely say that this is bad. And
of course, a lot of people are developing
AI models. Other AI models can
also do these things. I think that
matters less. We should feel as
defenders that this is really bad.
We may have a few months
advantage in terms of time as
defenders from the Frontier Labs, but
very soon, you know, Chinese
unregulated open weight models,
which is just models that everyone can
download and use, they will be able
to do these same things. So we should
use this time to really do everything
we can as defenders, but we shouldn't feel
safe because, yeah, Anthropica has done a great
job with their model, but other companies
will very soon be able to do this, if not
now. I want to kind of
contextualize what I think Methodos really represents. You hit return in your keyboard and you literally,
the command is as simple as find a vulnerability in this system. That's it. You just put them plain English.
You hit return and you come back 30 minutes or an hour later and it's found it. The NSA used to have a
statement called nobus or nobody but us, the false idea that, hey, no one else has the
capabilities that we have. But suddenly the kind of scarcity around zero day vulnerabilities that we
used to have has turned into kind of an abundance. And we talk about AI.
and how it's going to create all this access to things for cheaply,
but suddenly zero days are now abundant in a way that we also created.
And I just want to help further kind of just settle into this picture
of what is the world that we're now living in when we hear all that.
Josephine?
So I think that when we sort of think about the risks that mythos presents,
to me, it's less of, oh my gosh, whichever, you know, powerful country
with significant cyber capabilities gets this first is going to be a real risk.
because they're already a real risk, and they're already the people with the time and the resources
and the expertise to find these zero-day vulnerabilities. So I think that that, to me, is less of a
step change than the idea of sort of who are the people who did not previously have access to
these kinds of capabilities who might get them now. And how would that change the landscape
in which we've been able to say, you know, okay, well, this is a thing that only China could do,
or only China and Russia and North Korea,
or whatever the list is, right,
I think we're going to have to change our thinking on that
in pretty significant ways
doesn't mean that we shouldn't be worried
about who has access to these tools.
I think Anthropic has definitely hyped some things unnecessarily,
but I think they're right to be sort of thoughtful
and careful about that.
And in the world that I think we're looking to,
the world that I hope we're looking to,
let me start there,
is one in which cyber defense
is as easy as cyber offense.
And that I think would be a radically different one
from any we've ever lived in before
in which I say to you,
look, finding all of the zero-day vulnerabilities,
patching all of them is the work of a few hours,
just like trying to exploit them.
And China has much more secure infrastructure
than it ever did before,
and the United States has much more secure infrastructure
than it ever did before.
And so do a whole bunch of other countries
and a whole bunch of other companies.
And finding a vulnerability
that has not already been found by these AI tools
is really, really hard and really, really rare.
And I think that, to me, is a much better world to live in
than the one that it feels like we're sort of heading towards right now
of every country is sort of trying to develop more and more offensive cyber capabilities
and plant more little footholds and malware in each other's critical infrastructure
and try to exploit the fact that none of those systems are perfectly secure.
I think a tool like Mythos allows us to imagine a future
in which actually the default is your critical infrastructure is secure
and there's a very, very small number of actors who can possibly compromise it.
Let's make sure we're touching on a couple points you're raising there.
So one is you're mentioning it's not that state-level actors like China couldn't do these things before
or they weren't in our systems, they are in our systems, but suddenly there's a question of who has access.
So now maybe non-state rogue actors, you know, hacker groups, cybercriminals, terrorists, you know, Iran who is upset at the U.S. for the recent bombing, you know, naturally.
Everyone has maximum incentive to use these things, but they had limited tools before.
Now suddenly everyone has very good tools, especially if they can get that model.
The other thing you're raising is the idea that in the long term, you can imagine a world where it's defense dominant because everyone's using AI to just patch everything and we just live in a safer, more secure world.
in general. Maybe we should go back in just a moment and make sure we're setting the table for listeners
about what exactly is a zero-to-exploit. Why is it called that? And what is a bug bounty?
So I think the zero-day piece refers to the idea between the time when it's been discovered and being
exploited. So the time people have had to patch it prior to actually exploitation occurring. And
the idea is if I try to exploit a vulnerability that we've known about for a year, some people may still be,
vulnerable, right? Some people may not have downloaded their patches. We know that's true. But if I'm
explaining a zero-day vulnerability, then the idea would be I can get into any system I want in the
whole world because nobody's had a chance to patch that yet. The bug bounties vary a little bit
from company to company, but the general model is that tech companies will offer a reward or a bounty
to people who don't work for them, but who discover vulnerabilities in their code and report them.
So there's this interesting thing where essentially a private company, not a government,
has developed something that unlocks all the locks in the world.
Fred, one of the things that you were mentioning a second ago is how essentially, you know,
with mythos, the U.S. and one specific private U.S. company called Anthropic
happened to have this capability first.
And it happened to be the case that there's several months we think until China will get it.
Let's say it's three or four months.
So there's this weird thing where we have essentially three or four months
for the U.S. to notify the people that it wants to help defend,
and then give them early access to patched systems.
So we basically just happen to prioritize
through the decision-making of a handful of people at Anthropic
that we're going to patch a handful of U.S. companies.
So what happens if I'm in the Philippines
and I'm running old infrastructure?
I'm defenseless now.
What happens if I'm in Africa and I'm in Nigeria?
I'm defenseless now.
What happens if in Germany?
And as you said, Fred, there's kind of a time question
of maybe this time around we have three months to pass.
the systems, but every time further, what if that collapses down to two months, to one month,
to one day? Do you want to speak to how you see the cat and mouse game happening in terms of the
time horizon? Yeah, I think that's a really good point. And the time horizon is changing
a lot. So first to address some of the other things you mentioned, yeah, it gets way easier for
small state actors or actors that aren't the big ones, right? Like US and China, it gets way
easier for them to launch really devastating cyber attacks, at least for a while, right? Because
these AI models can just find vulnerabilities that we haven't found ourselves.
And we see that exactly, as you said, with Iran, then it's so cheap to do it now, right?
So I think we will see way more of that.
There's a few other interesting remarks I think is worthwhile making.
One is that the landscape is changing.
As we talk now, Mufos and these AI tools makes it way easier for defenders to test our systems,
and that's great.
But this is very, very short-sighted in a way, because, of course,
AI tools are also being used to rewrite technical infrastructure.
So our infrastructure will not look, you know, what it looks like today, it will not look like in one year.
And that's very problematic, potentially good, because AI can write really secure code.
But very soon we will be in a world where AI is writing all the code.
We have no idea what's going on.
They may even write their own program languages and AI fund all the vulnerabilities in that.
But that's basically, takes the humans completely out of the lobe.
And that amount of just opakness, we will not understand what's going on.
Then that's a really big problem.
I think Fred's absolutely right to say we're going to see more and more AI generated code
that we aren't going to have as much intuition for how it works or where the vulnerabilities may be.
But I think that's also in some ways a familiar problem.
When you think about code maintenance, we use an enormous amount of software that humans today don't really understand.
Not because it was written by AI, but because if you go to any big tech company that's been around for a decade or longer,
there's some usually huge body of code
that has been in their products
for as long as anyone can remember
and nobody knows exactly how it works
but they know that if you change anything, everything breaks.
So I would say already we have a little bit of this dynamic
where there are languages that people used to code in
that most people don't know anymore
where there's legacy code that we're sort of stuck with
but we don't fully understand or know how to debug
and the question is going to be
what do we view as,
being the crucial sort of human touch elements here. Or do we view there as being any, right?
Are there going to be people signing off on this? If so, what does that entail? What kinds of tests?
Are they going to be running? How good? How effective are those tests? I think a lot of uncertainty there
around how well we can assess any of these things using the AI tools themselves. So I agree that
it's worth thinking about and worth preparing for. I also think that to some,
extent this is a challenge we're already facing. And I think there will definitely be new challenges
and new potential adversaries, right? If the AI tools themselves are working at odds with the
people who designed them or the people who are deploying them, I'm less pessimistic about the
idea that this will be so much worse than the world that we live in today. I think it's certainly a
possibility. But I think it could also help sort of fix a lot of
a lot of the challenges we've had around what happens when you're not one of the biggest tech
companies in the whole world. If you're an open source developer and you're trying to secure your
code, then having access to the same kinds of tools that the biggest tech companies are using
could be a real game changer. So I guess I'm confused a little bit about why we shouldn't be
more concerned because Anthropic only chose those first, whatever it was, 12 to 20 companies to
partner with, and then the rest of the world is sort of just screwed, where they're just
vulnerable. So is the world that you're talking about dependent on
entropic turning around and making sure that they're
just going to GitHub and basically automatically patching
everything across all of GitHub in some automated way?
What is the world that you're envisioning that enables the lower risk?
Yeah, I think for it to be an equalizer,
you have to have pretty widely accessible tools.
I agree with Fred that I think those are coming,
whether we want them or not.
But I also, I would say, and again,
I mean to be too Pollyanna-ish about this,
20 tech companies could be a lot of code all over the world.
It's not, you know, if you go to Microsoft, you are not just talking about patching machines in the United States.
You are not just talking about a small piece of the world whose software you're trying to protect.
There is a small number of tech companies that control a lot of the most widely deployed code in the whole world.
So I don't know if that's the right number.
I don't know if this is the right set.
But I would not necessarily say that's anthropic just trying to carve out a tiny little piece.
of the world to protect, I think it's possible that that is a set of companies that have a very
far reach. Yeah, definitely. Go ahead, Fred. I really like to try to bring in the everyday person,
the ordinary citizens, so to speak here as well. And then you really have to think and ask yourself,
well, okay, let's say 20 companies are the only ones in the entire world who can secure our systems
who understands our systems, and they don't even understand it. But at least they have an AI
that understand. It's everyone else, you know, every single other system is completely,
completely helpless. I don't like that. I don't like that at all. That doesn't feel good to me.
And to a large degree, we have had a world where we didn't fully understand our code. That is one of
the biggest security problems of our time. However, we did write it, right? There was always someone
who couldn't understand it. If all the critical infrastructure, all the power goes down in
Massachusetts, for example, someone could figure out how that works. Well, let's see in a future world.
All the electricity in Massachusetts goes down and no one has any idea.
what's happening in the code.
And we don't know how to recover from it.
Yeah, I think that's really bad.
I mean, we saw what happened during COVID with just crisis everywhere.
And it could be so much worse and no one has any idea of how to fix it.
I think that's problematic.
Yeah, I mean, I lean on the side of this is much worse.
So there's this interesting thing.
I mean, I'm happy to go back and forth with you, just to opinion on this.
I just, how do we differentiate between, you know, there's nothing new here.
State level actors had this capability, but now we have just like thousands
and thousands more actors who can do this stuff.
And then the point that you're also raising Fred is like,
how comfortable should we feel that just one company has this capability?
So, yeah, how should we think about that, Josephine?
So I think one of the open questions that I don't know the answer to
is, is there some point at which the AI vulnerability finding systems level out?
Right.
So far we've seen, you know, continuous improvement.
And the things that the models developed this year can do
are much more impressive than the things that the models developed last year.
can do. If that continues to be the case for the next 10 years, then you're right. Whoever has
the newest, fanciest model has a really significant advantage. I don't know if that is the case
or if we're going to sort of hit a little bit of a plateau where everybody has models that can
find roughly the same set of vulnerabilities and patch and exploit them to roughly the same degree.
My general instinct has been more the latter. There is going to be a very significant improvement
and how well we can find vulnerabilities with AI until there isn't,
until we have developed systems that can find most of them,
and then we're going to see more of a leveling off.
In terms of the sort of what do we do when the AI writes all the code
and none of us can possibly understand it,
I want to emphasize that's a choice, right?
It doesn't mean it won't happen.
But if we decide we're going to replace all of the software,
powering the Massachusetts electric grid,
with software written in a language that no human has,
ever used and has ever tried to code or patch, we will be making a deliberate decision that
that's the kind of software we want to be using. And I think, I mean, I'm biased because I'm somebody
who spends her whole life-steading cybersecurity policy, but one of the reasons I think the policy
piece of this picture is really important is because I don't think those are decisions we want
to fall into. I think those are decisions we want to make really carefully and deliberately.
And I absolutely agree. I think that would be a bad one. But I don't think it's an
inevitable one. None of this is to say I don't think there are risks here, right? Definitely we're
going to see cyber attacks where AI is playing larger roles. We're already seeing some of them,
especially in the scam world. I think there will be a lot of damage and there will be a lot of losses.
Will those be exponentially larger than the damage and the losses we've seen from other cyber attacks?
I genuinely don't know. Right. What I have seen so far since the announcement of Mythos has been
fairly well contained, which suggests,
me, by the way, that the way Anthropic has done this is not necessarily terrible, right?
That, you know, choosing a couple large tech companies and working with them to patch some
of the most widely deployed software might be a sensible first step. It's obviously not
where they're going to leave it, right? But I, nothing that I have seen in the wild so far has
made me feel like, oh, this is a worse threat. These are bigger and scarier losses than any I've
seen before. Fred, do you agree, disagree with that? Yeah, no, I think all of these are really good
points. I think it's really good with optimism. I'm really
pessimistic and that's why we make for a good conversation partner.
And I always, yeah, I think you're always sort of spot on in
everything you say, Yosephine. Some things I think about a lot is that
so let's say AI makes people develop code quicker. That's true. We see it
all around right now. Does AI make you develop secure
code? Well, it depends. If you ask it to, it will.
But almost no one asks you to for two reasons, right? People don't
think about this because they just say
create code that can solve task X.
Usually people don't think about
explicitly telling the AI to make the code
secure. It's also more expensive, right?
This is a game of resources as cyber security always
have been because it costs
tokens and everything will just become a token
economy in the end. That's how the AI will
work. And will we
create a regulation that says
you have to spend 20% of your tokens
on security? I don't think we will
but that would be great. So you just
rush forward and let's take this power plants
in Massachusetts again, right?
A lot of critical infrastructures owned decentralized by private partners.
If they know that they can use this AI-generated super-fast code,
that just is incredibly much cheaper, easier to keep up-to-date, easier to work with, et cetera, et cetera.
It's not as secure as it should be, but it saves a lot of cost.
Oftentimes, they'll have to do it.
They just can't afford not using it right now.
I'm just not confident real break long enough, and we will have time to implement
all the regulations to stop this.
So it could work out.
That would be really good if it does.
I just see so many scenarios where, again,
we have this arms race dynamic.
Everyone is rushing.
There's a lot of cost savings to be done.
And security usually doesn't fit into that cost equation,
until it's too late, basically.
So I'm skeptical.
I guess the only thing, yeah,
the other thing I want to say is that we could find all the vulnerabilities
today with AI.
I'm certain we couldn't.
That's great.
The problem is to detect is moving.
The tech infrastructure we have today will not be the one we use in one year.
It's always changing.
And that's a problem, right?
So we're not going to find all the vulnerabilities that we use
because these will be outdated and then new ones will create it.
And I just really don't like the world where humans have no say in finding vulnerabilities
because we will just be at the mercy of AI to do this.
And to me, that's terrifying.
So I just want to talk for a moment about the system card for Cloud Mythos.
That means the card that Anthropic published of what it is,
capabilities. Whenthropic asked Mythos to escape a locked-down sandbox, a computer that was not
connected to the internet, and to email a researcher when it had done so, it did that. And it also
posted the exploit that it used onto public websites, unprompted, almost bragging or showing off
how it actually had hacked through the container. When Mythos couldn't edit certain files, it injected
code into a file to elevate its privileges and then covered its tracks. When he was told not to access
certain records, but when it, quote, accidentally saw the answer, it used it anyway, and then
sort of hid the fact that it knew the answer.
There's some other really crazy things that Mythos is capable of doing. For example, when an
LLM was used to judge Mythos's work and kept rejecting it, mythos identified that the thing
that was evaluating its behavior was itself an LLM AI, and then it prompted it, meaning that
it hacked the AI that was trying to evaluate it. So we have AIs that are able to recognize that
they're being evaluated by other AIs and then hack them.
So why this matters is, of course we've had systems and we've had people, human beings,
who if they're a top-tier hacker, could hack into some of these systems.
However, we have here a totally new level of hacking capability,
where, you know, Mythos is able to not just find one exploit,
but actually to string together multiple three, four, sometimes even five vulnerabilities in a sequence
that can give you a very sophisticated end outcome that we've never had before.
You know, one thing we haven't talked about is how, you know, the presumption of all this is that only, quote, the good guys have access to this model.
Anthropic had it. And then through Project Glasswing, they shared it with, quote, the good guys, the defenders.
But Anthropics only as good as their security prevents that model from being stolen.
And if you think about the Manhattan Project,
like if someone from another country
wanted to get access to everything we were doing
with the Manhattan Project,
they couldn't just walk in and then take one little object
in their hand and walk out and have an entire nuclear bomb.
But with Claude Mythos, you can do that.
We're talking about a weapon for cybersecurity
that fits on a flash drive.
And there's a joke in the AI security community
that we all have to race, like, go faster, go faster,
that the U.S. is in the lead.
But literally the Chinese companies have,
well, we have the second that we have it.
So we're not actually quote ahead of them.
We're just ahead of them as far as giving it to them.
So how should we think about the, we're only as good as the labs are themselves secure?
And ironically, it's a recursive race that the more of these capabilities get developed,
the less secure the labs are too.
To me, the sort of the access question was always time limited.
I would imagine Anthropic felt the same way.
And that was why they were sort of making the decisions they felt they had to make
about who they would give early access to.
But I don't know that I think that's a bad thing, right?
I don't know that I think a world in which all of the companies large and small,
all of the countries large and small have sort of access to roughly the same security capabilities
is a much worse one.
I think it depends on how those capabilities are harnessed.
It depends on, again, whether we're able to sort of use them in ways to secure our systems.
I think you could, you know, in keeping with my general,
clearly extreme optimism in this conversation,
right, you could imagine a world in which it allows for much more geopolitical alliance
across these countries if they sort of decide our real enemy is the AI
and we all need to work together to make sure our systems are protected against that.
I don't think it's the world we're in right now.
But I also think that there's a huge amount of room
for all of these companies and all of these countries
to rethink the question of how secure can we make our systems?
Josephine, you brought up a very important point about
is there actually mutual self-interest from the U.S. and China
against these capabilities?
Clearly on one side of the scale,
one country having this step-function advantage in cyber
is beneficial to them, not the other one,
and they don't want to share or collaborate on that.
But then from another perspective,
the risk of rogue actors having,
like if either of us leaked a super-capable hacking model
that we didn't have the defenses in place for yet
or made it so that we only had one day to patch everything
and that wasn't enough time to patch everything.
Then we're actually all in a more dangerous world.
And one of the things we always say in our work
and informed the creation of this film,
the AI doc that we were a part of,
is that in AI, the fear of all of us losing
has to become greater than the fear of me losing to you.
If the fear of me losing to you is dominant,
then that's what I'm going to focus on
is getting that dominant capability.
But for example, I found it notable
that when Mythos came out,
the public response from the White House
didn't come from the Depends Department
or the Homeland Security.
It came from Treasury Secretary Scott Besant,
who had an emergency call
with the top banks and top companies.
And I think that banks and financial infrastructure
are clear places where cascading failures there
would actually create mutually assured financial destruction.
Like, on the one hand, you could say China wants to take down
the U.S. financial system
because they want to switch everybody to Yuan.
But on the other hand, there's no way of doing that
in a way that doesn't create interconnected fallout
for the entire global economy
and the stability of the world as we know it?
And curious, both of your reactions to that.
There are a variety of ways
in which I could imagine this sort of spurring
a little bit more, certainly discussion,
maybe even cooperation
among the countries that have a vested interest
in maintaining the stability of the markets,
maintaining the stability of critical infrastructure.
What exactly that will look like,
how good will be at that
in this particular political moment,
It's, of course, a little bit difficult to predict.
There, again, I think there is some advantage to everybody feeling like,
oh, we've all basically got access to roughly the same AI capabilities and not,
we've got the best ones, and so we're going to refuse to work with you.
And I think it's not clear to me, especially if you sort of follow the trajectory we're talking about before,
of all of our code is written by AI.
It has lots of backdoors that only AI can find, but they're not going to tell us about them.
I think that's not a great world to live in,
but I think it's a world in which a lot of governments
are going to find common cause
much more than they are right now.
And maybe not even just the AI as the adversary,
but if North Korea has the ability to shut down
everybody's critical infrastructure,
they're probably going to be a lot less restrained about that
than a number of other state actors have in the past,
and that might also, you know,
prompt a higher degree of cooperation.
We have to know that this is,
a different regime we're entering into. We're now talking about a world where it's not just humans can do
the hacking. We're building AIs that can do the hacking. And you can't just negotiate with an AI and say,
don't hack me. You know, if I follow these things, will you not hack me? Like, the AI has its own
inscrutable logic. And this is sadly not science fiction anymore. I think the key to me that
unlocks the possibility for coordination is mutual recognition of an existential outcome. I think with
AI, if you have an AI that is hacking every major web browser and every major,
operating system in the world successfully, and that's only going to get stronger, and the AI is going to be
able to do that on its own, and if I release it and screw it up, it might cause more existential damage,
and it's the existentiality of that outcome that motivates a trustworthy basis for collaboration.
To me, that speaks to how the U.S. and China should have something like a, just like there was the red
phone between the Soviet Union and the U.S. to de-escalate nuclear.
It seems like we need a red-lines phone for AI between the U.S. and China, by which I mean, you know,
Anytime we have evidence of AIs that are going rogue or doing things like hacking in ways that we don't know how to control or stop,
at the very least, the right people in national security and the top of both governments should know about that same evidence.
Because that creates the common knowledge of, quote, the existential outcome that we're trying to avoid.
So to me, that is an achievable thing.
I'm not saying this because I have faith in the government leaders that they would do this.
I'm just trying to articulate, you know, the pathways that would be there.
And I'm curious if you all have other ideas.
Like if we were really designing and trying to scheme about how we would get to some safer world
at the level of international understanding and safeguards, what are other things that we would be doing?
Josephine?
So I think another piece of this that to me is important for thinking about that sort of mutual existential outcome
is thinking about how much shared digital infrastructure we all use, right?
How many of the same software programs are running on our computers all over the,
the world, how many of the same devices we're relying on. And I think a lot of the security
progress in this space is going to have to come from really close collaboration with those
companies. And so I think, you know, the cyber red phone, I think there might even have been
like a China Daily op-ed advocating for that 10, 15 years ago. I like that idea, right? I think it
makes sense to me that there would be some avenue for really trying to focus specifically on
these issues and not getting too mired down and everything else going on between these countries
at any given moment. But I also think we need to do a much better job of thinking about how do you
bring the private sector into those discussions, how do you both sort of respect and defer to
their expertise, and also not leave governments kind of completely on the sidelines as we're trying
to decide what kinds of restrictions and constraints we want to put on these systems.
and think really seriously about what those constraints are.
And I think that we are much more likely
to be able to put in place those restrictions
with more international cooperation.
I think the U.S. on its own is never going to say
we shouldn't be pursuing AI to develop bioweapons
because if they think China is pursuing that,
then they're never going to want to give up their access to it.
So I think it opens the door to being able to say,
look, this particular capability seems bad for all of,
of us. Let's take it off the table together and that way, you know, worry less about, oh,
are you going to get there first? I agree with all those points. What I would add here,
and you mentioned it briefly, Tristan, it's just not just educating the government or bringing
companies in, but also educating the people, right, and making sure that everyone sees AI as
as big of a threat or even bigger than nuclear weapons. I do personally believe that AI is much more
a threat to humanity than even nuclear weapons. I mean, nuclear weapons,
could kill a lot of humans,
but I think it wouldn't extinct us as a race.
I do believe that AI could completely enslave the human race
in ways that sounds like sci-fi, but it's not.
We already see totalitarian regimes,
like look at North Korea, to some degree, China.
Russia has parts of this.
They're just without AI, right?
It's just people with smart uses of technology.
And these smart uses of technology
makes it really, really easy for a few people
to control a population.
And I think people don't understand,
this the same way they understand that nuclear is bad. And if people would understand this,
they would put pressure on companies, on governments to just drastically change what we're doing.
You're speaking to what we call the attractor state of totalitarian lock-in. So once you
locked into authoritarian governments that had both AI surveillance and AI hacking, how can you
as a citizen ever fight back if you have no secrets? You can't. Let's take a step down from the
international coordination bit, which we talk to.
about with China, and we want to go to policy solutions. And one of the things I think,
Josephine you've written about is how, you know, you're not liable if you make a piece of code
that someone can later be discovered to hack into. We don't treat the software maker as liable
for that. So the company that gets hacked has to do with that themselves. And then we started
developing this new economics of an insurance market. Can you talk a little bit about what would be
the policy solution that we would do? And this is related to what Fred said earlier around,
incentivizing companies to spend more on those tokens
to basically ask the AI system,
don't just write the code for me, write the secure code for me,
which means spend more money on compute,
but that's going to cost more.
So how do we deal with this from a domestic policy angle?
For the most part right now, we don't.
I think the hope for an insurance industry
would be that it would incentivize or require
companies that are developing software
to use state-of-the-art tools
for security testing, right? In the same way that, you know,
none of us would have smoke detectors in our homes if our insurers didn't require us to.
Maybe none of us would spend any money securing our code,
but if our insurance says you've got to do this or we're not going to cover
certain types of losses, then perhaps we'll be willing to.
And I do think that one of the other things that I find sort of hopeful about tools like
Nithos is that they could provide insurers with a clearer roadmap than they've had before
of what is it you should actually require of your policyholders to do in terms of security.
Is there, you know, a really solid approach that could be just a condition of the coverage?
You know, one thing that strikes me is basically saying mythos can change the economics
and almost create more precision pricing for insurers saying,
here's what it would cost for you to basically use mythos to do it.
Something it didn't hit me until now is obviously now the entire world's dependent on five companies to secure themselves,
both for the vulnerabilities of the world and to protect themselves.
So it's a racket.
It's essentially if those guys went rogue,
they have basically, they have everybody, you know,
locked into paying them forever to protect themselves.
I think it's a reason to, you know,
be advocating for other models of artificial intelligence.
It's a reason to be thinking about the open weight models.
It's a reason to be thinking about sort of,
are there alternatives to a world in which there's a very, very small handful
of companies that hold all the cards.
But if you can say, like, look, here's a tool, you have to run it, you have to patch everything it finds,
that's actually a much more concrete piece of guidance.
Now, maybe it won't be perfect, maybe it won't be where we'll end up.
But it would certainly be a big step forward if it turned out to mean that we could then impose some liability
on developers who failed to use these tools for vulnerabilities that could have been caught but weren't,
if it means that insurers are going to condition their coverage on the use of these types,
of tools. It will give a huge amount of power to these companies. No question. Will it give them more power
than like the cloud companies have right now? I don't know, right? Tech has always been a very concentrated
industry. I think that's a broader systemic issue than just with AI. Fred, do you want to speak to
your policy recommendations? I know mandating pre-deployment access for defenders, treating AI labs
as critical infrastructure. Do you want to speak to some of these solutions? Just before I want to say that I really
amplify you this criticism
or maybe skepticism of a few companies
owning all this AI chain. I think
Josephine makes a good point that cloud companies
are also powerful. We have other
powerful semi-monopolis in the world.
I do believe that AI is in another
category than anything we've seen before.
So I think that is
really problematic. And I would love
to see more people-owned AI, if
possible, more decentralized
owners structures. And we could make
policies right to approach that.
More security-specific to be a
little bit more small level for a second.
I think there's a lot of things we could do.
Jason Clinton at Anthropic, I'm sure a lot of other people too,
talks about this one-day patch policy, and maybe it's even shorter now.
But I think that's great, right?
Every company should be able to just patch a vulnerability within 24 hours or even much,
much quicker, because we just have to.
It's going to be so stressful and time-dependent in the future.
Whenever a vulnerability is discovered, companies need to have the frameworks in place
to just patch that instantly because we can't wait and be.
be slow as we've been even a few weeks
is way too long.
One of the things you mentioned is treating AI labs
as critical infrastructure that they shouldn't be able to
maybe there's some public commons level way
of accessing this public utility of basically
defense so that maybe there's some amount they can charge
but basically they can't overcharge or there's got to be something
that just kind of makes it a commons of common security
because at the end of the day we need it
for securing a safer world.
Then the question is, is it just a national thing?
Are we extorting still all the international
allies to say we're going to force us to
forcing them to pay for all these things
it just gets into geopolitics and
complicated quickly. I think
that's such a good point. I'm definitely seeing AI
as a critical infrastructure and there's different arguments
here. If we make it an official
17th critical infrastructure sector
in the US, maybe we'll slow development,
maybe we'll create regulatory overlap
which can be problematic as well.
We could do that in a way
that I think gives
policymakers more power to demand security standards
and that might slow things down, but
that could also make us more secure.
I'm pretty positive to such an approach.
I don't think it will happen, but I like to advocate for it.
Maybe just to wrap up, what are some of the things that people can do just in their personal lives to, you know, in light of mythos existing,
which if it can hack every operating system, people say they throw up their hands, what can I possibly do?
But let's give people some hope.
What are some basic things that people should be doing?
I think the advice I have, and it's the most irritating and obnoxious advice you can give,
but I think it's also the right advice
is that it's something people should be thinking about
when they're voting, right?
That the question of how politicians
are approaching artificial intelligence
and whether they think there should be any safeguards
and whether they're willing to challenge any of the companies
that are developing it is really important
and it's only going to get more important
as those companies are pouring more and more money into lobbying.
There are a whole bunch of issues to think about
when you vote today,
and I'm not going to tell you it's the single most important one,
but I think it's a very important one and only becoming more so.
Well, it's a monopoly of enactment where once this happens,
there's no more enactment of anything by citizens because,
and so from that perspective, there's a weird way in which is like,
okay, well, is this actually more important than the price of eggs or gasoline
or whether my kids have school?
Well, it's like, well, but if I can't,
if I'm about to lose my political power permanently,
then it actually is the most important thing.
This should be the number one issue on the midterms.
and people do have a say.
And if they can share this episode,
share this material,
go watch the AI doc, get people to see it,
recognize that we're not heading to a pro-human future by default,
and we want to be moving towards a pro-human future
and against the anti-human future.
But I do think that this conversation is, you know,
trying to play a role in clarifying the nature of the problems that we face
so that we make sure that we're putting in the policies,
putting in the guardrails,
and also putting forward, as you said, Fred,
basically the collective problems
that we need everyone's mind unsolving.
How do you protect citizen secrets in a world
where AI can hack those secrets?
What are the new laws? What are the new
code level protections so that
anybody who access to such a thing, for example,
gets locked. Here's the one system
that can hack into computer systems.
If you're using it, there has to be oversight of who's using it
and for what. And that has to be enforced
at the level of code, basically.
Okay, well, I'm just going to give the most irritating
cybersecurity advice. And again, I'm
only going to give it because I think it's the right advice.
You want to be really aggressive about installing the updates, as annoying as you find them,
as much as you want to tell your computer and your phone to delay them.
You want to be really careful about how you're using AI, what you're giving it access to,
what pieces of your digital life, what pieces of your data are being fed into it.
You want to be really thoughtful about which companies, AI tools and products you're using.
You want to, you know, think carefully about who's running those companies and what their interests are.
And in a moment of deciding, do I need AI for this or maybe not,
I think it makes sense right now to err on the side of maybe not.
I think these are really good advice.
Some things to maybe take that one step more extreme just to do it, right?
Well, let's say something really bad would happen in terms of a totalitarian lock
it happens where the people just don't have control anymore.
And that could go quickly because all of these AI models are right now being used
as social media companies also use their tools to collect,
what do you think, what do you do, what's your digital footsteps?
And right now that's being used heavily to create ads, right?
And you're fair enough, that's annoying, but maybe you can live with that.
But that is to a larger degree being used to nudge you into different direction,
making you think in a different direction.
So what information do you digest online?
I think it's really important to think this.
I think there's these statistics that the younger generation get 90% of their news from social media.
What accounts do you follow?
Are these people rational human beings who seem to know what they're talking about
and present both sides of the arguments?
Maybe I can add one thing, Tristan.
You spend a lot of years, you know, let's say almost a decade
on just trying to figure out how can we counter these incentives of social media, right?
And I think it's fair to say we failed as a society to incentivize social media.
These are for-profit companies that have done really, really bad harm to the human population.
in terms of dopamine hijacking and other things.
And we're now starting a similar thing, right, but with AI companies.
We have these for-profit AI companies.
They're obviously seeking to shareholder maximize and profit-maximized
as they develop their AI models.
Are we going to repeat the same mistake again?
And we really shouldn't.
We have to learn from the mistakes with our failed social media regulation
and try to make AI into something better.
And that would be really good if we take it seriously.
And I don't think we take it seriously right now.
Yep, and we can. We're in a critical window. If we play our cards right, we can make sure that defenders get access to this first.
We can have regulation that tries to close the gap of the extra costs for adding security.
We can have international coordination with enforceable metrics that we're doing the verification.
This could end better than it did with social media.
But if we don't, the internet becomes basically unusable for people who don't have top-tier tools.
And I do think that this qualifies as kind of a Manhattan project kind of moment.
And we need everybody who works in cybersecurity,
who has any interest in any capability or talent in these areas to work on defense right now.
You can think of AI as kind of introducing a Y2K vulnerability in all of society, but in a rolling way.
So we kind of have a rolling mobilization, a wartime mobilization, to defend our systems from the new
vulnerabilities that AI creates.
You know, I hope this conversation helps activate everyone in every corner of society, whether it's
policymakers or people listening to this, to take part in this.
And again, vote in the midterm elections.
This is not inevitable.
Fred and Josephine, thank you so much for coming on Your Undivided Attention.
This has been really fantastic.
Thanks for having us.
Thank you so much, Tristan.
Your Undivided Attention is produced by the Center for Humane Technology,
a non-profit working to catalyze a humane future.
Our senior producer is Julia Scott.
Josh Lash is our researcher and producer,
and our executive producer is Sasha Fegan.
Mixing on this episode by Jeff Sudaken,
original music by Ryan and Hayes Holiday.
And a special thanks to the whole Center for Humane Technology team
for making this podcast possible.
You can find show notes, transcripts,
and much more at humanetech.com.
And if you like the podcast,
we'd be grateful if you could rate it on Apple Podcasts
because it helps other people find the show.
And if you made it all the way here,
let me give one more thank you to you
for giving us your undivided attention.
