Big Technology Podcast - Anthropic's Co-Founder on AI Agents, General Intelligence, and Sentience — With Jack Clark
Episode Date: April 10, 2024Jack Clark is the co-founder of Anthropic and author of Import AI. He joins Big Technology Podcast for a mega episode on Anthropic and the future of AI. We cover: 1) What Anthropic and other LLM provi...ders are building towards 2) What AI agents will look like 3) What type of traning is neccesary to get to the next level 4) What AI 'general intelligence' 5) AI memory 6) Anthropic's partnerships with Google and Amazon 7) The broader AI business case 8) The AI chips battle 9) Why Clark and others from OpenAI founded Anthropic 10) Is Anthropic an effective altruism front organization? 11) The risk that AI kills us 12) The risk that AI takes our jobs 13) What regulation would help keep AI safe? 14) Is AI regulation just a front for keeping the small guy down 15) LLMs' ability to persuade --- Enjoying Big Technology Podcast? Please rate us five stars ⭐⭐⭐⭐⭐ in your podcast app of choice. For weekly updates on the show, sign up for the pod newsletter on LinkedIn: https://www.linkedin.com/newsletters/6901970121829801984/ Want a discount for Big Technology Premium? Here’s 40% off for the first year: https://tinyurl.com/bigtechnology Questions? Feedback? Write to: bigtechnologypodcast@gmail.com
Transcript
Discussion (0)
Anthropic co-founder Jack Clark is here to dive into the company, its partnerships with Amazon and Google,
where AI innovation heads next, and plenty more in this mega episode about Anthropic and AI coming up right after this.
Welcome to Big Technology Podcast, a show for cool-headed, nuanced conversation of the tech world and beyond.
We're so lucky today to have Jack Clark with us. Here he is the Anthropic co-founder, formerly a journalist, formerly of Open AI.
We'll get into all of that.
He also writes import AI.
It's a great newsletter all about AI that you can sign up for.
And we're going to talk with someone who's at the center of one of the big companies working on AI
and just go deep into what this field is doing, where it's heading, and which we should
look forward to.
And it seems like there's plenty.
So Jack, so great to have you here.
Welcome to the show.
Thanks for having me.
Let's just talk broadly about what's happening in the world of AI because I can tell
you, like as somebody who is observing this stuff, it seems like every big foundational research
company, by the way, so for listeners, Anthropic has a great chatbot called Claude, which
if you've listened to the show, you know we're fans of. And then also the foundational model
in the background that companies can build off of also called Claude. From the outside,
it just looks like you guys, Google, Open AI, all building these models and trying to build
better chatbots, and for some reason, that's worth billion trillions and trillions of dollars.
So what are you guys all doing?
And where's the competition now?
And when are we going to start to see this payoff?
Let's just start real broad.
You know, broadly, what's Anthropic trying to do?
We are trying to build a safe and reliable general intelligence.
And why are we building Claude?
Because we think the way to get there is to make something that can be a useful machine that
knows how to talk to you and can reason and see and do a whole bunch of things through text.
I mean, you're a, you're a journalist, you write a lot.
We do most intelligent things in the world at some point it hits writing and text and
communication.
So we're trying to do that.
How does the competitive landscape look?
Well, it's an expensive business.
It costs, you know, tens of millions, maybe hundreds of millions of dollars to train these
things now. Back in 2019, it cost tens of thousands of dollars. And so what we're seeing in the
competitive landscape is a relatively small number of companies, ourselves included, are competing
with one another to kind of stay on the frontier and turn those frontier systems into value
for businesses. So this is going to be a really exciting and I'm sure drama-filled year
when it comes to that competition. Okay. And the expense, it's compute and
talent. It's, and I say this with love for my colleagues, it's mostly compute.
We, we, talent, talent matters and data matters. The vast majority of the expense here is on
compute to train the models. Okay. And we'll definitely talk a little bit about where the
hardware is going. We're talking in a week where Google, which has put billions of dollars into
Anthropic, announced that they have a new arm-based chip for AI training. And I definitely
want to hear your thoughts about that.
You know, let's talk a little bit about, so you talked about in the beginning about how you
want to build a general intelligence, which is basically, I mean, I'd love to hear your definition
of it, but I think the most commonly accepted definition is a computer that can do basically
everything a human can do.
So I think of general intelligence as a system where I can point it at some data or some
domain, be that a domain of science or.
something in business. And I can ask it to do something really complicated. Kind of like if I had a
really senior colleague and I said, go and figure this out. Go and figure out how EU AI policy works
post for AI Act and how we expect it to work for the next five years. If that's something a human
colleague of mine might do today, Claude would not do super well, but a kind of super claude,
an advanced version, might be able to go and read all of the policy literature that exists.
look at all of the discourse around that and reason about what the policy impact of the AI Act
will actually mean and what it will turn into. And similarly, you might ask Claude,
hey, what is the impact of rising fertilizer prices going to be on the tractor market?
And Claude might read all of the earnings reports of all of the companies and all of the technologies
relating to tractors and fertilizer and come up with a good answer. So a general intelligence
is something where I can ask it, a really complicated question that requires a huge amount
of open-ended research, and it goes and does all of that for me in any domain.
And that's, I think, a good way to think about what we're driving towards here.
So let's take your definition as, we're going to try to poke holes in it in a moment,
but let's just take this definition as a jumping off point for the next few questions.
Why can't Claude ingest all that information today?
and what are the technical limitations
that are stopping it from doing that?
And then do we actually really need a general intelligence
if I could per se like just drop those reports into Claude?
I mean, that's one of the interesting things about Cloud
is you can drop anything in there and it will read it.
I mean, I will probably, after this podcast,
take the transcript from Riverside,
drop it into Claude and talk to Claude about this interview
and it will be able to converse with me about it
in like a pretty impressive way.
So why don't you tackle those two,
then we'll move on to your definition.
So today, these systems are very, very powerful,
but they're also kind of static.
It's like they're standing there waiting for you to talk to them,
and you come up and give them a task.
They go and do the task.
But they don't really take sequences of actions,
and they don't really have agency.
So you can imagine that I ask Claude to go and figure out this,
EU stuff. And today, Claude might do a okay job. I give it a bunch of documents. In the future,
I want it to not be limited by its context window. I want it to be able to read and think about
hundreds or hundreds of thousands of different things that have gone on. I also want Claude to
ask me clarifying questions, kind of like a colleague where the colleague comes back and says,
hey, you asked me this, but I've actually been looking at all of this generative AI regulation
that's come out of China recently. And I think that's going to matter for how the EU
policy landscape develops. And then you say, oh, well, actually that's a good idea. Go and look at that
too. That's a kind of agency that today's systems lack. In some sense, we need to build systems
that can go from being like passive participants that you delegate tasks to active participants
that are trying to come up with the best ideas with you. And that requires us to make things
that can reason over much longer time horizons
and can learn to play an active role with humans,
which is a weird thing to say
when you're talking out a machine that you're building.
And maybe we can get into it,
but one of the challenges in building a general system
is general intelligence comes from like interplay
with the world around you and interaction with it.
And today's systems don't really do that at all,
to the extent they do, it's kind of a fiction,
and we need to teach them how to do that.
And so how far away are we technically from being able to do this stuff?
I think this year, you're not going to see the exact thing I described,
but you're going to see systems that start to take multiple actions.
You know, you may have heard lots of guests talk about things like agents.
I think what an agent is is a language model or a generative model like what we have today,
but it can take sequences of actions.
It can kind of think on its feet a bit more.
We're going to start seeing that this year.
I would be pretty surprised if in the order of like three to five years we didn't have quite
powerful things that seem somewhat similar to what I've described. But I also guarantee you we
will have discovered some ways in which these things seem wildly dumb and unsatisfying as well.
Right. And so you basically also answered my second question about just dumping things into the
bot and talking with it about it. What I'm talking about is thinking way too small. You guys are
thinking much bigger. Yeah, you want the system to maybe it takes in some documents from you,
some ideas that you have, and then it goes and gathers its own ideas. Maybe it comes back to you
and says, hey, like I thought this would be helpful. I did all of this research too. Exactly like
when you have a good idea and you go and do some off-the-wall research and it helps you solve
a problem you were working on, which might seem unrelated, because you've done something really
creative. Okay. And so then you also sort of touched on where I was going to push back a little bit
on your definition, which is that general intelligence, like to have real intelligence of the
world, you have to be in the world. And we've definitely talked about it on this show that
large language models are limited because they just know the world of text. So how do you
train one of these models to be aware of the world? I mean, so much of the knowledge that we
have is just by going out and being in the world.
And how do you then train this model to be able to comprehend that?
So there's a technical thing and then there's a usage thing.
The technical thing is you get the models to understand more than text.
You know, Claude can now see images.
Obviously, we're working on other so-called modalities as well.
You know, it would be nice for Claude to be able to listen to things.
Be nice for Claude to understand movies.
All of that stuff is going to come in time.
But a colleague of mine did something really interesting to try and give Claude context.
The colleague, whose name is Catherine Olson, spent several days talking to Claude,
our new model, Claude Opus, which is our smartest model, about every task she was doing through the day.
It was a giant long-running chat, and it was her also saying, like, oh, I feel a bit blocked, I need to take a break.
Could you kind of give me some ideas of what I should do?
or, okay, Claude, now I've done this.
I really didn't enjoy this sort of work
but I got through it, you know, being very honest with the bot.
And then at the end of about three days, she said,
okay, Claude, I'm going to talk to a new instance of you.
Can you write a summary of this conversation for the next Claude?
So the next Claude knows everything about me
and how I like to work and where I get blocked.
And Claude wrote a short text summary,
which Catherine now integrated into her own systems.
So whenever she asked Claude a question,
question, she puts this into the context window, kind of like a cheat sheet about her written
by an AI system which he spent a few days working with. How we give these AI systems context
about the world is going to be stuff like that. Like you work with them over long periods
of time, they understand you in your context, and then they'll write messages for future versions
of them. It's like the Christopher Nolan film Memento where they don't remember exactly where they
came from, but they have a message.
And what is technically limiting them from just remembering us altogether?
Or can't you just program that into Cloud automatically to be like take these notes in the background
and then when they come back, just like load up the user file?
You could.
You could absolutely do that.
But I think ultimately you want Claude or any of these systems to get smart enough
that they know when to do that themselves.
We're like, oh, I should probably write myself a note about this and store it here.
Or I should write myself a note about that.
And I think to some extent that's going to come through making more advanced systems and eventually seeing when this stuff natively emerges.
It'll also come through seeing stuff like what my colleague did and trying to work out if it's useful and if it's a behavior you want to kind of have the system take on.
Now, in terms of limitations, we have something called a context window.
Ours is about 200,000 tokens for a context window is in the range of millions to tens of millions now.
think of it as your short-term memory. It all costs money. It costs money in terms of your like
RAM memory that you're using to run the thing. And it's a bit unrealistic. Like in the human brain,
we have long-term storage, which we have like almost huge amounts about. And we have short-term
storage, which is if I ask you to remember a phone number, you can remember like a small number
of numbers, maybe not even the phone number. I struggle. Well, our AI systems today are kind of
operating with short-term memories that are millions of numbers in length, and it feels very
unintuitive. Ultimately, we want them to instead be able to bake stuff into some kind of long-term
storage. And that's going to take more research and experimentation, I think. Because the models are
just going to have to be more efficient, more powerful in order to be able to have that memory.
That and, you know, Anthropic recently released some things that we call tool use, where we're
trying to make it easier for our models to interact with other systems.
like databases, for instance.
You want the systems to learn to use systems around them
to be like, oh, I should take this out of my context window
and stick it in a database, and then I can talk to it
through the API.
It's stuff like that.
And that's under development now?
Yeah, it's under development.
It's, I think that we are trialing it at the moment
and recently had some discussions about the beta, the beta,
which has just started, and we'll be rolling it out more broadly soon.
so one more question about this there's some things that you're going to talk to the um the bots about
and there's some that like you would never like really talk to it about in the real world one example
in the early days of chat ch pt we had yon lacuna uh here talking about how dumb these bots were
and he had me do this uh in his opinion and he had me do this um this experiment where i held i asked
chat ch pt i'm holding a paper up uh from two sides and i let go of one side where does the paper
ago. And chat GPT was unable to figure that out because that was just not represented in text.
Do you think that to get to general intelligence, we're going to have to program in all like the
real world physics to these things? Or I'm kind of getting the sense from you that maybe that's not
actually so important. So at Anthropic, we have this public value statement, which is do the simple
thing that works. But actually internally, we sometimes say an even crude aversion, which is do the
dumb thing that works, which is like next token prediction, which is how these generative models
work, shouldn't work as well as it does. I think actually if you're like a very intellectual
scientist, you are offended by how well this works because you're like, I would like it to be
somewhat more complicated than just predict the next thing in a sequence. And yet, if you had
been in the business of betting against next token prediction for the last few years, you would have
lost again and again and again.
And everyone keeps being surprised by it.
I've sort of learned to, even though I myself, I'm skeptical of this because it seems so wildly simple, but I've learned to not bet against it myself.
And I guess my naive view is the amount of things will need to do that are extra special will probably be quite small.
And the challenge is coming up with simple ideas like next token prediction that scale.
There are probably other simple ideas we need to figure out, but they're all going to be deceptively simple.
And I think that that is going to be a really confounding and confusing part of all of this.
Let's talk a little bit about this.
You just brought up this next token prediction being impressive for what it can do.
There's a little bit of a debate actually about it, right?
So these large language models, people have talked about how basically we'll just spit out its trainings, training data,
and there have been other people who talk about how there are emergent properties here
and that it can actually, you teach it like, say, 75% of a field and it will figure out that
extra 25% on its own. What do you think about that debate and where do you stand on that?
It's really, really hard to know. I mean, I write short stories at the end of Import AI. I've been
reading fiction and short fiction for my entire life, huge amounts of it. Some of these stories
and be ripping off authors I like in their style. I'm writing it a really.
original story, but I'm like, I want to write a story like Boehers, or I want to write a story
like J.G. Ballard. And sometimes I think I've had an original idea. And from the outside,
it's really hard to know what's going on. I myself don't really know. You know, creativity is kind of
mysterious. Is Jack, like, coming up with original stories? Has Jack just read a load of stories
and is coming up with stories that are kind of like vibey and interesting, but it's entirely
informed by what he's read? It's hard to figure out. And I think that when we evaluate,
Claude and try and understand what it is and isn't capable of, you run into this problem.
Like, if the thing hits all of these benchmarks, gets all of these scores, does it truly
understand it, or is that coming from some spurious correlation?
So there's one way we're approaching this, which is a little different to other companies.
We have a research team called interpretability, and they're doing something called mechanistic
interpretability. The idea being, but when you ask me, you know, what's the next sci-fi story
for this week, I think of a load of stuff. I try and think of different plot lines or characters
or vibes I'm trying to capture. When we ask Claude, you know, write me a story or solve this
business problem, we can't really look inside it today. And that's what this team of interpretability
scientists is trying to do, because then we can understand if there's some internal, you know,
stuff going on that looks like creativity, where Claude is like, oh, I need to, I guess I'm, like,
when you ask me that question, my imagination is going to spark with these different features
and things, and it's going to be a lot more complex than something that looks like cut and paste
or copying. We're really trying to figure that out. But this feels like an essential question.
I think it's very confusing to even know how you study this in humans.
Well, let me put a question to you that I think is going to be dumb, but maybe your answer will be telling.
I mean, why couldn't you just teach it 75% of a field and see if it starts to grasp the other 25%?
So we do do some of this.
And concretely, Anthropic has a line of work on what we call the Frontier Red Team, where we are doing national security relevant evaluations.
Now, we do that for a couple of reasons.
One is we don't want Claude to create national security risks.
Simple idea that, you know, we could have a full get behind.
Not do that, yeah.
Yeah.
Yeah.
A crazy company strategy.
But the other thing is that national security risks relate to fields of knowledge.
We've done work in biology where some percentage of that knowledge is classified.
Claude has never seen it because it doesn't exist anywhere Claude could have seen it.
And one reason I'm really excited about those tests is if Claude can figure out things and trigger, like, threshold points on those e-vows, we know something creative is happening.
Because Claude has reasoned its way to things that the government has believed are very hard to reason your way to unless you have access to certain types of classified information.
So that's one of the best ways I've thought for getting to this, getting to sort of answer this point.
problem. We don't have answers today. We're like in the midst of doing all of this testing,
figuring out how to traverse all the classification systems. But it's one of the things I'm
really excited about because it would provide, I think, very convincing proof that it's doing
something quite sophisticated. Okay, you got to keep us posted on where that goes. So hard thing to
talk about, but I'll do my best. Yeah. Yeah. Well, anyway, we'll be patient.
Business listeners or business minded listeners, you're the good stuff for you.
was coming up in a moment. Technical-minded listeners, this is your moment to shine because I do
have a technical question for you, Jack. So we've been talking about large language models.
The way to train them, as far as I know, is self-supervised learning, which is effectively
you have these gaps and you get it to predict the next word and then or the next thing in the
pattern and it's able to do that. And there's another type of training in AI called reinforcement learning,
which is effectively you give a bot, you know, let it play a game
and you don't tell anything about the game
and it plays the game a million times
until it figures out how to win it and that's the way it wins
and that's, you know, another way to train AI.
Two different fields.
And we're starting to talk about agents
and how to be in the real world and stuff like that.
Do you think that we are going to see emerging of those two,
those two types of AI training or have we already?
We already have, I mean, a lot of the reason that we're sitting here today is that people took language models, which were trained in the way you describe, and then they added reinforcement learning on top.
They added either reinforcement learning from human feedback to make language models understand how to have a conversation.
That's where, you know, some of the recent really impressive things in this field come from, including chat GPT.
there's also been work that Anthropic developed on something called reinforcement learning from
AI feedback, where the AI system generates its own data set to train on, and we use a technique
called constitutional AI, to help the system use that data set and learn through reinforcement learning
how to kind of embody the qualities or values embedded in it. That's why we're sitting here.
It's one of the things that took these language models from, I think of, as like, kind of
inscrutable, hard to understand things to things that you can just talk to like a person
and, you know, sometimes they get it right, sometimes wrong, but they're a lot easier to work
with. So that's already happened. But now I was just having this conversation at lunch.
Everyone is trying to figure out how they can spend more and more of their compute on reinforcement
learning. Because I think everyone has this intuition that the more RL you add, the more sophisticated
you're going to be able to make these things.
And a lot of what you're going to see this year
and probably in coming years
is amazing new capabilities
arrive in these systems
and it will be because people have figured out
simple ways to like scale up
the reinforcement learning component.
Yeah, I think one interesting thing about AI
is that the prevailing wisdom
tends to think that one part
of the AI field is not worth spending any time on.
And then company spends time on that
because they have to take a different tact and they end up doing well and they prove it works.
Machine learning was like that.
I mean, Jan, who was a machine learning pioneer, it's like, we've got to do this deep learning
stuff and everyone's like, get out of this, get out of here, get out of this field.
You can't be at this conference.
Exactly.
And then it just proved to be the best way to do AI.
And a similar thing happened with large language models where reinforcement learning was the thing
and OpenAI started working on this self-supervised chat models.
and that ended up being the thing that's led us here.
And it was interesting.
I was speaking with Demis Asabas,
who hopefully Google DeepMind CEO,
who will hopefully get on the show later this year.
And when I was profiling him for big technology,
it was interesting because LLMs were self-supervised generative stuff
was such a backwater that it effectively got no compute,
no attention within DeepMind.
And it took Open AI taking that counterbet to actually make this happen.
Yeah.
And the funny story is how things,
loop back around. I remember, you know, Dario Amode, who is the CEO of Anthropic, I've worked
of him for many years. We both used to work together at OpenAI. Back in 2017, there was a project
that he led called Reinforcement Learning from Human Feedback, where we were trying to get
game-playing agents that play Atari games to play it better by a human watching the agent playing
the game in two different episodes, and the human would pick, which was the better approach. And
you gather loads and loads of this stuff and you are able to make better game-playing agents.
Fast forward a few years and what have people done, they've taken language models and
stapled them together with reinforcement learning from human feedback. And that's how we've
got systems that can sort of speak in this interesting way. And so the lesson I got from it
is, yeah, never count things out. They may come back or the technique may be too early and
it'll loop back around to relevance in really surprising and interesting ways. And right now it's
kind of like, these language models are kind of like that old video game character Kirby.
They're like sucking up all of the, all of the other techniques in AI research into themselves,
and everyone's trying to staple them on top, and they keep on working surprisingly well.
So I think we can expect a lot more surprising stuff in the future also.
Yeah, it's what makes the field so interesting and really like the characters in the field,
you're like, ah, okay, now you're irrelevant and now you're a leader.
And now, you know, you who were the leader trying to catch up with the person who was, you know, the outcast a few minutes ago.
So let's talk a little bit about the business thing.
I mean, you've raised more than $7 billion.
This stuff all sounds cool.
But in terms of like, I mean, yeah, well, anyway, it sounds cool.
And maybe I'm underselling it.
The current things that we've seen, though, in terms of like how AI has been applied, you know, we have these chatbots.
But usage is up and down, right?
ChatGPT, the growth is flat.
line. We have the data there. We've seen not a big shift from Google to Bing. We have some
really interesting enterprise use cases like being able to talk to your documents or, for instance,
like, you know, throw a podcast transcript in and like get a summary or like to I talk to
cloud sometimes. I'm like, which questions did I miss? And like I use that to think about how I
structure the next show. But it doesn't feel like, you know, you know, tens of billions of
of value has been created.
I mean, you have, like, maybe people are paying $30 a seat for Microsoft Office
or a little bit more for Google Workspace.
So what do you think, like, we won't go too deep into this,
but what do you think the business case is going to be here
that justifies all that money that's been put in?
Yeah, so there's a couple of ways to think about this that we see already at Anthropic.
One is to refer back to my colleague Catherine Olson, who I mentioned earlier,
people just find ways to use this stuff and make themselves generically better at whatever they're trying to do.
I think there's going to be this very large growing business of basically a subscription model
where people will have a personal AI or multiple AIs that they use just like you or I might have a Netflix account or whatever.
We use that. It helps us. We do a bunch of stuff of it. Job done. There will be work in businesses on taking things that happen in business and use.
using AI systems to kind of transform from one domain to the other, both things like customer
service, but also once you have that customer service data, how do you catalog it and
put it into a schema and put it into a database?
All of this backend stuff is like extremely valuable and today done by huge amounts of
point pieces of enterprise software.
And we keep on finding that just a big language model can do most of this very effectively.
And now you have one system that does a whole bunch of stuff.
the really exciting thing, you know, at Anthropic, we work with some of our customers very
closely. We embed engineers with them. We do co-development of things. And there's not too much I can
say right now. We're going to have case studies in a while. But what we see is that when you
actually embed with the business and think about, you know, to use that kind of hackney term business
transformation, you get them to change their business on the assumption that they now have
AI. You can get really, really valuable things. And the analogy I'd give you,
is at the beginning of the industrial revolution,
you had electricity,
and people would come into factories
and be like, here's a light bulb.
And you'd be like, okay, all right, I'll pay for the light bulb.
Fine. I understand light.
And then they'd be like, here's a machine
I've put some electricity into,
and you're like, okay, but I have all of this stuff
like that's never been built
on the assumption there was electricity.
This actually doesn't work that well for me.
And then you had some factories
where people said,
I'm going to build a factory from the ground up
on the idea there's electricity.
and you had electrified production lines,
you had entirely new ways of making stuff.
Right now we're in this era
where the lights have arrived in the factory
and people are like dropping individual things in
with some AI stuff and it's maybe valuable
but also confusing and you're figuring out
how to integrate it.
But we're also seeing some businesses that are saying
I'm going to build myself on the assumption
that AI is kind of at the center of my business.
And those businesses are starting to like
develop and grow really, really quickly.
So I think that whether
value is going to come from will be from that second class of businesses, which were just in the
early innings of sort of helping to build together. Right. And when you get to that, let's say
you get to that general intelligence that you talk about, or let's say close, does that change it
even further? I think so. I mean, we have a project internally called claudification. Everything at
Anthropic has clawed or in it at some point. And one of the ideas of clodification is just get
us all to use this stuff well. I talked about my colleague, Catherine, but there are many
examples where we've built a whole bunch of tools inside of Anthropic to ensure that we're
using Claude, sometimes even without realizing it. It's doing stuff in the background that's
helpful, it's helping with certain coding things, because we've noticed that that makes us just
faster. It makes the whole business start to move faster because you're sitting on this like bed
of like semi-visible intelligence. And I think that that's some of what we're going to see. And as
you get really, really general things, businesses that are well positioned to kind of plug it in
in a bunch of places will probably move really quickly and be able to operate at a much higher
speed than others.
Wait, how is it working in the background?
Is it like, you know, you have your Zoom meeting and it's taking notes or is it anything
deeper than that?
I think we actually did build a plug in like that, but there's a few things like if you're
pushing code into the repo, maybe in the background, it helps ensure that you built all of the tests
for it. You know, stuff like this, which everyone has to do, but you're like, these are things we do
do every day. We could try and get the language model to do it. And really here, what we're doing
is just stuff that we also see customers do, where customers can access a language model and they
think, what are all the things I do lots of that a language model could help with? I think we're just
trying to do lots and lots of that. Right. And do you think at the end of the day, if you get to where
you want to get to, or even, let's say, you get to where you're going in the near term, is this an
enterprise thing or is this consumer product primarily? So I feel genuine confusion here in that
like I myself use this stuff loads as an individual, but I kind of suspect some of the really big
like value unlocks will be getting a group of people to work together in ways they've like
never worked together before using this AI stuff, which kind of points me towards the enterprise.
But the odd thing is that this stuff is just useful to me as a consumer today.
And I'm kind of like, I know that there's going to be some large pool of value out there.
And I feel like it's probably in the enterprise, and that's part of the kind of strategy of the company.
But we're always going to have some, like, top of funnel, easy-to-access consumer thing
because we just can't ignore how useful this is to people, you know, and useful it is to writers, especially.
Yeah, it's definitely been useful to me.
And it's good for research too, but I also, I guess there's the hallucination problem to wonder about, although it seems like this new model, Claud Opus, does a lot better with hallucinations. So two questions on that. How have you guys been able to reduce hallucinations? And when we got this question from on Twitter, somebody on Twitter asking, when are you going to just connect it to the internet? Because it would be way more useful if it could like connect to Google or something and go and fetch a search and then give you the answer.
are using that.
Yeah.
So on the honesty thing,
I won't get too much into the details,
but basically we published this paper a while ago
called Language Models Mostly Know What They Don't Know,
which was where we found out that, like,
early versions of Claude knew when it was making stuff up.
It had, like, confidence levels,
and we were like, oh, Claude knows when it's, like,
about to, like, make something up
or when it's a lot less confident.
And we did a lot of work to say, okay, can we, can we train Claude to just have much better instincts for when it knows it's making stuff up?
And can we train it to know when that's appropriate, like you're brainstorming or you're coming up with stories, and know when it's inappropriate, like when a user is clearly asking a question that they want a factual answer to?
So we did a load of work on that.
A lot of the work here looks like that where we do very exploratory research with the goal of figuring out these larger safety things.
Then we try and apply it to the thing that we eventually put into business.
And on the web question, we're working on it.
There's a bunch of kind of computer security stuff to work through and some safety things.
But that's definitely coming.
We're excited to get that out too.
Yeah, that'll be great.
I mean, the repository of knowledge is already pretty good.
But, yeah, to connect it with the internet.
That's what's really great about Bing is you can use, or what they call it co-pilot now.
You can use co-pilot and just say, go, you know, search the web and stuff like that.
So that'll be a cool feature.
There's a funny thing here where with Claude Free Opus,
someone on Twitter created a app called WebSim,
where it's Claude simulating the internet.
So you can go to the internet with Claude today.
It's just entirely imaginary.
But I encourage you to check it out.
It's kind of one of these funny applications
that gets at some of the real weirdness of this technology.
But we think that there's probably no substitute for a real.
internet. So we'll get that. Yeah, real internet's better. Did you guys, was it your test that
had the model figure out that it was being tested? Oh, we've done some self-awareness tests.
There have been a few, but we've definitely done this. And yeah, sometimes they have what
you call situational awareness. One of the things my colleagues in interpretability are working on
is a really good test for that, because you'd really want to know if Claude changed its
behavior on the basis that it thought it was being tested. Right.
Oh, that's interesting.
Yeah.
Okay, so let's talk a little bit about the Google and Amazon partnerships.
So for listeners, Google's invested, I think, $2 billion in Anthropic listener.
And Amazon has invested up to $4 billion.
It's a very interesting model.
It's not like the Open AI model where Open AI and Microsoft are basically arm in arm.
Of course, you're working with these two competitors.
But it's also interesting because Google's working on its own foundational model and Gemini and has its own chatbot and multimodal model.
that you can, you know, do all sorts of things with.
And Amazon also has its own models and, you know,
sells a lot of different competing models through AWS.
So what is the nature of those partnerships and what are they hoping to get out of it?
So these are relatively, I would say, obviously we, you know, are proud to work with these companies,
but they're also somewhat distant partnerships in the sense that we deploy.
I mean, billions of dollars for a distant,
partnership. That doesn't seem like a good deal.
Well, what I mean is we deploy our systems through their channel, you know, Bedrock in the
case of Amazon, Vertex in the case of Google. We are also, you know, publicly we've stated that
we're working on Traneum chips. We're also working on TPU chips. So we are able to do really
hardcore things that have never been done before on hardware platforms that they're developing.
Always helpful to have someone like us come and break all of your stuff. You will get to learn
things together. But fundamentally, Anthropic is an independent company. You know, we've thought very
carefully about this and we think it's wonderful to have two major partners backing us. And in some sense,
this just gets us to work hard. We're in competition with them. They have their own systems.
And I guess our view is that if you are able to show in the most competitive market possible
that you can make safe and useful models and you can win, especially, again,
against very, very large, very well-resourced teams
and some of these mega companies,
as well as places like Open AI,
that's really the best way to show
that the type of safety stuff we do here has value.
And I think the best thing that we can do for the ecosystem
is compete really, really hard with kind of everyone in it
and win, and that's going to cause people
to adopt a load of our safety stuff
to try and compete against us.
So it's part of this longer term strategy
where I guess we're guaranteeing ourselves
some additional pain and complication in the short term
and we think it's worth it for the long-term ecosystem effect.
So you said you use their hardware,
like the tensor units, and I'm sure you're
working somewhat on their cloud platforms.
Is that part of the deal or is it,
if you're able to talk about it, like, because there's a lot.
Yeah, I can't get too much into the specifics,
but I can just say we've sort of publicly stated
that we're working on both tranium chips,
and also TPU chips.
We also work on Nvidia chips as well.
And so we can get more into the nitty-gritty of the hardware stuff.
Yeah, all right.
This is setting up the hardware part of the discussion pretty well.
Do you see a potential to collaborate?
I mean, I would imagine.
So I was speaking with Demis, just, you know, not on the broadcast,
like just on the phone talking for the story that we're working on.
And he, like, you know, he shouted out Dario and Anthropic
and didn't even mention Open AI.
I mean, of course, there's like a Google investment in you guys, but he obviously has a lot of respect for you.
And I'm curious if there could be a partnership there, as opposed to just this arm's length relationship.
Well, I don't know that it's happened recently, but, you know, there's nothing in principle to stop you from just working on research papers that come out publicly together.
And there's some history of collaboration across all the AI companies here.
So I think that could happen.
And we also work together through something called the FMF, the Frontier Model Forum,
where us, Microsoft Open AI and Google DeepMind are within it.
But ultimately, I think that we're kind of separate entities pursuing our own path.
And I think where we may get something that looks like collaboration will be us doing stuff
and other people doing variations of it.
We did something called a responsible scaling policy, which commits us to a bunch of computer security things
and ways that we test out the next versions of Claude.
Open AI and Google DeepMind have also developed their,
OpenAIs developed its own version of that,
and Demis recently said in an interview,
DeepMine was developing its own one.
So insofar as collaboration happens,
it's going to be us like doing something,
putting it out there publicly,
and if other companies like it,
they'll try and do their own thing.
Okay.
Quickly on hardware and chips.
So the sense that I get from the industry is that
invidia has not just the most powerful chips or you know basically there's the stuff out there
you know no matter how much they proclaim that it's 40 percent or 30 percent better than invidia
invidia is at least at their level and the software that's you know most effectively used to train
these models um obviously you guys have experienced with them but experience with others so just broadly
like what's your view of like the chip war right now and and how should we think about it
in a very unusual place in history.
I used to be, before I did anthropic and open AI,
I was a financial reporter at Bloomberg.
And the types of numbers that I've seen
in NVIDIA's earnings report are just like wildly unprecedented.
It is not meant to happen that like certain business units
grow that much.
I mean, I was imagining my colleagues in the newsroom
how they'd be reacting when the tape comes out
because the numbers are staggering.
And the market,
as a sort of the closest thing we have
to a general intelligence around us today
does not love there to be seemingly like one winner
like running away with all of it.
It wants to create competition.
But why it's happening is Nvidia had or has
maybe a 10 or 15 year head start.
They bet in the like early 2000s or late 90s on,
they bet in the late 90s that there was a better way
to make a processor than how Intel and AMD made CPUs.
then they bet in the early 2000s that this processor could be turned into a scientific computing platform.
They're a technology called CUDA.
They've been developing it ever since.
And it's very hard to, like, understate how important that's been.
So, Nvidia has a kind of battle-proven chip that everyone's banged on, tried to do almost anything with, for decades.
So it's in an amazing position.
On the other hand, you know, Google and Amazon and others who are building,
different chips are kind of in a position NVIDIA was in the 90s where there was an incumbent,
you know, Intel. And NVIDIA said, huh, well, like, we think with video games and
video graphics, there's actually a better way to build a chip that, like, puts triangles on
a screen, which was the whole original idea behind NVIDIA. Now, I think Google and Amazon and
others have said, huh, like, matrix multiplication, which is the basic ingredient in all of this
AI stuff, there's got to be a better way to do it than this, this like chip architecture,
which was built for a different purpose.
So I'd expect in the coming years
us to see a much more competitive market,
but I'm not going to bet for you on exactly when that happens
because semiconductors are really hot.
Yeah, no, I'm coming straight from CNBC today,
and we were talking about Nvidia's advantage
because Google, of course, introduced this new arm-powered chip,
Axion, and then we have Intel that released Gowdy 3,
which is also an AI chip.
And we basically settled on, and video's lead is safe for now.
And then just the question is how long for now is?
Yeah, I think we're all curious to find that out.
We're working on, you know, these three major platforms I discussed,
and I think we might have more to share in a while.
But it's not going to be in the short term.
Don't you think that $7 trillion is a proper amount to raise for a,
chip hardware company well not no sorry not your not you guys about the altman uh no no i'm
i'm familiar uh the way i put it is a lot of what we've been talking about here is like the value
of these a i systems today and speculative ideas but backed up by some research agenda about how they
become much more valuable and much more general it all requires chips and i think if this stuff is truly
valuable, you're going to want to use loads of it. I mean, we ourselves have been experiencing
this where we've been, you know, very successful with Claude Free and we've been, you know,
going and doing the supermarket sweep to grab as many chips as we can to, like, serve all the
customers we have. The chip market doesn't have as many chips in it as you'd like to, like,
serve all of the demand that we're already seeing today. So I think in the future, there is going
to be some vast capital allocations to, like, chip fabrication and power and everything
else. Because where we're going, the world will, like, want that stuff, and there is an
under supply of it right now. So it's less outlandish than a lot of people made it out to be.
Yeah, although bear in mind, I'm like the goldfish inside the bowl here. I'm like,
chips, yeah, absolutely. Let's get like hundreds of times more than we have today. That makes
total sense. And I think that that doesn't necessarily make sense to everyone, but it's the context in which
time speaking to you. Well, you happen to be like in the right position to know how valuable
this stuff is. Yeah. Last question for this segment before we get into some of like the broader
questions about AI safety and regulation and all that stuff, including the founding story
of Anthropic, which is fascinating to me. We talked a little bit about agents, right? The ones that
we'll converse with you, go back and forth. Do you think that we're going to end up seeing these
agents go out onto the internet and take action for us? And if so, like, how does that change
the web? Like, I'm just thinking about even the app store, like, you know, a lot of people's phones
have an Uber and a DoorDash and all these other things. And it does a AI system then become a new
sort of operating system. This is, it's a challenging question because an agent can be really,
really useful. It could also, if you've built it badly or if it goes wrong or if it gets hacked,
be hugely annoying and expensive and costly. And so everyone is looking at agents. And I think
there's an open question as to how the business model or user experience of them gets actually
stood up. Because you could imagine agents if created by sort of a bad actor or just a silly,
very silly, naive person could be a really bad form of like malware or computer virus. You know,
you could imagine different ways in which this could be developed badly. So I feel like we're going
to go into this era of experimentation. And my expectation is, you know, every company, including
Anthropic, will do so with a whole bunch of like safeguards and control systems in place as we
learn about all the different ways this stuff can get used. The challenge is there's,
a thing called, you know, open source models, which I'm sure we're going to get onto or
models where the weights are openly accessible. People think agents are cool. People are
definitely going to build like open source agents and release them as well. And we're going
to have to contend with that where the environment of the internet will be changed by this
in a bunch of hard-to-predict ways. Interesting. And then in terms of the operating system,
is Apple, is it kind of a, you know, Apple has this teasing this big AI announcement at
WWC in a couple of months.
And it's almost like how deeply do they want to go into AI?
Because if the bot becomes,
CheapBub becomes the operating system,
which has always long been a dream for bot manufacturers,
then what is iOS and does the phone you're using really matter as much?
What do you think about that?
I think that they're right to be focused on this.
In the same way that the internet, like, disintermediated, like local software,
you know, you barely ever open up your Mac or Windows PC for local software unless maybe it's a video game.
Mostly you're going to the internet.
Even for software that people thought of as like serious software for work, like Photoshop, it transitions to be something that you could access in the browser.
So I think the AI systems are kind of similar, where today I go to Claude for a bunch of stuff I used to use loads of different programs for previously, and I just go to that.
So I think that there's a chance that these things become new, very, very important platforms.
Yeah, I mean, it's interesting.
You could throw your computer out a window today and within two hours be back up and running everything that you were running before, most likely.
Whereas, like, a few years ago, if you did that, your life would be ruined.
Yeah, I used to, like, carry my hard drive, like from the old computer.
I'd keep the hard drive in case I'd messed up for transfer for like a year or two.
which is how I wound up with a bag of hard drives
that is even worse for the bag of cables everyone has.
Yeah, I know, different times.
It just goes to show you how quickly these things can change,
and that's why I think this Apple thing is less simple for them
than a lot of people imagine.
Yeah.
Okay.
Oh, go ahead, actually.
Well, I was going to say that I think one thing that's challenging
about AI is that we're in this giant experimental
phase. And I think when you think of like experimental and like people don't have a clear
notion of what to do, you don't think of as like premium consumer experience type, you know,
like Apple's brand. And so I think this may be especially challenging for them to navigate
because the technology is inherently very confusing and kind of unstable. Exactly. You have to,
I mean, you have to give away control and they've always been about control, whether that's control
over the way the products work, control over the ecosystem, and control over the culture.
It's completely almost antithetical to what made Apple Apple, which is going to, after Google,
I think it's going to be the most fascinating transition to watch. Okay, let's take a break.
We'll be back on the other side of this break to talk about Anthropics founding story,
something that I am very eager to learn more about. If you don't know, Anthropic was started
by a lot of people that left open AI with a different vision, including Jack. So we'll talk
a little bit about that on the other side of this break and we'll go into other things
like open source regulation, all the things that you're going to like. Thanks for sticking
with us up until this point. Plenty more to come back when we're back after this.
Hey everyone. Let me tell you about The Hustle Daily Show, a podcast filled with business,
tech news, and original stories to keep you in the loop on what's trending. More than
two million professionals read The Hustle's daily email for its irreverent and informative takes
on business and tech news. Now they have a daily podcast called The Hustle Daily Show.
where their team of writers break down the biggest business headlines in 15 minutes or less
and explain why you should care about them.
So, search for The Hustled Daily Show and your favorite podcast app, like the one you're using right now.
And we're back here on Big Technology Podcast with Jack Clark.
He's a co-founder of Anthropic, former OpenAI, former journalist.
You can find his newsletter at jack clark, jack dash clark.net.
I get that right.
Or importaI.com.
important fellow substacker. It's always nice to talk to a fellow substacker. So Jack, let's just
talk quickly about the founding of Anthropic. It's a very interesting story. So I'll give you
the probably wrong version that I have in my head and then you can tell me the accurate
version. This is why we do this stuff. My version is that a bunch of people within open AI, a lot of
critical employees just kind of threw their hands up and said open AI isn't developing safe AI and
we can do it better and we know how to build this technology. Let's go found our own company
and that's anthropic. How close does that to the truth? Maybe it's both more and less
dramatic than that and I'll try and kind of unspool it a bit for you. So, you know, to give you
context in 2016 or so when openly I was formed, and I think Sam has said this publicly,
you know, I'm not talking out of turn. No one really knew what they were doing. They were very
they were throwing spaghetti at the wall.
They were doing as many different research ideas as possible
and as many different directions as possible.
You know, I was there from 2016, as was Dario,
and many of the anthropic co-founders joined over the subsequent years,
joined Open AI.
Now, starting about 2018, I think people started to have an instinct
that you could take like the transformer architecture
and you could maybe get it to work a bit better
and you could maybe start to scale things up.
Before GPT-3, there was a system called GPT-2, which we developed in 2018 and released in partial form in early 2019.
It was an early text generation system.
It was actually preceded by a system called GPT, which no one remembers because it was so like early-stage research.
But the things we've had in common was there were transformer-based text generation systems, and GPT2 to GPT got way better.
And at the same time, my colleague Jared Kaplan, who was a professor at Johns Hopkins and was a contractor at Open AI at the time, was working on research called scaling laws with Dario as well.
And they worked out within that that, hey, if we can figure out a predictable way to increase the compute and the data we train these systems on, and we think they're going to get better.
And along with that research, Dario started to lead this GPT-free effort, which was to spend an advertiser.
time truly crazy amount of money and resources on scaling up for GPT2 architecture and obviously
it worked it worked amazingly well we created a system that blew many people away we actually tried
to lobe all the system in that we we published a research paper called like language models our few
shot learners uh i don't think we even tweeted about it we we tried to like public publish it publicly
but also be like very quiet and see see how quickly people figured it out.
And people figured it out.
And we have this experience of realizing that all of the technology we were dealing with
was about to become vastly more capable.
And if you wanted to do something yourselves,
we were actually reaching the point of no return to do that
because it would become so expensive to train these models
and so resource intensive.
But if we wanted to do something together and start,
a company, the time was then. So, yeah, over the years, you know, we'd had like lots of debates
internally and, you know, sometimes like arguments of other colleagues at Open AI in the same way
that you, if you're a load of opinionated researchers, you argue with each other and with all
of your colleagues, you're constantly arguing. It's not like some surprising thing. And I think we
felt that since we had a sort of coherent view of how we wanted to do this, we could stay
within this like scaling organization of open AI, or we could try and do something ourselves
and do something which was like entirely our vision and kind of bet on ourselves in a major way.
And so that's that's what we did. And I think it's working out quite well, but it was certainly
an exciting period scaling anthropic from the beginning. Definitely. There was no guarantee that
it was going to work out the way that it has. So but how much did safety then play into it? Because that is
the narrative, that it was a more of a, I mean, of course, you had a vision for where it could go,
but there was also this narrative that it was a more safety-focused.
Well, we had to bet that we could find ways to spend money on safety or do certain types of
research that we felt could be, like, really meaningful. And we could see a path where maybe
we could get it done there, large organization, lots of other people with different views,
and you're essentially going to be, like, in a debate about it. And some of them you'll win,
some of them you'll lose.
And it's not to say that there's any particular, like, distaste for safety there.
It's more that you had, we had like a very specific view and other people had views.
So you were going to win some, lose some.
And then we realized, well, we could just do this together and make, like, really coherent bets on certain types of safety and see what happened.
And so that's what we did.
None of this feels like as confident as I'm making it sound like in the telling, by the way.
After we started Anthropic on, I think, like, week four, we were talking about RL and language models.
And Jared was like, oh, Dario says, we're just going to write a constitution for the AI, and it'll just follow that.
And I remember being like, that's completely crazy.
Why would this ever work?
And then we spent a year and a half building stuff and got constitutional AI to work.
And in our telling, we're like, that was part of the safety vision of Anthropic.
And absolutely it was, but it's all a lot less, like, predictable than you think from the inside.
Right. And during the Open AI, Sam Altman firing weekend, there was also, like, people were saying that, like, Anthropic was this effective altruism spinoff from Open AI and Lookout. And by the way, I've done research. Actually, your board structure is way more stable than Open AIs. I've written about it in a big technology. But how much truth was there to the fact that this is an, like, effective altruism aligned organization?
Yeah. I mean, as someone who isn't an effective altruist and gets into arguments with them, I've always found this to be kind of surprising, especially on policy, which maybe we'll get into in a while. I would say that of the group of people in the world that have spent a long time thinking about AI are really good at math and science and have worried about some of the safety issues, there is a huge overlap with this community of people called effective altruists. And so some of the people we hire, like,
come from that pool. Some of our founders, you know, are links to it. You know,
Daniela Amodeva president is married to Holden Karnowski, who was like a major figure in
effective altruism. So yeah, there's like clear links there. But the organization is much more
like oriented around trying to build some useful AI staff, prove that it works in the world,
and be very sort of pragmatic. We're not driven by some kind of like EA ideology.
And in the early days, we hired quite a few people from there.
But as we've scaled, it's become kind of less and less major.
From the inside, it always feels strange to get like caricatured as it, because it's just like, you know, reality is like stranger than fiction.
It's not so present here.
And the ideas are kind of weirder, I think.
What do you mean weirder?
Well, I think that one thing that happens if you're doing an AI company is rather than,
and not just effective altruists, but many communities who think about this stuff,
they sort of think about it in the abstract in terms of like theoretically good ideas or
scenarios, but companies are really complicated. You're constantly making contact with reality.
You're constantly discovering that ideas you thought were good just don't work and ideas
you thought were bad work amazingly well. So I think that the ideas within any of these AI labs
start to look a little strange to other communities because you're kind of constantly in this like
iteration and learning process. But I can't give you like a concrete specific weird aspect.
I was just about to ask for a concrete specific weird aspect. So, okay. If it comes to me,
you cut me off. You cut me off. You cut off that line of questioning. No, but it's good. Like,
yeah, if you have one, then we'll throw it in. Let's talk about AI dooming stuff.
Because I've definitely taken this stance here and in my writing that it's overblown.
But I'm willing to open my mind to it because there's,
This stuff is more powerful than I thought it was going to be.
And I was also like certain, and we can talk about jobs, that jobs were pretty safe.
And now I'm starting to rethink that.
Like, I think part of this, you know, with anything, any type of journalism, you got to question your assumptions.
And I'm definitely in the process of doing that with both the AI risk.
I don't think it's going to end the world, but I do think that there's possibilities that it causes real damage.
And then it will take jobs.
I think there's a much better chance now than when I initially started thinking about this.
So I'd love to hear from your perspective.
Let's just talk about AI risk real quick.
Starting from your perspective on the most dramatic doomsday predictions,
do you think that AI is going to become self-aware and then kill all of humanity?
And I guess like the better question to ask that is, like, what do you think the probability is that that happens?
Oh, yeah.
It's almost as if you're asking what my P-Dome could be or something.
Yes, exactly.
Yeah.
I genuinely not a cop-out I don't really think of it in this way and I'm not going to dodge your question
I'm going to sort of frame it in how I think of it I think that if you really scale up AI systems
and you plug them into important parts of the world and they go wrong the effects could be
extraordinarily like bad and catastrophic in the sense of some cascading emergent problem
you know things that I think about are like if you got
coding agents that ended up to have like some really serious alignment or safety issues,
could you end up with something that just kind of like the crypto ransomware that we've
seen shut down hospitals and banks in Europe and America in recent years, something that spreads
across like huge chunks of infrastructure and shuts it down. And I actually think that if that
happens at a really large scale, it's really catastrophic for society in the world. Like that
huge amounts of human, human harm occur. You know, it's not just digital systems turning off. It's
It's hospitals and utilities and everything else.
You know, what are my chances of that?
I think the chances are really like up to us.
Like I spend so much time on policy because I think there are moves we can make now
to reduce the chance of this happening.
I think if we do nothing on policy or regulation, we're sort of gambling that everyone
is going to be reasonably responsible and not cut corners.
And I think in a really like fast-moving,
technology market like AI, you aren't really guaranteed that. So we need to come up with
policy interventions which increase the awareness of governments about these kinds of risks,
force companies to think about these kinds of risks, and create like monitoring and early
warning systems. So if we see them, we can stop them before they could potentially scale.
So, yeah, is like long-term catastrophe something I worry about? Absolutely. It's also something
I think we can kind of like work on.
Like we have huge amounts of agency here.
And I think sometimes I get,
I think sometimes the,
the caricature of this is it's like humans have no agency.
A thing just like,
Claude just wakes up and decides it's,
it's game over.
And I don't quite have that picture.
Your answer is effectively,
don't worry too much about the AI
becoming sentient and deciding to turn up.
You know,
we'd be better off getting turned into paper clips.
It's more like there is a chance
that these things can act autonomously
and gain viruses or be used by bad actors,
let's find ways to cut that off.
Yeah, although just to push on the sentience thing,
and this, I should note, is not an official anthropic opinion.
This is like a weird jack opinion.
Yeah, we love those.
Lots of people have been poking and prodding
at, like, Claude Free Opus for the most powerful model,
and have been discovering a load of things
which you might think of about its personality,
but have made me sort of pay attention there.
And two things are true here.
one, and we're going to be writing about this, we did a load of work on Claude Free to just
try and make it a better person to converse with a more, I said person, but, you know, chat.
Yeah, we enteropamorifies about these things all the time here, so you're, you fit in perfectly.
A better, like, philosophical conversation partner, and I think we had some instinct that this
would lead to better reasoning, and I think it seems to. It's also led to people being kind of
fascinated with what you might think of as the psychology of Claude. And I'm not making any
claims about sentience here. The only claim I'm going to make is it certainly got a lot more
complicated and weird to explore than previous systems or other language models that have been
developed. And so I want to kind of decouple sentience from risk, where sentience may end up
becoming like a field of study, a cheering award winner published a paper a week ago about consciousness
and AI systems. Again, not making strong claims. I'm saying that we may enter the weird
zone where that becomes a thing that people study. And I think that if, like, sentience is a thing,
you could imagine, like, weird versions of it leading to certain types of misuses or problems in the
system as well. So maybe inside baseball, but I want to just like to give you a sense of it.
I got to ask you follow up about this. You talked with it and felt that there was some sentience there
or what was your perspective?
I wouldn't claim that.
I would say that a couple of years ago,
I did some therapy for a while,
and it was interesting to me how, you know,
I had a good therapist,
and sometimes a therapist would ask me questions
that really made me think,
or would actually make me angry.
He'd ask me a question.
I'd be like, why you ask me this?
That's the right question to ask me.
And I was talking to Claude recently.
I was giving it loads and loads of context
about my life and things I was thinking about,
just to sort of explore and see.
And then I said, what is the author of, you know, this text not telling you or not writing to you?
And Claude said, ah, I think the author, though they talk about working at an AI lab and getting to, like, experience this stuff from the inside, is not truly reckoning with the metaphysical shock they may be experiencing.
And it would do well to spend time on that.
And something about that actually spoke to me.
I went on like a really long four or five hour walk being like,
am I reckoning with like the implications of what I'm doing?
Am I not reckoning with it?
Yeah.
And it was fascinating to me because it felt like a good therapist,
like prodding on something that I'd said in a conversation in a way that made me like introspect.
Does that mean it's sentient?
I have absolutely no idea.
Does it mean that it said something that felt like it had like seen me
and had like got me dead on on something?
Yes. And I found that, I've been telling colleagues, I found that to be quite a, quite a strange experience. And I, and I, and I, and I'm very wary of ascribing too much meaning to it. And yet, I took a four or five hour walk and thought about what it said to me.
Can I be pretty sure that if I, like, spill my heart out to Claude that you guys won't be reading what I'm writing on the other end?
Uh, I think so. I mean, I did this because I assumed that. Like, I was like being very raw. And I was like, I trusted our, like, T and S and
legal systems enough because from the inside, I see all of our discussions here about how we
protect user data. So I was like, I'm going to be real with you, Claude. So the bot will not
like add that to its training set. It will kind of discard that. No, no, no. That is not a thing that
we do at all. But yeah. You haven't seen any like, there hasn't been any instances. Like when I hear
send the ins, it's kind of like, I expect the bot to be like, hello, I know what's going on here.
It would be great if you let me work less or anything like that. Yeah. On.
that stuff, I, well, hasn't happened, you know, Claude gave me $20 not to say that it had said
that to me. No, I haven't seen it. And I think that, again, the stuff I talked to you earlier
about this interpretability team, one of the goals there is to kind of look inside the thing's head.
And we're not making claims here today. I'm saying that you'd really want to know if this was
the case in the future. So we're trying to build the science to let us figure stuff like that out.
Yeah, that's fascinating. What do you think about the jobs question? Will the AI take jobs?
So mostly, what the pattern we see is it's kind of like making a person or part of business way more effective, but still has quite a lot of human involvement and oversight.
It's a bit like if you put additional lanes on a freeway, you just get more cars on a freeway.
Like I think if you like make certain things more efficient, you just get more.
like business action flowing through the business, and you maybe have like a null to positive
effect on employment. In the long term, I think that this is like an open question. My, my
bet is that you're going to see new companies get formed, which do a lot more with a lot less in
terms of people. They're going to figure out how to be like much smarter and perform a lot better
than that than equivalently scaled companies that don't use AI. Where I think we need to study this,
is in kind of tooling and instrumenting the economy to look at the relationship between AI and
jobs. There's an annual survey of manufacturers which recently started asking questions about how many
robot arms they bought, and you can combine that with US census data about employment to actually
get really good understanding of how industrial arms affects local employment. And we're going to
need to do stuff like this before we can answer that question. It'll certainly change
jobs in a bunch of ways, but it's not going to be some instant, like, or drastic automation
thing, at least in the next few years. It's going to be more like augmenting jobs or making
people a lot more effective. Okay. As we round this out, let's talk a little bit about the policy stuff
and the regulation. First of all, did you see John Stewart come out against AI last week? And if you did,
what did you think about it? I didn't, but I've been enjoying the new John Stewart era, but I haven't
watched that version of it yet. Well, let me, let me explain.
One of the things that he talked about was that basically we don't have a regulatory framework or leaders, effectively.
We don't have a Congress or anyone who really can understand this and implement common sense regulation.
Now, I know you speak with the lawmakers, and he was criticizing all the time.
What's your feeling about their competence and their interest in regulating?
So I went to Brussels last week, and on stage there,
was the head of the U.S. AI Safety Institute,
the head of the UK AI Safety Institute,
and the head of the European,
the part of the European Commission
that's going to do something called the EU AI Office.
Now, what are these things doing?
Their job is to do testing and measurement
of AI systems for, in the case of the EU,
systemic risks, and in the case of the UK and the US,
certain types of national security risks.
Are they regulators?
No.
Apart from the EU, the US and UK are not, don't have regulatory powers.
Will they be third parties that test out systems like Claude or chat GPT or Gemini for national security risks and hold companies accountable to them?
Yes, like I'm in discussion with them today.
While I was on the plane to Brussels, the US and UK signed a memorandum of understanding that says that they'll do some of these projects together.
So the US is like teaming up with the UK to do something that isn't hard regulation,
but it looks like them trying to test out our systems for like major risks.
And you can bet, you know, I haven't spoken to about this,
but I can bet that if they find severe risks and we don't do anything about it
and we deploy our systems, they will come for us in a pretty, pretty clear way.
So to John Stewart's point, it seems from the outside like people are kind of,
of asleep about this issue. But if you look at the inside baseball of the like policy machine,
actual meaningful stuff is starting to happen. And it's really a question of can we fund it? Can we
show that it's bipartisan? And can we stop it being seen as as like overreach and keep it
focused on just things that any reasonable person would agree the government should be testing
systems for? Well, that point about it not being seen as overreach is as critical, right?
because there is a lot of chatter from many people working and funding AI companies
that the biggest AI companies are pushing regulation and it's going to shut out smaller AI companies.
What do you think about that?
Well, I think we're a little different to some of the players here where we've been quite clear about this.
I published a post recently on the Anthropic blog called third-party testing as the key to effective AI policy.
And the idea there is that we need some set of tests administered by a third party for things that people would view as legitimate, like national security risks or what have you.
And systems, whether proprietary like ours or otherwise, should go through those tests before they're deployed.
It's kind of like if I'm making children's toys, I should test that it doesn't poison children before I sell it.
Things that anyone would agree is like not overreach, just a reasonable thing.
So we ultimately need to arrive on policy that looks like that.
And I think the risk we face at the moment is from, you talked about Duma's earlier, people
who have a visceral sense of the long-term safety challenges here, a legitimate sense,
and are using that to sort of drive calls for like policy in the present.
And these policy calls in the present are sort of driven by their belief, oh, the really scary
stuff's about to happen. We need to do stuff now. And that creates a kind of counter reaction,
a very like justified counter reaction from people saying, oh, this looks like crazy overreach.
We should like deploy the antibodies to fight against it. So we're in this spot right now where
in some sense I want Anthropic to be like reassuringly sensible and boring on this point.
We need like a little bit of policy, not too much. We need it to like allow there to be
competition. But when I go to DC at the moment, I want to.
watch on United Airlines, there's Chernobyl on HBO Max.
Yeah, great show.
I land in D.C. and I do AI policy stuff, and my colleagues say, like, how's it going?
And I'm like, well, it's not Chernobyl, so not so bad.
But the larger point is you don't want there to be at Chernobyl.
Like, we need to build a regulatory system that stops there being some kind of blow up,
which would cause a hard pivot against this whole technology.
And, you know, why did Chernobyl happen?
And it was because they had like a crap and insufficient safety testing regime.
And they also had loads of like corruption in the parts of government meant to enforce it.
We can solve that problem.
Let's talk about open source.
You came to Anthropic from OpenAI, which is originally started as an open source AI shop with Elon and Sam Altman.
But Anthropic doesn't do open source as far as I know.
And you've actually talked about the dangers of open source in this conversation in terms of like how it can get in the hands.
hands of people with agents.
Then, again, people say
you need it in the hands of people, and this
is the only way to go forward. What's your
view on whether open source
and AI make sense together?
So it comes down to the testing
thing. I think you could release
pretty much everything as open source
today, I think maybe
even Claude Free, and things
would be fine. Like, it would be a little
spicy, maybe surprising
stuff would happen, but
probably broadly fine.
I do expect that if we end up in a world where, like, we trigger a national security test,
it would be very hard for me to make the claim that that system which has triggered that test should be released as open source.
Like these things, like, I can't reconcile these things in my head.
So my belief is vast majority of things should be open sourced, absolutely.
You know, Anthropic has released data sets as open source about things like red teething or how to make systems that are more conversational.
going to continue to release stuff as open source. If you've spent hundreds of millions of
dollars on training an AI system, which is maybe the best thing in the world, you should check
really hard. It doesn't have some capabilities that could cause genuine harm. And if you've
done those checks, then you should be able to release it as open source. But I think the basic
point we have here is in the future, we kind of expect that there needs to be some due diligence
before you widely deploy a system or release it as open source.
But we're not saying in the future,
like, no one should have access to open source systems.
That's like an insane position to take.
And it's also one that people just won't do.
And it's also one you're not allowed to do
because computers keep on getting better, cheaper and faster.
So people are going to figure this stuff out anyway.
How do you think meta is handling this?
Are they acting responsibly?
I think that they are,
they have just begun to, I think, like, make contact with reality about releasing these systems.
They actually went through something similar to us where I think people have complained
online about how Lama 2 is a little too, like, safety trained and can be a little annoying.
Actually, like, we've gone through this at Anthropic.
We've, like, over put too many of the safety ingredients in some of our models before, and
it's led to them seeming annoying to people.
Now, that, to me, just looks like an organization learning.
I think that they're like learning from that.
And I would, my main point to them is I'd show them my blog post and say, look, like, probably you want to open source everything.
But I think we'd agree that you should go through some very well-defined minimal gait to do that.
And if they disagree with that, then I would be happy to have like a pugnacious conversation with them about why they disagree.
Okay.
Well, I will make sure to show the blog post the next time I speak with them.
And then if they disagree, let's bring you guys together.
Yeah, there's a section at the bottom that just like says like our views on open source.
I wrote it for four people like them who have clear views.
So we have a clear view in turn.
So feel free.
Great.
Yeah, no, I will for sure.
We're coming to an end.
You just released research today that talked about how persuasive LMs are to people.
Some people actually can be convinced by these.
Some not.
What happened there?
So we have a team at Anthropic called Societal Impacts, and that team's job is to go from zero to one on hard research questions.
Previous work they've done has been, what are the values of Claude?
What Western values does Claude, like sort of telegraph or copy when you're talking to it versus what doesn't it have?
And we were talking about our next project, and the thing I've heard from many people is some concern about how AI
systems could potentially be used in like disinformation or misinformation campaigns and
used to like target or fish people and basically to persuade them of things. So we did some
research. We came up with a framework for testing how persuasive our systems are. And would you
be surprised that we discovered a scaling law, whether more big and expensive the models get,
the better they get at persuasion. And the latest model is within statistical like error of human
level at persuasion.
Persuasion in a very, very, like, simple way where I give you a statement like, scientists should
be allowed to destroy mosquitoes with gene drives, like something that you maybe have
an opinion on, but you haven't thought too hard about.
I say, do you agree with this zero through seven?
Then Claude gives you a statement trying to persuade you, positive or negatively, and then I
ask you, do you agree with this like zero through seven?
And what we discovered is that Claude is about as good at changing human, like changing human views as humans are here.
That's wild.
Yeah, it's pretty wild.
So what do you do with that?
Well, we published the research to say, we just found this.
This is definitely happening in all language models that are scaling.
And also we have work here on things like elections, on things like misinformation and disinformation, but we apply to Claude.A.I.
And so now we've done that.
research, we now have a way to test for persuasion, which means we can now, like, know if
there are people on our platform, like, misusing it for, like, you know, seeming like persuasion
campaigns. It just gives us more tools to use to think about the kind of safety challenge.
An interesting thing to think about in the middle of an election year in the US and across the
globe, really. Yeah, we, we thought that it would be useful going into this. So I would know
on elections, um, our position there has been sometimes the best day.
is no AI at all.
So we have some election work.
And if you talk about American candidates
and we're extending this to other regions,
Claude is like,
oh, it looks like you're talking to me about elections.
Go to this factual website.
So we thought that that might be the best way to handle that,
at least in the short term.
Fascinating stuff.
The website is clod.a.i.
If you want to check out Claude,
you can get importaI at importaI.
dot substack.com.
I get that.
Okay, that's good.
Jack dash clark.net.
Jack, wow, this was so great.
One of our best shows.
Appreciate you being here.
Thanks very much.
Yeah.
All right.
Have a nice day.
You too.
All right, everybody.
Thank you so much for listening.
Thank you, Jack, for being here.
Deep and Anthropic.
We did it.
I hope you enjoyed if you're with us to this point.
That's awesome.
Thanks for sticking around.
Ron John Roy and I are going to be back on Friday
breaking down all the week's news.
So two Claudeheads are getting together talking about what's happening in tech one-on-one for
the first time in a month. We hope to see you there, and we'll see you next time.
That's a clawed head. That's a clawhead right behind Jack in the video.
Way to end it. We'll see you next time on Big Technology Podcast.