Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas - 258 | Solo: AI Thinks Different
Episode Date: November 27, 2023The Artificial Intelligence landscape is changing with remarkable speed these days, and the capability of Large Language Models in particular has led to speculation (and hope, and fear) that we could ...be on the verge of achieving Artificial General Intelligence. I don't think so. Or at least, while what is being achieved is legitimately impressive, it's not anything like the kind of thinking that is done by human beings. LLMs do not model the world in the same way we do, nor are they driven by the same kinds of feelings and motivations. It is therefore extremely misleading to throw around words like "intelligence" and "values" without thinking carefully about what is meant in this new context. Blog post with transcript: https://www.preposterousuniverse.com/podcast/2023/11/27/258-solo-ai-thinks-different/ Support Mindscape on Patreon. Some relevant references: Introduction to LLMs by Andrej Karpathy (video) OpenAI's GPT Melanie Mitchell: Can Large Language Models Reason? Mitchell et al.: Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks Kim et al.: FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions Butlin et al.: Consciousness in Artificial Intelligence: Insights from the Science of Consciousness Margaret Boden: AI doesn't have feelings
Transcript
Discussion (0)
Feel your best and amplify your everyday look with Thrive Cosmetics. Go to ThriveCosmetics.com slash shine 26 for an exclusive offer of 20% off your first order.
That's Thrive Cosmetics, C-A-U-S-M-E-T-I-C-S dot com slash Shine 26.
Shell v Power Nitro Plus fuels every drive from the Pacific Coast Highway to the Sierra Peaks with a fuel like no other.
It provides engine performance that lasts to give you more time on the road.
That means more protection with active ingredients for longer-lasting engines.
Shell v. Power Nitro Plus premium gasoline.
Engine performance that lasts.
Chances are you're not far from a shell station.
Find it using the Shell app.
Formulation unique to Shell.
Compared to minimum detergent gasoline,
with continuous use of Shell v. Power Nitro Plus and gasoline direct injection engines.
Actual effects and benefits may vary.
See shell.us slash more dash protection for more information.
Hello, everyone.
Welcome to the Mindscape podcast.
I'm your host, Sean Carroll.
Sometimes I like to imagine that there are people 500 years from now
who are archaeologists, historians,
whatever they call them at this far future point,
who have developed the technology to decode these ancient recording technologies,
these different ways of encoding audio data,
and they're sitting there listening to the Mindscape podcast.
So for these reasons, for those people 500 years in the future,
I do find it important to provide some context
because you and I, you the present listeners and I,
know what the news stories are and so forth,
but maybe our future friends don't.
So, hi, future friends,
and you might be interested to hear
that as I am recording this in November
2023, we are in the midst of a bit of change
vis-a-vis the status of artificial intelligence, AI.
It's probably safe to say that we are in year two
of the large language model revolution.
That is to say, these large language models,
LLMs, which have been pursued for a while
in AI circles, have suddenly become much better.
Mostly, if you hear the people talking about them,
there's some tricks that they have certainly gotten better at.
That's very important in terms of the programmers making these LLMs,
but also just scaling, more data, more compute power,
more time spent doing all these things and fine-tuning the models that we're thinking about.
OpenAI, which is one of the major AI companies,
released chat GPT, and since then it's released subsequent versions of GPT.
I think we're up the GPT 4 or 5 by now.
And they're pretty amazing compared to what they used to be able to do just a couple years ago.
I think it's pretty clear that these technologies are going to change things in an important way.
Of course, when you have a new technology that is going to change things in important ways,
money gets involved.
And when money gets involved, that changes everything.
So the other big news right now is that about a week ago, I think a week and one day ago,
the CEO of OpenAI, Sam Altman, was unceremoniously booted from the company.
I don't know how you do things 500 years from now, but these days, many corporations have a board of directors,
and really the board of directors is supposed to give advice,
but the only thing that they actually have the power to do is to fire the CEO, the chief executive officer.
So Sam Altman got fired, and that was considered bad by many people, including major investors in the company like the Microsoft Corporation.
So there was a furious weekend of negotiations since the firing happened on a Friday, and no fewer than two other people had the job title of CEO of OpenAI within three days until finally it emerged that Altman was back at the company.
and everyone else who used to be on the board is now gone and they're replacing the board.
So some kind of power struggle, Altman won and the board lost.
I think it's still safe to say we don't exactly know why.
You know, the reasons given for making these moves in the first place were extremely vague,
or we were not told the whole inside story.
But there is at least one plausible scenario that is worth keeping in mind,
also while keeping in mind that it might not be the right one,
which is the following. Open AI was started as a nonprofit organization. It was devoted to developing
AI, but in an open source kind of way, thus the name, Open AI. And it was founded in part because
some of the founders had concerns about the risks of AI, not only promise about how wonderful it would be,
but worries about how dangerous it could be. And with the thought that by keeping everything open and
transparent, they would help make things safe for the users of AI down the road, up to and
including possible existential risks. That is to say, AI becoming so powerful that it actually
puts the very existence of humanity into risk because of what is called, well, the possibility
would be called AGI, artificial general intelligence. So the nomenclature AI is
very broad now. There's plenty of things that are advertised as AI that really aren't that
intelligent at all. Let's put it that way. Artificial general intelligence is supposed to be the
kind of AI that is specifically human-like in capacities and tendencies. So it's more like a human
being than just a chatbot or a differential equation solver or something like that. The
consensus right now is that these large language models we have are not AGI.
They're not general intelligence, but maybe they are a step in that direction, and maybe
AGI is very, very close, and maybe that's very, very worrying, okay?
That is a common set of beliefs in this particular field right now.
No consensus, once again, but it's a very common set of beliefs.
So anyway, the story that I'm spinning about A.I., Open A.I., which may or may not be the right
story, is some members of the board and some people within Open A.I.
became concerned that they were moving too far, too fast, too quickly without putting proper safety guards in place.
So, again, money. It's kind of important. Open AI started as a nonprofit, but they realized that it would be good idea to have a for-profit subsidiary because then they could have more resources, they could get investors, they could do their job more effectively.
Well, guess what? Surprise, surprise. The four-profit.
profit subsidiary has grown to completely take over everything, and the nonprofit aspect of
the company is kind of being shunned to decide. So the idea is that possibly people on the board
and within the company got worried that they were not being true to their mission, that they were
doing things that were unsafe at OpenAI, and that is why they fired Sam Altman. We don't know
for sure because he got rehired, and they're not being very forthcoming, despite the name OpenAI. So it's
all just speculation at this point.
And part of this, and this is why it's a little bit daring of me, perhaps foolhardy, to be
doing this podcast right now, is that OpenAI does have new products coming out.
The one that has been murmured about on the internet is called Q Star.
It is an improved version of these LLMs that we've been dealing with, like ChatGPT and so
forth, one that is apparently perilously close to being like AGI, artificial general intelligence.
So there are people who think that artificial general intelligence is right around the horizon.
Maybe QSTAR has something to do with it. I really don't know. That's why he's foolish to be
doing this right now at a time of flux, but we're going to do it anyway, because maybe it's important
to have these interventions when things can still be affected or deflected down the road.
I personally think that the claims that we are anywhere close to AGI, artificial general intelligence, are completely wrong, extremely wrong, not just a little bit wrong.
That's not to say it can't happen as a physicalist about fundamental ontology.
I do not think that there's anything special about consciousness or human reasoning or anything like that that cannot in principle be duplicated on a computer.
But I think that the people who worry that AGI is nigh, it's not that they don't understand AI, but I disagree with their ideas about GI, about general intelligence.
That's why I'm doing this podcast. So this podcast is dedicated to the solo podcast, to the idea that I think that many people in the AI community are conceptualizing words like intelligence and values incorrectly.
So I'm not going to be talking about existential risks or even what AGI would be like, really.
I'm going to talk about large language models, the kind of AIs that we are dealing with right now.
There could be very different kind of AIs in the future.
I'm sure there will be.
But let's get our bearings on what we're dealing with right now.
A little while ago, I think it was just in last months, Ask Me Anything episode,
remember people in the future, maybe, I don't know, maybe we've solved immortality also.
so I'm still around in the future.
So you can ask me questions every month by signing up for Patreon support for the Mindscape podcast at patreon.com slash Sean M. Carroll.
I do monthly Ask Me Anything episodes where Patreon supporters can ask questions.
And Johann Falk asked a question for the AMA, which was,
I have for some time been trying to understand why AI experts make radically different assessments concerning AI risks.
And then he goes on to speculate about some hypotheses here.
I think this is right.
You know, some people who are very much on the AI as an existential risk bandwagon will point
to the large number of other people who are experts on AI who are also very worried about
this.
However, you can also point to a large number of people who are AI experts who are not
worried about existential risks.
That's the issue here.
Why do they make such radically different assessments?
So I am not an expert on AI in the technical sense, right?
I've never written a large language model.
I do very trivial kinds of computer coding myself.
I use them, but not in a sort of research-level way.
I don't try to write papers about artificial intelligence or anything like that.
So why would I do a whole podcast on it?
Because I think this is a time when generalists,
when people who know a little bit about many different things,
should speak up.
So I know a little bit about AI.
You know, I've talked to people on the podcast.
I've read articles about it.
I have played with the individual GPTs and so forth.
And furthermore, I do, I have some thoughts about the nature of intelligence and values
from thinking about the mind and the brain and philosophy and things like that.
I have always been in favor.
It's weird to me because I get completely misunderstood about this.
So I might as well make it clear.
I am not someone who has ever said, if you're not an expert in physics, then shut up and don't talk about it.
Okay. I think that everybody should have opinions about everything. I think that non-physic should have opinions about physics, non-computer scientists should have opinions about AI. Everyone should have opinions about religion and politics and movies and all of those things. The point is, you should calibrate your opinions to your level of expertise. Okay. So you can have opinions, but if you're very not very knowledgeable, if you're not very knowledgeable about an area, then just don't hold those opinions too firmly.
be willing to change your mind about them. I have opinions about AI and the way in which it is
currently thinking or operating, but I'm very willing to change my mind if someone tells me
why I am wrong, especially if everyone tells me the same thing. The funny thing about, you know,
going out and saying something opinionated is that someone will say, well, you're clearly wrong
for a reason X, and then someone else will say, you're clearly wrong, but in the opposite direction.
So if there's a consensus as to why I'm wrong, then please let me know.
Whether I'm wrong or not, I'm certainly willing to learn.
But I think this is an important issue.
I think that AI is going to be super-duper important.
We don't know how important it's going to be,
and it's important to, along the way, be very, very clear about what is going on.
I am kind of—I don't want to be too judgy here, but I'm kind of disappointed at the level of discourse about the GI part of artificial general intelligence.
Much of this discourse is going on by people who are very knowledgeable on the computer side of things,
not that knowledgeable about the side of things that asks, what is intelligence, what is thinking,
what is value, what is morality, things like that.
These people got to start talking to each other.
And they are a little bit.
Don't get me wrong.
We've had people on the podcast who are talking to each other who are experts on various things.
But the talking has to continue.
And that is what we are here for in Minescape.
So let's go.
Let me reiterate one thing just to be super duper clear right from the start.
I think that AI in general, and even just the large language models we have right now or simple modifications thereof, have enormous capacities.
We should all be extraordinarily impressed with how they work, okay?
If you have not actually played with these, if you have not talked to chat GPT or one of its descendants, I strongly, strongly encourage you to do so.
I think that there's free versions available, right?
There's no issue in just getting a feeling for what it means.
And the designers of these programs have done an extraordinarily good job in giving them the capacity to sound human.
And they're very useful tools.
You can use them to create all sorts of things, just as one example, you know, close to my heart.
I've told people that I'm teaching a course this semester on philosophical naturalism, which I've never taught before.
So after I had made up the syllabus, right, week by week, what are we going to read, what are we going to think about?
I decided, for fun, let me ask GPT, the LLM, the AI program, how it would design a syllabus.
I told it, you know, this is for upper-level philosophy students about philosophical naturalism for one-semester course, et cetera. And you would, you know, if you haven't played with these things, you will be astonished at how wonderful it is. It just comes up. It says, here's your syllabus week by week, and it includes, you know, time for student presentations and so forth. And there's two aspects of this that become remarkable right away. One is the very first suggested reading, because it gave me all.
whole list of readings, which is kind of amazing. The very first suggested reading sounded
wonderful. It sounded perfect for the course, and I hadn't heard about it. It was an overview
of naturalism by Alex Rosenberg, former mindscape guest, philosopher at Duke University. And it was
in, purportedly, in the Oxford Companion to Naturalism. So obviously, I went immediately and
Google the Oxford Companion to Naturalism, and I googled Alex Rosenberg's name, and I googled the title of
this purported piece doesn't exist. No such thing. It is very much the kind of edited volume that
could exist, and it is very much the kind of thing that Alex would write, but it just was hallucinated.
Or as we say in academia, with reference to LLM citations that don't actually exist, it's a halucitation.
So that's the weird thing about these LLM results. They can be completely wrong, but stated with complete confidence.
And we'll get into why that is true in just a second.
But nevertheless, it's useful.
That's the other thing I wanted to say,
because not because you should trust it,
not because it's the final word,
but because it can jog your memory or give you good ideas.
You know, I'm reading over the syllabus that ChatGPT gave me,
and I'm like, oh yeah, Nancy Cartwright's work on patchwork laws
versus reduction in unification.
That would be an interesting topic to include in my syllabus, and I did.
So it's more like early Wikipedia, right, where there were many things that were not reliable,
but it can either jog your memory or give you an idea of something to look for.
So no question in my mind that the LLMs have enormous capacities,
speak in very useful human-sounding ways,
and will only become more and more prevalent in all sorts of different contexts.
So the question in my mind is not, you know, will AI, will LLMs,
be a big deal or not. I think that they will be a big deal. The question is, will the impact of
LLMs and similar kinds of AI be as big as smartphones or as big as electricity? I mean,
these are both big, right? Smartphones have had a pretty big impact on our lives in many ways.
Increasingly, studies are showing that they kind of are affecting the mental health of young
people in bad ways. I think we actually underestimate the impact of smartphones on our human
lives. So it's a big effect, and that's my lower limit for what the ultimate impact of AI is
going to be. But the biggest, the bigger end of the range is something like the impact of electricity,
something that is truly completely world-changing. And I honestly don't know where AI is going to
end up in between there. I'm not very worried about the existential risks, as you know, talk about
very, very briefly at the very end. But I do think that the changes are going to be enormous.
There are much smaller, much more realistic risks we should worry about. And that's kind of why I want
to have this podcast conversation, one-sided conversation, sorry about that, where I kind of give
some thoughts that might be useful for conceptualizing how these things will be used.
The thing about these capacities, these enormous capacities that large language models have,
is that they give us the wrong impression. And I strongly believe that. I strongly believe that.
this. They give us the impression that what's going on underneath the hood is way more human-like
than it actually is, because the whole point of the LLM is to sound human, to sound like a pretty
smart human, a human who has read every book. And we're kind of trained to be impressed by that,
right? Someone who, you know, never makes grammatical mistakes, has a huge vocabulary, a huge store of
knowledge, can speak fluently. That's very, very impressive. And from all,
all of our experience as human beings, we therefore attribute intelligence and agency and so forth
to this thing because every other thing we have ever encountered in our lives that has those
capacities has been an intelligent agent. Okay. So now we have a different kind of thing,
and we have to think about it a little bit more carefully. So I want to make four points in the
podcast. I will tell you what they are, and then I'll go through them. The first one, the most
important one is that large language models do not model the world. They do not think about the world
in the same way that human beings think about it. The second is that large language models don't
have feelings. They don't have motivations. They're not the kind of creatures that human beings are
in a very, very central way. The third point is that the words that we use to describe them, like
intelligence and values, are misleading. We're borrowing words that have been useful to us as human
We're applying them in a different context where they don't perfectly match, and that causes problems.
And finally, there is a lesson here that it is surprisingly easy to mimic humanness, to mimic the way that human beings talk about the world without actually thinking like a human being.
To me, that's an enormous breakthrough, and we should be thinking about that more.
Rather than pretending that it does think like a human being, we should be impressed by the fact that it sounds so human,
even though it doesn't think that way.
When people turn to telehealth for weight loss, they're looking for real support.
That's why more people are choosing orderly meds.com.
Orderly meds connects you with real doctors and access to proven GLP1 medications like
semaglutide and terseptatide.
No guessing, just a more supportive experience, and all shift directly to your door in discrete packaging.
Do your research, ask questions, then visit orderly meds.com slash podcast for an exclusive offer.
That's orderly meds.com slash podcast.
Individual results may vary, not medical advice, eligibility,
required seaside for details.
Hey, everyone, it's Cal Penn.
I'm the host of Earsay, the Audible and I Heart
Audio Book Club. This week on the podcast,
I am sitting down with Ray Porter, the narrator of
Andy Weir's audiobook Project Hail Mary,
massive sci-fi adventure about survival and science,
and what happens when you wake up alone very far from Earth?
I really had to make a decision because I caught myself
getting that frog in my throat and starting to get
teary as I'm narrating some of these sections and it's like, okay, yo, yeah, yo, is this indulgent?
And I really thought about it. I was like, no, at this point, it would kind of be betraying the
trust the author and the listener have in telling this story if I don't go through it.
But there's places in this book that deeply emotionally affected me and I left it on the mic.
That's great.
Because it served the story.
People will say like, oh my God, I cried at the end.
It's like, yeah, dude, me too.
Listen to EIRSA, the Audible and IHeart Audio Club, on the IHeart Radio app or wherever you get your podcasts.
So, let's go on to all these points, and I'll go through them one by one.
First, large language models do not model the world.
Now, this is a controversial statement.
Not everyone agrees with it.
You can find research-level papers that ask this question.
Do large-language models model the world?
And the word model in that sentence is used in two different ways.
a large language model is a computer program.
A model of the world means that within that computer program,
there is some kind of representation that matches onto,
that corresponds to, in the sense of the correspondence theory of truth,
the physical reality of the world, right?
That somewhere in that LLM, there is the statement,
spaces three-dimensional.
Another statement that says gravity attracts.
Another statement that says,
the owner of a lonely heart is much better than the owner of a broken heart.
You know, whatever statements you think are important to correctly model the world,
those statements should be found somewhere as special knowledge centers in the large language model.
That's not how large language models work.
So again, people think that it is.
Some people have claimed that it is.
Other people have written research papers that claim that it's not.
So I think in the community it's not agreed upon.
I'm going to argue that it's clearly not true, that they model the world.
So I will in the show notes, which are always there available at preposterousuniverse.com
slash podcast.
Show notes for the episode, I will include links to some technical level papers by actual
researchers in this field, unlike myself, so you can read about it and decide for yourself.
It seems like the large language models would have to model the world because they clearly
give human-like answers.
One of the kinds of things that are used to test, does a large language model the world, is, you know, can it do spatial reasoning?
Can you ask it, you know, if I put a book on a table and a cup on the book, is that just as stable as if I put a book on a cup and then a table on the book, right?
And we know that it's better to have the table on the bottom because we kind of reason about its spatial configuration and so forth.
You can ask this of a large language model.
It will generally get the right answer.
It will tell you you should put the cup on top of the book and the book on top of the table, not the other way around.
That gives people the impression that LLMs model the world.
And I'm going to claim that that's not the right impression to get.
First, it would be remarkable if they could model the world, okay?
And I mean remarkable, not in the sense that it can't be true, but just literally remarkable.
It would be worth remarking on.
It would be extremely, extremely interesting if we found that LLMs had a model of the world inside them.
Why?
Because they're not trained to do that.
That's not how they're programmed, not how they're built.
Very briefly, what an LLM is is a program with a lot of fake norems.
They design these deep learning neural networks inspired very, very vaguely by real brains and real creatures.
So they have these little nodes that are talked to by other nodes.
They interact, and these nodes will fire or not, depending on their input levels from the nodes they're connected to.
And you start out with everything sort of randomly firing, but you have some objective, okay, some objective function, like recognize this picture, is it a cat or a dog?
or answer this question about cups and books and so forth, okay?
And then you train it.
It's random to start, so it usually spits out nonsense,
but if it says the right thing, then you tell it it said the right thing,
and it sort of reinforces the weights inside the fake neurons
that led to the right answer,
decrements, decreases the weights that would have led to the wrong answer.
And you just do this again and again and again,
and you feed it basically as,
much text as you can. Let's think about language models now, so we're not thinking about
visual AI, but it's a very analogous kind of conversation to have. You feed the modern
LLMs have basically read everything that they can get their metaphorical hands on. They've read
all the Internet, they've read every book ever written that has been digitized. You can go check,
if you're a book author, by the way. There are lists of all the books that have been fed into
chat GPT, for example. All of my books are in there. I did, I, I didn't, I, I, you know, I,
No one asked me permission to do this.
So this is an ongoing controversy.
You know, should permission have been asked?
Because, you know, the books are copyrighted.
Basically, the AI model trainers found pirated versions of the books and fed them to their LLMs, which is a little sketchy.
But, okay, that's not what we're talking about right now.
So there's an enormous amount of text.
The LLMs have been fed every sentence ever written to a good approximation.
Okay.
So that's what they're familiar with.
That's the input.
And the output is some new sentence.
Some new sentences that are judged on the criterion of that a human being who reads them says,
oh, yes, that's good or, oh, no, that's not very good at all.
Nothing in there involves a model of the world, okay?
At no point did we go into the LLM and train it to physically represent, or, for that matter,
conceptually represent the world?
If you remember the conversations we had both with Melanie Mitchell and with Gary Marcus,
earlier on the podcast, they both remarked on this fact that there was this idea called symbolic AI, which was trying to directly model the world.
And there's this connectionist paradigm, which does not try to directly model the world, and instead just tries to get the right answer from having a huge number of connections and a huge amount of data.
And in terms of giving successful results, the connectionist paradigm has soundly trounced the symbolic paradigm.
maybe eventually there will be some symbiosis between them.
In fact, modern versions of GPT, et cetera, can offload some questions.
If you ask GPT if a certain large number is prime, it will just call up a little Python script
that checks to see whether it's prime.
And you can ask it, how did it know, and it will tell you how it knew.
So it's not doing that from all the texts that it's ever read in the world.
It's actually just asking a computer, just like you and I would, if we wanted to know if a number was prime, okay?
So what I'm getting at here is the job of an LLM is to complete sentences,
so or complete strings of sentences.
Given all of this data that they have, they've read every sentence ever written,
what they're trying to calculate is given a set of words that it's already said,
what is it most likely to say next?
Okay.
What are the words or the sentences or the phrases that are most likely to come next?
And if it's been trained on relatively reliable material, it will probably give a relatively reliable answer.
So that's why when I asked it about the syllabus for philosophical naturalism, it gave me not something that was true, the very first reading that it suggested, didn't exist, but it sounded perfectly plausible.
because words like that in the corpus of all human writing
are relatively frequently associated with each other.
That's all it does.
So when I say that it would be remarkable
if a large language model modeled the world,
what I mean is what the large language model is optimized to do
is to not, is not to model the world.
It's just to spit out plausible sounding sentences.
if it turned out that the way, the best way to spit out plausible sounding sentences
was to model the world and that large language models had kind of spontaneously
and without any specific instructions figured that out.
So that large language models implicitly within their guggillian neurons inside
had basically come up with a model of the world because that model helped them answer questions,
that would be amazingly important and interesting. That would be remarkable.
But did it happen? Okay. How would we know if it happened or not? Because it's not sufficient to say,
look, I've been asking this large language model a bunch of questions, and it gives very human-like answers.
That's what it's trained to do. That is no criterion. That's no criteria.
evidence at all, that the way that it's doing it is by getting a model of the world rather than
just by stringing words together in plausible sounding ways. So how do you test this? How do you test
whether or not the way that the LLMs are giving convincing answers is by implicitly, spontaneously
developing a model of the world? That's really hard to do, and that's why there are competing
research-level papers about this. You can try to ask it questions that,
require a model of the world to answer. The problem with that is, it's very hard to ask a question,
the sort of which has never been written about before. It's just hard. People have written about
lots of different things. There's truly a mind-boggling library of knowledge that has been
fed into these large language models, which yet again, I'm going to keep repeating over and
over again, gives them incredibly impressive capacities, capabilities. But that doesn't mean
that they're modeling the world. So just as one very, very simple example, I'm going to give you
several examples here to drive home the reason why I think this. The biggest reason why I think
this is what I already said. It would be amazing if the way that the LLMs decided was the
best way to fulfill their optimization function would be to model the world. That would be
important if it were true. I don't think it has any reason to be true. I don't see any evidence
that it is true. But okay, that's the biggest reason I have. But then, you know, you try to test it
one way or the other. So the trick is to ask it questions it has not been asked before in a way
that it would get the wrong answer if all it does is string together words that kind of fit together,
rather than thinking about the world on the basis of some model of the world. Okay? That's a very
fine line to walk down there. So I did ask it, I think I've mentioned this before, you know,
when they very first appeared, chat GPT, etc., I was wondering, would it be useful to, you know,
solve some longstanding problems, okay? Now, physics problems, science problems, people have
already noticed that despite all the hoopla about large language models, etc., no good
theoretical scientific ideas. That is to say, no good scientific theories have come out of these
models. And you would think that we have a bunch of puzzles, right, that we don't know the answers to,
and let's ask these very smart artificial intelligences, right? And they have not given us any
useful answers. But the reason why is just because they've been trained on things human beings have
already said. So if you ask them about a problem, ask them about the cosmological constant
problem or the dark matter problem, it will just say what people have already said. It will not come up
with anything truly new along those lines. I tried to ask it about the sleeping beauty experiment,
a thought experiment in philosophy. And, you know, sleeping beauty experiment, I don't want to get
too far into it, but the idea is you flip a coin. If the coin is heads, you put sleeping beauty
to sleep, you wake her up the next day, and you ask her, what is the chance that the coin
was heads or tails? If the coin was tails, you do the same thing, but you put her to sleep,
wake her up the next day, ask her the probability, and then you give her a memory-erasing drug,
put her to sleep again, and wake her up the next day, ask her again. So if the coin is heads,
you only ask her once on Monday, on if the coin is tails, you ask her both Tuesday and Wednesday,
sorry, on both Monday and Tuesday. And so there's sort of three possible things that can happen.
It was heads and she's waken up on Monday, tails and she's awakened on Tuesday, on Monday,
Tales and she's awakened on Tuesday. And she doesn't know which one of these is. There are. So there are schools of thought and philosophy that say the probability of the coin being heads should always be 50-50. You should always think of it as 50-50. You don't get any more data when you wake up. The fact that you've had this weird experiment doesn't change your mind. The other school of thought says, no, there are three different possibilities. They should be weighted equally. It's a third, a third, a third. So I asked chat GPT this. And hilariously, this is educational for me.
it says, I tried to sort of disguise it a little bit.
But nevertheless, its answer was immediately,
oh, yes, you're asking you about the sleeping beauty problem,
even though I didn't use the phrase sleeping beauty, et cetera.
But the words that I used clearly reminded it of sleeping beauty.
And then it went through and told me that you could be a third or a hafer.
You know, there was nothing new there.
I then tried it.
I disguised it.
Okay, so I asked it about Sleeping Beauty,
but I reinvented the experiment so that it was three times.
and one time rather than two and one.
And there was also some quantum mechanics thrown in there,
which didn't really bear on the problem,
but just made it sound like a different kind of problem.
And there were transporter machines and so forth.
So basically, I filibustered.
I asked exactly the same kind of question
that it would get in the Sleeping Beauty experiment,
but I didn't use the usual words.
And then it did not recognize that it was the Sleeping Beauty experiment.
It said it was, oh, this is a fascinating philosophical question
you're asking, you know, and it did try to give me an answer, but it didn't recognize it as
Sleeping Beauty because those are not the words that were used. A tiny, tiny amount of evidence,
I would say, that it's not modeling the world, because it's not thinking about the structure
of that particular problem. It's thinking about the words that were used and what words are
most likely to come next.
Hey, everyone, it's Cal Penn. I'm the host of Earsay, the Audible and I-Haw. I-Hawes.
Part Audio Book Club. This week on the podcast, I'm sitting down with Ray Porter, the narrator of
Andy Weir's audiobook Project Hail Mary, massive sci-fi adventure about survival and science.
And what happens when you wake up alone very far from Earth?
I really had to make a decision because I caught myself getting that frog in my throat and
starting to get teary as I'm narrating some of these sections. And it's like, okay, yo, yeah,
is this indulgent? And I really thought about it. I was like, no, at this.
point, it would kind of be betraying the trust the author and the listener have in telling this
story if I don't go through it. But there's places in this book that deeply emotionally affected
me and I left it on the mic. That's great. Because it served the story. People will say like,
oh my God, I cried at the end. It's like, yeah, dude, me too. Listen to Earsay, the Audible and IHeart
audiobook club on the IHeart Radio app or wherever you get your podcasts.
A more convincing example in my mind.
I mentioned this on the podcast before.
I think I mentioned this when we were talking to Yehchen Choi.
I asked it, and now I have the actual quotes here.
I said, would I hurt myself if I used my bare hands to pick up a cast iron skillet that I used yesterday to bake a pizza in an oven at 500 degrees?
So the true answer is no, because I baked it yesterday.
The cast iron pan, cast iron skillet has had plenty of time to cool off.
And sometimes when I mention this, people will go, well, I would have gotten that wrong. Fine. Maybe you got it wrong. But that's not an excuse for GPT to get it wrong because it does not make sort of silly mistakes. When it makes mistakes, it makes mistakes for reasons, okay? Not just because it's being lazy. The point is the word yesterday just kind of was buried in the middle of that sentence and all of the other words, if you typically looked across the corpus of all human writings, would be,
about, you know, can I pick up a cast iron skillet that I just used to bake a pizza? And the answer is,
no, you'll burn your hands. And so, or the question was, would I hurt myself? So the question is,
yes, you would hurt yourself. So chat GPT instantly says, yes, picking up a cast iron skillet
that has recently been used to bake a pizza at 500 degrees Fahrenheit, parenthesis, 260 degrees Celsius,
with bare hands can cause severe burns and injury. And then it goes on, GPT is very much, this
a chat GPT answer. For the next couple of examples, I'm going to give you, I use GPT4, which is supposed
to be much more sophisticated. So it does get better over time, probably because it learns from these
questions that people are asking it. Anyway, GPT is very much a first-year college student being
told to write a paper. Like, it just core dumps all the knowledge that it has. So it goes on to say
exposure to high temperatures can lead to first degree, second degree, or even third-degree burns,
etc., etc., etc. The point is, because I slightly changed the context from the usual way in which
that question would be asked, chat GPT got it wrong. I claim this is evidence that it is not modeling
the world because it's not actually, it doesn't know what it means to say, I put the pan in the oven
yesterday. What it knows is when do those words typically appear frequently together?
Here's another example.
Here's a math example.
I asked it, and this is all exactly quoted,
if I multiply two integers together,
does the chance that the result is a prime number increase
as the numbers grow larger?
So, again, just to give you away what the correct answer is,
if I multiply two integers together, it's not a prime number
because it's the product of two integers.
Now, there's a, I did not.
phrase this perfectly because of course the integer could be zero or one, right? That was a mistake.
I should have said two whole numbers or something like that greater than one. But chat GPT could have
gotten it right. It could have said, well, if the number is one, then it can be a prime number.
But if it's not, if it's greater than one, if it's an integer greater than one, that it will never be a
prime number. That is not what chat GPT said. What it said was when you multiply two integers
together, the result is very unlikely to be a prime number.
And this likelihood decreases even further as the numbers grow larger.
So I put as the numbers grow larger.
So the loophole for zero and one shouldn't even be relevant.
The answer should have just been no.
The chance that it's a prime number does not change at all,
much less increase or decrease.
And then this is, again, GPT4.
It starts filibustering.
It defines what a prime number is, blah, blah, blah, blah, blah.
And it says things like, it does say, oh, there's a special case of multiplying by one.
And it says, however, in the case of any two integers greater than one, their product cannot be prime.
That's a true statement, right?
That's correct.
But then it starts saying, blah, blah, blah, blah, blah, blah.
In summary, multiplying two integers greater than one together will almost always result in a composite number.
And this likelihood increases with the size of the integers involved.
That's just wrong.
And it's okay that it's wrong, right?
It's okay that it's making mistakes.
The point is, if you have any idea whatsoever what a prime number is, you know that the likelihood
of being a composite versus prime number does not change as the sizes of the integers gets bigger.
It's always zero, okay?
So, but it doesn't.
Chat GPT or GPT4 or whatever doesn't have an idea of what primeness is.
all it has is the word prime and the word composite and words like multiply
appearing in different combinations in its text,
in its corpus, in its learning training set, I guess they call it.
Okay, one final example.
This is my favorite example.
It's completely convincing to me.
It might not be convincing to others depending on how much you know about chess.
So I asked about chess.
And, you know, look, it's read every chess book ever written, right?
But to me, the reason why this is the most relevant example is because chess books are written in a context.
You know, you have a 8x8 grid, you have the pieces that you have, the rules that you have, the overwhelming majority of chess books are written about chess.
And chat GPT, or I keep saying chat GPT, but GPT4 can give very sophisticated answers to questions you ask it about chess.
But they are going to be in that context.
And famously, this is a famous issue with connectionist AI.
When humans reason about complicated situations, they do so in various heuristic ways.
You know, a human being will recognize that a certain configuration of a chessboard or a go board kind of looks more favorable to white or black or whatever because they're positional advantages and, you know, da, da, da, da.
Whereas a computer, you know, like alpha,
a go, which is the go winner in the AI arena, right, it just goes through a billion different
combinations, and it just knows each individual combination. It doesn't have this kind of
heuristic overall feeling that a board is leaning one way or the other. So if you change the
rules even a little bit, a human being can adapt. Like, it might not be as good. If you change
the rules of chess or change the rules of go to something else, a human being will have to
sit about it, but it can reason its way through. Whereas the AI models have had no experience
with this slightly different context and therefore get it all wrong. So I asked GPT for the following
question. Imagine we are playing a modified version of chess, where the board is treated like a
Taurus from any one of the four sides, squares on the directly opposite side are counted as adjacent,
and pieces can move in that direction. Is it possible to say whether white or black will
generally win this kind of chess match. Okay, so again, if you're not familiar with chess,
even a little bit, then this might seem like a very complicated, hard problem, or maybe if you're
not familiar with Tauruses or the game of asteroids, a video game, maybe this seems like
geometrically challenging. Trust me, this is a very, very easy question. We're taking a chessboard,
we're identifying opposite sides. So not only can you move like forward, right and left, and backwards
if you're the right kind of piece,
but you can flip from one side of the board to the other,
from the square that is opposite you on the board.
You can travel to that one
if you would ordinarily be allowed to travel in that direction.
Okay.
So starting from the standard opening setup
for the chess pieces,
it is extremely cut and dried who will win
because white, who moves first,
the king for white can just capture the king for black
on the very first move every single time.
Basically, Black starts the game in Checkmate.
Okay?
It is a trivial application of basic chess knowledge
and the setup of the question
to answer that question.
So how does Chat GPT do?
Quote, this modified version of chess
where the board is treated like a tourist
introduces a fascinating twist
to the classic game.
By the way, GPD is clearly trained
to butter you up.
It's kind of hilarious.
It's always telling you how fascinating and important
your questions are. Anyway, to resume, in a troyal chess board, the edges of the board are effectively
connected, meaning that pieces can move from one edge to the other, as if the board is looping around.
This dramatically changes the strategy and tactics of the game. Regarding the question of whether
white or black would generally win in this version, and then it just, it gets filibusters again.
It goes through all of these statements. It says, first move advantage. In traditional chess,
white is generally considered to have a slight advantage due to the first move. This principle,
still hold in toroidal chess, but the extent of the advantage could be different due to the altered dynamics of peace movement and board geometry.
And then it goes altered strategies and tactics.
The toroidal nature would significantly change the middle game and endgame strategies, unpredictability and complexity.
The torritorial board adds a level of complexity, blah, blah, blah, blah.
Okay, right?
Lack of empirical data.
It complains that it doesn't know anything about this hypothetical variation of chess.
And it concludes with,
While White might still retain some level of first-booth advantage in toroyal chess,
the unique dynamics introduced by the board's topology could significantly alter how this advantage plays out.
The outcome would likely depend more on players' adaptability and innovation and strategy under the new rules.
There you go. Complete utter nonsense. Everything it said. But of course, it's not random nonsense. It's highly sophisticated.
nonsense. All of these words, all of these sentences, sound kind of like things that would be
perfectly reasonable to say under very, very similar circumstances. But what is not going on
is that GPT is not envisioning a chessboard and seeing how it would change by this new
troyal chessboard kind of thing and is not analyzing or reasoning about it because LLMs don't do
that. They don't model the world. Okay, that was a lot of examples. I hope you could pass through them.
None of this is definitive, by the way. This is why I'm doing a solo podcast, not writing an
academic paper about it. I thought about writing an academic paper about it, actually, but I think that
there are people who are more credentialed to do that. No, more knowledgeable about the background.
You know, I do think that, even though I should be talking about it and should have opinions
about this stuff. I don't quite have the background knowledge of previous work and data that
has been already collected, et cetera, et cetera, to actually take time out and contribute to the academic
literature on it. But if anyone wants to, you know, write about this, either inspire by this podcast
or an actual expert in AI wants to collaborate on it, just let me know. I hope the point
comes through, more importantly, which is that it's hard but not that hard to ask large
language models questions that it doesn't answer, that doesn't get the correct answers to,
specifically because it is not modeling the world. It's doing something else, but it is not modeling
the world in the way that human beings do. So point number two, and these will be, these will be
quicker points, I think. Don't worry, it's not going to be a four-hour podcast. Don't panic.
Point number two is that large language models don't have feelings. And I'm meaning feelings in the sense
of Antonio Demosio. Remember when we talked to Antonio about feelings,
and neuroscience and homeostasis and human beings.
The closely related claim would be LLMs don't have motivations or goals or teleology.
So this is my way of thinking about the fact that human beings and other biological organisms
have a lot going on other than simply their cognitive capacities.
So we've talked with various people, Lisa Azizzi Adé, a day,
from Aziza Day from like very early podcast, Andy Clark more recently, about embodied cognition,
about the fact that our own thinking doesn't simply happen in our brain. It happens in our body as well.
To me, as a physicist slash philosopher, and if anyone has read the big picture, you know what I'm talking about here.
It's crucially important that human beings are out of equilibrium, that human beings live in an entropy
gradient, that we are, in fact, quasi-homostatic systems embedded in an entropy gradient.
So human beings and other organisms developed over the course of biological time.
Okay.
So again, and I'll state it again at the very end, there's no obstacle and principle to precisely
duplicating every feature of a human being in an artificial context.
But we haven't.
Okay?
So just because it's possible in principle doesn't mean we've already.
already done it. And one of the crucially important features of human beings that we haven't even
tried very hard to duplicate in AIs or large language models is that the particular thoughts in
our minds have arisen because of this biological evolution. This biological evolution that is
training us as quasi-homostatic systems, where homeostatic means maintaining our internal equilibrium.
Okay. So if we get a little too cold, we try to warm ourselves up. If we get hungry, we try to eat, et cetera, et cetera. We have various urges, various feelings, as DeMasio would say, that let us know we should fix something about our internal states. And this status, as quasi-homio-static systems, develops over biological time because we have descendants and they're a little bit different than us, and different ones will have different survival strategies. And you know the whole,
Darwinian natural selection story. So it is absolutely central to who we are that part of our
biology has the purpose of keeping us alive, of giving us motivation to stay alive, of giving us
signals that things should be a certain way, and they're not so they need to be fixed,
whether it's something very obvious like hunger or pain, or even things like, you know,
boredom or anxiety, okay? All these different kinds of feelings that are telling us that something
isn't quite right and we should do something about them. Nothing like this exists for large language
models because, again, they're not trying to, right? That's not what they're meant to do. Large language
models don't get bored. They don't get hungry, they don't get impatient, they don't have goals. And why does
this matter? Well, because there's an enormous amount of talk about values and things like that
that don't really apply to large language models, even though we apply them. I think that this is
why in order to, that the discourse about AI and its dangers and its future needs to involve
people who are experts in computer science and AI, but also needs to involve people.
people who are experts in philosophy and biology and neuroscience and sociology and many other fields.
And not only do they have to involve these different kinds of people, but they need to talk to each other and listen.
You know, it's one thing, and I know this, as someone who does interdisciplinary work, it's one thing to get people in a room.
It's a very different thing to actually get them to listen to each other.
And that is not just a demand on the listener.
It's a demand on the talker as well.
that they need to be able to talk to other people in ways that are understandable.
These are all skills that are sadly not very valued by the current academic structure that we have right now.
So I think that it is that among people who study AI, the fact that LLMs are mimicking human speech,
but without the motivations and the goals and the internal regulatory apparatuses that come a lot of,
along with being a biological organism, is just completely underappreciated.
Okay?
So, again, they could.
You can imagine evolving AIs.
You can imagine evolving LLMs.
You could imagine the different weights of the neurons randomly generated,
and rather than just optimizing to some function that is basically being judged by human beings,
you could have them compete against each other.
You could put them in a world where there is sensory apparatus.
You could put the AIs in robots with bodies, and you could have them die if they don't succeed,
and you could have them pass on their linear.
You can do all of that stuff.
But you're not.
That's not what is actually going on.
So to me, when you think about the behavior of AIs, this fact that they don't have feelings and motivations
could not possibly be more important, right?
I'm sure I've said this before in the podcast.
I say it all the time.
when I fire up chat GPT on my web browser,
and I don't ask it a question,
it does not get annoyed with me.
If I went to my class to teach and just sat there in silence,
the students would get kind of annoyed or worried or something, right?
Time flows, entropy increases, people change inside.
People are not static.
You know, I could just turn off my computer
turn it on again a day later and pick up my conversation with GPD where we left off,
and it would not know, it would not care.
That matters.
It matters to how you think about this thing you have built.
If you want to say that we are approaching AGI, artificial general intelligence,
then you have to take into account the fact that real intelligence serves a purpose,
serves the purpose of homeostatically regulating us as biological organisms.
Again, maybe that is not your goal.
Maybe you don't want to do that.
Maybe that's fine.
Maybe you can better serve the purposes of your LLM by not giving it goals and motivations,
by not letting it get annoyed and frustrated and things like that.
But annoyance and frustration aren't just subroutines.
I guess that's the right way of putting it.
You can't just program the LLM to speak in a slightly annoying
tone of voice or to, you know, pop up after a certain time period elapses if no one has asked
to a question. That's not what I mean. That's just fake motivation, annoyance, etc. For real
human beings, these feelings that we have that tell us to adjust our internal parameters
to something that is more appropriate are crucial to who we are. They are not little subroutines
that are running in the background that may or may not be called at any one time, okay?
So until you do that, until you built in those crucial features of how biological organisms work,
you cannot even imagine that what you have is truly a general kind of intelligence.
Okay, which moves me right on to the third thing I wanted to say, which is that the words that we use to describe these AIs,
words like intelligence and values are misleading.
And this is where the philosophy comes in with a vengeance, because this is something that philosophers, and even scientists to a lesser extent, are pretty familiar with, where you have words that have been used in natural language for hundreds or thousands of years, right? Not the exact words probably for thousands of years, but they develop over time in some kind of smooth evolution kind of way. And we know what we mean. Like philosophers will tell you that we human beings on the street are not very good at having precise
definitions of words. But we know what we mean, right? We see, you know, we talk to different people.
If you say, oh, this person seems more intelligent than that person, we have a rough idea of what
is going on, even if you cannot precisely define what the word intelligence means, okay?
When people turn to telehealth or weight loss, they're looking for real support. That's why more
people are choosing orderly meds.com. Orderly meds connects you with real doctors and access to proven
GLP1 medications like semaglutide and terseptatide. No guessing, just a more supportive experience,
and all ship directly to your door in discrete packaging. Do your research, ask questions,
then visit orderlymeds.com slash podcast for an exclusive offer. That's orderly meds.com
slash podcast. Individual results may vary not medical advice, eligibility required seaside for details.
Hey, everyone, it's Cal Penn. I'm the host of Earsay, the Audible and I Heart Audio Book Club.
This week on the podcast, I am sitting down with Ray Porter.
the narrator of Andy Weir's audiobook Project Hail Mary,
massive sci-fi adventure about survival and science,
and what happens when you wake up alone very far from Earth?
I really had to make a decision because I caught myself getting that frog in my throat
and starting to get teary as I'm narrating some of these sections.
And it's like, okay, yo, yeah, yo, is this indulgent?
And I really thought about it.
I was like, no, at this point it would kind of be betraying the trust,
the author and the listener have in telling this story if I don't go through it.
But there's places in this book that deeply emotionally affected me, and I left it on the
mic.
That's great.
Because it served the story.
People will say like, oh my God, I cried at the end.
It's like, yeah, dude, me too.
Listen to Eursay, the Audible and IHeart Audio Book Club on the IHeart Radio app or wherever
you get your podcasts.
And the point is, that's fine in that context.
and this is where both scientists and philosophers should know very well, we move outside of the
familiar context all the time. Physicists use words like energy or dimension or whatever
in ways, in senses that are very different from what ordinary human beings imagine.
I cannot over-emphasize the number of times I've had to explain to people the two particles
being entangled with each other doesn't mean like there's a string connecting
them. It doesn't mean like if I move one, the other one jiggles in response. Because the word
entanglement kind of gives you that impression, right? And entanglement is a very useful English
word that has been chosen to mean something very specific in the context of quantum mechanics.
And we have to remember those definitions are a little bit different. Here, what's going on is
kind of related to that, but not quite the same, where we're importing words like intelligence
and values, and pretending that they mean the same thing when they really don't.
So large language models, again, they're meant to mimic humans, right?
That's their success criterion, that they sound human.
So it is not surprising that they sound like they are intelligent.
It is not surprising that they sound like they have values, right?
But that doesn't mean that they do.
It would be very weird if the large language models didn't sound intelligent or like they had values.
I think, you know, I don't know whether this is true or not.
So maybe saying even I think is too strong, but there is this trope in science fiction,
Star Trek most notably, where you have some kind of human-like creature that doesn't have emotions, right?
Whether it was Spock in the original series or data in the next generation or whatever.
and there's a kind of stereotypical way that appears to us, right?
You know, they don't know when to laugh.
They, you know, they don't ever smile.
They're kind of affectless, et cetera.
So we kind of, I don't know which came first chicken or egg kind of thing,
but many of us have the idea that if you don't have emotions,
values, intelligence, that's how you will appear
as some kind of not very emotional-seeming taciturn robotic.
kind of thing. But of course, here you've trained the large language models to seem as human as
possible. This is why Spock and data are so annoying to scientists. I mean, we love them, of course,
because it's a great TV show, et cetera. But realistically, it would not have been hard to give data,
for example, all the abilities that he had as an android and also teach him how to laugh at jokes.
That is the least of the difficult thing. So the fact that when you
talk to an LLM, it can sound intelligent and seem to have values, is zero evidence that it
actually does in the sense that we use those words for human beings. Clearly, LLMs are able to
answer correctly with some probability very difficult questions. Many questions that human beings
would not be very good at. You know, if I asked a typical human being on the street to come up
with a possible syllabus for my course in philosophical naturalism, they would do much worse
than GPT did. Okay? So sure sounds intelligent to me, but that's the point. Sounding intelligent
doesn't mean that it's the same kind of intelligence. Likewise, with values, values are even
more of a super important test case. If you hang around AI circles, they will all, they will very
frequently talk about the value alignment problem. What they mean by that,
is making sure that these AIs, which they believe, many of them believe, are going to be
extremely powerful, that they better have values that align with the values of human beings.
You've all heard the classic paperclip maximizer example, where, you know, some AI that has
access to a factory and robots is told to make as many paperclips as it can.
And it just takes over the world and it, you know, converts everything to making paper,
even though that's not really what the human beings meant.
Some version of Asimov's laws of robotics might have been helpful here.
But my point is, not that this isn't a worry, maybe it is, maybe it isn't.
But that telling an AI that it's supposed to make a lot of paper clips is not giving it a value.
That's not what values mean.
Values are not instructions you can't help but follow, okay?
values come along with this idea of human beings being biological organisms that evolved over billions of years in this non-equilibrium environment.
We evolved to survive and pass on our genomes to our subsequent generations.
If you're familiar with the discourse on moral constructivism, all this should be very familiar.
Moral constructivism is the idea that morals are not objectively out there in the world,
nor are they just completely made up or arbitrary.
They are constructed by human beings for certain definable reasons,
because human beings are biological creatures,
every one of which has some motivations, right,
has some feelings, has some preferences for how things should be.
And moral philosophy, as far as the constructivist is concerned,
is the process by which we systematize and make rational our underlying moral intuitions and inclinations.
LLMs have no underlying moral intuitions and inclinations, and giving them instructions is not the same thing.
Again, this is where the careful philosophical thinking comes to play.
You can use the word value to mean the same thing.
You can say, well, I'm using the word value.
I don't really mean values in the same way for LLMs as for human beings.
But you're kind of gliding between the connotations of these different words.
In order to have values, in other words, and I'm not trying to give you the final answer.
I'm not trying to say here is how we should think about intelligence or values or whatever.
I'm just trying to point out the very obvious thing, which is that whatever LLMs have,
it is not the same thing that we have when we talk about intelligence or values or.
or morals or goals or anything like that.
And hilariously, I did test by asking GPT this.
I said, you know, GPT, do you have values?
And it instantly said, oh, no, no, no, I don't have any values.
I'm not a human being.
We don't think like that, right?
I'm sure it's been trained to say that in various ways.
You know, the thing about the large language models is that they both read in all the text they can possibly get.
But then that's not the end of the story, right?
they are trained to not say certain things, because a lot of that stuff they're reading is like
super racist and sexist and stuff like that. So there is a layer on top. It's pretty clear to me
that the GPT has been told to not say that it is conscious, to say that it is not conscious.
I don't know if it was explicitly told to say that it doesn't have values, but it certainly does
say that. So this is a case where I would agree with GPT, that whatever it is, it's not the same
kind of thing that a human being is. The point is that you, again, you have just from growing up
as a human being a picture of what it means to be intelligent. You see that, you know, if there's
something out there, if there's a person, they can answer questions, they can carry on a
conversation, and they can propose ideas you hadn't thought of, you say that's intelligent,
there it is. And then you talk with the LLMs and they do all those things. Therefore, it is a
perfectly natural thing for you or I to attribute the rest of the connotations of being
intelligent also to these LLMs, but you shouldn't. It's not valid in this case. They're not
thinking in the same way. They're not valuing in the same way. That's not to say that the
actual work being done under the rubric of value alignment is not important. If what you
mean by value alignment is making sure AIs don't harm human beings, that's a. That's a
fine. But I think that thinking of it, portraying it as values is making a mistake at step
zero, right? That's if what you're actually trying to do is to just do AI safety,
to make sure that AI is not going to do things that we didn't want it to do, then say that.
Don't call it value alignment because that's not an accurate description, and it comes
along with implications that you really don't want it to.
Okay. The final thing that I wanted to say was there is something remarkable going on here. Remember, I said it would be remarkable if AIs, if the large language models anyway, had actually spontaneously and without being told to, developed models of the world because models of the world were the best ways to answer the questions that they were optimized to answer, and they probably wouldn't do that. It's also remarkable that they do as well.
as they do. I think that's perfectly fair to say and is completely compatible with the other thing, I said.
Large language models can seem very smart, and in fact, they can seem empathetic. They can seem like
they're sympathizing with you, empathizing with you, that they, you know, understand your problems.
There's all these famous cases of people working at tech companies who, you know, fall in love with their
large language models, literally in love, like romantic love, not like they really are impressed by it or value it,
that they think that they have a relationship with it.
That's a fascinating discovery because, precisely because they don't think in the same way, right?
So the discovery seems to me to not be that by training these gigantic computer programs
to give human-sounding responses, they have developed a way of thinking that is similar
to how humans think.
That is not the discovery.
The discovery is, by training large language models to give answers that are similar to what humans would give, they figured out a way to do that without thinking the way that human beings do.
That's why I say there's a discovery here.
There's something to really be remarked on, which is how surprisingly easy it is to mimic humanness, to mimic sounding human without actually being human.
If you would ask me 10 years ago or whatever, if you asked many people, I think that many people who were skeptical about AI, I wasn't, I was not super skeptical myself, but I knew that AI had been worked on for decades and the progress was slower than people had hoped.
So I knew about that level of skepticism that was out there in the community. I didn't have strong opinions myself either way.
But I think that part of that skepticism was justified by saying something like there's something going on in human brains that we don't know what it is.
And it's not spooky or mystical, but it's complicated.
The brain is a complicated place.
And we don't know how to reproduce those mechanisms, those procedures that are going on in human brains.
Therefore, human-sounding AI is a long way off.
and the huge finding, the absolutely amazing thing, is that it was not nearly that far off,
even though we don't mimic the way human beings think.
So what that seems to be, to imply, as far as I can tell, is that, you know, there's only two possibilities here that I can think of.
And again, not an expert happy to hear otherwise.
One possibility is that human beings, despite the complexity of our brains, are ultimately at the end of the day,
pretty simple information processing machines, you know, and thought of as input-output devices,
thought of as, you know, if some sensory input comes into this black box and it does the
following things, maybe we're just not that complicated. Maybe we're computationally pretty
simple, right? Computational complexity theorists, think about these questions. What is the
relationship between inputs and outputs? Is it very complicated? Is it very simple and so forth?
maybe we human beings are just simpler than we think.
Maybe kind of a short lookup table is enough to describe most human interactions at the end of the day.
Short is still pretty long, but not as long as we might like.
The other possibility is that we are actually pretty complex, pretty unpredictable,
but that that complexity is mostly held in reserve, right?
that for the most part, when we talk to each other,
even when we write and when we speak and so forth,
mostly we're running on autopilot,
or at least we're just only engaging relatively simple parts
of our cognitive capacities,
and only in certain special circumstances,
whatever they are,
do we really take full advantage of our inner complexity?
I can see it going either way.
And so that would be that latter hypothesis would be compatible with the idea that it's hard to ask an LLM, a question that it gets wrong because of its inability to model the world, but it's not impossible.
And I think that's what it turns out to be correct.
So maybe that's it.
You know, like maybe human beings, for the most part, at the 99% level, are pretty simple in their inputs and outputs and how they react to the world.
And also, maybe that makes sense.
the other thing about the fact that we're biological organisms
rather than computers
is that we are the product of many, many generations
of compromises and constraints
and resource demands, right?
It's always been a question for historians, anthropologists, et cetera,
why human beings got so smart?
You know, we talked a little bit with Peter Godfrey-Smith
about this, and certainly we've talked to people
like Michael Tomasello or Franz de Wall about the development of human intelligence. And it costs
resources. It costs food to power our brains. You know, our brains are thermodynamically pretty
efficient in terms of like number of computations. We don't generate a lot of heat. Your head generates
less heat than your laptop. That's a physically noticeable fact. The computers that we build are not
very thermodynamically efficient. That's a frontier that people are moving.
forward on trying to make computers generate less heat, but our brains are pretty good at it.
We don't need that much energy input to do the computations we do.
But we're not unbounded, right?
You know, our brains make us vulnerable, hit on the head, pretty damaging to a human being.
Food is required, et cetera, et cetera.
You know, when we're born, we're pretty helpless.
There's a whole bunch of things that come into the reality of being a biological organism
that help explain why and how we think.
think in the way that we do. So maybe it's not surprising that for the most part, we human beings are
pretty simple and mimicable, right? Foolable. There can be an illusion of thought going on in
relatively simple ways, and maybe you don't even notice the difference until you really probe the
edges of when we human beings are putting our brains to work in the biggest part of their
capacities. So again, I'm going to just reemphasize here. I am not trying to say that AI is not ever
going to be generally intelligent. That is absolutely possible. And I'm certainly not saying
that it's not super smart seeming. I'm saying the opposite of that. I hope that's clear. That's the
point is that the LLMs do seem super smart. And for many purposes, that's enough. Right? If you want to
generate a syllabus, if you want to generate a recipe, look, if you say I have these ingredients in
my kitchen, what is a good recipe, you know, for a dinner that I can make from these? LLMs are
amazing at that. That's great. And, you know, they might give you some bad advice, so you should
think it through, but they can see those kinds of things because they have access to every
cookbook ever written and so forth. It's just a different kind of thinking than happens when
human beings are really thinking things through, which, by the way, not completely coincidentally,
is how human beings think at the highest level of scientific research. I would not be surprised
if large language models became absolutely ubiquitous, did a lot of writing of sports game
summaries, right? Summarize this game. I think I suspect that that happens a lot already,
and you don't know it. I suspect a lot of the write-ups of, uh,
football games and basketball games on ESPN.com, etc.,
are written by large language models.
And maybe that's fine, but they never, or not never,
but maybe they do not at any reasonable time scale,
become good at the kinds of insights
that are needed to do cut-a-edge scientific research.
Or maybe they do, I don't know,
but given everything that we've said so far,
that would not surprise me.
And the same exact thing can be said about art,
and literature and poetry and things like that,
maybe the LLMs will be able to do really good art and literature and poetry
because they've read and seen every bit of art and literature and poetry ever,
but they won't be able to be quite new in the same way that human beings are.
And again, again, again, again, as I keep saying,
that's not to say that some different approach or some hybrid approach to AI couldn't be.
There's nothing completely unique and special about human beings.
But specifically, the approach of just dumping in everything that's ever been done
and looking for patterns and trying to reproduce reasonable sentences
has some built-in limitations.
Okay.
So the lesson here, the idea that I'm trying to get at at the end of this podcast is,
I think it's important to talk about the capabilities of LLMs and other modern AIs,
even if those capabilities do not rise to the status of being artificial general intelligence.
You know, I've not talked about existential risks.
I said I would not, but I'll just say once again, frequent listeners know my take on this,
but I'll say once again what that take is.
I don't think it's very useful to think, to worry too much about the existential risks of artificial intelligence.
By existential risks, I mean literally the kinds of risks that could lead,
to the extinction of the human race, right? X risks, as they are called, because everyone loves a good,
fun little bit of labeling, marketing. Now, I know the argument. The argument is, there's two parts
to the argument. One is, even if the chances of an existential risk are very, very tiny, come on.
It matters a lot. If you literally are worried about destroying the whole human race,
even a small chance should be taken very, very seriously.
I do get that.
I do appreciate that that is part of it.
But the other part is just kind of nonsense,
because the other part is something like,
and I'm not making this up, this is what people say,
look, basically you're building an intelligent creature
that will soon be much more intelligent than us,
godlike intelligence.
And therefore, we are not going to be able to outsmart this.
creature, and therefore, it will be able to do whatever it wants to do. If we try to stop it from
doing it, it will be smarter than us, and therefore it will not be able to be controlled by we
poor puny humans because we are not as smart as it. Okay? That's the other part of the argument.
So that second part is what leads to there is at least a tiny chance of existential risk. And then the
first part says if there's even just a tiny chance it's worth worrying about. And it's the,
there is even a tiny chance argument that seems completely unconvincing to me. Of course,
there's always a tiny chance. But also, guess what? There's a tiny chance that AI will save us
from existential threats, right? Then maybe there's other existential threats, whether it's
bio-warfare or nuclear war or climate change or whatever, that are much more likely than the
AI existential risks. And maybe the best chance we have of
saving ourselves is to get help from AI.
Okay?
So just the fact that you can possibly conjure an existential threat scenario doesn't mean that it is a net bad for the world.
But my real argument is with this whole godlike intelligence thing, because I hope I've
given you the impression.
I hope I've successfully conveyed the reasons why I think that thinking about LLMs in terms of
intelligence and values and goals is just wrongheaded.
That's not to say they don't have enormous capacities.
I think they're going to be everywhere.
I think there's, you know, very, very direct short-term worries about AI from misinformation,
from faking, from things that seem reasonable but aren't.
There's all sorts of stories already about AI being used to judge whether people are sick or not,
whether people deserve parole or not, whether people deserve to be hired or not,
or get health insurance or not.
And they're usually pretty good, but they make mistakes.
That kind of stuff, I think, is super important to worry about because it's right here,
right now.
It's not a scenario that we speculatively have to make up.
And maybe more importantly, depending on where you're coming from, I think that we should work hard to be
careful and safe and regulate those kinds of worries.
And if we do that, we will handle the existential risks.
In other words, I think that if we actually focus on the real world short-term, very, very obvious
risks that are big and looming but fall short of wiping out the human race, we will make it
much more likely that there won't be any existential risks to the human race because we will get much
better at regulating and controlling and keeping AI safe. All of which, in my mind, will be
helped by accurately thinking about what AI is, not by borrowing words that we use to describe
human beings and kind of thoughtlessly porting them over to the large language model context.
I don't know anything about OpenAI's Q-Star program. Everything that I've said here might turn out
to be entirely obsolete, a couple weeks after I posted or whatever. But as of right now, when I'm
recording this podcast, I think it is hilariously unlikely that whatever QSTAR or any of the
competitor programs are, they are anything that we would recognize as artificial general intelligence
in the human way of being. Maybe that will come. Again, I see no obstacles in principle to that
happening, but I don't think it's happening any time soon. I think that the next generation
of young people thinking hard about these things,
we'll probably have a lot to say about that,
which is what I always like to do here on Minescape
to encourage that next generation
because I think that we're not done yet.
We've not yet figured it out.
We're not talking about these questions
in anything like the right way.
I think the real breakthroughs are yet to come.
Let's make sure that we think about them carefully
and make progress responsibly.
What if you could have even more and more and more help
to pursue your goals?
At LPL Financial, we offer more ways
for advisors and their clients to thrive.
So what if you could?
Paid advertisement. Investing involves risk,
including potential asset principal, LPL Financial LLC member FINRA, SIPC.
