StarTalk Radio - The Simulation Within with Karl Friston
Episode Date: October 18, 2024Are we all living in a simulation inside our brains? Neil deGrasse Tyson and co-hosts Chuck Nice and Gary O’Reilly learn about the root of perception, if AI really is intelligent, and The Free Energ...y Principle with theoretical neuroscientist Karl Friston.NOTE: StarTalk+ Patrons can listen to this entire episode commercial-free here: https://startalkmedia.com/show/the-simulation-within-with-karl-friston/Thanks to our Patrons Timothy Ecker, Jason Griffith, Evan Lee, Marc, Christopher Young, ahoF3Hb9m, Steven Kraus, Dave Hartman, Diana Todd, Jeffrey Shulak MD, Susan Summers, Kurt A Goebel, Renee Harris, Damien, Adam Akre, Kyle Marston, Gabriel, Bradley Butikofer, Patrick Hill, Cory Alan, and Micheal Gomez for supporting us this week. Subscribe to SiriusXM Podcasts+ on Apple Podcasts to listen to new episodes ad-free and a whole week early.
Transcript
Discussion (0)
So guys, I was delighted to learn all the ways that principles of physics could be borrowed by
neuroscientists to try to understand how our brain works. Because at the end of the day,
there's physics and everything else is just opinion.
If you say so yourself. I love how you get to say that.
But it's the physics of intelligence.
And the physics of neuroscience is really what that
is about. As you've educated us, the physics is in everything. Yes. You just don't think of it as
being in neuroscience. Yeah, because we've compartmentalized what people do as professional
scientists into their own textbook and their own journals and their own departments at universities.
At the end of the day, we are one. It's all physics, people.
It's all physics.
At the end, that's the lesson we learn.
Coming up, all that and more on StarTalk Special Edition.
Welcome to StarTalk, your place in the universe where science and pop culture collide.
StarTalk begins right now.
Collide. StarTalk begins right now.
This is StarTalk Special Edition. Neil deGrasse Tyson here, your personal astrophysicist.
And if this is Special Edition, you know it means we've got not only Chuck Nice. Chuck,
how you doing, man? Hey, buddy.
Always good to have you there as my co-host. And we also have Gary O'Reilly, former soccer pro, sports commentator. Was that the crowd cheering him on?
Yeah, that's the crowd at Tottenham. Tottenham, yeah.
Crystal Palace, they're all... Anytime you mention my name in this room, there's a crowd effect.
That's right. Gary, with Special Edition, what you've helped do with this branch of StarTalk is focus on the human condition and every way that matters to it.
The mind, body, soul.
It would include AI, mechanical augmentation to who and what we are.
Robotics.
So this fits right in to that theme.
So take us where you need for this episode.
In the age of AI and machine learning,
we as a society and naturally as StarTalk
are asking all sorts of questions
about the human brain, how it works,
and how we can apply it to machines.
One of these big questions being perception.
How do you get a blob of neurons?
I think that's a technical term.
Yes, technical for sure.
Yeah.
In your skull to understand the world outside.
Our guest, Carl Friston, is one of the world's leading neuroscientists
and an authority on neuroimaging, theoretical neuroscience,
and the architect of the free energy principle.
Using physics-inspired statistical methods to model
neuroimaging data, that's one of his big successes, he's also sought after by the people in the
machine learning universe. Now, just to give you a little background on Carl, a neuroscientist and
theoretician at University College London, where he is a professor, studied physics and psychology at Cambridge
University in England, and inventor of the statistical parametric mapping used around
the world and neuroimaging, plus many other fascinating things. He is the owner of a
seriously impressive array of honors and awards, which we do not have time to get into.
And he speaks Brit.
Yes. So he's evened this out.
There's no more picking on the Brit because there's only one of them.
Okay.
Carl Fritz, then welcome to StarTalk.
Well, thank you very much for having me.
I should just point out I can speak American as well.
Please don't.
Carl, that takes a certain level of illiteracy that I'm sure that you don't possess.
Yeah, please don't stoop to our level.
So let's start off with something.
Is it a field or is it a principle or is it an idea that you pioneered, which is in our
notes known as the free energy principle?
I come to this as a physicist, and there's a lot of sort
of physics-y words that are floating or orbiting your work. And so in physics, they're very
precisely defined, and I need to know how you are using these terms and in what way they apply.
So let's just start off. What is the free energy principle?
So let's just start off.
What is the free energy principle?
Well, as it says on the tin, it is a principle.
And in the spirit of physics, it is therefore a method.
So it's just like Hamilton's principle of least action.
So it's just a prescription, a formal mathematical prescription of the way that things behave that you can then use to either simulate or reproduce or indeed explain
the behavior of things so you might apply the principle of least action for example
to describe the motion of a football the free energy principle is has a special domain of
application it talks about the self-organization of things, where things can be particles, they can be people, they can be populations.
So it's a method, really, of describing things that self-organize themselves into characteristic states.
Very cool.
So why give it this whole new term?
You know, we've all read about or thought about or seen the self-organization of matter. Very cool. have you know so so a ball rolls off a table onto the ground it doesn't roll off the ground onto the
table so it seeks the minimum place and my favorite of these is the the box of morning breakfast
cereal and it will always say some settling of contents may have occurred yeah and you open up
and it's like two-thirds two-thirds full. Yeah, two-thirds of powder.
You get two-thirds of crushed shards of cornflakes.
So it's finding sort of the lowest place in the Earth's gravitational potential.
So why the need for this new term?
Well, it's an old term. I um again pursuing the american theme you can trace the
um this kind of free energy back to richard feynman probably his phd thesis so he was he
was trying to deal with the problem of describing the behavior of small particles and invented this
kind of free energy as a proxy that enabled him to evaluate the probability that a
particle would take this path or that path. So, exactly the same maths now has been transplanted
and applied not to the movement of particles, but to what we refer to as belief updating.
So, it's lovely you should introduce this notion of nature finding its preferred state.
That can be described as rolling downhill to those free energy minima.
This is exactly the ambition behind the free energy principle.
But the preferred states here are states of beliefs or representations about a world in which something say you or i exist so we you know this is the
point of contact with machine learning and artificial intelligence so the free energy is
not a thermodynamic free energy it is a free energy that scores a probability of your explanation
for the world in your head being the right kind of explanation. And you can now think about
our existence, the way that we make sense of the world, and our behavior, the way that we sample
that world as effectively falling downhill to that settling towards the bottom, but in an extremely
itinerant way, in a wandering way, as we sort of go through our daily lives at different temporal scales it can all be
described effectively as coagulating at the bottom of the serial packet in our preferred states wow
so you're again if i don't want to put words in your mouth that don't belong there this is just
my attempt to interpret and understand what you just described. You didn't yet mention neurons, which are the carriers of all of this, or the transmitters
of all of this thoughts and memories and interpretations of the world.
So when you talk about the pathways that an understanding of the world takes shape, do
those pathways track the nearly semi-infinite connectivity of neurons in our brains and it is so you're finding
what the neuron it will naturally do in the face of one stimulus versus another now that's absolutely
right in fact technically you can describe neuronal dynamics the the trajectory or the path
of nerve nerve cells firing exactly as performing a gradient descent on
this variational free energy. So that is literally true. But I think more intuitively,
the idea is, in fact, the idea you've just expressed, which is you can trace back possibly
to the early days of cybernetics in terms of the good regulator theorem. The idea here is that to be well adapted to your environment,
you have to be a model of that environment.
In other words, to interface and interact with your world
through your sensations,
you have to have a model of the causal structure in that world.
And that causal structure is thought to be literally embedded
in the connectivity within your among your neurons within your brain so you know my favorite example
of this um would be the distinction between where something is and what something is so in our
universe you know a certain object can be in different positions.
So if you told me what something is, I wouldn't know where it was.
Likewise, if you told me where something was, I wouldn't know what it was.
That statistical separation, if you like, is literally installed in our anatomy.
So there are two streams of the back of the brain,
one dealing with where things are and one stream of connectivity
dealing with what things are.
However, we are pliable enough, though.
And of course, I'm not pushing back.
I'm just trying to further understand.
We're pliable enough, though,
that if you were to say,
go get me the thing, okay okay and then you give me very specific
coordinates of the thing i would not have to know what the thing is and i would be able to find it
even if there are other things that are there yep now and that speaks to um something which is quite
remarkable about ourselves that we actually have a model of our lived world that has
this sort of geometry that can be navigated because our presupposes you've got a model of yourself
moving in a world and you know the way that your your body works i'm tempted here to bring in
groins but i don't know why chuck injured his groin a few a few days ago that's why
Chuck injured his groin a few days ago.
That's all he's been talking about.
Carl, well, hopefully.
We've all heard about it since.
Carl, I hear the term active inference,
and then I hear the term Bayesian active inference.
Let's start with active inference.
What is it?
How does it play a part in cognitive neuroscience?
Active inference, I think most simply put, would be an application of this free energy principle we're talking about.
So it's a description or applying the maths to understand how we behave in a sentient way. that perception read as unconscious inference in the spirit of Helmholtz depends upon the data
that we actively solicit from the environment. So what I see depends upon where I am currently
looking. So this speaks to the notion of active sensing. You went a little fast. I'm sorry, man.
I'm trying to keep up here. Okay. But you went a little fast there, Carl.
You talked about perception being an inference that is somehow tied to the subconscious.
But can you just do that again, please?
And just to be clear, he's speaking slowly.
Exactly.
So it's not that he's going fast.
No.
It's that you are not keeping up. Well, listen, I don't have a problem.
Okay. I have no problem not keeping up, which is listen, I don't have a problem, okay?
I have no problem not keeping up,
which is why I have never been left behind, by the way.
I have no problem keeping up because I go,
wait a minute.
So anyway, could you just break that down a little bit for me?
Sure.
I was trying to speak at a New York pace.
My apologies.
I'll revert to London.
Okay, so let's start at the beginning,
sense-making, perception. How do we make sense of the world? We are locked inside, our brains are
locked inside a skull, it's dark in there, there's no, you can't see other than what information is
conveyed by your eyes or by your ears or by your skin, your sensory organs.
So you have to make sense of this unstructured data coming in from your sensory organs,
your sensory epithelia. How might you do that? The answer to that, or one answer to that,
can be traced back to the days of Plato through Kant and Helmholtz. So Helmholtz brought up this notion of unconscious inference.
Sounds very glorious, but very, very simply,
it says that if inside your head
you've got a model of how your sensations were caused,
then you can use this model to generate a prediction
of what you would sense if this was the right cause,
if you got the right hypothesis.
And if what you predict matches what you actually sense,
then you can confirm your hypothesis.
So this is where inference gets into the game.
It's very much like a scientist who has to use scientific instruments,
say microscopes or telescopes,
in order to acquire the right kind
of data to test her hypotheses about the structure of the universe about the state of affairs out
there as measured by her instruments so this can be described this sort of hypothesis testing
putting your your fantasies your hypotheses your beliefs about the state of affairs outside your skull to test
by sampling data and testing hypotheses.
This is just inference.
So this is where inference gets into the game.
These are micro steps en route to establishing an objective reality.
And there are people for whom their model does not match a prediction they might make
for the world outside of them, and they would be living in some delusional, some world that
you cannot otherwise agree to what is objectively true.
And that would then be an objective measure of insanity or some other neurological disconnect.
Really, though?
I mean, is it really?
Well, if you project your own fantastical world into reality
and you know it doesn't sit, but it's what you want,
then that's a dysfunction.
You're not working with, you're working against.
But we live in a time now where that fantastical dysfunction actually has a place.
And talk to James Cameron for just a little bit, and you'll see that that fantastical dysfunction was a world-building creation that we see now as a series of movies.
So is it really so,
you know,
aberrant that it's,
you know,
a dysfunction or is it just different?
Well,
I think he's trying to create artistically rather than impose upon.
Yeah.
So Carl,
if everyone always received the world objectively,
would there be room for art at all?
Ooh, that was a good question.
Yep, it really was.
Well done, sir.
I'm going to say I think I was the inspiration for that question.
Yes, Chuck inspired that question.
So, there's a role for each side of this perceptive reality, correct?
No, absolutely. a role for each of each side of this the perceptive reality correct no absolutely so just to pick up on a couple of those themes but that last point was i think quite key it is certainly the case
of one application of one use of active inference is to understand psychiatric disorders so you're
absolutely right when people a model of their lived world is not quite apt for the situation in which they find themselves say something changes
say you lose a loved one so your world changes so your predictions and the way that you sort of
navigate through your day either socially or physically is now changed so your model is no
longer fit for purpose for this world but as chuck was saying before the brain is incredibly plastic and
adaptive so what you can do is you can use the mislatch between what you predict is going to
happen and what you actually sense to update your model of the world and before i was saying that
this is a model that would be able to generate predictions of what you would see under a particular hypothesis or fantasy.
And just to make a link back to AI, this is generative AI.
It's intelligent forecasting prediction under a generative model that is entailed exactly by the connectivity that we were talking about before in the brain.
And it's the free energy principle manifesting when you readjust to the changes and it's finding the new
roots that are presumably the more accurate your understanding of your world the lower is that free
energy state or is it higher or lower what is it yeah that is absolutely right
so actually technically you know if you go into the cognitive uh neurosciences you'll find a big
move in the past 10 years towards this notion of predictive processing and predictive coding
which again just rests upon this meme that our brains are constructive organs generating from the inside our predictions of the sensorium,
and then the mismatch is now a prediction error.
That prediction error is then used to drive the neurodynamics that then allow for this revising or updating my beliefs,
that then allow for this revising or updating my beliefs such that my predictions now are more accurate
and therefore the prediction error is minimized.
The key thing is, to answer your question technically,
the gradients of the free energy that drive you downhill
just are the prediction errors.
So when you've minimized your free energy,
you've squashed all the prediction errors. So when you've minimized, when you've minimized your free energy, you've squashed all the prediction errors.
Absolutely.
Excellent.
You're not going to roll uphill
unless there's some other change to your environment.
I'm Kais from Bangladesh,
and I support StarTalk on Patreon.
This is StarTalk with Neil deGrasse Tyson.
So if we think back to early mankind and the predictability.
So I'm walking along, I see a lion in the long grass.
What do I start to predict?
If I run up a tree high enough, that lion won't get me. But if I run along the ground,
the lion's probably going to get me. Is this kind of evolutionary that we've born for survival?
Yes.
Or have I misinterpreted this completely?
No, no, I think that's an excellent point. Well, let's just think about what it means to be able to predict exactly what you would sense in a given situation and thereby predict also what's going to happen next.
If you can do that with your environment and you've reached the bottom of the cereal packet and you've minimized your free energy, minimized your prediction errors, you now can fit the world.
You can model the world in an accurate way.
That just is adaptive fitness
so if you look at this process now as unfolding over evolutionary time you just you can now read
the variational free energy or its negative as adaptive fitness so that tells you immediately
that evolution itself is one of these free energy minimizing processes it is also if
you like testing hypotheses about the kind of denizens of its environment the kind of creatures
that that will be a good fit for this particular environment so you can actually read natural selection as well in statistics will be known
as bayesian model selection so you are in fact inheriting inferences or learning
transgenerationally in a way that's minimizing your free energy minimizing your prediction errors
so things that get eaten by lions don't have the ability to promulgate propagate themselves
through to the next generation so that everything ends up at the bottom of the cereal packet
avoiding lines because those are the only things that can be there because the other ones didn't
minimize their free energy yeah unless gary you made babies before you said i wonder if that's a
lion in the bushes let me check but if they've got my genes then there's a lion in the bushes. But if they've got my genes, then there's a lion
with their name on it. That's exactly right. I want to share with you one observation,
Carl, and then I want to hand back to Gary, because I know he wants to get all in the AI
side of this. I remembered one of the books by Douglas Hofstadter. It might've been
Gödel, Escher, Bach, or he had a few more that were brilliant explorations into the mind and body.
In the end of one of his books, he had, it was an appendix, I don't remember, a conversation with Einstein's brain.
And I said to myself, this is stupid.
What does this even mean?
And then he went in and described the fact that imagine Einstein's brain could be preserved at the moment he died.
And all the neurosynaptic elements are still in place.
And it's just sitting there in a jar.
And you ask a question.
And the question goes into his ears, gets transmitted into the sounds that trigger neurosynaptic firings. It just moves
through the brain and then Einstein then speaks an answer. And the way that setup was established,
it was like, yeah, I can picture this sometime in the distant future. Now, maybe the modern version
of that is you upload your consciousness and then you're asking your brain in a jar, but it's not biological at that point. It's in silicon. But what I'm asking is the information going into Einstein's brain in that thought experiment presumably trigger his thoughts and then his need to answer that question because it was posed as a question
could you just comment on that exercise the exercise of probing a brain that's sitting
there waiting for you to ask it a question i mean it's a very specific and interesting example of
the kind of predictive processing that we are capable of because we're
talking about language and communication here and and just note the way that you set up that question
provides a lovely segue into large language models um but note also that it's not the kind
of embodied intelligence that we were talking about with in relation to active inference because
there's no the brain is in a body the brain is in a body, the brain is embodied.
Most of what the brain is actually in charge of
is moving the body or secreting.
In fact, those are the only two ways you can change the universe.
You can either move a muscle or secrete something.
There is no other way that you can affect the universe.
So this means that you have to deploy your body
in a way to sample the right kind of information
that makes your model as apt or as adaptive as possible.
So Chuck, did you hear what he said?
It means you cannot bend the spoon with your brain.
Right.
It's a letter you regale.
Right.
Just to clarify. Okay. So what i was trying to hint at because i suspect it's going to come up in later conversation that there's i think a difference between a brain and a vata a large
language model that is the embodiment of lots of knowledge so one can imagine say a large language
model being a little bit like einstein's brain but einstein plus you know
100 possibly a million other people and the history of everything that has been written
that you know you can probe by asking it questions and in fact there are people whose
entire career is now prompt engineers ai prompts yeah it's funny the people who
program ai then leave that job to become prompt. The people who are responsible for
creating the best prompts to get the most information back out of AI. So it's a pretty
fascinating industry that they've created their own feedback loop that benefits them.
And now you can start to argue, you know, where is the intelligence?
Is it in the prompt engineer?
As a scientist,
I would say that's where the intelligence is.
That's where the sort of sentient behavior is.
It's asking the questions,
not producing the answers.
That's the easy bit.
It's certainly asking,
queering the world in the right way.
And just notice,
what are we all doing?
What is your job?
Is it asking the right questions
carl can i ask you this please um could active inference cause us to miss things that do happen
and secondly does deja vu fit into this yes and um yes um oh um in a sense active inference um is really about missing things that are measurable or
observable in the right kind of way um so another another sort of key thing about natural intelligence
um and be a good scientist just to um point out that sort of noting the discovering infrared that's an act of creation
that is art so yeah where did that come from from somebody's model about the structure of
electromagnetic radiation so you know i think just to pick up on a point we missed earlier on
creativity and insight is an emergent property of this kind of question answering in an effort to improve
our models of our particular world coming back to missing stuff you know it always fascinates me that
the way that we can move depends upon ignoring the fact we're not moving so i'm talking now about a
phenomena in cognitive science called sensory attenuation.
And this is the rather paradoxical, or at least counterintuitive phenomenon,
that in order to initiate a movement, we have to ignore and switch off and suppress
any sensory evidence that we're not currently moving and my favorite example of this is moving your
eyes so if i asked you to sort of track my finger as i moved it across the screen and you moved your
eyes very very quickly while your eyes are moving you're actually not seeing the optic flow that's
being produced because you are engaging something called saccadic suppression and this is a
reflection of the brain very cleverly knowing that that particular optic flow that i have induced
is fake news so the ability to ignore fake news is absolutely essential for a good navigation
and movement of our world is it fake or just irrelevant to the moment?
If it's the New York Times, it's definitely fake.
Fake news.
But it's not so much fake.
It's just not relevant to the task at hand.
Isn't that a different notion?
It's a subtle one.
For the simplicity of the conversation,
then I'm reading fake as
irrelevant imprecise so it's like it's unusable so your brain is just throwing it out basically
like don't don't nothing to see here so get rid of that so neil neil this is this is in your
backyard rather more than mine but isn't this where the matrix pretext kind of fits in that
our perception might differ from what's actually out there?
And then perception can be manipulated or recreated.
Well, I think Carl's descendants will just put us all in a jar.
The way he's talking.
Carl, what does your laboratory look like?
Full of jars.
Yes.
Well, there are several pods and we have one waiting for you.
Yeah, in the film The Matrix, of course, which came out in 1999,
about 25 years, a quarter century ago, which is hard to believe.
What?
It was very candid sense that your brain's reality
is the reality you think of and understand, and it is not receiving
external input. All that your brain is constructing is detached from what's exterior to it.
And if you've had enough lived experience, or maybe in that future that they're describing,
the brain can be implanted with memory.
It reminds me, what's that movie that Arnold Schwarzenegger is in about Mars?
Total Recall.
Total Recall, thank you.
Get your ass to Mars.
Instead of paying thousands of dollars to go on vacation,
they would just implant the memories of a vacation in you
and bypassing the sensory conduits into your brain.
Of course, these are movies and they're stories and it's science fiction.
How science fiction-y is it really?
Well, I certainly think that the philosophy behind, I think, probably both Total Recall,
but particularly The Matrix, I think that's very real and very current.
Just going back to our understanding people with psychiatric disorders
or perhaps people who have odd views, world views,
to understand that the way that you make sense of the world
can be very different from the way I make sense of the world,
dependent on my history and my predispositions and my prize, what I have learned thus far.
And also, the information that I select to attend to. So, just pursuing this theme of ignoring 99%
of all the sensations. For example, Chuck, are you thinking about your groin at the moment? I
would guarantee you're not, and yet it is generating sensory impulses from the nerve endings but you at this point in time
we're not selecting that so the capacity to select is you know i think a fundamental part
of intelligence and agency because to select means that you are not attending to or selecting 99% of the things that you could select.
So I think the notion of selection is a hallmark of truly intelligent behavior.
Are you analogizing that to large language models in the sense that it could give you gibberish,
it could find crap anywhere in the world that's online but because you prompted
it precisely it is going to find only the information necessary and ignore everything else
yes i know but that's a really really good example um so the yes part is that the characteristic
bit of architecture that makes large language models work certainly those that are implemented
using transformer architectures are something called attention heads. So it is exactly the same
mechanism, the same basic mechanics that we were talking about in terms of attentional selection
that makes transformers work. So they select the recent past in order to predict the next word.
That's why they work, to selectively pick out something in the past,
ignore everything else to make them work.
When you talk about that probability in an LLM,
that probability is a mathematical equation that happens for every single letter
that's coming out of that model.
So it is literally just giving you the best probability of what is going to come
next. Okay. Whereas when we perceive things, we do so from a worldview. So for an LLM,
if you show it a picture of a ball with a red stripe,'s next to a house, okay, and say, that's a
ball, and then show it a picture of a ball in the hands of a little girl who's bouncing it,
it's going to say, all right, that might be a ball, that may not be a ball. Whereas if you show
even a two-year-old child, this is a ball, and then take that ball and place it in any circumstance,
this is a ball, and then take that ball and place it in any circumstance,
the baby will look at it and go, thaw, thaw.
So there is a difference in the kind of intelligence that we're talking about here.
Yeah, I think that's spot on.
That's absolutely right.
And that's why I said yes and no.
Okay. That kind of fluency that you see in large language models is very compelling and it's very easy to give the illusion that these things have some understanding or some intelligence, but they don't have the right kind of generative model underneath to be able to generalize and recognize a ball in different contexts, like the way that we do. Well, it would if it was set up correctly.
And that setup is no different from you looking at reading the scene.
I mean, a police officer does that busting into a room. You know, who's the perpetrator, who's not, before you shoot.
There's an instantaneous awareness factor
that you have to draw from your exterior stimuli and so because you know i'm
reminded of here carl i saw one of these new yorker style cartoons where there are two dolphins
swimming in one of these water you know parks right and so they're in captivity but the two
dolphins are swimming and one says to the other of the person walking along the pool's edge, those humans, they face each other and make noises, but it's not clear they're actually communicating.
And so who are we to say that the AI large language model is not actually intelligent if you cannot otherwise tell the difference who cares how it
generates what it is if it gets the result that you seek you're going to say oh well we're intelligent
and it's not how much of that is just human ego speaking well i'm sure it is human ego and speaking
but in a technical sense um okay there's a loophole you're saying,
because I'm not going to say that bees are not intelligent when they do their
waggle dance,
telling other bees where the honey is.
And I'm not going to say termites are not intelligent when they build
something a thousand times bigger than they are when they make termite mounds
and they all cooperate.
I'm fatigued by humans trying to say how special we are relative to everything else in the world that has a brain.
When they do stuff, we can't.
Let me ask you then.
So what's the common theme between the termite and the bee and the policeman reading the scene?
What do they all have in common?
All of those three things move, whereas a large language model doesn't.
Doesn't. So that brings us back to this action, the active part of active inference. So the note
of the question about large language models and attention was that large language models are just
given everything. They're given all the data
there is no requirement upon them to select which data are going to be most useful to learn from
and therefore they don't have to build expressive fit for purpose world models or generative models
whereas your daughter would the two-year-old daughter playing with the beach ball would have to by
moving and selectively reading the scene by moving her eyes by observing her body by observing balls
in different contexts build a much deeper appropriate world or geriatric model that
would enable her to recognize the ball in this context
and that context and ultimately tell her father, I'm playing with a ball. so we had a great show with brett kagan who mentioned your free energy principle
and in his work creating computer chips out of neurons what people call organoid intelligence
what he was calling synthetic biological intelligence. And that's in our archives.
Yeah, in our recent archives, actually.
Recent archives, yeah. Do you think the answer to AGI is a biological solution, a mechanical solution, or a mixture
of both?
And remind people what AGI is.
Artificial General Intelligence.
I know that's what the words stand for, but what is it?
You're not asking me for the answer. Don't ask me either.
No, seriously, I've been told off for even using the acronym anymore because it's so ill-defined
and people have very different readings of it. So open AI has a very specific meaning for it.
If you talk to other theoreticians, they would represent it. I think what people are
searching for is natural intelligence. It's natural.
Gary, it answers your question. Do we have to make a move towards biomimetic, neuromorphic,
natural kinds of instantiation of intelligent behavior? Yes, absolutely. But Chuck, just going
back to your previous theme, notice we're talking about behaving systems,
systems that act and move and can select
and do their own data mining in a smart way
as opposed to just ingesting all the data.
So what I think people mean
when they talk about superintelligence
or generalized AI or artificial intelligence,
they just mean natural intelligence.
They really mean us.
It's our brain.
Our brain, if you want to know what AGI is,
it's our brain.
If it was actually our brain,
it would be natural stupidity.
Well, that too.
Our brain without the stupidity.
That's really what it is.
So back in december 22 you dropped a white paper titled
designing ecosystems of intelligence from first principles now is this a roadmap for
the next 10 years or beyond or to the terminator ultimate destination and then somewhere along the
line you discussed the thinking behind a move from AI to IA, and IA standing for intelligent agents, exact as a kind of roadmap that those people
who were committed to a future of artificial intelligence
that was more sustainable,
that was explicitly committed to a move to natural intelligence
and all the biomimetic moves that you'd want to make,
including implementations on neuromorphic hardware uh quantum computational photonics all those efficient
um approaches that um would be sustainable in the sense of your climate change for example
but also speaking to chaps notion about efficiency efficiency is um is also if you like bait into natural intelligence
in the sense that if you can describe intelligent behavior as this falling downhill pursuing free
energy gradients minimizing free energy getting to the bottom of the serial packet you're doing
this through via a path of least action that is the most efficient way of
doing it not only informationally but also in terms of the amount of electricity you use and
the carbon footprint you leave behind so from the point of view of sustainability it's important we
get this right and so that part of the theme of that white paper was saying there is another
direction of travel you're away from large models, large is in the title.
It's seductive, but it's also very dangerous.
It shouldn't be large.
It should be the size of a pea.
So to do it biologically,
you should be able to do it much more efficiently.
And, of course, the meme here is that our brains work on 20 watts,
not 20 kilowatts.
And we do more than any large language model um so that we have low
energy intelligence we do efficient i guess that's a that's a way to say it i've seen you quoted
carl as saying that we are coming out of the age of information and moving into the age of
intelligence if that's the case what is the age of intelligence going to look like,
or have we already discussed that? Well, I think we're at its inception now,
just in virtue of all the wonderful things that are happening around us and the things that we
are talking about. We're asking some of the very big questions about what is happening and what
will happen over the next decade. I think part of the answer to that lies in your previous nod
to the switch between AI and IA.
So IA brings agency into play.
So one deep question would be,
is current generative AI an example of agentic?
Is it an agent?
Is a large language model an agent?
And if not, then it can't be intelligent and certainly can't have generalized intelligence so what is definitive of being an
agent i put that out there's a question half expecting a joke but i've got agent smith in
my head if anyone can take that and run with it well yeah you go. It's right about now where you hear people
commenting on the morality
of a decision and
whether a decision is good
for civilization or not.
And everybody's afraid
of AI achieving
consciousness and just declaring that
the world would be better off without humans.
And I think we're afraid of that because
we know it's true.
Yeah, I was going to say, we've already come to that conclusion.
That's the problem.
Okay, Carl, is consciousness the same as self-awareness?
Yeah, there are lots of people who you could answer that question of
and get a better answer.
I would say the purpose of this conversation, probably not, no.
I think to be
conscious um certainly to be sentient and to behave in a sentient kind of way would not
necessarily imply that you knew you were a self i'm pretty sure that a b doesn't have self-awareness
but it still has sentience it's still experience it has experiences and has plans and communicates
and behaves in a you know in an intelligent way and you could also argue that certain humans
don't have self-awareness of a fully developed sort you know i'm talking about very severe
psychiatric conditions so i think self-awareness is is a a gift of a particular very elaborate
very deep generative model that not only entertains the consequences of my actions
but also entertains the fantasy or hypothesis that i am an agent of the i am self and can be
self-reflective in a sort of metacognitive sense. So I think I'd differentiate between self-aware
and simply being capable of sentient behavior.
Wow, that is great.
Let me play skeptic here for a moment, mild skeptic.
You've described, you've accounted for human decision-making and behavior
with a model that connects our sensory, the
sensory conduits between what's exterior to our brain and what we do with that information as it
enters our brain. And you've applied this free energy, a gradient that this information follows.
this information follows. That sounds good. It all sounds fine. I'm not going to argue with that.
But how does it benefit us to think of things that way? Or is it just an after-the-fact pastiche on top of what we already knew was going on, but now you put fancier words behind it?
Is there predictive value to this model? Or is the predictivity
in your reach because when you assume that's true, you can actually make it happen in the AI
marketplace? Yeah, I think that that's the key thing. So, I mean, when I'm asked that question,
or indeed when I asked that question of myself, I sort of applied to things like Hamilton's
principle of least action why is that useful well
it becomes very useful when you're actually sort of building things it becomes very useful when
you're simulating things it becomes useful when um something does not comply with a hamilton's
principle of least action so just to unpack those directions that travel in terms of applying the free energy principle,
that means that you can write down
the equations of motion
and now you can simulate
self-organization
that has this natural
kind of intelligence,
this natural kind of sentient behavior.
You can simulate it in a robot,
in an artifact,
in a terminator should you want to,
although strictly speaking
that would not be
compliant with the energy principle but you can also simulate it in silico and make digital twins
of people and choices and decision making and sense making and once you can simulate you can now
use that as an observation model for real artifacts and start to phenotype,
say, people with addiction or, say, people who are very creative or, say, people who have
schizophrenia. So, if you can cast aberrant inference or false inference, believing things
are present when they're not or vice versa, as an inference problem.
And you know what the principles of sense-making and inference are, and you can model that in a computer.
You can now get a stamp-it in which you can now not only phenotype by adjusting the model to match somebody's observed behavior,
but now you can go and apply synthetic drugs or do brain surgery in silica so there are
lots of practical applications of knowing how things work well when i say things work how things
behave that presumes that your model is correct for example just a few decades ago it was presumed
and i think no longer so that our brain functioned via neural nets, neural networks, where it's a decision tree,
and you slide down the tree to make an ever more refined decision. On that assumption,
we then mirrored that in our software to invoke neural net decision-making in my field,
in astrophysics. How do we decide what galaxy is interesting to study versus others
in the millions that are in the data set? You just put it all into a neural net that has parameters
that select for features that we might, in the end of that effort, determine to be interesting.
We still invoke that, but I think that's no longer the model for how the brain works.
But it doesn't matter. It's still helpful to us.
but I think that's no longer the model for how the brain works.
But it doesn't matter.
It's still helpful to us.
You're right. And honestly, that is now how AI is organized
around the new way that we see the brain working.
Yeah.
And why is the brain the model of what should be emulated?
I mean, the human physiological system is rife with baggage, evolutionary baggage.
Much of it is of no utility to us today, except sitting there available to be hijacked by advertisers or others who will take advantage of some feature we had 30,000 years ago when it mattered for our survival.
And today, it's just dangling there waiting to be exploited.
So a straight answer to your question, the free energy principle is really a description
or a recipe for self-organization of things that possess a set of preferred or characteristic
states coming right back to where we started, which is the bottom of the cereal packet. of things that possess a set of preferred or characteristic states,
coming right back to where we started, which is the bottom of the serial packet.
If that's where I live, if I want to be there, that's where I'm comfortable,
then I can give you a calculus that will, for any given situation,
prescribe the dynamics and the behavior and the sense-making and the choices
to get you to that point.
It is not a prescription for what is
the best place to be or what the best um embodied form of that being should be and say that if you
exist and you want to exist in a sustainable way where it could be you know a speech in a meme
in a given environment yes in a given yeah it's all about
it's all about the relationship you know that's a really key point so the variational free energy
that we've been talking about the prediction error is a measure of the way that something
couples to its universe or to its world it's not in it it's not a statement about a thing in
isolation it's the fit it's you know in isolation. It's the fit.
Again, if you just take the notion of prediction error,
there's something that's predicting and there's something being predicted.
So it's all relational.
It's all observational.
It's a measure of adaptive fitness.
That's an important clarification, I guess.
Carl, could you give us a few sentences on Bayesian inference?
That's a new word to many people who even claim to know some statistics.
That's a way of using what you already know to be true to help you decide what's going to happen next. Are there any more subtleties to a Bayesian inference than that? I think what you just said
captures the key point. It's all about updating. So it's a way of describing inference by which people just mean
estimating the best explanation probabilistically a process of inference that is ongoing so
sometimes this is called amazing belief updating updating one's belief in the face of new data. And how do you do that update in a mathematically optimal way?
You simply take the new evidence, the new data,
you combine it using Bayes' rule with your prior beliefs
established before you saw those new data
to give you a belief afterwards, sometimes called a posterior belief.
Because otherwise you would just come up with a hypothesis
assuming you don't know
anything about the system and that's not always the fastest way to get the answer yeah so you
could argue it isn't important you can't do it either it has to be a process it has to be a path
through some beneath space you're always updating whether it's at an evolutionary scale or whether
it's during this conversation you can't start from scratch and you're using the word belief the way here stateside we might use the word
what's supported by evidence so it's not that i believe something is true often the word belief
is just well i believe in jesus or jesus is my savior or muhammad so belief is i'll believe that
no matter what you tell me because that's my belief right and So belief is, I'll believe that no matter what you tell me, because that's my belief.
Right.
And my belief is protected constitutionally on those grounds.
Yes. When you move scientifically through data,
and more data comes to support it,
then I will ascribe confidence in the result
measured by the evidence that supports it.
So it's an evidentiary supported belief.
Yeah. Yeah. I guess's an evidentiary supported belief. Yeah.
I guess if we have to say belief,
what is the strength of your belief?
It is measured by the strength of the evidence behind it.
Yeah, that's how we have to say that.
So Gary, do you have any last questions
before we got to land this plane?
Yeah, I do.
Because if I think about us as humans,
we have, sadly, some of us have psychotic episodes, schizophrenia.
If someone has hallucinations, they have a neurological problem that's going on inside their mind.
Yet we are told that AI can have hallucinations.
I don't know.
Does AI have mental illness?
AI just learned to lie.
That's all. You know, you ask it a question. It doesn't know the answer. And mental illness? AI just learned to lie. That's all.
You know, you ask it a question, it doesn't know the answer,
and it's just like, all right, well, how about this?
That's what we do in school, right?
You don't know the answer.
You make something up, it might be right.
Right, exactly.
What's the answer?
Ah, rockets?
Okay.
Yeah, I was speaking to Gary Marcus in davos a few months ago and he was telling me he invented
the world or applied a world the word hallucination hallucination context and it became word of the
year i think in in some circles and i think he regrets it now because the spirit in which he
was using it was um technically very divorced from the way that people hallucinate. And I think it's
a really important question that, you know, theoreticians and neuroscientists have to think
about in terms of understanding false inference in a brain. And just to pick up on Neil's point,
when we talk about beliefs, we're talking about sub-personal non-propositional
bayesian beliefs that you know you you wouldn't be able to articulate these these are the way
that the brain encodes probabilistically the causes of its sensations and of course if you
get that inference process wrong you're going to be subject to inferring things are there when they're not, which is basically hallucinations
and delusions, or inferring things are not there when they are.
And this also happens to some of us in terms of neglect syndromes, dissociative syndromes,
hysterical syndromes.
These can be devastating conditions where you've just got the inference wrong.
So understand the mechanics of this failed inference.
I think, for example, hallucination is absolutely crucial.
It usually tracks back to what we were talking about before
in terms of the ability to select versus ignore different parts of the data.
So if you've lost the ability to ignore stuff,
then very often you preclude an ability to make sense of it because you're
always attending to the surface structure of sensations.
Take,
for example,
severe autism.
You may not get past the bombardment of sensory input in all modalities,
all parts of the scene,
all parts of your sensory.
It's all alive.
Right.
It's all alive. Guys, all alive guys i think we've
got to call it quits there carl this has been highly illuminating yeah man good stuff and
what's interesting is as much as you've accomplished thus far we all deep down know
it's only just the beginning and who knows where the next year much less five years will take this
and be interesting to check back in with you
and see what you're making in your
basement.
With a
Brit, Neil, it's garage.
The basements is more the garage.
We go out there and create
lots of wonderful things.
Exactly.
Okay, Professor Carl, thanks for joining us.
Thank you very much for the conversation and the jokes, exactly. Okay, Professor Carl, thanks for joining us. Well, thank you very much for the conversation,
the jokes particularly, then, sir.
You're welcome.
The most beautiful thing I've ever done, the conversation.
Thanks for joining us from London.
Thank you.
Time shifted from us here stateside.
Again, we're delighted that you could share your expertise with us
in this StarTalk special edition.
All right, Chuck, always good to have you, man.
Always a pleasure.
All right, Gary.
Pleasure, Neil. Thank you.
I'm Neil deGrasse Tyson, your personal astrophysicist,
as always bidding you to keep looking up.