Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas - 87 | Karl Friston on Brains, Predictions, and Free Energy

Starting point is 00:00:00 You're confused about your credit score. One site has one number and another site, something completely... What? That can't be right. It's okay. Forget everything except MyFICO. These free scores from other apps can differ by as much as 100 points from your FICO score that 90% of top lenders actually use when you apply for a credit card, personal loan, car loan, or mortgage.

Starting point is 00:00:22 For the moments that matter, get the score that matters, your FICO score. Visit MyFICO.com and get started for free today. Indeed, sponsored jobs gets you quality candidates when you need them most. Spend less time searching and more time actually interviewing candidates who check all your boxes. Less stress, less time, more results. When you need the right person to cut through the chaos, this is a job for Indeed sponsored jobs. And listeners of this show will get a $75 sponsor job credit to help get your job the premium status it deserves at Indeed.com slash podcast. Terms and conditions apply.

Starting point is 00:00:59 Need to hire? This is a job for Indeed Sponsored Jobs. Hello, everyone, and welcome to the Mindscape Podcast. I'm your host, Sean Carroll. And in the course of doing many podcasts, interesting phenomena arise when I look at what the comments are from various sources, whether it's email or Twitter or comments on the web page or on YouTube or Reddit, different podcast episodes have different spirits in some sense. Sometimes we're big picture, right? we're sort of talking about various ideas from a very high-level view, and it's more inspirational than challenging, right? It's like thinking about things rather than getting a lecture in them.

Starting point is 00:01:40 Other times, we get a little deeper. We kind of get our hands dirty. We get into the weeds. We try to dig into some specific example of something. Either way, I will get complaints that I know. And you know what? I love and cherish the complaints, because I want constructive feedback. I want to hear what people have to say, but honestly, going forward, realistically, it's going to be a mix of both kinds. So today, we're getting our hands dirty. We're going to get into the weeds. Don't be afraid, you know, don't think that, well, this is going to be a slog or anything like that. This is one of the most fascinating episodes I think that I've done here on Minescape and perhaps the most useful in a sense I will explain to you in just a second.

Starting point is 00:02:20 Our guest is Carl Fristin, who's a neuroscientist at University College London, and Carl Fristin is, by many measures, the most influential neuroscientist alive today, the most citations, the highest H index, all these different quantitative measure of scientific success. He is a practicing psychiatrist and is very interested in schizophrenia, and he serves patients, but he is also contributed to neuroscience more broadly, most obviously, in developing techniques for imaging the brain, ideas like statistical parametric mapping, voxel-based morphometry. I really have no idea what these ideas are.

Starting point is 00:03:02 Sorry about that. What I'm interested in is where he's moved more recently in his career into the theory of how the brain works, and he's been developing an idea called the Free Energy Principle. It's part of a bigger set of ideas called the Bayesian brain. The idea that what the brain is trying to do is to model the world around. it and therefore develop a little picture of what's going to happen next using Bayesian inference, something you all are experts on because you've all read the big picture or somewhere else. Bayesian inference is getting data in and using that data to update your beliefs about the world.

Starting point is 00:03:36 The free energy principle is Fristin's idea for how the brain effectively does that. It turns out that calculationally updating your beliefs about the world can be very, very hard. The free energy principle is a way to sort of simply increase. quickly get to an effective view of the world. And the basic idea is that the brain is constantly trying to minimize surprise, is trying to develop a model for all the stimulus, all the sensory input that it's going to get that is least likely to be surprised by something new happening. So that sounds simple, but when you get into it, when you look at the actual way it's supposed to work, it actually turns out to be pretty darn complicated and intimidating. And famously, there's a large

Starting point is 00:04:20 number of people in a lot of different fields, not just neuroscience, but deep learning, machine learning, biologists and physicists, and a whole bunch of people who have trouble really figuring out what this is all about. So I really think that in this podcast, we present quite an understandable picture of what it's all about. There's some jargon, but we explain what the jargon is. Of course, Carl understands what it's all about, but I think that we're able to give enough examples, talk a little bit about why the brain would work. this way and what it's supposed to be, why in particular he's interested in it from the point of view of addressing schizophrenia and other problems. To me, of course, I'm biased. I know a lot

Starting point is 00:05:01 about free energy and entropy and measures of information theory, et cetera, but I think we did a good job of uncovering what's going on here. You know, there's no equations in the podcast, but the ideas, I think, are out there. This is the kind of episode which it really repays listening closely to. I think you can learn a lot and learn about something that is really at the absolute cutting edge of modern neuroscience. This is Jacob Goldstein from what's your problem. When you buy business software from lots of vendors, the cost add up and it gets complicated and confusing. Odu solves this. It's a single company that sells a suite of enterprise apps that handles everything from accounting to inventory to sales. Odu is all connected

Starting point is 00:05:43 on a single platform in a simple and affordable way. You can save money without missing out on the features you need. Check out Odu at ODO.O.com. That's ODOOO.com. When Toyota builds an electric vehicle, we don't start with a blank slate. We start with everything we know. The BZ brings Toyota's proven engineering to electric.

Starting point is 00:06:08 With impressive range, intuitive technology, and Toyota reliability, BZ reflects decades of experience, reimagined for one. What's next? The BZ isn't just electric. It's Toyota Electric. We make it easy. Toyota, let's go places. I also want to mention a tiny little announcement. You know that I have a Patreon account that you can find on the web page, preposterousuniverse.com slash podcast. There's a link to the Patreon. One of the benefits that Patreon users get is that they get a monthly Ask Me Anything episode.

Starting point is 00:06:44 So Patreon users ask me questions. I try to answer as many of them as I can. And someone on Patreon suggested that even though it makes sense that Patreon supporters get the ability to ask questions, the ability to listen to the answers might be more widely shared. So I'm working out a way to do that. Every month, there's like a two or three hour episode that I put on Patreon. And right now, it is only for Patreon listeners. But going forward, I'm going to try to figure out how to make those answers,

Starting point is 00:07:14 available to anyone. Right now, the only way to do it is going to be to go to the Patreon page every month when they appear. But hopefully I'll figure out a way to put them into the regular podcast feed. The trick is that it costs me a lot of money to put two or three hours worth of gigabytes of data onto the podcast feed because my host is really expensive. It gets paid for by the ads. It's not clear whether we can get ads to pay for the AMA or not. Let me know, you know, especially on the comments in the blog post associated with this particular episode, whether or not this is a good idea at all, whether people would be interested, would it make sense just to include like one hour's worth of answers,

Starting point is 00:07:56 then you can go to the Patreon page to get the rest, etc., etc. But I think it's a different kind of thing. It's not going to replace regular Minescape episodes, but it might be a different way to get some ideas out there and talk about them. And with that, let's go. Carl Frist and welcome to the Minescape podcast. Thank you. I'd like to be here.

Starting point is 00:08:31 I figured we would talk about this thing called the free energy principle, which you've been investigating and championing for a while now. But maybe just to get there, rather than just start by defining that, could you just explain what is the problem that we're trying to solve? What is the question that we're trying to answer by talking about free energy? From my perspective, it's trying to find a first principal account of sentient behavior. and just very practically that's relevant because of my background, which is a psychiatrist. So very simply, and as sort of Richard Feynman says,

Starting point is 00:09:04 if you want to understand something, you've got to be able to build it. If you want to understand psychiatric patients, you have to, in some minimal way, be able to build or simulate sentient behavior that goes wrong. Okay. So that's basically how I got into it. So you were actually a practitioner with patients and the whole business? Oh, yes. Transitioned out of that.

Starting point is 00:09:25 Yes. It was slowly but surely, yes. It transferred my angst from patients to students. Okay. But I'd just spend, you know, an early part of my life in a therapeutic community with 30 chronic schizophrenics, which was an eye open for several years. I can imagine. Yeah. Okay.

Starting point is 00:09:43 And so that's, so we want to understand how the mind works, how the brain works, in part to help fix it when it goes wrong. That's the ultimate agenda. But, of course, I got seduced. away from clinical psychiatry into systems, neuroscience, and brain imaging. And the question became slightly less focused and more how does the brain work. And that became relevant when you're trying to characterize

Starting point is 00:10:07 or analyze neuroimaging time series from things like functional magnetic resonance imaging or the electron cableography. So to make sense of these data, you have to have some conceptual generative or forward model of what's actually under the hood. Sure, otherwise it's just a bunch of time series data, yeah. And neuroimaging is sort of where you made your money, as it were, right?

Starting point is 00:10:28 That's my day for a while. And thinking about the grand theory of the brain is what we're doing now. So good. So that's enough for me to dive into free energy, except that you had this lovely story that I've heard about wood lice when you were young and seeing them scurry around. That does set the stage very nicely. If you could tell our listeners that story. Right. Well, that was my first, looking back, my first sort of scientific inside. So it was a hot summer's day and I was, must have been about between five or eight years of age, playing in the garden and just became preoccupied by watching little woodlights scurrying around, noticing that they tended to avoid sunlight, that they ended up underneath bits of rock or

Starting point is 00:11:17 wood in shady places. And just looking at this, I thought that's interesting because that looks like purposeful behavior. It looks as if they are purposefully, in a goal-directed way, avoiding the sunlight. But then there was another interpretation that came to mind. Well, yes, but you would also see exactly the same phenomenology if they just move more quickly when they were warmed up by the sun. So there was a more deflationary account. I didn't use these words at the age of five. The essence of the insight was, well, there's a much simpler explanation. of what's going on here, for this sort of very elemental form of self-organisation, ensemble dynamics. There's a simple explanation. Things just move fast when the hotter.

Starting point is 00:12:06 That notion, that sort of deflationary, simple, almost verging on a tautology explanation for self-organising behaviour, kept sort of representing itself throughout my education. So natural selection, I think, is another nice example of that. If it doesn't work, you move in some phenotypic space, right through to physics at Cambridge and sort of, you know, density dynamics of the Fokker Planck and quantum physics. Again, if it's not good, if it has a high potential, just get out of there. Were you a physics undergraduate?

Starting point is 00:12:42 Is that what you studied? Natural sciences, yeah. So half psychology and half physics, quantum physics. I had no idea. So the audience don't, you know, I don't seek out people who are physics undergraduates, but I find them in all sorts of fields. Apparently, yes. But in a deflation way, just because you move away from people who are physicists.

Starting point is 00:13:00 That's right, exactly. So, but I love that kind of story because it's an example of what I talk about a lot in the big picture, this emergence of purportedly higher, well, purportedly purposeful, teleological, goal-directed behavior out of, you know, things just obeying the laws of physics one way or the other, right? And so that's the kind of thing that you saw in your undergraduate education. and even today, I presume, in studying the brain? Yes, well, I mean, at its heart, that is the free energy principle that we've been promoting. It is very much this sort of deflationary, getting back to the first principles,

Starting point is 00:13:40 and then rebuilding up on that and seeing what would this kind of behavior look like in a sufficiently itinerant context? How would it be fit for purpose to explain the behavior? of you and me, or starting a slightly simpler level, would it be fit for purpose in explaining a thermostat or a virus or something, an ensemble of bacteria or the like? So it's a question of how far can you get from first principles as a principal account of sentient behaviour? So we get back to the human brain.

Starting point is 00:14:13 And of course that is in terms of writing down down the dynamics, the mechanics, and specifically in mathematical terms, becomes a necessary if you want to write down formal models of brain imaging time series. So there's an interesting dialogue between the rather self-indulgent theoretical neurobiology side of it. Yeah, what I often refer to as the work that you do at the weekend when the pressures are off and the day job, which is analyzing brain data. Both sides inherit from each other in a very interesting way that you're constantly building models of how the brain works

Starting point is 00:15:00 and then testing those models in relation to the empirical data you get from brain imaging, which forces you, puts pressure on you to actually sort of think, well, what's the simplest sort of dynamical, functional, computational, or computational architecture that could possibly explain these data when I look at this sentient creature, usually a normal subject, human subject, when exposed to these experimental manipulations. And yet on the other side,

Starting point is 00:15:32 the very algorithms or schemes or data analytic approaches that you apply to make the inference about, is this the right model of the brain, or is that the right model of the brain, themselves now become inherit from the theoretical word. Because if you can solve how a brain works, that's the best sort of data analysis machine you can possibly have. And that helps with analyzing the actual data. Has it affected what data you collect? Yes, yes.

Starting point is 00:16:02 I mean, that's true in both senses. It's true in terms of me as a sentient creature. You're an example. Looking around, I'm literally collecting visual. visual data, as that's the card around the room, interrogating your face, trying to anticipate whose tenet is to talk, whether you've understood me. So I'm selectively sampling the right kind of data to resolve uncertainty about those hypotheses that are relevant to my behaviour at the moment.

Starting point is 00:16:30 And in exactly the same way as a neuroimaging scientist, I designed those experiments to solicit the right kind of data that resolve my uncertainty about my hypothesis, about the functional integration of the hippocampus with the prefrontal cortex. Same stuff. It's exactly the same principles. Good. Except now I'm self-conscious that everything that I do, you're in a different way. I mean, of course you're going to be looking at my face in a different way than you're looking at the desk in front of you, which you see every day, right? The surprise all the new information is much larger for something like that. So, okay, good. This summer, AG jeans looks to California wine country for inspiration, where style feels effortless and time naturally slows down. The summer 2026 collection reflects that

Starting point is 00:17:11 mindset with premium denim washes, relaxed silhouettes and refined details designed for real moments, from brunching among the vines to spontaneous tastings and sunset dinners that turn into memorable nights. Each piece balances comfort and sophistication, created to move easily through the day and feel just as right well into the evening. The season is marked by a special collaboration with Napa Valley's Stony Hill Vineyard, an iconic estate with a timeless sense of place. Inspired by Stony Hill's understated character, the limited edition capsule features raw salvage denim for men and women, along with sun-fated sweatshirts, hats, and versatile essentials. Like a fine wine, AG denim is designed to evolve beautifully over time, growing richer with everywhere.

Starting point is 00:17:53 AG's summer 2026 collection is available now at AGGines.com. Use code summer 15 for 15% off, even if you've shopped with us before. This is Jacob Goldstein from what's your problem. When you buy business software from lots of vendors, the cost add up and it gets complicated and confusing. Odo solves this. It's a single company that sells a suite of enterprise apps that handles everything from accounting to inventory to sales. Odo is all connected on a single platform in a simple and affordable way. You can save money without missing out on the features you need. Check out Odo at ODO.com.

Starting point is 00:18:30 That's ODOO.com. What is free energy when you think about it? Because I'm a physicist and I have a definition and I think it's a little bit of bit different than yours, but there's a mathematical relationship. So I want to clear up for everyone in the audience what we mean when we use these words. Right. So when I talk or when people, when I say I, it's a little bit self-centered, when people in things like machine learning talk about free energy, they mean variational free energy. So technically, if you were talking to somebody from machine learning, we're talking about an evidence lower upper bound. They switch the sign in machine learning. So often

Starting point is 00:19:07 called an evidence lower bound acronym Elbow, which confused me for enormous amount of time. I literally thought it was part of your body. That's asking for trouble, really. So it's a statistical information theoretic concept, which licenses your previous discussion about hypotheses and inferring this and rich information. So it scores basically abound on the technically the self-information or the log probability of some data given a model of how those data were generated. So from a pure... So sorry, that's the crucial thing.

Starting point is 00:19:51 There's a model and there's data. And we would like them to match. Yes, absolutely. So, well, let's just rehearse that. That's absolutely fundamental. So everything that... we talk about either in terms of sort of sentient behavior and the energy principle as applied to sentient artifacts, or in terms of actually analyzing data rests upon this notion of a generative

Starting point is 00:20:16 model. Generating what? Generating data. Generating sensations. Generating any observables. And an even more simple expression of that is you have to have some way of articulating a mechanics of causes and consequences. So the generative model causes or is a description of how causes generate consequences. And in this instance, the consequences are sensory observations or data, observables, measurables. The causes are the sort of the latent variables, features, structures, whatever you want to call them, that are responsible for generating those things.

Starting point is 00:21:01 So central to the free energy as a scalar functional is a generative model. So it's only defined in relation to a generative model. So that tells you immediately the free energy, the variational free energy is a functional, a function of a function, and the function that is a function of is a probability distribution or a belief. So this is your brain giving probabilities to the various things it might experience out there in the world, and the free energy is a way of measuring, I'll just say relating that prediction for what you see to what you actually see,

Starting point is 00:21:38 and then sort of what measure is it? What does it characterize really? Well, exactly as you defined it. It's a measure of the surprise that you would have, and I'm using surprise in a sort of, you know, well, actually used it a folk psychology sense, though, but literally the surprise or the self-information. There is a technical

Starting point is 00:22:02 version of it, which is in this case pretty close to the full version. Yes, it is. Absolutely. So it's the surprise that you would associate with a bunch of data, given a belief or a model about how those

Starting point is 00:22:20 data were generated. So technically it's also called the marginal likelihood or the logarithm, the negative logarithm of the marginal likelihood. and that marginal likelihood is also called model evidence. And all this rhetoric becomes important because you can gracefully move from one interpretational stance to another one without making any mathematical moves whatsoever. Right.

Starting point is 00:22:42 The math carries over. It's just exactly the same. But depending upon how you grew up or how you appreciate these quantities or the rhetoric that you would interpret them, you get a very different look and feel to the fundamental behaviour of systems that look as if they are minimizing their variational free energy. So if this surprise is interpreted from the point of view of a statistician, so the brain has a statistical organ, a little scientist inside your head, then if it is in the game of

Starting point is 00:23:19 minimizing its variation for energy, its evidence bound, that makes it appear as if it is maximizing model evidence. What does that mean? Well, it will look as if it's gathering information in the service of seeking evidence for its own existence. And this translates nicely into a philosophical concept, self-evidencing. So another way of looking at this purely mathematical sort of behavior is in terms of self-evidencing.

Starting point is 00:23:56 have lovely little phrases like, you know, people going around gathering evidence for their own existence, which is, you know, mathematically a truism. So that's certainly one. Or you can look at the other way around. We're trying to minimize surprise. So what would that look like if you were in, you know, you've been taught to think about things in terms of information theory and uncertainty? Well, mathematically expected surprise, expected self-information is entropy. Entropy is one way of describing uncertainty. So what does that? mean. It means that you look as if this creature or this artifact or this system is gathering information in the service of resolving uncertainty. Uncertainty in relation to its model of how they think the

Starting point is 00:24:40 world works and in particular how it is situated within that world and sampling from that world. So that brings us back to your phrase earlier on about looking around, not looking at my desk, because there's no rich information there. I'm going to be surprised, hopefully. Just to reassure you, I haven't got my glasses on, so I can't see. It doesn't look very surprising to me. So, you know, what is another way in information theory of articulating the notion of rich information? It's just minimising uncertainty.

Starting point is 00:25:15 It's maximising the relative entropy, as scored by the technically the KL divergence, that is part of an expected free energy. But put more simply, it just means that minimizing free energy or making moves that will minimize free energy in the future simply means maximizing information gain or optimizing some divergence measure, literally making your mind up in the sense of moving a belief,

Starting point is 00:25:49 moving a probability distribution from one prior belief to a posterior belief, and the more information that you can have at hand, you will assimilate, and the KL divergence will in this instance be greater, and that's a very important part. And when you actually write down the imperatives for action, that becomes a very, very important part. And this kind of makes intuitive sense, right,

Starting point is 00:26:14 that our brain would like to carry around with it a model of the world such that it typically looks around and says, yes, that's more or less what I would have expected, right? And so the free energy is the difference between what it's sort of expecting and what it's seeing. So it wants to minimize that. And it's completely information theoretic, which I need to say out loud because, of course, the word energy appears in it. And as a physicist, we have a notion of free energy that it's a kind of energy you can use to do useful work. And in fact, in the thermodynamic system, when you maximize entropy, you're minimizing free energy and vice versa.

Starting point is 00:26:49 If you have a box of gas that is in equilibrium, it has no free energy, lots of entropy. If it's all bundled up on one side, it's the opposite. But you're looking at a context in which the brain is both minimizing a kind of entropy and minimizing a kind of free energy. That's a great paradox. Now, I shall try to unpack it. Yes. But we are right in the depths of the mechanics.

Starting point is 00:27:14 It's the weeds. Yeah. That's okay. So I apologize in advance. So let's just back up and... It'll pay off later. We'll bring you down to Earth. Don't worry.

Starting point is 00:27:23 Okay. So let's just start from... You've gone to university. You've learned that free energy is that amount of energies that is available to do work that there's not locked into the entropy. So the total entropy is the expected energy

Starting point is 00:27:37 minus the entropy. So what that would suggest in terms... So the first thing to acknowledge is that the form of the very National Free Energy is formally identical to a sort of, you know, a Gibbs or thermodynamic equation. The same equation. The only move you make is you drop Belston's constant from the shaman entropy. That's all you do.

Starting point is 00:28:01 We said it equaled one anyway, so it doesn't make any difference even there. There's no difference between us, we are. Speaking the same language. And on that view, what does that mean? Well, that means if you now write down the energy as a potential, a potential, a potential, potential energy and I'm thinking now from the point of view of a statistician, for example. What's the potential energy that gets into the variation of energy? Well, it's effectively the accuracy.

Starting point is 00:28:30 So it's the negative log probability getting these data, given the parameters of my generated model, given all the variables, the quantities, the structure of my model of what could have caused those data, my hypothesis, if you like. There is parameterized, and that's quite important. So the accuracy is basically the surprise that we were talking about before. I should say the way that you described, you know, it makes sense having a model of the world, we can make predictions, and then we can test those predictions against sensory impressions. It's absolutely spot on.

Starting point is 00:29:07 And indeed, there's a whole industry, both in terms of data compression and engineering, but also more recently in neuroscience, and particularly cognitive neuroscience. that is predicated on predictive coding, which is exactly that. So prediction errors are just the mismatch between what your genital model predicted and what you actually sample. You take the sum of squared prediction errors, you weight them by some precision or inverse variance. That is basically free energy. Yeah, it's just the, you know, under some simple assumptions about the genital model

Starting point is 00:29:39 and, you know, the nature of random fluctuations. So predictive coding is one instance of the more general notion that we're in the game of minimizing, our free energy. So coming back to this, you know, the physicist's conception of free energy in systems that have attained equilibrium. What does it then mean to minimize the energy? Well, you're trying to minimize your energy, which is maximizing your accuracy, for minimizing the surprise, averaging out your ignorance about the parameters. And then you're trying to try to to maximize the entropy. Now that may seem paradoxical, which is why it's good. Yeah. Or why I apologize that we're going to have to resolve. Because I've just said,

Starting point is 00:30:27 if you remember, minimizing uncertainty by choosing the right moons that will get the most information, resolve the greatest uncertainty, looks as if it's trying to maximize the information gain or the relative entropy. But now I'm saying, well, minimizing for only will require a maximisation of entropy from the physicist's point of view. And that's absolutely right. So the key distinction is basically what I do at this moment in time will always be to maximise my accuracy or my energy in terms of minimizing this sort of prediction error,

Starting point is 00:31:09 whilst at the same time maximising my entropy, keeping my options open, because the entropy that we're talking about is an attribute of a, belief about the causes of my data. So this is not an entropy measure of the brain biophysically. Right. It's not the molecules in your brain that we're treating as a thermodynamic system. It's a set of beliefs. Absolutely. So these beliefs are then driven to be as broad as possible, entirely consistent with the second law of thermodynamics

Starting point is 00:31:49 there's an imperative as a drive to disorder the entropy will increase but it's the entropy of our beliefs. Now what does that mean? Well it basically means I'm trying to find a low energy explanation for my data

Starting point is 00:32:05 whilst at the same time keeping my options open so this is essentially Occam's razor is this basically not committing to a very precise posterior belief. If I've seen these data, then I believe this caused it. So you don't want to commit to a very precise one. So you've got to find the simplest, the most accommodating explanation for your data.

Starting point is 00:32:28 So that's where the paradox, if you like, in terms of whether you're trying to maximize or minimize the entropy term comes in. But I think more fundamentally makes the point that the entropy we're talking about at the moment is a functional or a scale of functional of a belief about something. It's not the thing that is encoding those beliefs.

Starting point is 00:32:49 It's not the neuronal firing or the molecules or all the atoms. The twist, the sort of the slightly paradoxical aspect of this is when you move into the future and when you have the beliefs about the consequences of an action, say looking over there or Googling a certain entry, or going to Wikipedia, before you make that action, you have beliefs about how your free energy is going to change. And at that point, your entropy or your relative entropy is effectively switch around

Starting point is 00:33:26 because the outcomes now become random variables. You have to take an expectation. This is a bit technical, but it's a beautiful little bit of techy stuff, which basically flips this imperative to minimize these entropy and relative entropy. is when you're applying it to the system as it is now, as it is behaving. When the system looks at itself,

Starting point is 00:33:48 say, well, how would I have to now act in order to minimize my fear energy in the future? I now include basically outcomes as random variables, and they get into the expectation operators. And suddenly you're then in the service of minimizing it. So there's this sort of yin-yang, which means that as I'm currently processing my data, I'm striving to find the explanations that maximize my uncertainty,

Starting point is 00:34:14 because I don't want to commit to a particular belief. Yet at the same time, I'm going in the exactly opposite direction. I'm trying to sample those data that will shrink my uncertainty. And then when I find that balance, then we have this of active self-evidencing. When Toyota builds an electric vehicle, we don't start with a blank slate. We start with everything we know. The BZ brings Toyota's proven engineering to electric. With impressive range, intuitive technology, and Toyota reliability, BZ reflects decades of experience,

Starting point is 00:34:47 reimagined for what's next. The BZ isn't just electric. It's Toyota Electric. We make it easy. Toyota, let's go places. This is Jacob Goldstein from What's Your Problem. When you buy business software from lots of vendors, the cost add up and it gets complicated and confusing. Odu solves this.

Starting point is 00:35:08 It's a single company that's a single company that's a problem. sells a suite of enterprise apps that handles everything from accounting to inventory to sales. Odo is all connected on a single platform in a simple and affordable way. You can save money without missing out on the features you need. Check out Odu at ODO.com. That's ODOO dot com. Let me try to put it in my words and you can tell me if I'm coming close here. I mean, we want to minimize the times that we're surprised.

Starting point is 00:35:38 you might think, well, just predict the thing you think is most likely to be true all the time. But if you put 100% probability on that, then any time you're not exactly right, you're hugely surprised and that's bad. So you might say, well, let's do the other thing. Let's have no beliefs about anything. Let's say anything could happen. But then in some case, especially when you go through the math, you're always surprised. Everything that happens is a little bit unlikely because it could have been anything else. And so there's this compromise where you sort of try to home in on something you think

Starting point is 00:36:08 most likely, but you do give allowance for the deviations. Absolutely. Beautifully put. As a physicist, what you could understand what you've just said as is basically how do we describe systems that have some itinerancy, but they still restrict themselves to a limited part of their face space or their state space. So it's not one point. We're not little marbles or moon rocks, nor are we gases.

Starting point is 00:36:38 But we are, we certainly have boundaries and we have a shape and a form in the sense of, you know, those parts of the state space we could occupy or where the woodlice are running around. You know, there's a structure there. So that structure, if it is the case that these sorts of systems exist, things like you and me or anything actually exists in a non-trivial way. By non-trivial, I mean basically having an attracting set that is just being in one state. deterministic. Absolutely, yeah. Then there has to be this balance between this itinerancy, this sort of, you know, the expiration of a phased space in a structured way where many regions are visited, but some regions are visited more often than other regions. And when you start to think about, well, how would you articulate that mathematically? You start to get sort of random attractors

Starting point is 00:37:37 that inherit from undenical systems. The way you're saying there's an attracting set out there. It is not, it's probability distribution. If I measure the probability of finding me at any point in my state space over a very long period of time, it's certainly not uniform. I have a shape. Some things are more likely that. Absolutely.

Starting point is 00:38:01 And yet, I am not a fixed point. I am not just in one state. So what you are talking about effectively is a non-equilibrium steady state. So it is not the kind of equilibrium steady states that physicists knew and love and know and love and were taught in the 20th century. This is the new challenge of understanding non-equilibrium steady state in open systems that exactly have this delicate balance between being a point attractor and being completely diffused at the end of the universe. different. That's funny. Yeah.

Starting point is 00:38:36 Where it's going. Just parenthetically, I did have Antonio Demasio in the podcast a little while, and his favorite word in the universe is homostasis, right? The idea of keeping within this tiny little range as much as we can, but some flexibility there. Yeah. Absolutely.

Starting point is 00:38:50 And you want to, correct me if I'm wrong again, but you really want to say that this minimization of free energy and surprise all is sort of the key to unlocking what the brain does, right? It's the underlying thing for most everything. Yeah, it is. Absolutely. I mean, there's a joke in my group meetings that the answer to any question is model evidence. Usually, it's a do worse, yeah. And certainly when you're talking about colleagues and research fellows and, you know, whenever they ask the question, well, what happens if this is like that? Well, get the evidence for that hypothesis and then, you know, and then you can quantify your beliefs about whether it was like that or whether it broke like that or whether it was. that difference is in play.

Starting point is 00:39:37 So can I come back with another parenthesis because I think it nicely follows on from Antonio Damasio's focus on homeostasis. Of course, the roots of much of self-organization to non-equilibrium steady state inherit from the work of people like Ross Ashby who made apparent his ideas through the homeostat.

Starting point is 00:39:58 So it's exactly the same, or the good regulator theorem. I think that this is a, This is just my personal opinion. I think that physics has a lot of work to do and there's a lot of discoveries to be made along exactly these lines. You know, where non-equilibrium statistical mechanics is a booming growth field. I try to get my colleagues excited about it, but it's a, you know,

Starting point is 00:40:21 they're used to doing their things. It's a tricky kind of pre-paradigmatic area where we don't exactly know what the rules are. I don't know because, you know, I became a psychiatrist, and left the physics behind, but certainly surfing the web, that pre-paradamatic thing, that's very exciting, though, isn't it? Because that's the next thing. Yes, exactly. That's 21st century physics.

Starting point is 00:40:45 It's much more comfortable when you know, like in particle physics or cosmology, where I was raised, you know what the questions are, you know what would qualify as an insight. Whereas in complex systems, dissipative systems, Ilya prigene was an early pioneer here also, right? self-organization, and we just don't know what words to use. We don't know what equations to use, but I think like you, I'm a fan of these sort of statistical mechanical lenses at which to view these things. Yes, well, I'm sure that's the only way, really, too. Well, for me, it is the only way to write these things down

Starting point is 00:41:18 because at the end of the day, you actually have to have a model in code that generates predictions to analyze the brain data. So there is no other, you don't really have the, unless you're a philosopher, or you write books. You are. There is no... You need to be able to give it to a computer and ask it how well you're doing. Yeah. But let's just do sort of the reality check for this perspective here.

Starting point is 00:41:41 I certainly get that if I'm driving down the street, I would like to be surprised as little as possible. But isn't it true that also informally, I do sometimes seek out new experiences, right? How does that fit in? Well, it's exactly that sort of paradox that we're addressing technically in terms of the difference between me at this point in time, this moment,

Starting point is 00:42:01 my free energy via a maximisation of the entropy and uncertainty of my explanations or beliefs about what's going on now. And choosing those actions that will in the future minimize my expected free energy. So that basically means that when you talk about the expected free energy conditioned upon a particular action move, looking over there when driving the car, you have this opposite imperative. So now you become information seeking. Now you become a curious creature. Now you become sensation seeking.

Starting point is 00:42:39 But it's a particular kind of sensation seeking. It's those sensations that would resolve uncertainty about what would happen if I did that. Because behind a lot of the ways I've just described that is a sense of me as an agent. Sure. So once you generalize the non-equilibrium physics of sentient systems to write down what might be the imperative for the way that they act upon the world,

Starting point is 00:43:05 that they evidence their agency. And you make the assumption that, or you try to prove it can be no other way, that they will act in the service of minimizing the long-term average of free energy in the future. Then you get this curiosity. It's written into the information theory. To resolve uncertainty means you're going to be sensation-seeking. Whether that's a sensation-seeking of a banal sort that you're looking at the traffic lights or the street lamps

Starting point is 00:43:35 to see whether it's go or stop or whether it's going to a disco or doing bungee jumping, it's a different, but it's the same imperative underneath. But this is the response to sort of the wise crack that if we just wanted to minimize a surprise, we would sit in a dark room and not do anything, right? But it's it, how innocent is that move to go from minimizing surprise

Starting point is 00:43:55 to minimizing the expectation value of all my future surprises? That seems like a little bit of a different minimization. Is that fair? Which one is it that we're doing? Well, in a minimal sense, you're doing both. But you're sort of touching on the issues that normally come to the end of the conversation.

Starting point is 00:44:18 What is the difference between you and a virus? So to cut to that, that distinction before revisiting the fundaments of minimizing expected to be energy in the future. It may be the case that certain systems, certain, certain, yeah, biotic systems, creatures basically, have acquired the capacity to include, in store, in their generative models, the prior belief that they are free energy minimizing creatures. And if they can do that, then they will have the prior belief that the way that I will act will be to minimize my free energy.

Starting point is 00:45:07 That was how you might write it down as a physicist. I see where this is going, yes, good. If you were a psychologist, all you're saying is, I have the capacity to plan. Yeah. So that's all you're saying, really. What to imagine. Exactly, yeah. So you have sort of, yeah, perfect.

Starting point is 00:45:23 So your genetic model is now equipped with the capacity to imagine a counterfactual or fictive future, to imagine the future, to roll out possible consequences of actions. And of course, the consequences of those actions in terms of the observations now are random variables because they haven't happened yet. And that's why you get this reversal. And suddenly you become a creature that seeks out sensations, literally sensation seeking. That resolve uncertainty, you become curious and you go to your discos and you do your bungee jumping at a certain age. So that's, I think, an important distinction between very simple attracting sets. Let's go right back to the moon rock. Okay.

Starting point is 00:46:10 So an appropriate description of that non-equilibrium sedative state is in fact an equilibrium steady state, and that's basically got a point attractor. It could have a quasi-periodic attractor, so the orbiting of the planets. But these are very simple. non-itinerant attracting sets that would approach an equilibrium steady state. Then we move up to systems right from sort of rocks through to, yeah, you may even go as far as some, not insects, but certainly say viruses. So things that don't plan.

Starting point is 00:46:45 But they live. They're very effective occupying our universe. But they live in the moment in some sense. Absolutely. in Jerry Adelman's word, there is no remembered present, there is no imagination, there's no planning. They have all the mechanical finesse of a thermostat, but they're very good at what they do. Homeostasis, yeah. I mean, that's a nice beautiful for example.

Starting point is 00:47:09 So even in our own bodies, possibly 99% of all that actually goes on in terms of physiology, which is homestasis, is just this reflexive in the moment, keeping yourself in the... Regulating ourselves. Regulating yourself. So those kinds of systems are distinct, I think, from systems like you and me that start to plan and have the ability to...

Starting point is 00:47:38 Their generative models do actually span the future and by implication the path. They implicitly have a dynamics where the trajectories actually go quite a long way or in their journey model but quite a long way into the future. So that would be a really interesting

Starting point is 00:47:54 sort of way to take forward the argument or a response to your question. But your question is slightly, just to return to say, well, is it minimizing free energy or is it minimizing the free energy conditioned upon a particular action?

Starting point is 00:48:12 Expected action. It is both. The degree to which you actively minimize your uncertainty depends really upon the shape of the attractor that we're talking about, the attracting set that we're talking about. So with itinerant systems, it is possible to sort of write down the density dynamics and work out the probability distribution of trajectories of action into the future. So this just actions is another state. you can apply fluctuation theorems or path integral formalisms to work out a distribution over

Starting point is 00:48:54 trajectories of action into the state into the future, my apologies. Which means that technically what you can do now is characterize a given system in terms of probability distributions over courses of action policies into the future, trajectories or paths of action under a model of what would, you know, what would the that that particular action have for my sensory input, for example. When you write that down, that there is a way of showing that that is essentially a description of systems that do minimize their expected free energy in the sense of just minimizing the, just minimizing the, the chaos divergence between where they think they are,

Starting point is 00:49:54 probabilistically, and their attracting set. So that's one part of expected free energy. There is another part which is all about ambiguity reducing, which kicks in when there are particular constraints on the shape of this probability distribution that you and I evince just by existing. And that depends really upon the itinerancy, which you can measure in terms of mutual information or relative entropy between different partitions of the States.

Starting point is 00:50:32 I've got to say this. There's probably some details here. Here comes one little detail. So all of this rests upon a Markovian partition into a Markov blanket. It all rests upon carving the universe into, internal states that are inside you that constitute your internal states, the rest of the universe, and then crucially, blanket states that separate systematically you from the rest of the universe,

Starting point is 00:50:58 that enable you to be identified, and a further bipartition on those blanket states into active and sensory states. So once you've compartmentalizing... Compartmentalizing the different kinds of states that would be necessary to describe a universe in which something exists, are you, in a way that is separable from not you, then you can start to write down, you can go beyond just the, you know, sort of 20th century physics in terms of entropy of distributions of, say, an idealized gas or some sort of closed system. Now you have to deal with the entropy's of a partition. And furthermore, you've got the relative entropy now.

Starting point is 00:51:41 So suddenly you're in a game of, which is pure physics. It's just now you've got to think about the. relative entropy. You can't just talk about the entropy of this ensemble or this, the entropy of this association with this wave function or this solution. You now have to carve it up and talk about the relative entropy, which is where the information theoretic and all the information richness and all the uncertainty, that's where it all starts to kick in. And of course, in so doing, you've actually now committed yourself to a mechanics of open systems because the whole point of having the Markov blanket

Starting point is 00:52:21 is that it enables a two-way traffic between the inside and the outside. So now by definition, you're in the game of writing down the statistical mechanics of open systems that have some non-equilibrium steady state in virtue of having an attracting set. So now the game becomes, well, what different kinds of attracting? set could be, and if they are, what would it look as if they are doing in terms of these

Starting point is 00:52:48 mutual information or relative entrapers or uncertainty resolving pressures that, you know, are one way of describing the very existence of this attracting set? When Toyota builds an electric vehicle, we don't start with a blank slate. We start with everything we know. The BZ brings Toyota's proven engineering to electric, with impressive range, intuitive technology, and Toyota reliability. BZ reflects decades of experience, reimagined for what's next. The BZ isn't just electric. It's Toyota Electric. We make it easy. Toyota, let's go places. This is Jacob Goldstein from What's Your Problem. When you buy business software from lots of vendors, the cost add up and it gets complicated and confusing.

Starting point is 00:53:38 Odo solves this. It's a single company that sells a sweet, of enterprise apps that handles everything from accounting to inventory to sales. Odo is all connected on a single platform in a simple and affordable way. You can save money without missing out on the features you need. Check out Odu at ODO.com. That's ODOO.com. I do want to get more into the Markov blankets, but I don't want to quite let go of this transition from living in the moment to

Starting point is 00:54:09 living in the expected future, I guess. This seems to be without me planning something that appears on the podcast over and over again. Recently in a conversation with the philosopher, Jananne Ismail, when we were talking about free will, and what that means, that came up. And earlier with Malcolm McIver, who was a mechanical engineer and neuroscientist, who has a theory that one of the steps on the road to consciousness was when we fish climbed up onto land and could begin planning in the future because the time scales are much, slower on lands. You have the ability to plan. So I wonder, is it possible in this framework to

Starting point is 00:54:45 sort of pinpoint a place in the evolutionary scheme where we flip over from living in the moment to being more planning animals? My personal theory is that it's with cats. Because I have two cats. My listeners are well aware, Ariel and Caliban, and I swear that one of them Caliban just lives in the moment. Like he, his needs are being met or they're not, and that those are the only two states he has. Whereas Ariel, you could see that she's trying to figure something out about what would happen subjunctively if she did something, and like her little kitty brain is trying its best. So I'm sure the cats is an exaggeration, but is, is it as late as mammals where this becomes important, or do you want to attribute it much earlier than that? I don't know,

Starting point is 00:55:34 but I'm compelled by your saying it seems very clear it counts but you know everything you've said in terms of these other perspectives you know particularly from philosophy makes entire sense to me the ability to you know to plan

Starting point is 00:55:54 suddenly means you have now a space of trajectory as courses of actions in the future. And it means that you have, because you can only realize one deterministic action, because action is actually a physical state of the universe.

Starting point is 00:56:15 It's not a, you know, the action in itself is not a belief. You know, we have beliefs about action, but the action is realized. So that realization means that you have to commit to one of a multitude. So there's a selection process in play, which must in some sense speak to the,

Starting point is 00:56:33 the free will. Or at least if it doesn't, depending upon your commitment, attitude to free will, what he does say is if there is a selection process in play in terms of selecting and action from some probability or distribution or beliefs about the way that I am currently acting, then it must be the case that only applies to systems that actually have posterior beliefs about the future. Yeah, it cannot apply. capacity. So a thermostat could not, I think, be confused with something that might express free will. Whereas your question now is, at what point do we have biological thermostats that have this beautiful homeostasis, that become equipped with the capacity to imagine, to plan, to think, and as you intimate it, possibly even have some minimal form of consciousness.

Starting point is 00:57:24 Right. Or even self, perhaps self-hood before consciousness. And so I normally recourse to the philosophical notion of a vague concept here. So, you know, I only recently learned about this. There's a whole philosophy of vagueness, it's true. We haven't talked about that on the podcast, but that's an interesting topic, yeah. So, you know, for those people who don't, like me, who don't know, like I was a few months ago. So at what point is a pile of sand a pile? Is it one, two, three, four, five grains?

Starting point is 00:57:58 So I quite like that as a way of, without moving, getting out of the question, at what point would you put your cat threshold? And I think it's warranted or licensed mathematically because even things like thermostats and viruses, they, say predictive coding. So predictive coding does not have this planning. It doesn't have this sense. Maybe define for the audience predictive coding.

Starting point is 00:58:26 Well, predictive coding is just, well, originally devised in the 1950s as a way of compressing sound files. So it's a very efficient way of complying with Occam's principle by providing the most, retaining the most information, but in the simplest coding that you can, which is another way of... Sort of the algorithmic compressibility. Yeah, that is in fact the phoeditary principle as well,

Starting point is 00:58:59 but just written in terms of algorithmic complexity and minimum methodion. It's exactly the same maths, different sort of event spaces. So predictive coding, as currently applied to things like the brain, is just the notion that we're minimising our prediction error. So it doesn't talk about action.

Starting point is 00:59:18 What it does talk about is how our brains might respond, how our sort of decoders might respond to some new data. And they do it by sort of reorganising and belief updating or state estimation in a way that minimises a prediction error. So if you can predict what is currently being presented currently exactly on the basis of what you have previously seen, then you must have a perfect model of what is generating that signal. or that soundtrack or the auditory stream.

Starting point is 00:59:55 And therefore, you have minimised your free energy, your variational free energy, or you've maximised your evidence lower bound. In terms of the pure sensory, the sentient aspects of it, notice that we haven't, which is, you know, the important thing, we haven't talked about what we're going to do. I'm talking about that, yes. So this, you know, the pediatric current doesn't address active inference.

Starting point is 01:00:18 It doesn't, oh, active learning. It just talks about how to make sense of data. So that would be a nice example of, you know, the sensory part of a virus or a thermostat. It can be completely described as just minimizing the discrepancy. Take a thermostat, for example, let's now put action back into the mix. So you can describe a thermostat as just minimizing its prediction error.

Starting point is 01:00:48 prediction area between what? Well, between the temperature it's sensing and it's attracting fixed point. So it's one of these fixed point creatures. And it's got its attracting set. It has its prior belief. The temperature will be like this. And all I need to do is to minimize my prediction error, minimize my free energy. And I don't know how I'm doing it, but I do seem to be turning on the heater of the air conditioning, I guess. I don't know. It doesn't know it doesn't know, but it's equipped with action that will enable it to. to get to its fixed point. So in what sense is that planning, if you remember, I'm trying to argue for a vagueness. So I don't have to ask you on your question. Well, when you formulate that kind of predictive coding in terms of Kalman filtering or Bayesian filtering, which would be the technical, or the statisticians way of

Starting point is 01:01:38 describing a predictive coding scheme, you're always working with derivatives. So you're always working in a dynamical setting with not just the prediction areas but the rate of change of prediction areas with time. So in a minimal sense you've got a notion of the future through a linear first order approximation to see it anyway.

Starting point is 01:01:59 Absolutely, yeah. So that's what I meant with in a minimal sense that everything has, every generative model has a notion of the future just in virtue of having a notion of trajectories or dynamic. But it's not quite the same as your second cat.

Starting point is 01:02:19 Yeah, it's not thinking, well, if I sit here, I look like this. You can see it's just at the level of her capacities. But you mentioned action, and I think this is a natural place to go there, because you also want to say that free energy helps us understand how we behave. Is that safe to say? In fact, so let me sort of repeat the thing that I read, and then again, you'll fix it. One way of thinking about what happens when I'm moving. my hand is that my brain sort of intentionally gets it wrong about where my hand is.

Starting point is 01:02:51 And then there seems to be a mismatch between where my hand actually is and where my brain thinks it is. And rather than fix my brain, I move my hand to bring it to where it is. That's beautiful. Yeah. I don't need to fix anything there. In fact, we should celebrate that because what you've just described is a modern-day recount of ideomotor theory, which was prevalent in the 19th century, which... Helmholtz. Yes. Well, yeah, he did everything.

Starting point is 01:03:22 So he could say yes, but he was a high entropy thinker, yeah. There were other sort of German neurologists and natural scientists, you know, sort of focused specifically on that. And that was picked up by William James, you know, on his European tours. But the eye is exactly what you just said. It's this, to move, I have to, in my mind, imagine the outcome of that movement and then just let my reflexes realize that imagined outcome, which basically was in the Victorian era,

Starting point is 01:03:57 positive as an explanation for stage hypnotism, that your arm is getting lighter and lighter and lighter and lighter. I see. And, of course, if you believe your arm is getting lighter and lighter and lighter and lighter and it's floating, then your predictions about the pro-preceptive input, you would get if your hand was in fact floating can now be fulfilled at a sort of pre-awareness level simply by reflexes and your hand will endangerous rise. So, you know, this is as much as nearly all the free energy principle actually, particularly it's sort of incarnations when applied to things

Starting point is 01:04:33 like active inference. These are very old ideas and you can actually probably trace them back to the students of Plato, but they come through count and held holes. Okay. Very much so. Yeah. And alongside that sort of perception as inference, was this sort of actions inference, basically, actions, beliefs about the way I should be. Oh, yes, no, I am like that. And if, of course, you actually tend to the evidence that, in fact, your hand is not floating, then, of course, it won't move, which sort of takes us into the interesting scenario, that you must, in some sense, attenuate the evidence.

Starting point is 01:05:12 prevent yourself from getting that. Exactly. Yeah. Okay. Is this, I guess maybe what we have that Plato or Helmholtz or William James did not have is the ability to poke inside a brain and see what's going on there? Is this idea that action is driven by a mismatch between model and a sensory input verified, testable in the brain? Yeah, yeah, yeah, absolutely. A number of levels. But again, just coming back to this sort of notion that most of what emerges from this, from a sort of mechanical treatment of the VNG principle was well known a century ago. So what we are describing are classical reflexes. So there's this, you know, the brain sends down messages to alpha motor neurons and spinal cord that effectively are a mismatch between the intended signals, sensory signals

Starting point is 01:05:59 from the muscles, the motor plant, and what it's actually receiving. And then those cells elaborate signals to the muscles to cause them to contract until the signals match. So The way that we move, and in fact the way that we secrete our autonomic function works, the way that our heart works, in fact, all of our homestasis works, generalized homestasic works, is by supplying the right set points to servos or homestatic or reflex mechanisms in the periphery. What are those set points? They're just predictions of the way I want to be. And is the role of, does free energy have a role here in kind of an optimization problem or an efficiency?

Starting point is 01:06:41 mechanism? Like there's different ways you could imagine the brain or the nervous system bringing this match between expectation and reality, but is there just a sort of, there are easier ways to calculate how to do that? Using free energy? Using, uh, um, well, um, I mean, you know, the free energy formalism is, is if you like, you know, grandfather's all of these particular manifestations. Yeah. Okay. I guess what you're asking is if I wanted to now describe the biology or the wiring or the time constants,

Starting point is 01:07:20 what you'd have to do is to write down the genitive model. If you remember the free energy is a functional of a belief and the belief is defined in relation to a genitive model. So if you can write down the genetic model, you can then write down the differential equations of these sort of sensory and active internal states. And internal states you can associate with neural activity. Active states can be either secretions or it could be physical. movements of an arm, and then you will be able to simulate these kinds of phenomena. What you can then do is take empirical data and then change the parameters of the generative model until your simulation of an arm movement, for example, or a neuronal response to

Starting point is 01:08:01 perceptual synthesis matches what you observe in terms of brain signals. So that, you know, in a sense, that's another description of what we already do. That's what neuroscientists do. You know, we think about the functional anatomy in terms of, well, how does the brain model its world? How does it make these predictions? How does it generate, for example, movement? How does it generate these motor commands? But they're not motor commands.

Starting point is 01:08:29 They're just predictions of what I should feel if I was actually in this position or walking or talking. and in fact that the whole industry of not only theories of interpreting reflex arts and the motor system under that kind of perspective called the equilibrium point hypothesis but there's also massive debates about the actual implementation of that and whether that's the right way to look at things. I guess I had this impression that you might imagine that what the brain is trying to do is just use basis theorem. It has some beliefs.

Starting point is 01:09:06 It gets in more data, it updates its beliefs, but that is calculationally difficult, computationally intensive, and calculating the free energy is a sort of shortcut. Minimizing free energy is calculationalizing easier than simply conditionalizing probabilities. I see. Sorry, yes. You've moved us into a very important observation. So, I mean, we've been talking about sort of beliefs about where we should be in prior beliefs and beliefs about the future. and of course, prior beliefs implicitly rests upon a Bayesian distinction between prior to seeing any sensory states or sensory data and posterior beliefs, the product of belief updating having observed those data. So that is the process of minimizing variational free energy, for example, or is it? Not quite.

Starting point is 01:09:56 if you if that process of belief updating and this just connect it back to physics so what we are saying is that there is a gradient flow that is in place that underwrites a random attractor and that attracting set defines

Starting point is 01:10:20 the kind of thing that we are and there's this Markov blanket or partition in play so that gradient flow we're going to now associate with belief updating on the assumption that some states encode probability distributions are stand in for the parameters of beliefs and once we've made that move then we can actually write down the gradient flows in terms of belief updating which is literally we're taking a random dynamical system on say a l'anguance system and you know interpreting the dynamics as belief updating

Starting point is 01:10:50 and the belief updating is from prior to posteries so that's so gradient flow in this case is a is a way of saying moving from where we are a little bit closer to where we want to be. Yes, yeah, absolutely. Well, I mean, that's exactly what the free energy actually scores. So we're covering all sorts of wonderful issues here. We haven't gotten to the origin of life yet, but we're going to get there too. So, yeah, no, it was different between bays and approximate bays and influence. That's where we go.

Starting point is 01:11:20 But I think actually you're touching on something which we'd rehearse previously. which is another perspective on variational free energy. So just for your interest, once you abandon equilibrium physics, I follow you referred to the 20th century physics, and you just live in a world of non-equilibrium steady state, which implies that there is some attracting set there, and you have to explain it, you know, it's mechanics.

Starting point is 01:11:50 Then the free energy, the variational free energy, now starts to have the look and feel of something that's much closer to a physicist's thermodynamic free energy. And effectively, it scores the divergence between the current stated system and the probability distribution it would have on its attracting set. Exactly on the attracting set, yeah. So it's basically how far away am I from my attracting set or my sort of non-equilibrium, semi-state probability distribution. So in that sense, now it looks very much like the amount of energy available to do.

Starting point is 01:12:25 work. So I look at me, I perturbed me, I put me in a highly unusual, frightening, angsting situation. I've never been in it before, you know, homestatically or conceptually. And I will work towards getting back to my comfort zone. Yes, a familiarity. My happy place. My right temperature. And in so doing, I will minimise my phyology and I'll be working literally on the environment. So now, to my mind, there becomes an almost invisible distinction between the variational free energy in the context of writing down

Starting point is 01:12:59 the dynamics or the mechanics of non-equilibrium steady states with attracting sets and thermodynamic free energy. And in fact, you can just put a boutsman's constant on and they are the same thing. And interesting, the boutsman constant is in the purely information theoretic perspective,

Starting point is 01:13:16 it now becomes, just equips the amplitude of random fluctuations with units and now you can start to interpret it in terms of physical measures. Absolutely. That's right. Yeah. That's right. Yeah. Anyway, that was that was me indulging myself because I know you're No, no, yeah. Thank you. You like that sort of thing. Yeah. But back to the, you know, why not disbaden inference? And what's the difference between minimizing free energy and base in inference?

Starting point is 01:13:47 Great questions. So the, the, the, um, if basing model evidence is just simply the marginal likelihood of states of being, then one could trivialis say just by optimizing the likelihood that I am in this state, which by definition is the thing that I do, because that's how I exist, I could describe myself as performing Bayesian inference, trivially so, because I can just call the negative log of my non-equilibrium density, density, model evidence, and therefore everything I do is in the service of maximising model evidence. I'm the perfect basis statistician. It doesn't get you anywhere, but it's a nice thing to say. So where does the variational free energy comes in? Well, the variational free energy

Starting point is 01:14:33 comes in because it's operationally defines what's called approximate Bayesian inference. So what's the difference between base, base inference that would conform to base rule and approximate basing inference or variational base or sometimes called ensemble learning, but probably variational basis is the most technically correct term. Well, the difference is that you're not minimizing model evidence, they're maximizing model evidence. You are maximizing a lower bound on model evidence, which means that the thing that you can measure,

Starting point is 01:15:13 which is this variational free energy, or the negative in machine learning, is now always a, in machine learning, it's always below the actual thing you want to maximize, which is your model evidence. If I just flip the side and bring us back to physics and the free energy principle, so the negative logarithm of model evidence, which is essentially our self-information, our surprise, can always be minimized as if you were,

Starting point is 01:15:47 perfect basing statistician by minimizing something that's always provably larger than that. And that's the KL divergence or bound approximation. So the variation of energy is an upper evidence bound. And if you minimize that, then you become an approximate basin inference. Why approximate? Well, because the bound is not necessarily zero. It's not saturated. Yeah.

Starting point is 01:16:11 Yeah. Right. But that might be easier to work that way. That's the only rationale for it. Yeah, so it's just that the evaluation of the actual evidence, a partition function, if you're a physicist, becomes intractable because in high-dimensional systems. So how do you elude that intractability when you've actually got real physical systems who seem to be able to do this kind of thing?

Starting point is 01:16:39 Well, you just create a tractical bound. And who did that? Well, Richard Feynard. That's where it came from. On one reading of the legacy, there's another Russian reading. On one reading of the legacy of physics for things like the free energy principle, certainly machine learning, that's, you know, it was Feynman's path integral formulation, which introduced the notion of a variational bound,

Starting point is 01:17:08 a bound that was using variational calculus was provably always greater than the thing you wanted to optimize. So you just optimize the bound instead. And you get into some nice rhetoric about sort of bounded rationality in economics. Oh, okay. Yeah, you thought of that, but I guess it makes sense. So it's not perfectly rational. It's not an exact base in inference, but it's doable. It's physically realizable.

Starting point is 01:17:30 So that's the key difference. None of us is Laplace's demon. Absolutely. Yeah, yeah. But so much of this conversation, and I think for good reason, has been in the context of agents or brains, occasionally thermostats. But you do want to be even. a little bit more ambitious, right, and talk about sort of a general organizing principle

Starting point is 01:17:50 of non-equilibrium systems to minimize their free energy. And maybe this has something to do not just with the nature of cells and organisms, but with their origin, is that safe to say? Is that a fair ambition to attribute to you? That minimizing free energy helps explain why life came into existence in the first place? Oh, yeah, you're taking me out of my comforts for now. surprised did I say that so well it's probably it could be me could be my lenses because I want to do that I see well you should talk about that yeah yeah yeah yeah well you know okay let's put it this way you mentioned Markov blankets right the idea of the you know in fact we have a bunch of systems in the universe that have a pretty clear boundary between themselves and the rest of the world right

Starting point is 01:18:43 this boundary mediates their interactions with the world. The appearance of cell walls is clearly one of the most important steps in the origin of life. And a literal cell wall is certainly similar in spirit, if not exactly identical, to the conceptual Markov blanket between the inside and the outside. Is that safe to say? Absolutely. Yeah. In fact, that is the metaphor we always use.

Starting point is 01:19:09 It's self-surface is about it. Yeah. But somehow life as we know it, you could imagine that if life were just defined as some complex thing that had a big chemical reaction and it evolved, it wouldn't need to have cell walls. It wouldn't need to be in a compartment. And yet it is, as a matter of fact. And I'm wondering if somehow being compartmentalized like that affords a special set of powers to certain chemical reactions to then go and adapt in a crazy world. Yes. I'm sure that's right. I'm not sure about, I'm wondering whether there's a slightly more deflationary way of expressing that.

Starting point is 01:19:52 In order to exist and to have measurable characteristics in the sense of egregadiscite, you'd have to have a markoff blanket. So from a sort of weak anthropomorphic principle point of view, then it could have been no other way. Well, does a hurricane have a Markle blanket? Interesting point. I've been asked as Gaya. No, it doesn't. That's really irritating. Nor do candle flames.

Starting point is 01:20:23 And they're not alive. So that's, but there is some. Right. But they have characteristics, some characteristics. Do they? Yes, but on in a godic sense, in the sense that they don't last long enough. However, you could say an eternal flame,

Starting point is 01:20:40 to, I think you're right. I think that's, you know, from my perspective, that's a really interesting, outstanding challenge. I'm wondering whether Burkhoff's notions of wandering sets, you know, resolves that that you can actually have a Markov blanket that renews

Starting point is 01:20:58 itself. However, that's that, you know, but in some sense, you know, one of the reasons why we don't count hurricanes or forest fires as living is that they're not rich with information. right? They're just doing their thing. And even the simplest cell, you know, in your language, carries a model of the world around with it, right? And a forest fire does not.

Starting point is 01:21:20 Ah, good. Yes. No, that makes a lot of sense. I wish I'd said that. Yeah. And so somehow, I don't know how life began, but some of my friends are working on it. You know, I do wonder whether or not we are under-emphasizing the important, there's this debate in the origin of life between replication first, where you imagine that RNA or the information-carrying thing came first, and then it sort of wrapped a blanket around it and got energy and got going. There's another camp metabolism first that said that the actually extracting free energy, in the physicist's sense, from the environment, was the first thing that happened, and only later did it sort of get imprinted informationally onto RNA and then put in a cell.

Starting point is 01:22:04 And then there's the easy, everyone agrees that the easy part is making a cell wall. It's just like a bilipid layer, that can just happen automatically. So I wonder if we're not, you know, underselling the importance of that thing, that without that compartment, without that blanket, the very notion of having a view of the outside world, having a model, even if it's very, very primitive, is not really tenable. If you don't even have a difference between self and other, what does it mean to have a view of the other? Well, that's the deflationary aspect. For me, it's beautiful.

Starting point is 01:22:35 For me, it's a Quinian desert landscape. I mean, there's a fundamental truth to that. I haven't really thought about this. Clearly, you have. I hear some of you. I'll speak tentatively. It does strike me. When we use the free energy principle to model morphogenesis and cellular organization,

Starting point is 01:22:55 it becomes immediately obvious. You need RNA and DNA as a generative model. That's certainly the case when you're talking about sort of multicellular organisation that you cannot get a free energy minimum minimum from coupled free energy minimizing Markov blankets without some shared generative model that is most naturally written down in terms of some genetic code so it may well be a perspective that makes perfect sense that's what DNA is good at doing right storing the information so so perhaps you could argue and I and perhaps you have already or people have already that yes first of all you need

Starting point is 01:23:34 a mark of blanket, otherwise there is no existence of the store that we're talking about. But if it is the case that just, you know, the part of having a Markov blanket means there is a way of writing down the dynamics or the mechanics that makes it look as if there is a generative model. There also have to be something in the internal states that plays the role of a generative model. And it seems quite natural then for things that endure over generations or show. that sort of attracting set with a sense of itinerancy that comes and looks like reproduction, then that that's what we call RNA or DNA.

Starting point is 01:24:15 So it may well be that the other side of the coin of having a Markov blanket, namely, I must have an implicit generative model. Or it looks as if my gradient flows are on a variational gradient, that free energy gradient, and that free energy is a function of a jointed model, therefore there must be a biophysical encoding of that. That could be a statement that you need something like RNA, DNA, in order to go hand in hand with the Markov blanket.

Starting point is 01:24:48 Well, I do appreciate you're indulging me there, but may be good then to sort of wrap things up. We can bring it back close to where we started with going back to the clinic out of the research environment. How does this perspective, this point of, of you of free energy minimization help us understand not just the brain when it's working but the brain when it's not working. I recall schizophrenia is something that you might be thinking about but maybe other things as well. Yes, I mean my personal training and interest is in schizophrenia

Starting point is 01:25:20 but I have to work with, that sounds awful, isn't it? I have the pleasure of working with lots of psychologists and psychiatrists. We'll edit that out, don't know. Who like all sorts, who are interested in all sorts of neuropsychiatric conditions. So I think it's a lovely question because it allows me to make some simple points. So if Kant and Helmholtz and everybody since that time are right,

Starting point is 01:25:49 and effectively psychology is inference, consciousness is inference and unconsciousness is inference. Psychology is inference, and psychiatry is psychopathology. That tells you, by definition, that psychiatry is all about broken inference, false inference. And I mean that in a very literal sense, that basically inferring something is there when it is not,

Starting point is 01:26:17 like a delusion or an hallucination, or inferring something is not there when it is, like an agnosia, or a denial that something exists or part of my body exists. So just extending that notion that nearly every psychiatric and probably neurological syndrome can be cast as an instance of false inference suddenly gives you or puts pressure on you to now derive a calculus of how the brain works that is framed. in terms of beliefs and inference.

Starting point is 01:26:56 So just a few examples. Well, we've covered say delusions and hallucinations. It's schizophrenia. Those are sort of the... Lasix. Post-shiles of your false inference. But also, you can manifest in a slightly more subtle way. Take Parkinson's disease.

Starting point is 01:27:11 Let's come back to our ideomotor picture of why and how we move. I infer that my arm is going to be over there. I infer that in the next 600 milliseconds, I'm going to stand up and start walking. Now, if I make a false inference because I fail to attenuate the evidence that I am not moving, I will never realize that prior belief. I will never form an intention to move. Well, I'm moving.

Starting point is 01:27:37 I'm fine. But if I'm not moving, I cannot deny the sensory evidence that I'm stuck in mobile. Of course, that's a classic description of park's instances. And actually, if you drill down on the things that the belief updating at the neural level, it actually implicates directly, you know, the neurotransmitters and chemicals that are implicated in Parkinson's disease. So if we just generalize this notion of false inference, when inference really underwrites your free will selection of what to do next in terms of predictions about what's going to, you know, how you're going to feel yourself move and talk

Starting point is 01:28:13 and think and feel, including your gut feelings, then you've got a way now of writing down a mechanics of psychiatry. So in a way that all other normative schemes do not, you know, because it's actually articulated in terms of probability distributions, because if you remember, it's all about functional, it's all about things that entropy and beliefs and relativities and certainties, beliefs about something, then you've now got a calculus, which is fit for purpose, to understand false inference, hallucinations and perceptions, and the physics that gives rise to those, the physics that underwrites the belief updating, that leads this false inference. So in the past 10 years, that's hit me time and time again, you know, why it is

Starting point is 01:29:04 so useful, and if not essential, to understand sentience in its sort of philosophical sense, and the failures of sentience that we have in psychiatry, to understand those on a formal footing, you know, calls for something like the 300 principle. And has it led to any specific tactics for therapeutic intervention? Yeah. Well, what might hope soon. So it certainly led to, and I may be misinterpreting this through the way that I, the things that I'm asked to review.

Starting point is 01:29:44 So I only speak to people who commit or subscribe. to the free energy principle. So I don't see alternative. But certainly from what I see in terms of being asked to contribute to special issues and what I see in terms of being asked to review for this specialist psychiatric literature, there has been a sort of slight paradigm shift

Starting point is 01:30:04 in the past five years. It also centres on something called precision. We mentioned that very, very briefly before when we were talking about the predictive gloat coding implementation of very, of very, approximate basic inference by minimizing variation free energy. So I talked about the precision weighted prediction errors. So what that basically means is that you can have a mismatch

Starting point is 01:30:30 between what you predicted and what you sensed. Does it really matter? Right. You know, if it's in the dark, if I get a mismatch between, you know, if I thought I was seeing a visual angel, what I actually get is darkness, does that really matter because there's no precise visual information around? So I actually now have to see the darkness by inferring the precision, the signal to noise ratio, effectively.

Starting point is 01:30:55 The error bars in some sense. Exactly. It is exactly the error bars. Interestingly, of course, that's 99% of the challenge for statistical analysis. It's not measuring the group mean. It's measuring your uncertainty about it. But that's interesting. People like Jakob Howie, you sort of emphasize the importance of what we've just said for psychiatry. getting the error bars right is at the heart of good inference.

Starting point is 01:31:21 If you break the capacity to get your error bars spot on, you're going to get all sorts of false type one and type two errors, false inferences, inferring things are there when they're not and they are, they're not there when they are. So that's where the precision comes in. It's literally the sort of the more precise, the tighter the error bars. The less precise than more dispersed or uncertain you are about something, or in particular the precision of the sensory evidence and, for example, or the precision of my beliefs,

Starting point is 01:31:48 how the confidence with which I hold my prior beliefs that are being used to explain these data that may themselves may or may not be precise. But you have to estimate this precision. So it looks as if, and this is the mini-paradigm shift I was talking about, it looks as if in psychiatry, nearly all the phenomenology and the psychopathology and possibly a lot of the neurochemistry and psychopharmacology can be explained by failure to encode the precision, the error bars.

Starting point is 01:32:21 That makes a lot of sense because nearly every treatment or certainly every pharmacological treatment in psychiatry targets those neurotransmitters that have a modulatory effect. So they don't in themselves encode shifts in the beliefs or expectations or averages. They encode changes in the sensitivity to sensory input, for example. So it looks as though those are exactly from the point of view of predictive coding,

Starting point is 01:32:48 exactly the biophysical mechanisms that would encode your beliefs about the precision. So what does that tell you? Well, it tells you, first of all, where you should target your therapy. It's likely to be in broken standard error estimators in the brain. So where are they? And then you know that, you know,

Starting point is 01:33:07 once you know about the neurochemistry and the function anatomy and the projection system, and the domains of beliefs that are broken. If you've got anorexia nervosa, it's beliefs about your body. If you've got visual hallucinations, it might be a sort of allude, body disease, an organic psychosy syndrome. But the same underlying mechanisms should be in play with different neurotransmitters and different possibly neurodegenerative processes.

Starting point is 01:33:28 So it does give you a mechanistic focus. So you can start to move away from what has, if you like, haunted psychiatry for centuries, which is this purely nosological describing descriptive approach to a slightly more mechanistic understanding which is starting to. It's pre-paratimatic. I'm going to use it all the time now.

Starting point is 01:33:52 So, yeah, so when you say, is it currently affecting treatment strategies? No, it is pre-paradigmatic. But I guess if you came back in five to ten years, I think you'd then see evidence of where that turn led in terms of possibly reinvigorating.

Starting point is 01:34:10 the pharmaceutical industry. I say that very practically. It's something you won't know, but it worries a lot of people in my game. So the farmer have basically given up on psychiatry and developing psychiatric drugs. There's no money to be made because there's no progress

Starting point is 01:34:26 because there's no mechanistic underpinning. So about three years ago, basically all just pulled out. Wow, no idea. Yeah. So there is no active research at all. I mean, this is expensive research. It's billions, not millions,

Starting point is 01:34:38 in drugs for schizophrenia, depression, you know, which is a great shame. And you could see why, because they don't know what to target. Because there's no one said, well, it's got to be this system or that system or that system. So we all agree it's important

Starting point is 01:34:53 that just don't see the direction forward. Yes, absolutely, yeah. And of course, they have to keep the money coming in to fund the research to make the next job. But maybe this will propose a direction forward in some... That would be the... Cross our fingers, yeah. That's the final hope, you know.

Starting point is 01:35:07 All right, I think this is a wonderful lesson to end on. We should all aspire to have our error bars brought in as close alignment with reality as we possibly can. Carl Fistin, thanks so much for being on the podcast. Thank you. What if you could have even more and more and more help to pursue your goals? At LPL Financial, we offer more ways for advisors and their clients to thrive. So what if you could? Paid advertisement. Investing involves risk including potential asset principal, LPL Financial LSC member FINRA, SIPC.

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas - 87 | Karl Friston on Brains, Predictions, and Free Energy

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.