That Neuroscience Guy - The Neuroscience of Exploring and Exploiting

Episode Date: November 23, 2025

In today's episode of That Neuroscience Guy, we discuss the neuroscience behind choosing an exploratory decision, where we do something new, or sticking with what we know. ...

Transcript
Discussion (0)
Starting point is 00:00:05 Hi, my name's Oliver Golson, and I'm a neuroscientist at the University of Victoria. And in my spare time, I'm that neuroscience guy. Welcome to the podcast. And welcome to our third lesson or episode on human decision-making. One of the reasons I chose to do a deep dive into decision-making is it's kind of what I know best. My PhD work was in the neuroscience of human learning, and myself and my research groups, studied that for quite a while. But then I got more and more fascinated with how people make decisions. In fact, it was the topic of my most recent TED talk. And they're related, right?
Starting point is 00:00:48 Like, one of the reasons we learn is to make better decisions. So the two things go hand in hands. In fact, I'm already kind of planning to do a deep dive on the neuroscience of human learning after this deep dive on the neuroscience of human decision making. Now, enough of that, let's get into it. So if you remember in the first lesson, we talked about the concept of value and expected value. And I reviewed that in the second, so I don't want to spend a lot of time on it. But on the second lesson or topic, last episode, we got into this idea of a very simple decision model where you always choose the highest value or expected value.
Starting point is 00:01:29 The terms are usually synonymous, just so you know. but we should think in terms of expected values because the reality is it's the value times the probability of getting that value. It's just in a lot of simple situations, the probability is 100%. So the expected value is the same as the value because you're multiplying by one. Okay, so we have this model. Always choose the highest value. But we know that people don't do that, right?
Starting point is 00:01:59 Sometimes people choose a lower, value. The question is why. And what is the mechanism behind it? Well, the why part's pretty straightforward. There's two reasons why you might not choose the highest value option. Number one, what happens if you move to a new city? And you're on the quest for pizza. And yes, I'm still going off about pizza. So I moved to Portland. I'm living in a little part of Portland called Slab Town. It's really nice. And I wanted pizza. So I went out and got pizza my first night and ate that pizza. Now, all of a sudden, that's the highest value pizza because it's the only value I have. And I will do a little sidebar here. What is the value for a choice that we don't
Starting point is 00:02:53 know? It's actually a matter of some debate. Do we initially set all values to zero? And when we gain information we increase or decrease value? Or are the values random? They're randomly assigned. Now that might sound kind of weird, but you have to remember, this is the product of a bunch of neurons firing, and there might be some random pattern of firing that represents that choice, thus making the value effectively random. I won't get too much into it because it gets pretty technical. So for our purposes, let's just assume that the value of an unknown choice is zero. So I go out and I get my first pizza in Portland and it was really good. So I increased the value of that choice option. Now, there was another pizza place not far away. And the value of that pizza place is
Starting point is 00:03:49 zero because I've never been there. So if I decide to have pizza on the next night, guess what? I'm going back to the places I went the night before because it's the highest value. The problem with this is it's also the only value I know. So one of the reasons we might not choose the highest value is to check out or explore more of the unknown. And sure enough, that's what humans do. Over the course of my first couple weeks here, I tried all of the pizza places in the region. And we call that exploration. So before we choose the highest value, we're actually faced with something called the explore or exploit dilemma.
Starting point is 00:04:33 And the question is, do I exploit? Do I always choose the highest value option? Or do I explore and deliberately try a lesser value option? So as humans, we are hardwired to explore. Now, some people explore more than others. You might have a friend that always sticks to the same thing who never explores, and you might be the kind of person like me that's always exploring, always choosing something unknown to see what it's all about. But that is how we modify this simple model of decision making.
Starting point is 00:05:12 Rule number one, always choose the highest value. Rule number two, sometimes you explore. Don't always exploit. and you explore because of the unknown. Now, that's the number one reason to explore, to examine the value of choices that you don't know anything about. Now, in terms of picking pizza, I think you can get the idea, but let's put it in a very different context.
Starting point is 00:05:40 Imagine learning to hit a tennis ball with a forearm hit. Well, your brain and your motor system is selecting a group of neurons to do this, all right? It's picking neurons and your biceps, your triceps. you know, your shoulder muscles, your pronators, your rotators. I forget that stuff. It was a long time to go. But you get the idea. You're picking up this pattern of a bunch of neurons.
Starting point is 00:06:03 Now, imagine you hit the ball and it was okay. Exploration in this case might be choosing some different pattern of neurons. All right. So you're just deliberately not choosing the highest value ones, which are the ones that you use to hit the ball. wall on the hit before, but you're trying a different pattern of neurons to see if you can do better. So that is the essence of the explore-exploit dilemma. Sometimes you choose the highest value option. Sometimes you explore. Now, I want to talk about two more things. One, there's another
Starting point is 00:06:44 reason you might explore. What if the world changes? If the world changes, you know, you might need to explore. So imagine you do find the place and you establish through exploration. It's got the highest value option. It's the best pizza. Well, guess what? What happens if the chef changes? If the chef changes, you're going to be forced to explore because you're going to want to see what other pizza is out there now that your favorite chef is gone and the pizza is not as good. So you explore when the situation is unknown, but sometimes you explore just to check values in case the world changes or because the world changes. So there's a bunch of reasons why you explore. Now, what's the neural mechanism for deciding to explore?
Starting point is 00:07:40 There's still a lot of debate about that, but we know that when people explore, there's a different pattern of brain activity than when people exploit. And in fact, my former PhD student Cameron Hassel, he's now a professor. He wrote a couple of papers on this. When he was doing his PhD with me, we were very interested in this exact question. And we were able to find some mechanisms that might be tied to exploration. And if you really want to take the deep dive into hardcore neuroscience, one of the parts of the brain is that's implicated in the exploration sort of choice is the locus serilious. It's a neuron or a neural structure within the brain, more correctly, that releases norapinephrine.
Starting point is 00:08:26 And the idea is if there's a sudden phasic increase in norapinephrine, that might trigger your decision to explore and override the decision to exploit. So there's a neural mechanism for you. It's a mid-brain structure, uses neurotransmitter to signal this idea of exploration. There are probably other parts of the brain, and not probably. There are most certainly other parts of the brain involved in the explore or exploit dilemma. Now, the last thing I want to talk about with this is how often should you explore? You know, should you explore all the time? Well, no.
Starting point is 00:09:05 The problem with exploring all the time is you're going to spend far too much time picking lower value options. Like, once you find the best pizza, you want to pick it most of the time. Get what I mean? like if you explore every time you go out, that means you might only get the best pizza one time out of 10 or whatever number, however number of pizza places are where you live. So you don't want to explore all the time,
Starting point is 00:09:32 but you don't want to explore too little, because if you explore too little, you might not actually have an accurate representation of value. So let's go back to my favorite pizza place in Portland one more time. Imagine the first time you went there, it was the chef's night off. It was another guy cooking or another person cooking, right? So they do make the best pizza most of the time,
Starting point is 00:09:57 but just randomly that first night that I went, it wasn't the best pizza because it was the chef's night off. Well, if you explore too little, you might have decided that the second best pizza was the best because the night you went to the second best place, it was the best pizza. the chef was there. So it becomes the highest value.
Starting point is 00:10:21 And if you don't explore very much, you will never learn that the other place has better pizza. I hope that makes sense, but hopefully you get the idea. If you don't explore enough, you won't learn the true values of the world. If you explore too little, then you'll, same problem.
Starting point is 00:10:40 You won't learn the true values of the world. So, How much should you explore? Well, there's no right or wrong answer. We do know that for most organisms that use this sort of decision-making framework, that when you're in a novel environment, you tend to explore more early on, but then you dial down your exploration rate. So the exploration rate isn't set.
Starting point is 00:11:07 It's changing over time. So in a new environment, you might explore a lot. an environment you're very familiar with, you will explore less, but you will still explore sometimes. So your exploration rate is fluid in a sense, but it's biased by familiarity, memory, and a bunch of other things. Okay, that's the end of our third lesson on decision-making. So the first one, values, second one, a simple model of decision-making, always take the highest value choice and today we added exploration versus exploitation. Sometimes we don't always take the highest value choice. All right, don't forget about our website that neuroscience guy.com. There's links to Etsy.
Starting point is 00:11:54 All right. We have some merch up there. There's links to Patreon where you can support us, right? Remember, you just sign up and donate some money. All the money goes to graduate students in the Krig Olson Lab. You can get us on social media, Instagram, X and Thread. at that neurosai guy. Now, we're not going to do the rest of the podcast for all of time on decision making. So we do want to know what you want to know about the neuroscience of daily life.
Starting point is 00:12:21 And you can also email us, that neuroscience guy at gmail.com. And finally, the podcast. Thank you so much for listening. You know what? It means everything to us. I won't lie. There was a point where I was thinking about
Starting point is 00:12:34 maybe we've run our course, but we got a lot of mail in. I saw the number of subscribers and downloads and I went, you know, I like doing this and I think people enjoy listening. So thank you for listening. And please subscribe if you haven't already. My name is Oloff Krigg Olson, and I am that neuroscience guy. I'll see you soon for another full episode of the podcast.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.