The Science of Everything Podcast - Episode 64: Knowledge Representation

Starting point is 00:00:34 You're listening to The Science of Everything podcast, episode 64. Knowledge Representation. I'm your host, James Fodor. In this episode, we're going to talk about the psychology behind how we understand things and think about things. Particularly, we'll look at how knowledge is represented and how we understand different concepts. We'll look at this mostly from a psychological perspective, although we'll dip into a little bit of neuroscientific sort of ideas at points. So, in particular, we'll talk about semantic networks. Propositional representation, connectionism versus computationalism,

Starting point is 00:01:08 we'll look at family resemblance notions of categorization, prototype and exemplar theory and some evidence for and problems with each, and we'll also look at concepts as theories. No particular recommended pre-listening for this episode, although if you've listened to some of the episodes, past episodes about neurons or neuroscience, that might help for some parts of what we're going to discuss. So, let's get started. First of all, let me just outline sort of what we're trying to do in this.

Starting point is 00:01:34 episode because it might seem a little bit abstract. What we're trying to do is understand how knowledge, broadly speaking, particularly semantic knowledge, so propositional knowledge, Paris as a capital of France would be an example of that, how that sort of knowledge is represented in the brain, or more generally, just in the mind. So one, a quite fruitful, recent approach towards understanding this, is to use what broadly called semantic networks, or sometimes they talk about networks more generally, but semantic networks more specifically. So a semantic network is a network, so you can represent it as a graph, a sequence of nodes with connections between them, with links between them. It's sort of like a concept map, if you've ever seen one of those, or like a mind map.

Starting point is 00:02:16 But a semantic network is a network which represents semantic relations between concepts. It's a way of representing knowledge. So again, the idea here is that each node, which you can represent as like a circle or a box or whatever, represents a particular feature or a concept, and that each node is connected to many other nodes, via links, so you can represent these as simply lines. So you can have many of these nodes connected with each other in lots of complicated different ways, and this forms what we call a semantic network, which represents a body of knowledge. So as an example, we might have a node representing a concept like bird, and then connected

Starting point is 00:02:51 to this node we might have the various different properties of birds, so has wings, can fly, builds nests, etc., etc. So each one of these properties or behaviors or functions or whatever else of birds would be connected to the bird node and can fly, for example, that's one of the properties of a bird, might also be connected to other things like aeroplane or hot a balloon or Superman, and then that in turn would be connected to various other concepts, and so you could see how these would all be connected to each other, and it could become a very complicated, intricate network of all the various propositions and knowledge that you have.

Starting point is 00:03:25 Now, of course, the idea is not that, like, literally there's a concept map in your head, the idea is this is a way of representing what's happening in our minds when we think about concepts like this. Now, in order to further expand this model, we introduce a concept called spreading activation. This is the spreading activation model of understanding search through semantic networks.

Starting point is 00:03:48 They're also called associative networks because we associate one concept with another. So how does this work? Well, the idea is broadly based on neural networks where you have, if you recall, you have neurons that are connected to each other, and neurons can be excited by essentially external electrical impulses, and if that level of excitation reaches a certain threshold, if it's high enough, then the neuron will fire what we call an action potential

Starting point is 00:04:13 and therefore excite the various other neurons that it's connected to. So this is the broad idea behind spreading activation. The idea is in our semantic network, nodes can have different levels of activation. Again, it doesn't really... don't worry too much about what this activation means. We're not talking about it as any literal thing, it's not literally, electrical activity or something like that. It's an abstract concept of activation, and that's all we really need for our purposes.

Starting point is 00:04:37 We'll talk about how this could be instantiated in the brain later on. Right now, it's just a psychological model. So each node can have a level of activation, and that activation is propagated or spread through the nodes that it's connected to. So, for example, if we activated the bird node, then that activation would be propagated out through to the nodes that it's connected to.

Starting point is 00:04:56 So that would be that has wings and can fly and build to nest nodes. The idea is that activation in this sense is additive. So suppose that we were to activate the bird node, that would spread some of its activation to the Can Fly node. But then suppose we also were to activate the Superman node, that would also spread some of its activation to Can Fly. So Can Fly would be sort of doubly activated. Again, we also have a concept of threshold. So if a node, if any particular node reaches a sufficiently high level of activation, we say that that node fires. and when it fires, only then does it spread its activation to all the nodes that it's connected to.

Starting point is 00:05:32 So any levels of activation that it receives less than the threshold do not cause that, that spreading of activation. So initially I spoke as if each node just spreads its, each node upon firing, spreads its activation out through all of the nodes that it's connected to equally. But usually the idea is that we complicated it a little bit beyond that, as each node will have, each connection between nodes, will have a different weight. So some connections will have higher weights than others, which essentially represents how much of the activation propagates through that particular connection. And we can think of this weight as representing how closely those concepts are connected.

Starting point is 00:06:14 So, for example, thinking about birds, birds would be pretty strongly connected to the has wings node, for example, and the has feathers node, because all birds have feathers and wings. fly, however, would have a slightly weaker connection, arguably, because not all birds fly, although most birds do. Then we could imagine specific bird sounds, for example, and those would have much weaker connections to the bird node, because only some birds have those, make those sounds, and anyway, maybe we haven't heard them very often or something, so those would have weaker connections. So the idea is that the strength of this connection between

Starting point is 00:06:45 two concepts represents how closely related the concepts are, and perhaps also how much experience we have with those things being related. If they're only weakly related, or we've only seen it that connected a couple of times, they'll have a weak connection, but if they pretty much always go together, they'll have a very strong connection. So this is, so that the strength of the connection between nodes can represent how closely related concepts are. Another way we can represent how closely related concepts are how many nodes you have to jump through to get from one concept to another. So for example, how many nodes would I have to jump through to get from Paris to bird? Well, maybe I could go from Paris contains the Eiffel Tower, and the Eiffel Tower

Starting point is 00:07:21 could be connected to birds because that's somewhere where birds can roost, just to as a sort of a silly example, so it has to jump through two nodes to get from Bird to Paris. Maybe we could imagine more distantly related concepts, which require more nodes to jump through. We can also further expand this semantic network spreading activation model by incorporating inhibitory connections between nodes. So the idea here would be that output nodes would mutually inhibit one another so that only one of them sort of wins out.

Starting point is 00:07:51 These are called winner-take-all networks, because only one node in the end wins out. The reason you'd want to have this is because you can imagine, for example, if we activate the bird node, then that produces activation in 100 or whatever nodes that are connected to that. But then each of those nodes would in turn activate 100 more nodes, and before long we'd have the entire network being activated, which wouldn't be very helpful.

Starting point is 00:08:12 So one way to avoid this is by having a sort of a diminishing rate of activation as you move further and further away. So the activation depletes itself over time. But another way that you can ensure that is to have a winner-take-all-all-all-weigh. network where only one output node would be activated. So, to think of how this would work, imagine that someone's telling me about some city in the world, or maybe it's some superhero or some movie, whatever, they're telling me facts about it.

Starting point is 00:08:37 Each of these facts, let's pick a city in the world, they tell me what continent it's on, and maybe they tell me how big it is, and they tell me how old it is, they tell me some of the sites to see there in broad terms. So I'm not quite sure which city it is yet, but each time they tell me a fact, some cities that are connected to that fact, activate. So if they say it's in Europe, all of the cities that are in Europe that I know of will be activated a little bit. And then if they tell me it's more than 500 years old, then all of the cities that I know

Starting point is 00:09:03 that are more than 500 years old would be a bit more activated. And as they tell me more and more things about it, a few cities will continue to be substantially activated. But the idea of a winner-take-all network is that once one of those cities reaches the threshold, which reaches a sufficiently high level of activation, it will inhibit the other possibility. So say that it's Paris was the actual city that I was, that became activated first,

Starting point is 00:09:29 then that would send inhibitory connections to, say, Berlin and London, some of the other cities that I was thinking of, so that those won't be activated, so that I only get one winner. So Paris is the winner in that sense. The other ones are inhibited, so they don't get activated. That's the idea of a winner-take-all network, so that only one possibility wins out in a sense,

Starting point is 00:09:47 and that inhibits the other one, so that they don't get, so that they don't reach threshold. The reason you'd want something like that to happen is, as I know before, so you don't have the whole network activating. And there's an interesting application of this because it can explain the tip of the tongue phenomenon, which is an interesting psychological phenomenon that probably everyone here is experienced, and I may have mentioned before, the tip of the tongue phenomenon, is when you have a word on the tip of your tongue, so to speak, and in fact, people usually use that expression, hence the name of it,

Starting point is 00:10:13 but you just can't quite remember what it is. You know that you know the knowledge, maybe you know this person's name, name or the name of the movie or this actor's name or the date or something like that. But you just can't quite get it. Maybe you can even give like the letter that it starts with or you can say that it sounds like this or something like that, but the name won't come for some reason. This is called the tip of the tongue phenomenon and it's found in many different cultures and it's been studied a lot psychologically.

Starting point is 00:10:36 But this Winn to Take All Network inhibitory model can potentially explain it because the idea is say that you were trying to remember the actor in a particular movie and you were thinking about the different facts relevant to this actor, you know, the type of character and when the movie was released and these other things. But perhaps these facts lead to the activation of some other actor that you know is the wrong answer, but is sort of close in some way. So this other actor gets activated, the node for that actor gets activated, and that inhibits the true answer that you're really looking for.

Starting point is 00:11:10 Now, when you think about this false actor, that's the wrong answer, that's not the one you were trying to think of, but that's the only thing you can think of, because that one's been activated now, and it's inhibiting the other ones. So you try and think, oh, no, no, no, it's not this one. What's the other guy's name? But you can't get it because it's been inhibited

Starting point is 00:11:24 by the fact that the winner has already inhibited the other possibilities. So a winner-take-all semantic network can explain the tip of the tongue phenomenon, which is, I think, a particularly sort of nifty little application of these type of networks. But there are many other pieces of evidence which support the fact that our memories and, well, particularly conceptual knowledge, works in broad terms like a semantic network. Again, we're not literally saying that there's a semantic network in the mind or in the brain or something like that. Maybe there is, but more to the point is just that this is a useful way of thinking about how the mind works, and we'll talk, again, a little bit later about how

Starting point is 00:12:00 this could be instantiated in an actual brain, but right now we're just talking about sort of psychologically how do the concepts work. So, what are some evidence, what are some actual evidence that semantic networks actually do describe how we think about things. So there's a number of lines of evidence which supports the importance of semantic networks. One is hints. We all know that when you're given hints that they're often very helpful. You know, it starts with a letter B or, you know, it's in January or it sounds like this or

Starting point is 00:12:28 something like that. Hints are helpful? But why are hints helpful? If you're just searching through a database like entries in an Excel document, hints wouldn't really help at all because you just have to sort of go through one item at a time. Giving hints wouldn't make any difference. that's not how memory works.

Starting point is 00:12:42 Memory is not like just going through all of the lists in an Excel document. Memory is like a semantic network, and so giving hints will activate certain nodes. So, for example, if I say that it's in January, maybe I'm trying to think of someone's birthday, and they tell me it's in January. That will activate the January node, which will in turn spread some activation to various nodes that's connected to,

Starting point is 00:13:03 and one of them might be that person's birthday. If that person's birthday, say January 4th, and that's something that I have stored in my memory, then activating the January node will probably also activate the January 4th node, and so that will help me to activate January 4th. If it's also sufficiently activated by other things, like that I'm just trying to remember this person's birthday, then that might allow the January 4th node to reach threshold. And the idea is that when a particular node in the network reaches a sufficiently high

Starting point is 00:13:30 level of activation, we become sort of consciously aware of that, so we remember it or we think about it. So the idea is maybe that the person asked me, or someone asks me, when is Bob's birthday? So that activates the Bob's birthday node or something like that. That leads to a certain amount of activation to the January 4 node, which is the correct answer. But not enough. So there's not a strong enough connection between Bob's birthday and January 4

Starting point is 00:13:51 to lead to the activation of January 4 by itself. But then they tell me, oh, hint, it's in January. Then I have two sources of activation for the January 4 node. I've got the January node and the Bob's birthday node. Together, those might produce enough activation in the January 4th node to exceed the threshold, and then that node sort of fires in a sense, or I become consciously aware, oh yeah, that's the answer. It's January 4, and I could pick that out of my memory.

Starting point is 00:14:13 So semantic networks allow us to explain why hints are helpful in retrieving information. Another piece of information in favor of semantic networks is context reinstatement. We talked about this in one of the memory episodes. If you recall, state-dependent learning is the phenomenon whereby people remember things better or are able to recall things better if they are asked to retrieve the information in the same context as when the memory was formed. So if you memorize things underwater, you'll be able to remember those things better if you're asked to repeat them underwater than if you're not underwater. That's called context reinstatement.

Starting point is 00:14:48 Now, the semantic networks can explain why that occurs, because when the memory was encoded, certain associations would have been made. So, for example, maybe the feeling of the scuba gear that you're wearing or the sight of the particular fish or something like that, or just the sense of being underwater, whatever. Some of those sorts of perceptions would form links to the particular items in the list that you're trying to memorize because you're doing them while you're underwater, and that would just sort of happen in a natural way. Now, when you go back underwater, those perceptions will be present again. So, you know, this fish or the feel of the pressure on your body or whatever, these sensations will be activated, or these concepts will be activated, and therefore they will be connected. Because you memorize the list underwater, the activation of this.

Starting point is 00:15:36 these perceptions or these concepts will lead to the activation of the items of the list that you memorized, therefore helping you to recall them, whereas that won't happen if you're not underwater because the perceptions and other thoughts that brought to your mind are not there anymore because you're not underwater anymore. Context reinstatement can also be explained by semantic networks, is essentially producing this extra bits of activation that help you to remember things. These are what we call cues. Another piece of evidence in favor of semantic networks is called the typicality effect. This refers to the fact that experimental subjects are faster to respond to typical instances of a concept rather than exceptions. So this is manifested in a number of ways. For example,

Starting point is 00:16:15 what are called lexical decision tasks. So one way that this experiment is done is that you give people a pair of two words, or they could be a word or a non-word. So, for example, you might have a pairing of lake and blup. Blup sounds like a word, but it's not actually a word. It's not actually a word. And so what the subject is asked to do is just say, are both of these words or not? So only answer yes if they are both words. So if they have leg, blup, that would be no, because one of them isn't a word. This is a lexical decision task. You just decide if it's word or not. Now, sometimes there will be two real words rather than a word and a non-word. However, in the case where there are two real words, sometimes the words will be semantically related, and sometimes they won't be. So a pair of

Starting point is 00:17:00 semantically related words would be like nurse and doctor, because they're related to each other. Non-related would be like Lake and Shoe, unrelated words. So what we find is that people are faster to respond in a lexical decision task to Nurse Doctor than they are to Lake Shoe. Of course, they're both words in both cases, so the answer is yes. But people are faster to say Nurse Doctor is, yes, those are both words, than they are to say Lake Shoe are both words. So why would that be the case? Well, semantic networks can explain why this would be the case. Because if you read Nurse, that has a connection to doctor, a fairly strong connection.

Starting point is 00:17:34 And so that's going to partly activate the doctor node. So that means you're sort of going to have to spend less time thinking about, oh, is doctor a word? Because it's already partly activated. Whereas if you have Lake and Shue, those are not very strongly connected at all. And so you're going to have to spend longer sort of thinking about whether each of them is a word. That increase in speed when you have semantically related items is evidence for a semantic network type of thing happening. Coming back to the typicality effect, though, that I mentioned before,

Starting point is 00:18:00 it's a similar notion here. A typical instance of a concept might be, if I think about bird, what's a typical bird? Well, Robin, it turns out, is the sort of typical bird that we think of. Penguin would be an atypical bird because it is a bird, but it doesn't fly, and it's different in various other ways. So if you ask someone, is a robin a bird, they say yes. If you ask them, is a penguin a bird, they say yes, but they're faster when you ask them about a robin than if you ask them about a penguin. So similarly to the lexical decision task, when you ask people about these concepts, they're faster. they answer more quickly when it's a typical instance,

Starting point is 00:18:33 and this is consistent with the semantic network approach, because what we would expect is that Robin, having more properties of a bird would more quickly, more easily activate the bird node than a penguin would, which has fewer of the properties we associate with a bird, and so you'd have to think about it for longer. Okay, so that's an overview of just a few of the pieces of evidence which seem to support the semantic network model of concepts.

Starting point is 00:18:58 There are others as well, but I think you get the idea. There's a variety of different lines of evidence which seem to point to the fact that semantic networks are playing some sort of important role in our thinking. But you might be wondering, well, how are these semantic network relationships actually instantiated? Or in other words, what are they actually? When we talk about a node, like I said,

Starting point is 00:19:19 a node could be something like a bird or France or anything like that. They could be concepts or ideas or places or things, whatever. But how does that actually work? what is actually a node? Because bird is not one thing. Like a bird has many... There's a lot of complexity in the notion of what a bird is,

Starting point is 00:19:35 so how do we think about that? How is that actually worked out? How is that encapsulated in a node? Well, one, there are different ways of trying to do this. One, I think, quite a promising approach, is to think of every node as each node representing a proposition. So, rather than a node representing an object or a place or something like that, each node is a proposition.

Starting point is 00:19:55 Now, a proposition is a concept. from logic. A proposition is just a statement that is either true or false. So Paris is the capital of France. That's a proposition. I am recording a podcast. That's a proposition. Even as, I like cheese, that's a proposition as well. Or I am related to my mother.

Starting point is 00:20:11 That's a proposition. That's a proposition. That's a proposition. It's a proposition that's false. Some examples of things that are not propositions would be, hey you, that's a statement, but it's not a proposition. Or open the door. That's also not a proposition. That's an injunction. You're telling someone to do something. So propositions are any statements that have truth values, that they're either true or false.

Starting point is 00:20:32 So there's statements of fact about the world. And we might not know whether the answer is true or false, but it's still a proposition. So, you know, this coin will land heads. That's a proposition. I don't know whether that's true or false because I have to toss it and see, but it's still a proposition because it has a truth value. Okay, so that's a brief introduction to propositions. So why would we think that nodes are propositions? Well, it turns out that we can represent pretty much all of our semantic knowledge by propositions.

Starting point is 00:20:57 So if I say birds have feathers, that's a proposition. So I could represent that proposition as a node. I could represent it as a relationship between bird and feathers. And the relationship would be that the bird has feathers. So you could think of that node as being the conjunction of three things, a relationship between the concepts and a sort of a subject and object. So the subject is the bird. But what am I saying about the bird about the subject?

Starting point is 00:21:21 Well, I'm saying that it has feathers. So that's the other element that I put in there. And the relationship is has. bird has feathers. So there's sort of three components to that. You can imagine them as lines joining together in a dot, and that dot represents the proposition. The bird has feathers. They all link together. And that node is a proposition, or the proposition is the node. They're the same thing. So these are called propositional networks. Each node represents a single proposition, and each different proposition has its own node. So connections between nodes could take the form

Starting point is 00:21:49 identifying relationships, objects, and various grammatical relationships. So Bob is running. That would be another proposition that you could have as a node. You can also incorporate distinctions based on timing and tense, so Bobby's running versus he was running, or he will run. You could incorporate this into the propositional network. You could also talk about different general categories or token, so this is a type token distinction. So a category is like a chair. Chair is a category. There's lots of different types of chairs. But a token is a specific chair. So the chair that I'm sitting on now, that's a token. That's an instantiation of the general category of chairs. So you can incorporate that into the propositional

Starting point is 00:22:23 networks as well, and there's lots of other complexities and other things you can do with the propositional network. We don't need to go into all the details there. The point is that this is one way that you could instantiate in a more detailed, concrete way, the semantic networks. You could represent each note as a proposition, and there are potentially other ways you can do it as well. At this point I want to discuss briefly the difference between connectionism and computationalism, or at least some relevant differences that are important to understanding this notion of semantic networks, because there are two sort of broad, approach is to understanding how conceptual knowledge works, or how the brain thinks in broad sense,

Starting point is 00:23:00 or how the mind thinks, I guess this is even necessarily a thing about the brain. Connectionism is the newer one, and so I'll talk about that, and it's maybe the one that's a little bit harder to understand. The idea of a connectionist approach is that concepts are not represented by specific nodes that represent a whole concept, like you don't have a bird node, or even a proposition node, like a bird has feathers node. Connectionism says no, there's nothing like that. There's no one place in your mind that represents a bird or represents a proposition about a bird.

Starting point is 00:23:32 Instead, each concept or each proposition, is represented by a distributed pattern of activation across an entire network. So I might have 100 nodes in my network. Now, previously, I was talking about semantic networks where each node represents a concept or maybe a proposition. But in connectionism, each node no longer represents anything. They're just nodes. You just have these nodes.

Starting point is 00:23:56 They don't represent anything. So forget about what does this node mean? Wrong question. In connectionism, a node doesn't mean anything. It's just a node. The way we go from nodes to knowledge in a sense. How do we get knowledge from the nodes? Is that each different concept, or maybe a proposition,

Starting point is 00:24:11 will be represented by a particular unique pattern of activation across the different nodes. So maybe. Let's just suppose I have three nodes. If I activate 1 and 2, maybe that represents bird, or maybe it's a proposition, bird has feathers. Now, instead I activate nodes 1 and 3. Maybe that represents Paris as the capital of France. You see that no particular node represents anything in particular. It's just the pattern of activation.

Starting point is 00:24:34 Node 1 was active in both of those cases, but that's okay because node 1 doesn't mean anything in particular. It's just a node like any other. But it's the pattern of activation. Which nodes are active, and also maybe how active they are, is what matters. to determining what the concept is, or what the proposition is that's being represented. Now, of course, you'd have more than three nodes. You'd have hundreds or thousands or millions of nodes in a real one of these networks, but just to illustrate the point. Now, it might seem just completely bizarre to you, if you haven't accounted this, undie,

Starting point is 00:25:04 before, that we can somehow represent a proposition like Paris as the capital of France just by an essentially arbitrary pattern of activation across nodes. But it turns out that this actually works pretty well, at least for a lot of things. Now this is where the debate between connectionism and computationalism comes in, because some of the computationalists guys, researchers say, well, you know, this approach is if connectionism is limited. You still have to have the propositions in there somewhere. You still have to have this sort of more top-down approach where you have some, there's some particular place that represents the notion or the proposition or the concept that Paris is capital France. There's some way that you could point to that's that's the Paris node or maybe it's more than one node, maybe it's a sort of collection of nodes, but there's somewhere that represents Paris. So that would be more the computationalist approach

Starting point is 00:25:45 that there's some sort of localized representation somewhere that you can point to. And yet this is the Paris part or this is the Parisis capital of France part, the representation of it. Whereas connection was to say, no, no, there's no particular place. It's just the pattern of activation. And every different concept is just a different pattern of activation. And connectionists would also say that when we learn things, all we're doing is establishing the appropriate

Starting point is 00:26:04 connection weights between the different nodes so that we can get the desired pattern of activation. So you need to establish those proper connection weights. Connectionism really sort of exploded in the 1980s, particularly as we gained more computing power to simulate some of these networks and began to get really interesting behavior. So, for example, connectionist networks have been used to, used in pattern recognition. They can be used to decide whether a particular sentence is a grammatical or not, and they can be used to decide whether something is a word or not, that they can be used. They're very good for pattern recognition, recognizing whether certain inputs fit a pattern or not.

Starting point is 00:26:38 So they can do lots and lots of different things. and they're very, very useful. I don't have the time in this episode to go into all the details about this, and there are many sort of ins and outs and backwards and forwards in the debate between connections and computationalists and how important distributed versus localized representations are. We won't get into all that here. What I just wanted to make sure that you heard about is you sort of have an idea of what connectionism is,

Starting point is 00:27:00 it's the idea that everything is, all of the knowledge is distributed across many nodes, and no particular node represents anything in particular, whereas computationalism would be more in the line of the, semantic networks or the propositional networks, especially I talked about earlier, where a particular node represents a particular concept or a particular proposition. So there's a sort of a localized, specific representation for an idea or a proposition. So connectionism, to put it in a different way, is sort of like the network concept taken to an extreme. Not only do we have networks of concepts, but actually the concepts are just the weights in the network. There's actually no concepts by themselves anymore.

Starting point is 00:27:33 We've done away with the concept, and it's now just a network. Again, that might seem just totally weird, but I guess you'll just have to take it my word for it or look it up more for yourself that this actually does work. At least it works to a degree, and the debate is to what degree it works. The main advantage of connection is one of the big advantages of it is thought to be very neurologically plausible, because these nodes in the network look a lot like neurons, but it's important to understand that we're not saying that they are the same thing as neurons. We're not literally saying that a node in our abstract connectionist network here is the same thing as a neuron. But maybe it's one neuron, maybe it's 10 neurons, maybe it's a bunch of neurons connected together.

Starting point is 00:28:10 But the idea is that they look similar, so it's plausible to imagine mapping one to the other, whereas it's harder to see how that works for something like a propositional network, because you don't have propositions in the brain, but you do have nodes and connections in the brain. So that's another point of the debate how biologically plausible these different models are. But it's important to bear in mind, again, that although ultimately all of this stuff must be instantiated in some biological substructure in the brain, in this episode we're not really talking about how that happens. So how does a propositional network actually get instantiated in the brain or how does a connectionist network actually get instantiated in the brain?

Starting point is 00:28:45 No one really knows there are different, there are some ideas and theories, and again, it's easier to see that for the connectionist than the propositional networks or the computational approach. But really the point of these models is not to say how that happens, it's just to say how they behave at a higher level. So we're talking about the sort of psychological level rather than the neurological level, for the most part, just to understand that distinction. Okay, so now that we've talked about representing concepts, we'll change gears a little bit and talk a bit about how concepts are, how we categorize concepts or how, sort of unpack this idea of what a concept is and how we understand things.

Starting point is 00:29:18 So the older idea, going back to the ancient Greeks, is that we decide whether a particular example fits into a category by going through a list of necessary and sufficient conditions. So think about a dog, for example, while the dog has fur and it has four legs and it, I don't know, fetches a body, own when you throw it, whatever. We go through all the criteria, and if it meets all the criteria, then, yep, tick, tick, tick, it's a dog. If it misses any of those, then it's not a dog. Similarly, a chair, well, a chair has to have four legs, you have to sit on it, it has to be a piece of furniture, tick, tick, tick, well, then it's a chair. That's the old idea. Now, it doesn't really take very much thought at all to realize that this idea is hopelessly inadequate to describing almost any real concept that we have, because almost all concepts that we actually use

Starting point is 00:30:01 do not have necessary insufficient criteria. So, for example, if I take a dog, well, what if I shaved it's hair? Is it still a dog? Of course it's still a dog, but now it doesn't have hair. Well, what if it lost a leg? Now it doesn't have four legs, it only has three, but it's still a dog. How big does a dog have to be? Well, actually, dogs can be all sorts of different sizes. I mean, there are some things that dogs do have to be. They do have to be a mammal, for example, although that's a different level of categorization. But the point is, it's pretty hard to find too many things that we think are very important that a dog has to have. Even if you start talking about, say, genetic code or something like that, well, you know, dogs aren't genetically

Starting point is 00:30:32 identical and you want to say that it's this particular gene which defines dogness, that seems kind of odd because that's not really what we think about. When we think of a dog, it's not this particular gene. We don't even know what this gene is, or what if this dog has a mutant of that gene, then is it not a dog anymore? No. So it doesn't seem like you can find any one specific criteria or a list of specific criteria that something must meet to be a dog, and that if it meets that, it then will be a dog necessarily. The same thing if we think about a concept like a game. What is a game? Trying to find game. It's actually very hard. What is it that hopscotch and tennis have in common?

Starting point is 00:31:07 Maybe football would be another example. There's not that much that they have in common. You could say they're played for fun. Well, professional sports aren't played for fun. They're played by children. Well, some games are played by children. Many games are not played by children. They're team games.

Starting point is 00:31:19 No, not all games are team games. They're competitive. Well, Solitaire is a game, but that's not competitive. It's pretty hard to find anything, really, that all games fit with. So what do we do with this? It seems that it's pretty difficult. to figure out how exactly we categorize things when there are no necessary or sufficient conditions that need to be met. So how do we decide if something is a game or if something is a dog,

Starting point is 00:31:41 or if something is a person, or if something is a piece of furniture, an occupation, that's another example, what counts as an occupation, all of these different categorizations. How do we decide whether something fits that or not? Well, Wittgenstein, who is actually a philosopher, but I think this idea applies quite well to psychology as well, came up with an idea called family resemblance. and I think this largely solves the mystery, allows us to think about how we categorize things. The idea is that there's no single thing that everything has to meet in order to be a game or a dog. There is a large class of different properties that they can have. The more of these properties that you have, the more like a game it is.

Starting point is 00:32:17 And if it has sort of enough in some vague sense, then we call it a game. So what are some of the things that we think, some of the properties, overlapping similarities is another phrase that we can use? What are some of the overlapping similarities that we expect a game to have? One is that it's competitive. Another one is you play it for fun. Another one is that it involves a team. Maybe it involves exercise in some way. And it has specific rules on and on.

Starting point is 00:32:41 So these might all be the various overlapping similarities. And we expect games to have at least a number of these. They don't have to have all of them. But they probably have to have at least some of them. If they don't have any of them, they're probably not going to be a game. But they don't have to have all of them, nor is there any particular one that they must have. So the idea of a family resemblance is just that you literally think about a family. Often we'll say members of a family resemble each other, but there might not be any particular

Starting point is 00:33:07 feature that they all have, or there's no one particular thing. It's not like maybe they have different hair colours and they have different nose shapes and they have different eye colour, whatever, but they still look like each other because there's this pattern of overlapping similarities which sort of they all broadly share, but they may not share in exactly the same way, and there may not be any particular characteristic that they all share. So this notion of family resemblance allows us to think about phenomena like fuzzy boundaries and graded membership. So graded membership would mean that there's no such thing as whether it's something a game or not a game.

Starting point is 00:33:37 It's more or less like a game. So Solitaire, for example, might be something that's less like a game because it's not played with other people and it doesn't involve any sort of physical activity. Maybe that makes it less like a game. Or sport is actually another good example of that. What is a sport? you know, everyone would agree that football or basketball are sports, is cheerleading a sport? This is sort of an example that people might debate about periodically, because it has many

Starting point is 00:34:03 of the characteristics it recognises a sport, but it seems to not share some others that, well, arguably it doesn't share some others that people think. What about chess? Is chess a sport? Probably a lot of people would say no, maybe because it doesn't involve physical activity, but, you know, does lawn balls involve physical activity? Well, not so much. So, you know, one can argue about how well these different things.

Starting point is 00:34:23 fit in. But that's exactly the point. That's what we mean by graded membership. Some things fit really well with our notion of the category, and other things fit sort of okay, and other things don't fit very well at all. And then there are other things that are just clearly outside the category, like computers are not sports. Computer is not even an activity. It just doesn't even make sense. So clearly that's not a sport. Clearly that's outside. So there are some things that are clearly inside the category, some things that are clearly outside, and some things that are sort of on the borderline. So understanding this in more detail is what leads us into prototype and exemplar theory. These are two different ways of understanding a graded membership

Starting point is 00:34:55 or of how we categorize things. So the idea of prototype theory is a prototype is supposed to be a sort of an idealized version or an abstract ideal, which is thought of like as an average of all past observed instances of that category. So a prototype for a dog would be like my average of all the dogs I've ever seen. In fact, when you think about a dog or if you think about a car or something like that, you're probably thinking about a prototype. It's probably some sort of average of all the cars you seen before. It's sort of like an abstract ideal. There's no car that looks exactly like your prototype of a car. It's not the same as any particular car you've seen. It's sort of like an average. That's your prototype of a car. So here's the

Starting point is 00:35:30 idea. When someone asks you, is this a car or is this dog or is this a sport, what you do is is you compare that particular instance to your prototype and see how well they match up. If they match up very closely, then you say, yeah, that's a sport or that's a car or that's a dog. If they match up kind of okay, you say, well, yeah, maybe it is. That's kind of a borderline case, if they don't match at all, you say, no, that's not. That's not a sport or a dog. So the idea of prototype theory is that you compare the particular instance to your prototype, your average across past cases, and see how closely they match. There are a number of pieces of evidence in favor of prototype theory, that is,

Starting point is 00:36:05 evidence that seems to point to the fact that people do actually think like this. So some of them are similar to what I've seen before. The sentence verification, for example, if I give people a list of sentences, so some are real sentences, some are not grammatical, they have to say, is this a sentence or is this not a sentence? So it's similar to the lexical decision task, because they're with sentences. What I will do is I'll have examples like robins have feathers, or robins have birds. Oh, sorry, robins are birds.

Starting point is 00:36:32 People are faster to say yes to assent that that is a real sentence when the item is a typical example of the category, like a robin, and they're slower when you have sentences like a penguin has feathers or a penguin is a bird. It's still a bird, but it's not quite as representative. It's further away from our prototype of a bird, and so therefore it takes us a little bit more time to think about it. This is evidence for prototype theory. It's evidence that we're comparing different attributes of what's being mentioned to our prototype, and the closer it is, the easier it is for us to say, yes, that is an instance of that class.

Starting point is 00:37:07 Typicality studies are particularly interesting. This is when you just ask people straight up to give you a rating of how typical something is. So how fruity is this particular fruit, or how birdie is it? How sporty is it? And it turns out that people give very consistent ratings. Often it's not on a 1 to 7 scale. So on a scale of 1 to 10, how birdie is a robin. People are trying to give high sixes, because a robin is very birdie, apparently.

Starting point is 00:37:30 How birdie is an eagle, a little bit less than a robin, but still pretty birdie. How birdie is a penguin, much less birdie. What about a dodo, again, much less birdie? But things that can't fly, or the swim don't meet our criteria. for birds quite so much, so they're less birdy. Similarly, fruits, again, just think right now, what's the most typical fruit? What's sort of the most fruity fruit if you had to think of one? Probably you're thinking of apple. Most people, at least most wesserners think of that, or maybe orange. Those are high up on the list. I think

Starting point is 00:37:59 pear as well is a very fruity fruit. Watermelon is less for fruity fruit. Figg is another one that's sort of low on the list. Now again, maybe you're sort of weird and think about things in different ways to most people, which is fine, of course, there's no right answer for this. But the point is that there is a lot of reliability in these studies in terms of the ratings that people give things, which seems to indicate that there's some real sense in which we're comparing our notions of different fruits to a common prototype. There's a real sense in which a watermelon is less fruity than an apple. It just doesn't fit the prototype quite so well. Okay, so that's some evidence for prototype theory.

Starting point is 00:38:34 Let me now talk a bit about exemplar theory. Really, exemplar theory is very similar to prototype theory. both of them say that we compare particular instances to some sort of ideal and see how well they match, and if they match up then we say yes, if this is in the category, otherwise we say no. The difference is what do we compare them to? What is the ideal that we compare them to? In prototype theory, the idea is that we compare a particular instance to an average

Starting point is 00:38:58 of what we've seen before. So we compare a particular fruit to an average of all past fruit that we've seen, and that's our prototype, or an average of all past dogs that we've seen. In exemplar theory, we don't do that. We don't have an average. Instead, what we have is a particular, instantiation. There's a particular dog or a particular fruit or a particular country, which represents a country, for example. There's one specific one that we keep as our exemplar, and we

Starting point is 00:39:20 compare everything to that. If it looks like the exemplar, then we say, yes, it's in the category. If not, then we say, no, it's not in the category. So an example here might be for works of art. Maybe people think about this in terms of works of art. Maybe when someone asks you, is this art, you compare it to the Mona Lisa. Maybe the Mona Lisa is your exemplar of art, or maybe it's Michelangelo Statue of David, for example. These might be exemplars of art. The more it looks like that, then the more likely to say, yes, it's art.

Starting point is 00:39:46 That's an exemplar, because it's not an average across different works of art. It's a specific one that you've seen before and you know about, so you're comparing it to a specific artwork and not to sort of a general prototype, which is an average. Exemplar theory can be used as an opposition to prototype theory, but in fact

Starting point is 00:40:04 there's evidence for both of them, that there's evidence that we use prototypes and exemplars, just depending on the circumstance, some evidence seems to indicate that if we have limited exposure with something, so we've only seen a few examples, we use exemplar because we've only seen, like, a few examples, so we just pick one of them and we think that that's our exemplar. But if we've seen many, many, many instances of something like cars or dogs, we've all seen heaps of those, we can't remember specific examples very well, and so instead we have prototypes. So prototypes, if we have a lot of experience with something, exemplars, if not so much

Starting point is 00:40:32 experience. But that's an oversimplification as well, because it probably depends on the domain, Probably something like artwork, where things are more individual, in a sense. It doesn't make so much sense to have an average artwork. It's not even clear what that would mean. But it does make sense to have an average car or an average dog, but that seems to be more sensible than an average artwork. So maybe it depends on the domain that we're talking about, whether we think in terms of exemplars versus prototypes.

Starting point is 00:40:55 But the basic idea is still the same, that you compare particular instances to the idea that'll be that an exemplar or a prototype. This approach seems quite useful, quite compelling. it allows us to understand how we think about things and explains a lot of psychological evidence, again, about how people are quicker to respond if it's a typical instance of the category, or they're more likely to produce that as an example, things like that. But there are some problems with these approaches.

Starting point is 00:41:20 One is that, and this is a really big problem, it's the problem of specifying which criteria are relevant. So if I'm asking, how similar is an apple to a banana? The answer is, of course, well, it depends what criteria you look at. If you look at, say, length or color, they might be completely different, But maybe you look at the consistency or the taste, and they're kind of similar as far as things go, or maybe where they come from, while they're both grow on plants, so that's pretty similar. So the point is it depends how you look at it. There's elements where they're the same, and there's elements where they're different.

Starting point is 00:41:48 So it doesn't seem that there's any sort of straightforward way of comparing a particular instance to our prototype or to our exemplar, because it depends how you compare them. They could be very similar in some ways and very different in other ways. Which is more important? How do you weigh these things up? What do you focus on? So we have to have some way of explaining. that of how people look at things.

Starting point is 00:42:06 Another problem with resemblance approaches, and this is quite interesting, is if you remember the typicality judgments that I mentioned before, how fruity is this, or how birdie is this, people are very good at giving those ratings and are quite reliable across people. They give quite consistent ratings about how representative

Starting point is 00:42:22 particular things are of a category, but it turns out that people also will give very consistent ratings for concepts that do have hard and sharp definitions. So, you know, fruit and sports don't have hard and fast definition, so we might think rated membership is the way to go, but odd and even numbers have very specific, there's a very specific criteria there. We can say definitely

Starting point is 00:42:42 whether something is an odd or an even number. But if you ask people, how even is this number, or how odd is this number, how sort of well does it fit into that? People give very consistent ratings for that as well. It turns out that lower numbers get, so, so three is a very sort of characteristically odd number, whereas 101 is not quite so characteristically odd, and so that gets a lower rating. So, so people still give very good and consistent ratings for these things, But trouble is that that doesn't indicate that there's actually some graded membership there. So there seems to be some sort of dissociation. The fact that people can rate to Bicali doesn't necessarily mean that it's mapping directly

Starting point is 00:43:15 into how we think about it. Because obviously people know that numbers are the even or odd. We don't actually think that they have graded membership at that category. So that casts a little bit of doubt on that line of evidence, at least, potentially. Another objection to these resemblance approaches is that there is not always a very strong resemblance between a prototype and a category member. So sometimes we have these sort of examples that we think are sort of super typical. You might think of Abraham Lincoln might be a good example.

Starting point is 00:43:39 So Abraham Lincoln is clearly an American. And in fact, arguably, many people would say he was like, he's sort of like the ideal American. He's sort of more American than most Americans are, if you sort of get what I'm saying here. But he's not a typical American. So if we just look at sort of a typicality comparison and say how close does Abraham Lincoln compare to our sort of typical or a prototype American, you'd say, well, he's not very similar at all, really. but yet that doesn't mean that we say Abraham Lincoln's less of an American. Actually, he's maybe in some sense more of an American. Or you might say the same thing about a work of art.

Starting point is 00:44:09 Like, how is Mona Lisa compared to an average work of art? Well, it's actually a lot... It's more of a work of art than most works of art. But that doesn't mean that we think it's less of a work of art because it's further away from the typical. One way of making sense of that might be that might be thinking about in terms of exemplars rather than of prototypes.

Starting point is 00:44:25 So perhaps our idea of an American is probably the... I think that exemplar American would be maybe Washington, or Lincoln or something like that. They're the exemplars. They sort of embody everything that that an America sort of is and should be. And the closer you are to that, the more American you are in some sense, rather than a prototype of the actual typical American, which would be an average of all the Americans you've met before. So that might be one way of explaining how you could have these sort of, yeah, these cases where the category, sorry, where the object is very clearly a member of the category, but at the same time is different from most of the

Starting point is 00:44:56 typical members as well. There's also another problem with the resemblance-based approaches in that we think about different things differently. So, for example, children will generally say that even if a skunk has its fur dyed and was trained to climb trees and was given and had its smell disguised and all these other things, it would still not become a raccoon. So you ask children this to say, no, it's still not a raccoon. It doesn't matter how much it behaves like a raccoon or it looks like a raccoon. It's still not a raccoon.

Starting point is 00:45:21 So it's clear that children are not just making judgments about superficial characteristics. There's something sort of deeper that they're making a decision about, about how to categorize it. And people seem to reason like that in regards to natural objects a lot more than artifacts or man-made objects. So if you make a toaster look like a hairdryer and act like a hairdry, then many people, children in particular, will happily say, yes, it is a hairdryer. But if you make a skunk look and behave like a raccoon, then people say, no, it's still not a raccoon. So people treat things differently depending on what sort of category you put it into. And there's actually some interesting neurological evidence about how we actually think differently

Starting point is 00:45:57 or using different parts of the brain about natural objects or entities versus man-made ones, but that takes us a bit far afield. So these various problems with the resemblance theory lead us to our final concept that we're going to talk about, and this is the notion of concepts as theories. Now, a theory is what we might call a model or a story, is a loose way of thinking about it. It's a way of understanding something. So the theory theory of concepts states that we think about, concepts based on our sort of lay theories about them, how we understand them, and not just as a sort of

Starting point is 00:46:33 list of facts or of properties about them. And so this background causal knowledge allows rooms for exceptions and also allows us to decide what's important and what's not important. So this allows us to meet the objection that we noted before about, well, how do you decide what, you know, there's lots of ways that you can compare an apple and banana, how you decide what's important. Well, it's this background causal knowledge about what matters for fruits that allows us to make that determination. So it's our theory of fruit that allows us to think about what's important.

Starting point is 00:47:00 So our theory of fruit tells us that it doesn't matter what color the fruit is. I could paint it whatever color I like. What matters is where it came from and whether you can eat it. And what other types of fruit that it goes with. That also matters whether it's a fruit because we don't normally think of a tomato as being a fruit, even though that sort of is a fruit.

Starting point is 00:47:15 So it's our causal background knowledge, our theory of fruits that allows us to make the categorization. Similarly, if we consider that the child are thinking about the skunk versus the raccoon. Even a child has some sort of basic background understanding of animals, like that a skunk comes from skunk parents, and that it has some sort of skunk essence, sort of something that makes it a skunk.

Starting point is 00:47:39 This might be some sort of a notion like DNA. Of course, children don't know about DNA, but I think that there's some sort of essentialist notion that there's something about it that makes it a skunk, maybe its parents or how it's raised or something like that. And it doesn't matter if you diet's fur or train it. in some way that it's still a skunk. There's still some skunkness to it. So again, that comes out of our theory, our causal theory, of how things work,

Starting point is 00:48:00 or of where they come from, or what they're for. So there's some interesting evidence in favor of this theory, theory of concepts. By the way, it is literally called theory-dash-theory. It's the theory theory-of-concepts. Interesting name, but anyway. One comes from how physicians diagnose various diseases. So, again, when you ask a physician, well, what disease is this? there are many criteria that they could potentially appeal to. So according to prototype or exemplar theory,

Starting point is 00:48:27 they'd be comparing this particular disease in the symptoms and whatever else, the case history. They'd be comparing that to some ideal type or some exemplar that they have in their memory. But the question is then, we have to ask the question, basis on what basis do they make this comparison? What properties do they emphasize and what do they de-emphasize and what's more important? How do they make that decision depends on their theory of the disease.

Starting point is 00:48:49 So if they think that the disease is caused by, I don't know, a particular behavior, for example, or maybe it's a particular bacterial infection or whatever, they have a certain causal knowledge of the disease. They'll say, well, that means that this symptom, for example, is really important. Because if you don't have this symptom, then it can't possibly be the disease because the bacteria directly causes that symptom. And you might have other secondary symptoms, but this particular symptom is very important because it's directly related to the cause of the disease. But if I think the cause is different, if I think the cause is behavioral, for example, if I think this is caused by alcohol, and the person doesn't drink, then I'm much less likely to say that this particular condition fits this disease, even if it looks like this disease in many other ways, because I don't

Starting point is 00:49:29 think the key causal elements there, and so I'm going to consider my comparison differently. So, and indeed, that this is, so people have psychologists have studied this, and it does seem that physicians are using this background causal knowledge about how the disease works and how diseases work in general, in order to make this class these categorizations. And as I said before, it's also consistent with the fact that people think differently about, say, natural objects and they do about artifacts, because they have different sort of background beliefs about where they come from and how they work. We can change artifacts to make them new artifacts, but you can't change natural things to make them different natural things in quite the same way. So we categorize them differently. Right, so that's actually all I wanted to talk about in this episode, although we have gone a little bit over time.

Starting point is 00:50:10 So I'll end it off here. Hopefully that wasn't too confusing because some of these concepts were a bit abstract. If you enjoyed this episode, send me an email. My address is Fods12 at gmail.com. That's F-O-D-S-1-2 at gmail.com. Also, if you could jump onto iTunes and give the podcast a favourable review or even just a star rating, I'd be very much appreciated. That really helps to raise the visibility of the podcast.

Starting point is 00:50:34 I haven't got a rating quite a while now, so I'd much appreciate one. Just a few minutes of your time to do that. It's much appreciated. You can also jump on Facebook and search for the signs of everything podcast and give the page a like. That's another way of raising the visibility of the podcast. I also post news about perhaps upcoming episodes or especially I post visual material to accompany each episode there so you can check that out if some of the things I'm saying are a bit confusing without a aid. So I do that as well. Thanks for listening and I'll talk to you next time.

The Science of Everything Podcast - Episode 64: Knowledge Representation

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.