Lex Fridman Podcast - #222 – Jay McClelland: Neural Networks and the Emergence of Cognition

Episode Date: September 20, 2021

Jay McClelland is a cognitive scientist at Stanford. Please support this podcast by checking out our sponsors: - Paperspace: https://gradient.run/lex to get $15 credit - Skiff: https://skiff.org/lex t...o get early access - Uprising Food: https://uprisingfood.com/lex to get $10 off 1st starter bundle - Four Sigmatic: https://foursigmatic.com/lex and use code LexPod to get up to 60% off - Onnit: https://lexfridman.com/onnit to get up to 10% off EPISODE LINKS: Jay's Website: https://stanford.edu/~jlmcc/ PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ YouTube Full Episodes: https://youtube.com/lexfridman YouTube Clips: https://youtube.com/lexclips SUPPORT & CONNECT: - Check out the sponsors above, it's the best way to support this podcast - Support on Patreon: https://www.patreon.com/lexfridman - Twitter: https://twitter.com/lexfridman - Instagram: https://www.instagram.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/lexfridman - Medium: https://medium.com/@lexfridman OUTLINE: Here's the timestamps for the episode. On some podcast players you should be able to click the timestamp to jump to that time. (00:00) - Introduction (07:12) - Beauty in neural networks (11:31) - Darwin and evolution (17:16) - The origin of intelligence (23:58) - Explorations in cognition (30:02) - Learning representations by back-propagating errors (36:27) - Dave Rumelhart and cognitive modeling (49:30) - Connectionism (1:12:23) - Geoffrey Hinton (1:14:19) - Learning in a neural network (1:31:11) - Mathematics & reality (1:38:19) - Modeling intelligence (1:48:57) - Noam Chomsky and linguistic cognition (2:03:18) - Advice for young people (2:14:26) - Psychiatry and exploring the mind (2:27:04) - Legacy (2:32:53) - Meaning of life

Transcript
Discussion (0)
Starting point is 00:00:00 The following is a conversation with Jay McClelland, a cognitive scientist that Stanford and one of the seminal figures in the history of artificial intelligence and specifically neural networks. Having written the parallel distributed processing book with David Rommelhart, who co-authored the backpropagation paper with Jeff Hinton. In their collaborations, they've paved the way for many of the ideas at the center of the neural network-based machine learning revolution of the past 15 years. To support this podcast, please check out our sponsors in the description.
Starting point is 00:00:37 As usual, I'll do a few minutes of ads now, no ads in the middle, I try to make these interesting, so hopefully you don't skip, but if you do, please still check out the sponsor links in the description, it's the best way to support this podcast. I use their stuff, I enjoy it, maybe you will too. This show is brought to you by PaperSpace Gradient, which is a platform that lets you build, train, and deploy machine learning models of any size and complexity. I love how powerful any intuitive it is. I'm likely going to use PaperSpace for a couple of machine learning experiments I'm doing as part of an upcoming video. Fast AI, maybe you've heard of them.
Starting point is 00:01:15 Jeremy Howard runs it. It's a course I highly recommend. And Fast AI uses PaperSpace gradient. You can host notebooks on there. You can swap notebooks on there, you can swap out compute instances at any time, start on a small scale GPU instance or even CPU instance,
Starting point is 00:01:30 and then swap out once your compute needs increase. I'm really excited about what they're calling workflows, which provides a way to automate machine learning pipelines on top of gradient compute infrastructure. It makes it really easy to build a production app because all the orchestration is reduced to a simple configuration file, a YAML file, to give gradient a try, visit gradient.runslashlex and use the signup link there.
Starting point is 00:01:57 You'll get $15 and free credits, which you can use to power your next machine learning application, that's gradient.run-slashlex. This show is also brought to you by Skiff, an end-to-end encrypted and decentralized collaboration platform built for privacy from the ground up. What signal is to messaging, Skiff is to document writing collaboration. It's like Google Docs, but with a lot more security features,
Starting point is 00:02:23 and actually it has a bunch more usability features I speak from experience and from a place of respect and love. I'm a big user of Google Docs Probably have over a thousand documents on there. I also use ever-note notion Google keep for Various kinds of note-taking, but I really love the interface that Skip has and obviously the security features are just unparalleled. On Skip only you can decrypt your data. No one, not even Skip can never see it. If you like using Signal, which I do, you will love using Skip.
Starting point is 00:02:56 They're offering listeners of this podcast early access to the platform. You get to Skip, they're over 60,000 person waiting lists. Sign up at Skipipsbeta. The URL is skiff.org slash Lex. Again, go to skiff.org slash Lex to sign up for the early access. This show is also brought to you by a new sponsor and an amazing one. It just blew my mind.
Starting point is 00:03:19 It's called Uprising Food, the maker of low carb, keto-friendly bread, bread and recently they're also making keto friendly chips. The bread is only two net carbs preserving, six grams of protein and nine grams of fiber. When they first sent me the bread as a pitch to see if they want to sponsor the podcast, I thought looking at the nutrition, I thought there's no way this could taste good. I'm somebody who loves bread, but because I really love the way I feel on low carb diets, I stick to the low carb diet. So I open this bread, it's shaped like a cube, which already feels like the future. And I cut a few slices and just ate it for the pure ingredient,
Starting point is 00:04:04 just to see what it tastes like and it was incredible. From my perspective, we need to know it's keto friendly and it tastes delicious. I highly, highly recommend it if you want to eat bread but want to eat it in a healthy way. Get $10 off the starter bundle that includes two superfood cubes. I guess they're called super food cubes. And four packs of the freedom chips.
Starting point is 00:04:29 Again, the chips are called freedom chips. Good marketing. Go to uprisingfood.com slash Lex and the discount will be automatically applied at checkout. That's uprisingfood.com slash Lex. This show is also brought to you by Forsegmatic, the maker of delicious mushroom coffee and plant-based protein. Does the coffee taste like mushrooms you ask? No, it does not.
Starting point is 00:04:55 But it is a big part of my morning ritual. I make the coffee. I'm listening to Brown Noise now as I think about what I'm going to do in the first deep work session of the day, calmly walk around, make the coffee, get glass of water, drink that, and then go to the computer and start taking on the day with the aroma of force agmatic coffee in the air. I feel like Pavlov's dog with a warm cup and a robot that just switches my brain on, that it's not time to get to work.
Starting point is 00:05:30 Plus the brown noise is really focusing and I'm ready to take on the day. Those first few hours are in terms of depth, focus and clarity of thinking are unparalleled for me. Get up to 40% off and free shipping on mushroom coffee bundles if you go to 4sigmatic.com slash Lex, that's 4sigmatic.com slash Lex. Speaking of productivity, this episode is also brought to you by Onit, Nutrition Supplement and Fitness Company. They make Alpha Brain, which is a new tropic that helps support memory, mental
Starting point is 00:06:05 speed and focus. I use it when I need a boost, when I have a difficult, deep work sessions coming up, and I really want to make sure that I go deep, stay there for a while with clarity and focus. I'll take an Alpha Brain. So it's not like part of my daily ritual. I use it as like a jet pack for the mind. It's not necessarily when my mind is feeling tired because that a good nap can usually fix, but when it's feeling pretty good, but I just anticipate a really difficult session. That's what I'll take off. Alpha brain. Anyway, go to lexfreedman.com slash on it to get up to 10% off alpha brain. That's LexFreedman.com slash on it. This is the LexFreedman Podcast and here is my conversation with Jay McClelland. You are one of the seminal figures in the history of neural networks.
Starting point is 00:07:16 At the intersection of Cognus Psychology and Computer Science, what to you has over the decades emerged as the most beautiful aspect of on your own networks, both artificial and biological? The fundamental thing I think about with their own networks is how they allow us to link biology with the mysteries of thought. And you know, when I was first entering the field myself in the late 60s, early 70s, cognitive psychology had just become a field. There was a book published in 67 called cognitive psychology. And the author said that, you know, the study of the nervous system was only of peripheral interest.
Starting point is 00:08:13 It wasn't gonna tell us anything about the mind. And I didn't agree with that. I always felt, oh, look, I'm a physical being. I from dust to dust, you know, ashes to ashes and somehow I emerged from that. So that's really interesting. So there was a sense with cognitive psychology that in understanding the sort of neuronal structure of things, you're not going to be able to understand the mind. And then your sense is if we study these networks,
Starting point is 00:08:52 we might be able to get at least very close to understanding the fundamentals of the human mind. Yeah. I used to think, where I used to talk about the idea of awakening from the Cartesian dream. So Descartes, youcartes thought about these things, right? He was walking in the gardens of Versailles one day and he stepped on a stone and a statue moved. And he walked a little further, he stepped on another stone and another statue moved.
Starting point is 00:09:25 And he, like, why did the statue move when I stepped on the stone? And he went and talked to the gardeners, and he found out that they had a hydraulic system that allowed the physical contact with the stone to cause water to flow in various directions, which caused water to flow into the statue and move the statue. And he used this as the beginnings of a theory about how animals act. And he had this notion that these little fibers that people had identified that weren't carrying the blood, you know, were these little hydraulic tubes that if you touched something that would be pressure and it would send a signal of pressure to the other parts of the system and that would cause action.
Starting point is 00:10:17 So, he had a mechanistic theory of animal behavior. And he thought that the human had this animal body, but that some divine something else had to have come down and been placed in him to give him the ability to think. Right? So the physical world includes the body in action, but it doesn't include thought according to Descartes, right? And so the study of physiology at that time was the study of sensory systems and motor systems and things that you could directly measure when you stimulated neurons and stuff like that. And the study of cognition was something that was tied in with abstract
Starting point is 00:11:08 computer algorithms and things like that. But when I was in undergraduate, I learned about the physiological mechanisms. And so when I'm studying cognitive psychology as a first year PhD student, I'm saying, wait a minute, the whole thing is biological. You had that intuition right away, that was seemed obvious to you. Yeah, yeah. It's not magical though, that from just the little bit of biology can emerge the full beauty of the human experience, why is that so obvious to you? Well, obvious and not obvious at the same time. And I think about Darwin in this context too,
Starting point is 00:11:49 because Darwin knew very early on that none of the ideas that anybody had ever offered gave him a sense of understanding how evolution could have worked. But he wanted to figure out how it could have worked. But he wanted to figure out how it could have worked. That was his goal. And he spent a lot of time working on this idea and reading about things that gave him hints and thinking they were interesting, but not knowing why and drawing more and more pictures of different birds that differ slightly from each other
Starting point is 00:12:28 and so on, you know, and then he figured it out. But after he figured it out, he had nightmares about it. He would dream about the complexity of the eye and the arguments that people had given about how ridiculous it was to imagine that that could have ever emerged from some sort of, you know, unguided process, right, that it hadn't been the product of design. And so he didn't publish for a long time, in part because he was scared of his own ideas. He didn't think they could probably possibly be true. But then, you know, by the time the 20th century rolls around,
Starting point is 00:13:15 we all, you know, we understand that, or many people understand or believe that evolution produced, you know, the entire range of animals that there are. And, you know, so what? Oh, you know somebody comes. Oh, there's a certain part of the brain that's still different. They don't you know, there's no hippocampus in the monkey brain. It's only in the human brain and Huxley had to do a surgery in front of many, many people in the late 19th century to show to them there's actually a hippocampus in the chimpanzee's brain, you know. So their continuity of the species is another element that contributes to this sort of idea that we are ourselves total product of nature. And that, to me, is the magic in the mystery how nature could actually give rise to organisms that have the capabilities that we have.
Starting point is 00:14:49 So it's interesting because even the idea of evolution is hard for me to keep all together in my mind. So because we think of a human timescale, it's hard to imagine that the development of the human eye will give me nightmares too. Because you have to think across many, many, many generations. And it's very tempting to think about a growth of a complicated object and it's like, how is it possible for that such a thing to be built? Because also, me from a robotics engineering perspective, it's very hard to build these systems. How can, through an undirected process, can a complex thing be designed? It seems not, it seems wrong.
Starting point is 00:15:32 Yeah. So that's absolutely right. And you know, a slightly different career path that would have been equally interesting to me would have been to actually study the process of embryological development, flowing on into brain development and the exquisite sort of laying down of pathways and so on that occurs in the brain. And I know the slightest bit about that, it's not my field, but there are fascinating aspects to this process that eventually result in the you know, the complexity of various brains. At least, you know, one thing where in the field, I think people have felt for a long time. In the study of vision, the continuity between humans and non-human animals has been has been second nature for a lot longer. I was having, I had this conversation with somebody who's a vision scientist
Starting point is 00:16:46 and you say, oh, we don't have any problem with this. You know, the monkeys visual system, the human visual system, extremely similar up to certain levels, of course, they diverge after a while. But the first, the visual pathway from the eye to the brain and the first few layers of cortex or cortical areas, I guess, when we'd say are extremely similar. Yeah, so on the cognition side is where the leap seems to happen with humans. That it does seem like we're kind of special. And that's a really interesting question, when thinking about alien life,
Starting point is 00:17:29 or if there's other intelligent alien civilizations out there, is how special is this leap? So one special thing seems to be the origin of life itself. However you define that, there's a gray area. And the other leap, this is very biased, perspective of a human, is the origin of intelligence. And again, from an engineering perspective, it's a difficult question to ask. An important one is how difficult is that leap? How special were humans? Did a monolith come down to the Ali's book down a monolith and some apes had to touch a monolith
Starting point is 00:18:05 but to get it. It's a lot like dark day cards, you know, idea, right? Exactly. But it just seems that it seems one heck of a fluke occurred a hundred thousand years ago. And you know, just happened that some human, some hominin predecessor of current humans had this one genetic tweak that resulted in language. And language then provided this special thing that separates us from all other animals. I'm, I think there's a lot of truth to the value and importance of language, but I think it comes along with the evolution of a lot of other related things related to sociality and mutual engagement with others and establishment of, I don't know, rich mechanisms for organizing an understanding of the world, which language then plugs into. Right, so it's a language is a tool that allows you to do this kind of collective intelligence
Starting point is 00:19:48 and whatever is at the core of the thing that allows for this collective intelligence is the main thing. And it's interesting to think about that one fluke, one mutation can lead to the first crack opening of the door to human intelligence. Like all it takes is one. Like evolution just kind of opens the door a little bit and then the time and selection takes care of the rest. You know, there's so many fascinating aspects
Starting point is 00:20:17 to these kinds of things. So we think of evolution as continuous, right? We think, oh yes, okay, over 500 million years, there could have been this, you know, relatively continuous changes. And, but that's not what anthropologists, evolutionary biologists found from the fossil record. They found, of years of stasis.
Starting point is 00:20:52 And then suddenly, it changed, it occurs. Well, suddenly, on that scale is a million years or something. But we're even 10 million years. But the concept of punctuated equilibrium was a very important concept in evolutionary biology, and that also feels somehow right about, you know, the stages of our mental abilities. We, we seem to have a certain kind of mindset at a certain age. And then at another age, we like look at that four-year-old and say, oh my God, how could
Starting point is 00:21:34 they have thought that way? So Piaget was known for this kind of stage theory of child development, right? And you look at it closely and suddenly those stages are so discrete and transitions, but the difference between the four-year-old and the seven-year-old is profound. And that's another thing that's always interested me is how we... Something happens over the course of several years of experience where at some point we reach the point where something like an insight or a transition or a new stage of development occurs. These kinds of things can be understood in complex systems research.
Starting point is 00:22:17 Evolutionary biology, developmental biology, cognitive development are all things that have been approached in this kind of way. Yeah, just like you said, I find both fascinating those early years of human life, but also the early minutes, days of the embryonic development to like how, from embryos you get like the brain, that development again from the engineering perspective is fascinating. So it's not so the early when you
Starting point is 00:22:54 deploy the brain to the human world and it gets to explore that world and learn that's fascinating but just like the assembly of the mechanism that is capable of learning. That's like amazing. The stuff they're doing with like brain organoids where you can build many brains and study that self assembly of a mechanism from like the DNA material. That's like, what the heck? You have literally like biological programs that just generate a system, this mushy thing that's able to be robust and learn in a very unpredictable world and learn seemingly arbitrary things or like a very large number of things that I'll enable survival. Yeah. Ultimately, that is a very important part of the whole process of, you know, understanding this sort of emergence of mind from brain kind of thing.
Starting point is 00:23:57 And the whole thing seems to be pretty continuous. So let me step back to neural networks for another brief minute. You wrote parallel distributed processing books that explored ideas on neural networks in the 1980s, together with a few folks, but the books you wrote with David Rahmohart, who is the first author on the back propagation paper, which I've hinted in. So these are just some figures at the time that we're thinking about these big ideas. What are some memorable moments of discovery and beautiful ideas from those early days?
Starting point is 00:24:33 I'm gonna start, sort of with my own process in the mid-70s and then into the late 70s when I met Jeff Hinson and he came to San Diego and we were all together. In my time in graduate schools, I've already described to you. I had this sort of feeling of, okay, I'm really interested in human cognition, but this disembodied sort of way of thinking about it that I'm getting from the current mode of thought about it is isn't working fully for me. And when I got my assistant professor's ship, I went to UCSD and that was in 1974. Something amazing had just happened.
Starting point is 00:25:30 Dave Rommelhart had written a book together with another man named Don Norman and the book was called Explorations in Cognition. And it was a series of chapters exploring interesting questions about cognition, but in a completely sort of abstract, you know, non-biological kind of way. And I think, gee, this is amazing. I'm coming to this community where people can get together and feel like they've collectively exploring, you know, ideas. It was a book that had a lot of lightness to it. The Don Norman, who was the more senior figure at a rumble heart at that time, who led that project, always created created this spirit of playful exploration of ideas. And so I'm like, wow, this is great. But I was also still trying to get from the neurons to the cognition.
Starting point is 00:26:39 And I realized at one point, I got this opportunity to go to a conference where I heard a talk by a man named James Anderson who was an engineer, but by then a professor in a psychology department who had used linear algebra to create neural network models of perception and categorization and memory. And I just blew me out of the water that one could create a model that was simulating neurons, not just kind of engaged in a stepwise engaged in a stepwise algorithmic process that was construed abstractly. But it was simulating, remembering, and recalling, and recognizing the prior occurrence of a stimulus or something like that. So for me, this was a bridge between the mind and the brain. And I just like, and I remember I was walking
Starting point is 00:27:47 cross campus one day in 1977. And I almost felt like St. Paul on the road to Damascus. I said to myself, you know, if I think about the mind, in terms of a neural network, it will help me answer the questions about the mind that I'm trying to answer. And that really excited me. So I think that a lot of people were becoming excited about that. And one of those people was Jim Anderson, who I had mentioned.
Starting point is 00:28:19 Another one was Steve Grossberg, who had been writing about neural networks since the 60s, and Jeff Hinton was yet another. And his PhD dissertation showed up in an applicant pool to a postdoctoral training program that Dave and Don, the two men I mentioned before, Rommelhart and Norman were administering, and Rommelhart got really excited about Hinton's PhD dissertation. And so, Hinton was one of the first people who came and joined this group of postdoctoral scholars
Starting point is 00:29:03 that was funded by this wonderful grant that they got. Another one who's also well-known in Neural Network Circles is Paul Smolensky. He was another one of that group. Anyway, Jeff and Jim Anderson organized a conference at UCSD where we were. And it was called parallel models of associative memory, and it brought all the people together who had been thinking about these kinds of ideas in 1979 or 1980. And this began to kind of really resonate with some of Rommel Hart's own thinking, some of his reasons for wanting something other than the kinds of computation he'd been
Starting point is 00:29:58 doing so far. So let me talk about Rommel Hart now for a minute. Okay, with that context. Well, let me also just pause because it's so many interesting things before we go to Ramahart. So first of all, for people who are not familiar, Neal Networks are at the core of the machine learning, deep learning revolution of today. Jeffrey Hidden that we mentioned is one of the figures that were important in the history, like yourself and the development of these Neal Networks, artificial neural networks that are then used for the machine learning application.
Starting point is 00:30:26 Like I mentioned, the back propagation paper is one of the optimization mechanisms by which these networks can learn. And the word parallel is really interesting. So it's almost like synonymous from a computational perspective of how you thought at the time about neural networks. There's parallel computation. synonymous from a computational perspective, what, how you thought at the time about, knowing that works, there's parallel computation. Is that, would that be fair to say? Well, yeah, the parallel, the word parallel in this,
Starting point is 00:30:54 comes from the idea that each neuron is an independent computational unit, right? It, it gathers data from other neurons. It integrates it in a certain way, and then it produces a result. And it's a very simple little computational unit, but it's autonomous in the sense that, you know, it does its thing, right?
Starting point is 00:31:20 It's in a biological medium where it's getting nutrients and various chemicals from that medium. But it's, you know, you can think of it as almost like a little little computer in and of itself. So the idea is that each, you know, our brains have, oh, look, you know, a hundred or hundreds, almost a billion of these little neurons, right? And they're all capable of doing their work at the same time. So it's like, instead of just a single central processor that's engaged in, you know, chug, chug one step after another,
Starting point is 00:32:09 And you know, Chuck one step after another, we have a billion of these little computational units working at the same time. So at the time that's, I don't know, maybe you can comment, it seems to me, even still to me, quite a revolutionary way to think about computation relative to the development of theoretical computer science, alongside of that where it's very much sequential computer, you're analyzing algorithms that are running on a single computer. That's right. You're saying, wait a minute, why don't we take a really dumb, very simple computer and
Starting point is 00:32:40 just have a lot of them interconnected together. And they're all operating in their own little world and they're communicating with each other, and thinking of computation in that way, and from that kind of computation, trying to understand how things like certain characteristics of the human mind can emerge. That's quite a revolutionary way of thinking, I would say.
Starting point is 00:33:04 Well, yes, I would say. Well, yes, I agree with you. And there's still this sort of sense of not sort of knowing how we kind of get all the way there, I think. And this very much remains at the core of the questions that everybody's asking about the capabilities of deep learning and all these kinds of things. But if I could just play this out a little bit, a convolutional neural network or a CNN, which, you know, many people may have heard of, is a set of, you could think of it biologically as a set of collections of neurons. Each one had, each collection has maybe 10,000 neurons in it. But there's many layers, right? Some of these things are hundreds or even a thousand layers deep,
Starting point is 00:34:08 but others are closer to the biological brain and maybe they're like 20 layers deep or something like that. So we have within each layer we have thousands of neurons or tens of thousands maybe. Well, in the brain, we probably have millions in each layer, but we're getting sort of similar in a certain way, right? And then we think, okay, at the bottom level, there's an array of things that are like the photoreceptors. And in the eye, they respond to the amount of light of a certain wavelength
Starting point is 00:34:45 that is certain location on the pixel array. So that's like the biological eye. And then there's several further stages going up, layers of these neuron-like units. And you go from that raw input array of pixels to a classification. You've actually built a system that could do the same kind of thing that you and I do when we open our eyes and we look around and we see there's a cup, there's a cell phone, there's a water bottle. And these systems are doing that now, right? So they are in terms of the parallel idea that we were talking about before. They are doing this massively parallel computation
Starting point is 00:35:34 in the sense that each of the neurons in each of those layers is thought of as computing its little bit of something about the input simultaneously with all the other ones in the same layer. We get to the point of abstracting that away and thinking, oh, it's just one whole vector that's being computed. One activation pattern is computed in a single step and that abstraction is useful, but it's still that parallel and distributed processing, right?
Starting point is 00:36:10 Each one of these guys is just contributing a tiny bit to that whole thing. And that's the excitement that you felt that from these simple things, you can emerge when you add these level of abstractions on it, you can start getting all the beautiful things that we think about as cognition. you add these level of abstractions on it, you can start getting all the beautiful things that we think about as cognition. Right. Okay, so you have this conference, I forgot the name already, but it's parallel and something associated with memory and so on.
Starting point is 00:36:35 Very exciting, technical and exciting title, and you started talking about Dave Roma-Hardt, so who is this person that was so you spoken very highly of him. Can you tell me about him, his ideas, his mind, who he was as a human being as a scientist? So Dave came from a little tiny town in western South Dakota and his mother was the librarian and his father was the editor of the newspaper. And I know one of his brothers pretty well. They grew up, there were four brothers and they grew up together and their father encouraged them to compete with each other a lot. They competed in sports and they competed in mind games.
Starting point is 00:37:33 I don't know, things like Sudoku and Chess and not various things like that. And Dave was a standout undergraduate. He went as at a younger age than most people do to college at the University of South Dakota and majored in mathematics. And I don't know how he got interested in psychology, but he applied to the mathematical psychology program at Stanford and was accepted as a PhD student to study mathematical psychology at Stanford. So mathematical psychology is the use of mathematics to model mental processes.
Starting point is 00:38:19 So something that I think these days might be called cognitive modeling, that whole space. Yeah, it's mathematical in the sense that you say, if this is true and that is true, then I can derive that this should follow. Okay, and so you say, these are my stipulations about the fundamental principles, and this is my prediction about behavior, and it's all done with equations. It's not done with the computer simulation. You solve the equation, and that tells you what probability that the subject will be correct on the seventh trial of the experiment is or something like that. It's a use of mathematics to descriptively characterize aspects of behavior. And Stanford at that time was the place where there were several really,
Starting point is 00:39:16 really strong mathematical thinkers who were also connected with three or four others around the country, who brought a lot of really exciting ideas onto the table. And it was a very, very prestigious part of the field of psychology at that time. So, Rommelhart comes into this. He was a very strong student within that program. And he got this job at this brand new university in San Diego in 1967, where he's one of the first assistant professors in the department of psychology at UCSD. So I got there in 74, seven years later, and Runghart at that time was still doing mathematical modeling, but he had gotten interested in cognition. He'd gotten interested in understanding and you know understanding I think remains. You know, what does it mean to understand anyway, you know, it's interesting sort of curious, you know, like how would we know if we really understood something. But he was interested in building machines that would, you know, hear a couple of sentences and have an insight about what was going on. So,
Starting point is 00:40:52 for example, one of his favorite things at that time was, Marky was sitting on the front step when she heard the familiar jingle of the Good Human Man. She remembered her birthday money and ran into the house. What is Margie doing? Why? Well, there's a couple of ideas you could have, but the most natural one is that the Good Human Man brings ice cream, she likes ice cream. She's,
Starting point is 00:41:26 she knows she needs money to buy ice cream so she's going to run into the house and get her money so she can buy herself an ice cream. It's a huge amount of inference that has to happen to get those things to link up with each other. And he was interested in how the hell that could happen. in how the hell that could happen. And he was trying to build good old fashioned AI style models of representation of language and content of, things like has money. So like formal logic and acknowledge basis, like that kind of stuff. So he was integrating or like formal logic and like knowledge basis like that kind of yeah,
Starting point is 00:42:05 so he was integrating that with his thinking about cognition. Yes, the mechanisms cognition. How can they like mechanistically be applied to build these knowledge like to actually build something that looks like a web of knowledge and their bias from from their emergence, something like understanding. Yeah. That is. Yeah. And he was grappling that this was something that they grappled with at the end of that
Starting point is 00:42:32 book that I was describing explorations and cognition. But he was realizing that the paradigm of good old fashioned AI wasn't giving him the answers to these questions. And by the way, that's called good old fashion AI. Now it was the time it was beginning to be called that. Because it was from the 60s. Yeah. Yeah. By the late 70s, it was kind of old fashion. And it hadn't really panned out, you know, and people were beginning to recognize that. But and and Rumbleheart was, you know, like, yeah,
Starting point is 00:43:05 he was part of the recognition that this wasn't all working. Anyway, so he started thinking in terms of the idea that we needed systems that allowed us to integrate multiple simultaneous constraints in a way that would be mutually influencing each other. So he wrote a paper that just really, first time I read it, I said, oh, well, yeah, but is this important? But after a while, it just got under my skin.
Starting point is 00:43:44 And it was called an interactive model of reading. And in this paper, he laid out the idea that every aspect of our interpretation of what's coming off the page when we read. At every level of analysis you can think of actually depends on all the other levels of analysis. So what are the actual pixels making up each letter? And what did those pixels signify about which letters they are? And what
Starting point is 00:44:29 of those letters tell us about what words are there? And what of those words tell us about what ideas the author is trying to convey? And so he had this model where, you know, we have these little tiny elements that represent each of the pixels of each of the letters and then other ones that represent the line segments in them and other ones that represent the letters and other ones that represent the words. And at that time his idea was there's this set of experts. There's an expert about how to construct a line out of pixels and another expert about how which sets of lines go together to make which letters and another one about which letters go together to make bench words. Another one about what the meanings of the words are and another one about how the meanings fit together and things like that.
Starting point is 00:45:33 And all these experts are looking at this data and they're updating hypotheses at other levels. So the word expert can tell the letter expert, oh, I think there should be a T there because I think there should be a word the here. And the bottom up sort of featured to letter expert can say, I think there should be a T there too. And if they agree, then you see a T, right? And so there's a top down bottom up interactive process, but it's going on at all layers simultaneously. So everything can filter all the way down from the top as well as all the way up from the bottom.
Starting point is 00:46:08 And it's a completely interactive, bidirectional, parallel, distributed process. That is somehow because of the abstractions, it's hierarchical. So there's different layers of responsibilities, different levels of responsibilities. First of all, it's fascinating to think about it in this kind of mechanistic way. So not thinking purely from the structure of a neural network or something like a neural network, but thinking about these little guys that work on letters and then the letters come words and words become sentences. And that's a very interesting hypothesis that from that kind of hierarchical structure can emerge understanding.
Starting point is 00:46:50 Yeah, so but the thing is though I want to just sort of relate this to earlier part of the conversation. When Rommelhart was first thinking about it, there were these experts on the side, one for the features and one for the letters and one for how the letters make the words and so on. And they would each be working sort of evaluating various propositions about, you know, is this combination of features here going to be one that looks like the letter T and so on. And what he realized kind of after reading Hinton's dissertation and hearing about Jim Anderson's linear algebra-based neural network models that I was telling you about before was that he could replace those experts with neuron-like processing units, which just would have their
Starting point is 00:47:42 connection weights that would do this job. So what ended up happening was that Rommelhart and I got together and we created a model called the interactive activation model of letter perception, which is takes these little pixel level inputs, these little pixel level inputs constructs line segment features, letters and words, but now we built it out of a set of neuron like processing units that are just connected to each other with connection weights. So the unit for the word time has a connection to the unit for the letter T in the first position and the
Starting point is 00:48:25 letter i in the second position, so on. And because these connections are bidirectional, if you have prior knowledge that it might be the word time that starts to prime the feature, the letters and the features, and if you don't, then it has to start bottom up. But the directionality just depends on where the information comes in first. And if you have context together with features at the same time, they can convergently result in an emergent perception. And that was the piece of work that we did together that sort of got us both completely convinced that, you know, this neural network way of thinking was going to be able to actually address the
Starting point is 00:49:16 questions that we were interested in as cognitive second. So the algorithm makes the optimization side, those are all details. Like when you first start the idea that you can get far with this kind of way of thinking, that in itself is a profound idea. So do you like the term connectionism to describe this kind of set of ideas? I think it's useful. It highlights the notion that the knowledge that the system exploits is in the connections between the units, right?
Starting point is 00:49:50 There isn't a separate dictionary. There's just the connections between the units. So I already sort of laid that on the table with the connections from the letter units to the unit for the word time, right? The unit for the word time isn't a unit for the word time, for any other reason than it's got the connections to the letters that make up the word time. Those are the units on the input that excite it when it's excited that in a sense represents in the system that there's support for the hypothesis that the word time
Starting point is 00:50:27 is present in the input. But it's not, the word time isn't written anywhere inside the model. It's only written there in the picture we drew of the model to say that's the unit for the word time, right? And if you if somebody wants to tell me, well, how do you spell that word? You have to use the connections from that out to then get those letters, for example. That's such a, that's a counterintuitive idea. We humans want to think in this logic way. This idea of connectionism, it doesn't, it's weird. It's weird that this is how it all works. Yeah, but let's go back to that CNN, right?
Starting point is 00:51:15 That CNN with all those layers of neuron-like processing units that we were talking about before, it's going to come out and say, this is a cat, that's a dog. But it has no idea why it said that. It's just got all these connections between all these layers of neurons, like from the very first layer to the, you know, the, like whatever these layers are, they just get numbered after a while because they, you know, they, they, they somehow further in you go the more, the more abstract the features are, but it's a graded, a continuous process of abstraction anyway. It goes from very local, very specific to much more
Starting point is 00:51:56 sort of global, but it's still another pattern of activation over an array of units, and then at the output side, it says it's a cat or it's a dog. And when we when I open my eyes and say, oh that's Lex or oh you know there's my own dog and I recognize my dog which is a member of the same species as many other dogs but I know this one because of some slightly unique characteristics. I don't know how to describe what it is that makes me know that I'm looking at Lex or at my particular dog, right?
Starting point is 00:52:33 Or even that I'm looking at a particular brand of car. Like I could say a few words about it, but I wrote you a paragraph about the car. You would have trouble figuring out which cars he talked about. So the idea that we have propositional knowledge of what it is that allows us to recognize that this is an actual instance of this particular natural kind, has always been something that it never worked, right? You couldn't ever write down who said a proposition for, you know, visual recognition. And so in that space, it's sort of always seen very natural that something more implicit, you know, you don't have access to what
Starting point is 00:53:22 the details of the computation were in between, you just get the result. So that's the other part of connectionism. You cannot, you don't read the contents of the connections. The connections only cause outputs to occur based on inputs. Yeah. And for us, that like final layer or some particular layer is very important. No one that tells us that it's our dog or like it's a cat or a dog.
Starting point is 00:53:51 But you know, each layer is probably equally as important in the grand scheme of things. Like there's no reason why the cat versus dog is more important than the lower level activations. It doesn't really matter. I mean, all of it is just this beautiful stacking on top of each other. And we humans live in this particular layers for us. For us, it's useful to survive, to use those cat versus dog, predator versus prey,
Starting point is 00:54:16 all those kinds of things. It's fascinating that it's all continues. But then you then ask, you know, the history of artificial intelligence, you ask, are we able to introspect and convert the very things that allow us to tell the difference to cat and dog into logic, into formal logic. That's been the dream. I would say that's still part, Leonard, who created psych.
Starting point is 00:54:48 And that's a project that lasts for many decades and still carries a sort of dream in it. Right? But we still don't know the answer, right? It seems like connectionism is really powerful. But it also seems like there's this building of knowledge. And so how do you square those two? Do you think the connections can contain the depth of human knowledge and the depth of what Dave
Starting point is 00:55:17 Rahmohart was thinking about of understanding? Well, that remains a $64 question, and with inflation that then maybe it's a $64 billion question from the emergent aside, which, you know, I placed myself on. So I used to sometimes tell people I was a radical, eliminative connectionist because I didn't want them to think that I wanted to build like anything into the machine, but I don't like the word Eliminative anymore because it makes it seem like it's wrong to think that there is this emergent level of understanding. And I disagree with that. So I think, you know, I would call myself an a radical emergentist connectionist rather than a
Starting point is 00:56:36 eliminative connectionist, right? Because I want to acknowledge that these higher-level kinds of aspects of our cognition are real, but they're not, they don't exist as such. And there was an example that Doug Hofstetter used to use that I thought was helpful in this respect. Just the idea that we can think about sand dunes as entities and talk about how many there are even, but we also know that a sand dune is a very fluid thing. It's a pile of sand that is capable of moving around under the wind and reforming itself in somewhat different ways.
Starting point is 00:57:39 And if we think about our thoughts as like sand dunes as being things that emerge from just the way all the lower level elements sort of work together and are constrained by external forces, then we can say yes, they exist as such, but they also, you know, we shouldn't treat them as completely monolithic entities that we can understand without understanding sort of all of the stuff that allows them to change in the ways that they do. And that's where I think the connectionist feeds into the cognitive. It's like, okay, so if the under, if the substrate is parallel distributed
Starting point is 00:58:26 connectionist, then it doesn't mean that the contents of thought isn't, you know, like abstract and symbolic, but it's more fluid maybe than it's easier to capture with a set of logical expressions. Yeah, that's a heck of a sort of thing to put at the top of a resume, radical, emergentist, connectionist. So there is, just like you said, a beautiful dance between that, between the machinery of intelligence, like the neural network side of it, and the stuff that emerges. I mean, the stuff that emerges seems to be, I don't know what that is. It seems like maybe all of reality is emergent.
Starting point is 00:59:19 What I think about, this is made most distinctly rich to me when I look at cellular automata, look at game of life, that from very, very simple things, very rich, complex things emerge that start looking very quickly like organisms, that you forget how the actual thing operates. They start looking like they're moving around, they're eating each other, some of them are generating, offspring, you forget very quickly. And it seems like maybe it's something about the human mind that wants to operate in some layer of the emergent
Starting point is 00:59:57 and forget about the mechanism of how that emergent happens. So it just like you are in your radicalness, that emerges happens. So it just like you are in your radicalness, I'm also seems like unfair to eliminate the magic of that emergent, like eliminate the fact that that emergent is real. Yeah, no, I agree. I'm not. That's why I got rid of eliminative. I don't know ifiminative, yeah. Yeah, because it seemed like that was trying to say that it's all completely like...
Starting point is 01:00:30 An illusion of some kind, it's not. Well, it, you know, who knows whether there isn't, there aren't some illusory characteristics there. And I think that philosophically, many people have confronted that possibility over time, but it's still important to accept it as magic. So I think of Follini in this context,
Starting point is 01:00:59 I think of others who have appreciated the role of magic, of actual trickery in creating illusions that move us. You know, and Plato was on to this too. It's like somehow or other these shadows, you know, give rise to something much deeper than that. And that's, so we won't try to figure out what it is. We'll just accept it as given that that occurs. And, but he was still onto the magic of it.
Starting point is 01:01:37 Yeah, yeah. We won't try to really, really, really deep understand how it works, we just enjoy the fact that it's kind of fun. Okay, but you worked close to Dave, over all my heart, he passed away as a human being. What do you remember about him? Do you miss the guy? Absolutely.
Starting point is 01:02:02 You know, he passed away 15 years ago now, and his demise was actually one of the most poignant You know, like relevant tragedies relevant to our conversation. He started to undergo a progressive neurological condition that isn't fully understood. That is to say, his particular course isn't fully understood because brain scans weren't done at certain stages and no autopsy was done or anything like that. The wishes of the family. So we don't know as much about the underlying pathology as we might. But I had begun to get interested in this neurological condition
Starting point is 01:03:17 that might have been the very one that he was succumbing to as my own efforts to understand another aspect of this mystery that we've been discussing while he was beginning to get progressively more and more affected. So I'm going to talk about the disorder and not about Rommelhart for a second. The disorder is something my colleagues and collaborators have chosen to call semantic dementia. So it's a specific form of loss of mind related to meaning, semantic dementia, and it's progressive in the that the patient loses the ability to appreciate the meaning of the experiences that they have. Either from touch, from sight, from sound, from language. They, I hear sounds, but I don't know what they mean kind of thing.
Starting point is 01:04:35 So as this illness progresses, it starts with the patient being unable to differentiate like similar breeds of dog or remember, you know, the lower frequency unfamiliar categories that they used to be able to remember. But as it progresses, it becomes more and more striking and, you know, the patient loses the ability to recognize, you know, things like pigs and goats and sheep and calls all middle-sized animals dogs and all can't recognize rabbits and rodents anymore. They call all the little ones cats and they can't recognize hippopotamuses and cowsing where they call them all horses. hippopotamuses and in cowsing where they call them all horses, you know. So there was this one patient who went through this progression where at a certain point any four-legged animal he would call it either a horse or a dog or a cat.
Starting point is 01:05:43 And if it was big he would tend to call it a horse. If it was small he'd tend to call it a cat, middle-sized ones he called dogs. This is just a part of the syndrome, though. The patient loses the ability to relate concepts to each other. So my collaborator in this work, Carolyn Patterson developed a test called the pyramids and palm trees test. So you give the patient a picture of pyramids and they have a choice. Which goes with the pyramids? Palm trees or pine trees? And you know, she showed that this
Starting point is 01:06:17 wasn't just a matter of language because the patients' loss of this ability shows up whether you present the material with words or with pictures. The pictures, they can't put the pictures together with each other properly anymore. They can't relate the pictures to the words either. They can't do word picture matching, but they've lost the conceptual grounding from either modality of input. And so that's why it's called semantic dimension. The very semantics is disintegrating. And we understand this in terms of our idea that distributed representation, a pattern of activation represents the concepts, really similar ones.
Starting point is 01:07:03 As you degrade them, they start being, you lose the differences and then, so the difference between the dog and the goat is no longer a part of the pattern anymore. Since dog is really familiar, that's the thing that remains. We understand that in the way the models work and learn. But Rommelhart underwent this condition. So on the one hand, it's a fascinating aspect of parallel distributed processing to be. And it reveals this sort of texture of distributed representation in a very nice way, I've always felt, but at the same time, it was extremely poignant because this is exactly the condition
Starting point is 01:07:45 that Rommel Hart was undergoing. And there was a period of time when he was this man who had been the most focused, goal competitive, thoughtful person who was willing to work for years to solve a hard problem. You know, he, he, he, he starts to disappear. And there was a period of time when it was like hard for any of us to really appreciate that he was sort of in some sense not fully there anymore. Do you know if he was able to introspect this dissolution of the understanding mind? Was he, I mean, this is one of the big scientists that thinks about this. Yeah.
Starting point is 01:08:49 Was he able to look at himself and understand the fading mind? You know, we can contrast hawking and rumble heart in this way. And I like to do that to honor rumbleelhart because I think Rommelhart is sort of like the Hawking of cognitive science to me in some ways. Both of them suffered from a degenerative condition. In Hawking's case, it affected the motor system. In Rommelhart's case, it's affecting the semantics and not just the pure object semantics, but maybe the self semantics as well. And we don't understand that. Cosups broadly.
Starting point is 01:09:40 So I would say he didn't, and this was part of what from the outside was a profound tragedy, but on the other hand, at a some level, he sort of did, because there was a period of time when he finally was realized that he had really become profoundly impaired. This was clearly a biological condition, he wasn't, you know, it wasn't just like he was distracted that day or something like that. So he retired, you know, from his professorship at Stanford and he became, he lived with his brother for a couple years and then he moved into a facility for people with cognitive impairments, one that many elderly people end up in when they have cognitive impairments.
Starting point is 01:10:39 I would spend time with him during that period. This was like in the late 90s around 2000 even. And you know, I would we would go bowling and he could still bowl. And I after bowling, I took him to lunch and I said, where would you like to go? You want to go to Wendy's and he said, nah. And I said, okay, well, where you want to go? And he just pointed. He's turned here, you know. So he still had a certain amount of spatial cognition and he could get me to the restaurant.
Starting point is 01:11:16 And then when we got to the restaurant, I said, what do you want to order? And he couldn't come up with any of the words, but he knew where on the menu the thing was that he wanted. So, it's, you know, and he couldn't say what it was, but he knew that that's what he wanted to eat. And so there was, you know, that it's like, it isn't monolithic at all.
Starting point is 01:11:44 Our cognition is, first of all, graded in certain kinds of ways, but also multi-partite. There's many elements to it. And things, certain sort of partial competencies still exist in the absence of other aspects of these competencies. So this is what always fascinated me about what used to be called cognitive, neuropsychology, the effects of brain damage on cognition.
Starting point is 01:12:19 But in particular, this gradual disintegration part. You know, I'm a big believer that the loss of human being that you value is as powerful as, you know, first falling in love with that human being. I think it's all a celebration of the human being. So the disintegration itself too is a celebration in a way. Yeah. And but just to say something more about the scientist and the backpropagation idea that you mentioned. So, in 1982,
Starting point is 01:13:00 Hinton had been there as a postdoc and organized that conference. He'd actually gone away and gotten an assistant professor ship and then there was this opportunity to bring him back. So Jeff Hinton was back on a sabbatical in San Diego. And Rommelhard and I had decided we wanted to do this you know we thought it was really exciting and we wanted to do this, you know, we thought it was really exciting and our, the papers on the interactive activation model that I was telling you about had just been published. And we both sort of saw a huge potential for this work.
Starting point is 01:13:35 And Jeff was there. And so the three of us started a research group, which we called the PDP research group. And several other people came. Francis Crick, who was at the Salk Institute, heard about it from Jeff, and because Jeff was known among Brits to be brilliant, and Francis was well connected with his British friends. So Francis Crick came and a heck of a group of people. Wow. Okay. And So Francis Crick came and a heck of a group of people. Wow. Okay. And several as Paul Spolansky was one of the other postdocs. He was still there as a postdoc and a few other people. But anyway, Jeff talked to us about learning and how we should think about how learning occurs in a neural network, and he said, the problem with the way you guys have been approaching this is that you've been looking for inspiration from biology
Starting point is 01:14:46 to tell you how what the rules should be for how the synapses should change the strengths of their connections, how the connections should form. That's the wrong way to go about it. What you should do is you should think in terms of how you can adjust connection weights to solve a problem. So you define your problem and then you figure out how the adjustment of the connection weights will solve the problem. And Rumblehart heard that and said to himself, okay, so I'm going to start thinking about it that way. I'm going to essentially imagine that I have some objective function, some goal of the computation. I want my machine to correctly classify all of these images. And I can score that. I can measure how well they're doing on each image. And I get some
Starting point is 01:15:54 measure of error or loss that's typically called in deep learning. And I'm going to figure out how to adjust the connection weights so as to minimize my loss or reduce the error. And that's called gradient descent. And engineers were already familiar with the concept of gradient descent. And in fact, there was an algorithm called the Delta Rule that had been invented by a professor in the electrical engineering department at Stanford, Bernie Woodrow and a collaborator named Hoff. I don't never met him. Anyway, so gradient descent in continuous neural networks with multiple neuron-like processing units
Starting point is 01:16:51 was already understood for a single layer of connection weights. We have some inputs over a set of neurons. We want the output to produce a certain pattern. We can define the difference between our target and what the network is producing and we can figure out how to change the connection weights to reduce that error. So what Rommelhard did was to generalize that. So is to be able to change the connections from earlier layers of units to the ones at a hidden layer between the input
Starting point is 01:17:26 and the output. And so he first called the algorithm, the generalized delta rule, because it's just an extension of the gradient descent idea. And interestingly enough, Hinton was thinking that this wasn't going to work very well. So Hinton had his own alternative algorithm at the time, based on the concept of the Bolson machine that he was pursuing.
Starting point is 01:17:54 So the paper on the Bolson machine came out in learning and Bolson machines came out in 1985. But it turned out that back prop worked better than the Bolson machine learning algorithm. So this generalized delta algorithm ended up being called back propagation as you say back prop. Yeah. And the, you know, probably that name is opaque to me.
Starting point is 01:18:20 Maybe what does that mean? What it meant was that in order to figure out what the changes you needed to make to the connections from the input to the hidden layer, you had to back propagate the error signals from the output layer through the connections from the hidden layer to the output through the connections from the hidden layer to the output to get the signals that would be the error signals for the hidden layer. And that's how Rommelhart formulated it. It was like, well, we know what the error signals are
Starting point is 01:18:54 at the output layer. Let's see if we can get a signal at the hidden layer that tells each hidden unit what its error signal is essentially. So it's back propagropagating through the connections from the hidden to the output to get the signals to tell the hidden units how to change their weights from the input. And that's why it's called backprop. Yeah, but so it came from Hinton having introduced the concept of, you know, define your objective function, figure out how to take the derivatives so that you can adjust the connection so that they make
Starting point is 01:19:32 progress towards your goal. So stop thinking about biology for a second and let's start to think about optimization and computation a little bit more. So what about Jeff Hinton? What you've gotten a chance to work with him and that little the set of people involved there is quite incredible. The small set of people under the PDP flag is just given the amount of impact those ideas have had over the years. It's kind of incredible to think about, but you know, just like you said, like yourself, Jeffrey Hinton has seen as one of the, not just like a seminal figure in AI, but just a brilliant person, just like the horsepower
Starting point is 01:20:16 of the mind is pretty high up there for him because he's just a great thinker. So what kind of ideas have you learned from him? Have you influenced each other on? Have you debated over what stands out to you in the full space of ideas here at the intersection of computation and cognition? Well, so Jeff has said many things to me that had a profound impact on my thinking. And he's written several articles which were way ahead of their time. He had two papers in 1981 just to give one example. One of which was essentially the idea of Transformers and another of which was an early
Starting point is 01:21:17 paper on semantic cognition which inspired him and Rommel Hart and me throughout the 80s and still I think sort of grounds my own thinking about the semantic aspects of cognition. He also, in a small paper that was never published that he wrote in 1977, before he actually arrived at UCSD, or maybe a couple of years even before that, I don't know, when he was a PhD student, he described how a neural network could do recursive computation. And it was a very clever idea that he's continued to explore over time, which was sort of the idea that when you call a subroutine, you need to save the state that you had when you called it so you can get back to where you were when you're finished with the subroutine. And the idea was that you would save the state of the calling routine by making fast changes
Starting point is 01:22:41 to connection weights. And then when you finished with the subroutine call, those fast changes in the connection weights would allow you to go back to where you had been before and reinstate the previous context so that you could continue on with the top level of the computation. Anyway, that was part of the idea. And I always thought, okay, that's really, you know, you just, he had extremely creative ideas that were quite a lot ahead of his time,
Starting point is 01:23:14 and many of them in the 1970s and early, early 1980s. So, another thing about Jeff Hinton's way of thinking, which has profoundly influenced my effort to understand human mathematical cognition, is that he doesn't write too many equations. And people tell stories like oh in the hints and lab meetings you don't get up at the board and write equations like you do in everybody else's machine learning lab. What you do is you draw a picture. And you know he explains aspects of the way deep learning works by putting his hands together and showing you the shape of a ravine. And using that as a geometrical metaphor for what's happening as this gradient descent process, you're coming down the wall of a ravine. If you take too big a jump, you're going to jump to the other side. And so that's why we have to turn down the learning rate, for example.
Starting point is 01:24:29 And it speaks to me of the fundamentally intuitive character of deep insight, together with a commitment to really understanding in a way that's absolutely ultimately explicit and clear, but also intuitive. Yeah, there's certain people like that. Here's an example, some kind of weird mix of visual and intuitive and all those kinds of things. Five minutes, another example, different style of thinking, but very unique. And when you're around those people, for me in the engineering realm, there's a guy named Jim Keller, who is a chip designer engineer.
Starting point is 01:25:23 Every time I talk to him, it doesn't matter what we're talking about, just having experience that unique way of thinking transforms you and makes your work much better. And that's the magic you look at Daniel Coneman, you look at the great collaborations throughout the history of science. That's the magic of that. It's not always the exact ideas that you talk about, but it's the process of generating those ideas, being around that, spending time with that human being, you can come up with some brilliant work, especially when it's across the supplementaries. It was a little bit in your case, Jeff. Yeah. Jeff is a descendant of the logician bull.
Starting point is 01:26:07 He comes from a long line of English academics. And together with the deeply intuitive thinking ability that he has, he also has, you know, it's been clear, he's described this to me, and I think he's mentioned it from time to time in other interviews that he's had with people, you know, he's wanted to be able to sort of think of himself as contributing to the understanding of reasoning itself, not just human reasoning, like, bull-like is about logic, right? It's about what can we conclude from what else and how do we formalize that? And as a computer scientist, logician, philosopher, you know, the goal is to understand how we derive
Starting point is 01:27:13 truths from other, from givens and things like this. And the work that Jeff was doing in the early to mid-80s on something called the Bolson machine was his way of connecting with that Boolean tradition and bringing it into the more continuous probabilistic graded constraint satisfaction realm. And it was, it was beautiful, a set of ideas linked with theoretical physics, and as well as with logic. And it's always been, I mean, I've always been inspired by the Bolsa machine too. It's like, well, if the neurons are probabilistic rather than, you know, deterministic in their computations, then, you know, that maybe this somehow is part of the
Starting point is 01:28:17 serendipity or, you know, adventitiousness of the moment of insight, right? It might not have occurred at that particular instant. It might be sort of partially the result of this theastic process. And that too is part of the magic of the emergence of some of these things. Well, you're right with the Boolean lineage
Starting point is 01:28:39 and the dream of computer science is somehow, I mean, I certainly think of humans this way, that humans are one particular manifestation of intelligence, that there's something bigger going on, and you're trying to, you're hoping to figure that out. The mechanisms of intelligence, the mechanisms of cognition are much bigger than just humans. Yeah. So I think of, I've, I started using the phrase computational intelligence at some point as to characterize the, the field that I thought, you know, people like Jeff Hinton and many of the, of the people I know at Deep Mind are, are working in and where I, I feel like I'm, are working in and where I feel like I'm a human- amount of the excitement of deep learning actually lies is in the idea that what we can achieve with our own nervous systems when we build computational intelligence
Starting point is 01:30:08 that are not limited in the ways that we are by our own biology. Perhaps allowing us to scale the very mechanisms of human intelligence just increase its power through scale. Yes. And I think that that, you know, obviously that's the, that's being played out massively at Google Brain, at OpenAI, and to some extent a deep mind as well. I guess I shouldn't say to some extent, the massive scale of the computations that are used to succeed at games like Go or to solve the protein folding problems that they've been solving and so on.
Starting point is 01:30:57 Still not as many synapses and neurons as the human brain. So we still got, we're still beating them on that. We humans are beating the AIs, but they're catching up pretty quickly. You write about modeling of mathematical cognition. So let me first ask about mathematics in general. There's a paper titled Parallel Distributed Processing Approach to Mathematical Cognition, where in the introduction there's some beautiful discussion of mathematics. And you reference there Tristan Needham, who criticizes a narrow form of your mathematics by liking the studying of mathematics as simple manipulation to studying music without
Starting point is 01:31:44 ever hearing a note. So from that perspective, what do you think is mathematics? What is this world of mathematics like? Well, I think of mathematics as a set of tools for exploring idealized worlds that often turn out to be extremely relevant to the real world, but need not. Objects exist with idealized properties. And in which the relationships among them can be characterized with precision, so as to to allow the implications of certain facts to then allow you to derive other facts with certainty. So, you know, if you have two triangles and you know that there is an angle in the first one that has the same measure as an angle in the second one.
Starting point is 01:33:12 And you know that the length of the sides adjacent to that angle in each of the two triangles, the corresponding sides adjacent to that angle are also have the same measure. Then you can then conclude that the triangles are congruent, that is to say they have all of their properties in common. And that is something about triangles. It's not a matter of formulas. These are idealist objects. In fact, you know, we build bridges out of triangles and we understand how to measure the height of something we by extending these ideas about triangles a little further. And all of the ability to get a tiny spec of matter
Starting point is 01:34:18 launched from the planet Earth to intersect with some tiny, tiny little body way out and way beyond Pluto somewhere. And exactly a predicted time and date is something that depends on these ideas, right? But it's actually happening in the real physical world that these ideas make contact with it in those kinds of instances. And so, but you know, there are these idealized objects, these triangles, or these distances or these points, whatever they are, that allow for this set of tools to be created, that then gives human beings the, it's this incredible leverage that they didn't have without these concepts. And I think this is actually already true
Starting point is 01:35:33 And I think this is actually already true when we think about just, you know, the natural numbers. I always like to include zero. So I'm going to say a non-negative integer, but that's a place where some people prefer not to include zero, but we like zero here. So that's number zero, one, two, three, four, five, six, seven, and so on. Yeah. And, and you know, because they give you the ability to be exact about Like how many sheep you have like, you know, I sent you out this morning. There were 23 sheep You came back with only 22 what happened? Yeah, the fundamental problem of physics. How many sheep you have?
Starting point is 01:36:17 It's a fundamental problem of life of human Society that you damn out better break back the same number of chiefs as you started with. And you know, it allows commerce, it allows contracts, it allows the establishment of records and so on to have systems that allow these things to be notated. But they have an inherent aboutness to them. That's at the one at's one in the same time sort of abstract
Starting point is 01:36:50 and idealized and generalizable while, at the other hand, potentially very, very grounded and concrete. And one of the things that makes for the incredible achievements of the human mind is the fact that humans invented these idealized systems that leverage the power of human thought in such a way as to allow all this kind of thing to happen. And so that's what mathematics to me is the development of systems for thinking about the properties and relations among sets of idealized objects and, you know, the mathematical notation system that we unfortunately focus way too much on is just our way of expressing propositions about these properties. Right. It's just just like we're talking with
Starting point is 01:38:08 Chomsky and language, it's the thing we've invented for the communication of those ideas. They're not necessarily the deep representation of those ideas. Yeah. So what's the good way to model such powerful mathematical reasoning, would you say? What are some ideas you have for capturing this in a model? The insights that human mathematicians have had is a combination of the kind of connectionist like knowledge that makes it so that something is just
Starting point is 01:38:53 like obviously true so that you don't have to think about why it's true, that then makes it possible to then take the next step and ponder and reason and figure out something that you previously didn't have that intuition about. It then ultimately becomes a part of the intuition that the next generation of mathematical thinkers have to ground their own thinking on so that they can extend the ideas even further. I came across this quotation from Henri Poincare while I was walking in the woods with my wife in a state park in northern California, late last summer. And what it said on the bench was it is by logic that we prove, but by intuition that we discover. And so what for me, the essence of the project is to understand how to bring the intuitive connectionist resources to bear on letting the intuitive discovery rise from engagement in thinking with this formal system. So I think of the ability of somebody like Hinton or Newton or Einstein or Rommel Hart or Poirot-Carré to our comedies, this is another example, right?
Starting point is 01:40:51 So suddenly a flash of insight occurs, it's like the constellation of all of these simultaneous constraints that somehow or other causes the mind to settle into a novel state that it never did before and and give rise to a new idea that you know, then you could say, okay, well now how can I prove this? You know? How do I write down the steps of that theorem that allow me to make it rigorous and certain? And so I feel like the kinds of things that we're beginning to see deep learning systems do of their own accord kind of gives me this feeling of, I don't it'll all happen. So, in particular, as many people now have become really interested in thinking about, you know, neural networks that have been trained with massive amounts of text can be given a prompt and they can then sort of generate some really interesting, fanciful creative story from that prompt. And there's kind of like a sense that they've somehow synthesized something like novel out of the, you know, all of the
Starting point is 01:42:48 particulars of all of the billions and billions of experiences that went into the training aid. That gives rise to something like this sort of intuitive sense of what would be a fun and interesting little story to tell or something like that. It just sort of wells up out of the letting the thing play out its own imagining of what somebody might say given this prompt as an input to get it to start to generate its own thoughts. And to me, that sort of represents the potential of capturing the intuitive side of this. And there's other examples. I don't know if you find them as captivating
Starting point is 01:43:34 is on the deep mind side with alpha zero. If you study chess, the kind of solutions that has come up in terms of chess, it is, there's novel ideas there. It feels very, like there's brilliant moments of insight. And the mechanism they use, if you think of search as maybe more towards good old fashioned AI, and then there's the connectionist network that has the intuition of looking at a board, looking at a set of patterns, and saying, how good is this set of positions, and the next few positions, how good are those. And that's it. That's just an intuition.
Starting point is 01:44:17 Yeah. understanding positionally, tactically, how good the situation is, how can it be improved without doing this full, like deep search. And then maybe doing a little bit of what human chest players call calculation, which is the search, and taking a particular set of steps down the line to see how they enrol. But there is moments of genius in those systems too. So that's another hopeful illustration that from neural networks can emerge this novel creation of an idea. Yes, and I think that, you know, I think Demis Sabis is, you know, he's spoken about those things. I heard him describe a move that was made in one of the go matches against Lisa Dahl in
Starting point is 01:45:13 this very similar way. It caused me to become really excited to collaborate with some of those guys at DeepMind. So I think, though, that what I like to really emphasize here is one part of what I like to emphasize about mathematical cognition, at least, is that philosophers and logicians going back three or even a little more than 3000 years ago And gradually the whole idea about thinking formally got constructed. And you know, it's preceded Euclid, certainly present in the work of Thales and others. And I'm not a the world's leading expert in all the details of that history, but Euclid's elements were the kind of the touch point of a coherent document that sort of laid out this idea of an actual formal system within which these objects were characterized and the system of
Starting point is 01:46:51 inference that allowed new truths to be derived from others was sort of like established as a paradigm. And what I find interesting is the idea that the ability to become a person who is capable of thinking in this abstract formal way is a result of the same kind of immersion in experience thinking in that way that we now begin to think of our understanding of language as being, right? So we immerse ourselves in a particular language, in a particular world of objects and their relationships, and we learn to talk about that, and we develop intuitive understanding of the real world. In a similar way, we can think that what academia has created for us, what those early philosophers and their academies
Starting point is 01:48:10 and Athens and Alexandria and other places, allowed was the development of these schools of thought, modes of thought that then become deeply ingrained and you know, it becomes what it is that makes it so that somebody like Jerry Foder would think that systematic thought is the essential characteristic of the human mind as opposed to a derived and an acquired characteristic that results from a culture in a certain mode that's been invented by humans. Would you say it's more fundamental than like language. If we start dancing,
Starting point is 01:49:02 if we bring Chomsky back into the conversation. First of all, is it unfair to draw a line between mathematical cognition and language, linguistic cognition? I think that's a very interesting question. I think it's one of the ones that I'm actually very interested in right now. But I think the answer is, in important ways, it is important to draw that line. But then to come back and look at it again and see some of the subtleties and interesting aspects of the difference. So,
Starting point is 01:49:54 if we think about Chomsky himself, he was born into an academic family. His father was a professor of rabbinical studies at a small rabbinical college in Philadelphia. And he was deeply inculturated in, you know, a culture of thought and reason and brought to the effort to understand natural language, this profound engagement with these formal systems. And, you know, I think that there was tremendous power in that and that Chomsky had some amazing insights into the structure of natural language. But that, I'm going to use the word but there.
Starting point is 01:51:01 The actual intuitive knowledge of these things only goes so far and does not go as far as it does in people like Chomsky himself. And this was something that was discovered in the PhD dissertation of Lila Glytman, who was actually trained in the same linguistics department with Chomsky. So what Lila discovered was that the intuitions that linguists had about even the meaning of a phrase, not just about its grammar, but about what they thought a phrase must mean were very different from the intuitions of an ordinary person who wasn't a formally trained thinker. And well, it recently has become much more salient. I happen to have learned about this when I myself was a PhD student at the University of Pennsylvania, but I never knew how to put it together with all of my other thinking about these things. So I actually currently have the hypothesis that formally trained linguists and other
Starting point is 01:52:16 formally trained academics, whether it be linguistics, philosophy, cognitive science, computer science, machine learning, mathematics, have a motive engagement with experience that is intuitively deeply structured to be more organized around the systematicity and ability to be conformant with the principles of a system, then is actually true of the natural human mind without that immersion. This fascinating. It's the different fields and approaches with which you start to study the mind, actually take you away from the natural operation of the mind. So it makes you very difficult for you to be somebody who introspects. Yes. And, you know, this is where things about human belief and so-called knowledge that we
Starting point is 01:53:47 consider private not our business to manipulate in others. We are not entitled to tell somebody else what to believe about certain kinds of things What are those beliefs? Well, they are the product of this sort of immersion and inculturation That is what I believe So and that's limiting it's
Starting point is 01:54:23 It's something to be aware of. Does that let me hear from having a good model of some of cognition? It can. So, when you look at mathematical or linguistics, so I mean, what is that line then? What, so, so, is Chomsky unable to sneak up to the full picture of cognition? Are you, when you're focusing on mathematical thinking, are you also unable to do so? I think you're, you're right. I think that's a great way of characterizing it. And, um, I also think that, um, it's related to, um, the concept of beginner's mind and another concept called
Starting point is 01:55:12 the expert blind spot. So the expert blind spot is much more prosaic seeming than this point that you were just making. But it's something that plagues experts when they try to communicate their understanding to non-experts. And that is that things are self-evident to them that that they can't begin to even think about how they could explain it to somebody else, because it's like, well, it's said, God made the natural numbers, all else is the work of man, he was expressing that intuition that somehow or other, you know, the basic fundamentals of discrete quantities being countable and innumerable and, you know, indefinite in number was not something that had had to be discovered.
Starting point is 01:56:45 But he was wrong. It turns out that many cognitive scientists agreed with him for a time. There was a long period of time where the natural numbers were considered to be a part of the innate endowment of core knowledge or to use the kind of phrases that Spellkey and Kerry use to talk about what they believe are the innate primitives of the human mind. And they no longer believe that. It's actually been more or less accepted by almost everyone
Starting point is 01:57:22 that the natural numbers are actually a cultural construction. And it's so interesting to go back and sort of like study those few people who still exist who don't have those systems. So this is just an example to me and where a certain mode of thinking about language itself or a certain mode of thinking about language itself or a certain mode of thinking about geometry and those kinds of relations. So it becomes so second nature that you don't know what it is that you need to teach.
Starting point is 01:57:56 And in fact, we don't really teach it all that explicitly anyway. And it's, you know, you take a math class, the professor sort of teaches it to you the way they understand it. Some of the students in the class sort of like, you know, they get it. They start to get the way of thinking and they can actually do the problems that get put on the homework that the professor thinks are interesting and challenging ones, but most of the students who don't kind of engage as deeply don't ever get, you know. And we think, oh, that man must be brilliant. He must have this special insight, but I, you know, he must have some, you know, biological sort of bit that's different, right?
Starting point is 01:58:46 That makes him so that he or she could have that insight. But I am, although I don't want to dismiss biological individual differences completely, I, I find it much more interesting to think about the possibility that, I find it much more interesting to think about the possibility that it was that difference in the dinner table conversation at the Chomsky House when he was growing up that made it so that he had that cast of mind. Yeah, and there's a few topics we talked about that kind of interconnect because I wonder the better I get at certain things, we humans, the deeper we understand something, what are you starting to then miss about the rest of the world? We talked about David and his degenerative mind and and his degenerative mind. And, you know, when you look in the mirror
Starting point is 01:59:47 and wonder, how different am I cognitively from the man I was a month ago, from the man I was a year ago? Like, what, you know, if I can have a thought about language if I'm Chomsky for 10, 20 years, what am I no longer able to see? What is in my blind spot and how big is that? And then to somehow be able to leap back out of your
Starting point is 02:00:13 deep structure that you're foreign for yourself about thinking about the world, leap back and look at the big picture again or jump out of your current way of thinking. and look at the big picture again or jump out of your current way of thinking. And be able to introspect like what are the limitations of your mind? How is your mind less powerful than you used to be or more powerful or different powerful in different ways. So that seems to be a difficult thing to do because we're living, we're looking at the world through the lens of our mind, right? To step outside and introspect is difficult, but it seems necessary if you want to make progress. You know, one of the threads of psychological research that's always been very, I don't
Starting point is 02:01:00 know, important to me to be aware of is the idea that our explanations of our own behavior aren't necessarily actually part of the causal process that caused that behavior to occur, or even valid observations of the set of constraints that led to the outcome. But they are post-hoc rationalizations that we can give based on information at our disposal about what might have contributed to the result that we came to when asked. And so this is an idea that was introduced in a very important paper by Nisbit and Wilson about the limits on our ability to be aware of the factors that cause us to make the choices that we make. And, you know, I think it's something that we really ought to be much more cognizant
Starting point is 02:02:22 of in general as human beings is that our own insight into exactly why we hold the beliefs that we do and we hold the attitudes and make the choices and and feel the feelings that we do is not something that we we totally control or totally observe. And it's subject to our culturally transmitted understanding of what it is, that is the mode that we give to explain these things when asked to do so, as much as it is about anything else. And so even our ability to interrespect and think we have access to our own thoughts is a product of culture and belief, you know, practice. big question of advice. So you've lived an incredible life in terms of the ideas you've put out into the world in terms of the trajectory you've taken through your career, through your life. What advice would you give to young people today?
Starting point is 02:03:35 In high school and college, about how to have a career or how to have a life that can be proud of. Finding the thing that you are intrinsically motivated to engage with and then celebrating that discovery is what it's all about. what it's all about. When I was in college, I struggled with that. I had thought I wanted to be a psychiatrist because I think I was interested in human psychology and high school. And at that time, the only sort of information I had that had anything to do
Starting point is 02:04:26 with the psyche was, you know, Freud and Eric Frome and sort of popular psychiatry kinds of things. And so, well, they were psychiatrists, right? So I had to be a psychiatrist. And that meant I had to go to medical school. And I got to college and I find myself taking, school. And I got to college and I find myself taking, you know, the first semester of a three-quarter physics class and it was mechanics. And this was so far for what it was I was interested in, but it was also two early in the morning in the winter court semester. So I never made it to the physics class. But I wanted about the rest of my freshman year and most of my sophomore year until I found myself in the midst of this situation where around me, there was this big revolution happening. I was at Columbia University in 1968 and the Vietnam War is going on. Columbia is building a gym in Morningside Heights, which is part of Harlem and people are
Starting point is 02:05:29 thinking, oh, the big, bad rich guys are stealing the parkland that belongs to the people of Harlem. And, you know, they're part of the military industrial complex, which is enslaving us and sending us all off to war in Vietnam. And so there was a big revolution that involved a confluence of black activism and, you know, SDS and social justice and the whole university blew up and got shut down.
Starting point is 02:05:59 And I got a chance to sort of think about why people were behaving the way they were in this context. And I, you know, I happen to have taken mathematical statistics. I happen to have been taking psychology that quarter at just psych one and somehow things in that space all ran together in my mind and got me really excited about, about asking questions about why people, what made certain people go into the buildings and not others and things like that. And so suddenly I had a path forward that, and I had just been wandering around aimlessly.
Starting point is 02:06:38 And at the different points in my career, you know, and I think, okay, well, should I take this class or should I just read that book about some idea that I want to understand better, you know, or should I pursue the thing that excites me and interests me or should I meet some requirement? That's, I always did the latter. So I ended up, my professors in psychology were, thought I was great. They wanted me to go to graduate school. They nominated me for Phi Beta Kappa, and I went to the Phi Beta Kappa in the ceremony, and this guy came up
Starting point is 02:07:25 and I said, oh, are you Magna or Soma? I wasn't even getting honors based on my grades. They just happened to have thought I was interested enough in ideas to belong to Phi Beta Capa. So, I mean, would it be fair to say you kind of stumbled around a little bit through accidents of too early morning of classes and physics and so on until you discovered intrinsic motivation as you mentioned and then that's it. It hooked you and then you celebrate the fact that this happens to human beings. And what is it that made what I did intrinsically motivating to me?
Starting point is 02:08:03 And what is it that made what I did intrinsically motivating to me? Well, that's interesting and I don't know all the answers to it. And I don't think I wanna, I want anybody to think that you should be sort of in any way, I don't know, sanctimonious or anything about it. You know, it's like, I don't know, sanctimonious or anything about it. You know, it's like, I really enjoyed doing statistical analysis of data. I really enjoyed running my own experiment, which was what I got a chance to do in the
Starting point is 02:08:37 psychology department that chemistry and physics had never. I never imagined that mere mortals would ever do an experiment in those sciences, except one that was in the textbook that you were told to do in lab class. But in psychology, we were already like, even when I was taking psych one, it turned out, we had our own rat and we got to, after two set experiments, we got to,
Starting point is 02:09:00 okay, do something you think of, you know, with your rat, you know, so, it's the opportunity to do it myself. Yeah. And to bring together a certain set of things that engage me intrinsically. And I think it has something to do with why certain people turn out to be profoundly amazing musical geniuses, right?
Starting point is 02:09:30 They get immersed in it at an early enough point. And it just sort of gets into the fabric. So my little brother had intrinsic motivation for music as we witnessed when he discovered how to put records on the phonograph when he was like 13 months old and recognize which one he wanted to play, not because he could read the labels, because he could sort of see which ones had which scratches, which were the different, you know, oh that's rapid ES Beniole and that's, you know, and enjoyed that. That connected with them somehow. Yeah, and there was something that it fed into
Starting point is 02:10:05 and you're extremely lucky if you have that and if you can nurture it and can let it grow and let it be an important part of your life. Yeah, those are the two things is like be attentive enough to feel it when it comes. Like this is something special. I mean, I don't know. For example, I really like tabular data,
Starting point is 02:10:33 like Excel sheets, like it brings me a deep joy. I don't know how useful that is for anything. But there's part of what I've talked to you. Exactly. So there's like a million, not a million, but there's a lot of things like that for me. You have to hear that for yourself. Like, be like realize this is really joyful.
Starting point is 02:10:53 But then the other part that you're mentioning, which is the nurture, is take time and stay with it, stay with it a while, and see where that takes you in life. Yeah, and I think the motivational engagement results in the immersion that then creates the opportunity to obtain the expertise. So, you know, we could call it the Mozart effect, right?
Starting point is 02:11:18 I mean, when I think about Mozart, I think about, you know, the person who was born as the fourth member of the family string quartet, right? And they handed him the violin when he was six weeks old. All right, start playing, you know, it's like, and so the level of immersion there was amazingly profound, but hopefully he also had something, maybe this is where the more, sort of the genetic part comes in sometimes, I think, something in him resonated to the music so that the synergy of the combination of that
Starting point is 02:12:06 was so powerful. So that's what I really consider to be the most out of fact. It's sort of the synergy of something with experience that then results in the unique flowering of a particular, you know, mind. So I know my siblings and I are all very different from each other. We've all gone in our own different directions. And, you know, I mentioned my younger brother who was very musical. I had my other younger brother was like this amazing, like intuitive engineer. engineer and one of my sisters was passionate about water conservation well before it was such a huge, important issue that it is today.
Starting point is 02:12:58 So we all sort of somehow find a different thing't I don't mean to say it isn't tied in with something about about us biologically, but but it's also when that happens where you can find that, then, you know, you can do your thing and you can be excited about it. So people can be excited about fitting people on bicycles as well as excited about making neural networks, achieve insights into human cognition, right? Yeah, like for me personally, I've always been excited about love and friendship between humans and just like the actual experience of it, since I was a child just observing people around me and also been excited about robots and there's something in me that things I really would love to explore how those two things combine and it doesn't make any sense a lot of it is also timing
Starting point is 02:13:54 Just to think of your own career and your own life You found yourself in certain pieces places that Happened to evolve some of the greatest thinkers of our time And so it just worked out that like you guys developed those ideas and there may be a lot of other people similar to you and they were brilliant and they never found that right connection in place to where they ideas could flourish. So it's timing, it's place, it's people and ultimately the whole ride, you know, it's undirected.
Starting point is 02:14:25 I'm gonna ask you about something you mentioned in terms of psychiatry when you were younger, because I had a similar experience of reading Freud and call young and just, you know, those kind of popular psychiatry ideas. And that was a dream for me early on in high school too. Like I hope to understand the human mind by, somehow psychiatry felt like the right discipline for that. Does that make you sad that psychiatry is not the mechanism by which you want to, are able to explore the human mind.
Starting point is 02:15:05 So for me, I was a little bit dissolutioned because of how much prescription medication and biochemistry is involved in the discipline of psychiatry as opposed to the dream of the Freud like, use the mechanisms of language to explore the human mind. So that was a little disappointing. And that's why I kind of went to computer science and thinking like maybe you can explore the human mind by trying to build the thing. Yes, I wasn't exposed to the,
Starting point is 02:15:39 sort of the biomedical slash pharmacological aspects of psychiatry at that point because I didn't I dropped out of that whole idea the physics that I never even found out about that until much later but you're absolutely right that's so I was actually a member of the National Advisory Mental Health Council, that is to say, the Board of Scientists who advised the Director of the National Institute of Mental Health. And that was around the year 2000. And in fact, at that time, the man who came in as the new director. I had been on this board for a year when he came in.
Starting point is 02:16:28 Said, okay, schizophrenia is a biological illness. It's a lot like cancer. We've made huge strides in curing cancer and that's what we're going to do with schizophrenia. We're going to find the medications that are going to cure this disease. And we're not going to listen to anybody's grandmother anymore. And, you know, good old behavioral psychology is not something we're going to support any further. And, you know, he, he completely alienated me from the Institute and from all of its prior policies, which had been much more holistic, I think, really at some level. And basically, and the other people on the board were like psychiatrists, right?
Starting point is 02:17:24 Very biological psychiatrists, right? And very biological psychiatrists didn't pan out, right? That nothing has changed in our ability to help people with mental illness. And so 20 years later, that particular path was at that end, as far as I can tell. that particular path was at that end as far as I can tell. Well, there's some aspect to, and sorry, to romanticize the whole philosophical conversation about the human mind. But to me, psychiatrists, for time, held the flag of where the deep thinkers, in the same way that physicists or the deep thinkers about the nature of reality, psychiatrists or the deep thinkers about the nature of the human mind And I think that flag has been taken from them and carried by people like you
Starting point is 02:18:11 It's like it's more in the cognitive psychology Especially when you have a foot in the computational view of the world because you can both build it you can like Intuit about the functioning of the mind by building little models and be able to save mathematical things and then deploying those models, especially in computers, to say, does this actually work? They do a lot of experiments and then some combination of neuroscience, we are starting to actually be able to observe, you know, do certain experience on human beings and observe how the brain is actually functioning. And there, using intuition, you can start being the philosopher, like which your Feynman is the philosopher, a cognitive psychologist can become the philosopher, and psychiatrists
Starting point is 02:18:58 become much more like doctors. They're like very medical. They help people with medication, biochemistry and so on, but they are no longer the book writers and the philosophers, which of course I admire. I admire the Richard Feynman ability to do great low level mathematics and physics and the high level philosophy. Yeah, I think it was from and young more than Freud that was sort of initially kind of like made me feel like,
Starting point is 02:19:31 oh, this is really amazing and interesting and I want to explore it further. I actually, when I got to college and I lost that thread, I found more of it in sociology and literature than I did in any place else. So I took quite a lot of both of those disciplines as an undergraduate. And, you know, I was actually deeply ambivalent about the psychology because I was doing experiments after the initial flurry of interest in why people would occupy buildings during insurrection and consider,
Starting point is 02:20:14 you know, be sort of like so over committed to their beliefs. But I ended up in the psychology laboratory running experiments on pigeons. And so I had these profound sort of like dissonance between, okay, the kinds of issues that would be explored when I was thinking about what I read about in modern British literature, versus what I could study with my pigeons in the laboratory, that got resolved when I went to graduate school
Starting point is 02:20:48 and I discovered cognitive psychology. And so for me, that was the path out of this sort of, like extremely sort of ambivalent divergence between the interest in the human condition and the desire to do actual you know, actual mechanistically oriented thinking about it. And I think we've come a long way in that regard and that you're absolutely right that nowadays this is something that's accessible to people through something that's accessible to people through the pathway in, through computer science, or the pathway in through neuroscience.
Starting point is 02:21:33 You can get derailed in neuroscience down to the bottom of the system where you might find the cures of various conditions, but you don't get a chance to think about the higher level stuff. So it's in the systems and cognitive neuroscience and computational intelligence, my asthma up there at the top that I think these opportunities are most, are richest right now. And so yes, I am indeed blessed by having had the opportunity to fall into that space. So you mentioned the human condition. Speaking of which, you happen to be a human being who is unfortunately not immortal.
Starting point is 02:22:18 That seems to be a fundamental part of the human condition that this right ends. Do you think about the fact that you're going to die one day? Are you afraid of death? I would say that I am not as much afraid of death as I am of degeneration. and I say that in part for reasons of having seen some tragic degenerative situations unfold, it's exciting when you can continue to participate and Feel like you're you're near the the place where the
Starting point is 02:23:12 The wave is breaking on the shore. I feel like you know and and I I think about You know my own future potential If if I were to undergo a begin to suffer from dementia Alzheimer's disease or some other dementia condition, I would gradually lose the thread of that ability. So one can live on for several, for a decade after, you know, sort of having to retire because one no longer has these kinds of abilities to engage. And I think that's the thing that I fear the most. The losing of that, like the breaking of the way, the flourishing of the mind, where you have these ideas and they're swimming around, you're able to play with them.
Starting point is 02:24:15 Yeah, and collaborate with other people who, you know, are themselves really helping to push these ideas forward. So yeah, what about the edge of the cliff? The end, I mean, the mystery of it. The migrated conception of mind and sort of continuous way of thinking about most things makes it so that to me, the discreteness of that transition is less apparent than it seems to be to most people. I see. I see.
Starting point is 02:24:58 Yeah. I wonder, so I don't know if you know the work of Ernest Becker and so on. I wonder what role mortality and our ability to be cognizant of it and anticipate it and perhaps be afraid of it. What role that plays in our setting of the world? I think that it can be motivating to people to think they have a limited period left. I think in my own case, you know, it's like seven or eight years ago now that I was sitting around doing experiments on decision-making that were satisfying in a certain way because I could really get closure on what whether the model fit the data perfectly or not.
Starting point is 02:25:51 And I could see how one could test the predictions in monkeys as well as humans and really see what the neurons were doing. But I just realized, hey, wait a minute, I may only have about 10 or 15 years left here. And I don't feel like I'm getting towards the answers to the really interesting questions while I'm doing this particular level of work. And that's when I said to myself, okay, let's pick something.
Starting point is 02:26:22 That's hard. Yeah. So that's when I started working on mathematical cognition. And I think it was more in terms of, well, I got 15 more years possibly of useful life left. Let's imagine that it's only 10. I'm actually getting close to the end of that now, maybe three or four more years. But I'm beginning to feel like, well, I probably have another five after that.
Starting point is 02:26:47 So, okay, I'll give myself another, another six or eight. But a deadline is looming in there for. But I'm still going to go on forever. Yeah. And so, um, so, yeah, I got to keep thinking about the questions that I think are the interesting and important ones for sure. What do you hope your legacy is? You've done some incredible work in your life as a man, as a scientist. When the aliens and the human civilization is long gone and the aliens are
Starting point is 02:27:17 reading the encyclopedia about the human species, what do you hope is the paragraph written about you? I would want it to sort of highlight a couple of things that I was, you know, able to see one path that was more exciting to me than the one that seemed already to be there for a cognitive psychologist, you know, but not for any super special reason other than that I'd had the right context prior to that, but that I had gone ahead and followed that lead, you know, and then I forget the exact wording, but I said in this preface that the joy of science is the moment in which is the moment in which, you know, a partially formed thought in the mind of one person gets crystallized a little better in the discourse and becomes the foundation of some exciting concrete piece of actual scientific progress.
Starting point is 02:28:45 And I feel like that moment happened when Rommelhart and I were doing the interactive activation model. And when Rommelhart heard Hinton talk about gradient descent and having the objective function to guide the learning process. And it happened a lot in that period, and I sort of seek that kind of thing in my collaborations with my students, right? So, you know, the idea that this is a person
Starting point is 02:29:17 who contributed to science by finding exciting, collaborative opportunities to engage with other people through is something that I certainly hope is part of the paragraph. And like you said, taking a step maybe in directions that are non-obvious. So the old Robert Frost road less taken. So maybe because you said like this incomplete initial idea, that step you take is a little bit off the beaten path. If I could just say one more thing here. This was something that really contributed to energizing me in a way that I feel it would be useful to share. My PhD dissertation project was completely
Starting point is 02:30:10 an empirical experimental project. And I wrote a paper based on the two main experiments that were the core of my dissertation. And I submitted it to a journal. And at the end of the paper, I had a little section where I laid out my, the beginnings of my theory about what I thought was going on that would explain the data that I had collected. And I had submitted the paper to the Journal of Experimental Psychology. So I got back a letter from the editor saying, thank you very much.
Starting point is 02:30:50 These are great experiments. We'd love to publish them in the journal. But what we'd like you to do is to leave the theorizing to the theorists and take that part out of the paper. And so I did. I took that part out of the paper. And so I did, I took that part out of the paper. But, you know, I almost found myself labeled as a non-thearest right by this. And I could have succumbed to that, and said, okay, well, I guess my job is to just go on and do experiments, right? But that's not what I wanted to do. And
Starting point is 02:31:29 so when I got to my assistant professorship, although I continued to do experiments because I knew I had to get some papers out, I also, at the end of my first year, submitted my first article to psychological review, which was the theoretical journal, where I took that section and elaborated it and wrote it up and submitted it to them. And they didn't accept that either, but they said, oh, this is interesting. You should keep thinking about it this time. And then that was what got me going to think, okay, you know, so it's not a superhuman thing to contribute to the development of theory. You know, you don't have to be, you can do it as a mere mortal. And the broader, I think, lesson is don't succumb to the labels of a regular viewer in a drawer. Or anybody
Starting point is 02:32:23 labeling you, right? Exactly. I mean, yeah, exactly. And especially as you become successful, you'll label labels get assigned to you for that you're successful for that thing. I'm a connectionist or a cognitive scientist and not a neuroscientist. And then you can completely, that's just, that's the stories of the past. You're today a new person
Starting point is 02:32:45 that can completely revolutionize in totally new areas so don't let those labels hold you back well let me ask the big question when you look at into the you say it started with Columbia trying to observe these humans and they're doing weird stuff and you want to know why are they doing this stuff. So I zoom out even bigger at the hundred plus billion people who have ever lived on earth. Why do you think we're all doing what we're doing? What do you think is the meaning of it all, the big why question? We seem to be very busy doing a bunch of stuff and we seem to be kind of directed towards
Starting point is 02:33:26 somewhere. But why? Well, I myself think that we make meaning for ourselves and that we find inspiration in the meaning that other people have made in the past. You know, and the great religious thinkers of the first millennium BC and, you know, few that came in the early part of the second millennium laid down some important foundations for us. But I do believe that we are an emergent result of a process that happened naturally without guidance. And that meaning is what we make of it, and that the creation of efforts to reify meaning in like religious traditions and so on, it's just a part of the expression of that goal that we have to, you know,
Starting point is 02:34:51 not find out what the meaning is, but to make it ourselves. And so to me, it's something that's very personal, it's very individual, it's like meaning will come for you through the particular combination of synergistic elements that are your fabric and your experience and your context in your, and you know, you should, it's, it's all made in a, in a certain kind of a local context though, right? Because what, here I am at UCSD with this brilliant man, Rommel Hart, who's having these doubts about symbolic artificial intelligence that resonate
Starting point is 02:35:54 with my desire to see it grounded in the biology. And let's make the most of that. Yeah. And so from that, you know, yeah. And so, and so from that, like, little pocket, there's a some kind of, uh, peculiar, little, emergent process that then, uh, which is basically each one of us, each one of us humans is a kind of, you know, you think cells and they come together and it's an emergent process that then tells fancy stories about itself. And then gets, just like you said, just enjoys the beauty of the stories we tell about ourselves.
Starting point is 02:36:33 It's an emergent process that lives for a time, is defined by its local pocket and context in time and space. And then tells pretty stories. And we write those stories down and then we celebrate how nice the stories are and then it continues because we build stories on top of each other and eventually we'll colonize hopefully other planets, other solar systems, other galaxies and we'll tell even better stories but all starts here on Earth. J. Year, speaking of peculiar emergent processes that lived one heck of a story, you're one of the great
Starting point is 02:37:17 scientists of cognitive science, of psychology, of computation. It's a huge honor. You would talk to me today. They you spend your very valuable time I really enjoy talking with you and thank you for all the work you've done. I can't wait to see what you do next Well, thank you so much, and I you know, this has been an amazing opportunity for me to let ideas that I've never fully expressed before come out because you asked such a wide range of, you know, the deeper questions that we're all we've all been thinking about for so long. So thank you very much for that. Thank you. Thanks for listening to this conversation with Jay McClelland. To support this podcast, please check out our sponsors in the description. And now, let me leave you with some words from Jeffrey Hinton. In the long run, curiosity-driven research works best.
Starting point is 02:38:09 Real breakthroughs come from people focusing on what they're excited about. Thanks for listening and hope to see you next time. Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.