Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas - 272 | Leslie Valiant on Learning and Educability in Computers and People

Starting point is 00:00:01 With the new My Lowe's Pro Rewards American Express card, earn three points per dollar spent on Lowe's qualifying purchases for the first six months after card account opening. Plus, save 5% every day on eligible purchases at Lowe's and earn points on eligible purchases everywhere else MX is accepted. Card members get more at Lowe's. Subject to credit approval. 5% can't be combined with any other offer.

Starting point is 00:00:21 Exclusions and restrictions apply. Points subject to loyalty terms at Lose.com. And credit reward terms at synchrony.com slash MLPRA dash terms. Visit Lose.com slash business credit. Struggling to see up close, make it visible with Viz. Viz is a once daily prescription eye drop to treat blurry near vision for up to 10 hours. The most common side effects that may be experienced while using Viz include eye irritation, temporary dimmer, dark vision, headaches and eye redness. Talk to an eye doctor to learn if Viz is right for you.

Starting point is 00:00:44 Learn more at Viz.com. Hello, everyone. Welcome to the Mindscape Podcast. I'm your host, Sean Carroll. One of the problems with reality, as I see it, is that there's a bunch of puzzles, questions, problems, if you like, that are hard. to solve. And I'm not even thinking about, you know, moral or political or social problems. I mean just mathematical problems, or at least problems that you can state rigorously and quantitatively. There are problems that, in principle, you can find an algorithm for providing a solution,

Starting point is 00:01:17 but that algorithm is so inefficient it would take forever or very, very long time to actually do it. This is, of course, a whole area of knowledge, right? Within theoretical computer science, we have computational complexity theory, asking the question if you have some kind of question, like what is the shortest distance, that a traveling salesman would have to go through, given these stops that they have to make along the route, how many steps would your computer program need in order to solve it? And one of the nice things about those problems is that even though it might take a lot of steps to solve it, at least there's a solution, right? At least there is one right answer. The difficulties become enormously low.

Starting point is 00:01:59 larger when you imagine being a scientist, let's say being a theorist within science, right, where you have some data collected by your experimentalist friends, and your job as a theorist is to come up with a best-fit model to the data, to come up with a theory that best-fits the data in the sense that you can extrapolate it beyond the data you have and have a good chance of continuing to fit it. Well, that's going to be a difficult problem to even conceptualize, because maybe you don't have enough data to uniquely fix the thing. Maybe you have lots of data, but you don't have any good ideas about how to fit it. That's why we theoretical physicists make the big bucks, right?

Starting point is 00:02:42 Anyway, today's guest, Leslie Valiant has done incredibly important work within theoretical computer science on exactly understanding this kind of puzzle, the idea of getting better at fitting some data in a way that can be extremely. the best learning that an automated system can do. I have to read you a little excerpt in case you don't know who Leslie Valiant is from his Wikipedia page. Valiant was awarded the Turing Award in 2010, having been described by the Association for Computing Machines, as a heroic figure in theoretical computer science and a role model for his courage and creativity in addressing some of the deepest unsolved problems in science, in particular for his

Starting point is 00:03:28 striking combination of depth and breadth. So he's written a book that is named after an idea he had called, he wrote, this is a while ago he wrote this book, called Probably Approximately Correct, Nature's Algorithms for Learning and Prospering in a Complex World. And as I say in the podcast, I just love this phrase, probably approximately correct. The idea being that you have some guess as to how to extrapolate from your experience to what's going to happen next. there is literally no way that you can guarantee ahead of time that you're going to be right in that

Starting point is 00:04:04 extrapolation. You can't even guarantee that you will be approximately right. You're going for that. You're trying to be approximately correct, but you have a chance of being approximately correct. And what you can show in certain very rigorous circumstances, that you can probably be approximately correct, that you have a good chance of doing that. And that's what all of us scientists actually aim for. So one of the great things about Les Valiant is that he thinks about these very deep ideas in rigorous, quantitative, theoretical computer science, and then does try to apply them more broadly. Thus, the subtitle, Nature's Algorithms for Learning and Prospering in a Complex World. So he has a new book that is just about to come out called The Importance of Being Educable, a New Theory of Human Uniqueness. So

Starting point is 00:04:55 what he wants to do is encourage everyone to shift their perspective from thinking about human beings on a scale of intelligence, how smart you are, how many things you know, to a scale of educability. How good are you at learning new things? He tries to make the argument that knowing things, being good at solving puzzles or whatever it is, is much less important than being able to learn how to solve puzzles, how to understand things. And that's also something that you can quantify and you can actually think about it for machines as well as for human beings. So he wants to argue that what really makes us different than other animals is that we can be educated, which is a little bit different than learning, right? A cat or a dog,

Starting point is 00:05:41 well, maybe not cats, but dogs certainly can learn to do things, but they generally learn to do things by being shown what they want, what they are wanted to do, and then, you know, trial an error, right? You can't read a book if you're a dog and learn to do things that way, whereas human beings can transfer new information, can generalize it, can string it together in ways that, according to Valiant, are not there elsewhere in the animal kingdom. So that may or may not be right. I think that's a good empirical question. He is very, very honest about when he puts forward a conjecture and it still hasn't been tested yet. But it's a nice way of thinking slightly differently about our capacities and capabilities.

Starting point is 00:06:22 So with that, let's go. Leslie Valiant, welcome to the Mindscape Podcast. Thank you for having me. So if I look over your works, your research over the years and writings, the word learning is the one that appears over and over again. So you're a computer scientist. Why is it that the word learning

Starting point is 00:07:00 became so important to how you think about things? Yes, so I started in computer science looking at computational complexity, which is about the inherent difficulty of doing computation. And I did various things in that field for about 10 years. But I regard it as a very fundamental field of computer science, maybe the most fundamental. And the most basic idea which occurred to me is that if it's no good at solving the basic problems of the mind and AI, then it's no good. So it was a really challenge for the field. And so then I looked at, you know, at AI. And I saw that there were topics discussed in AI conferences.

Starting point is 00:07:44 And I quickly tried to figure out, well, which one is the most fundamental? And asked that way, it seemed to me that it must be learning. So I zeroed in very fast on learning and tried to understand it from the point of view of it being a computational process. where, which wasn't so easy, there were limitations, and the limitations, some were statistical, but I thought the main ones were computational. It just needs a lot of computation to do it well. And has that, to skip ahead a little bit, were you right? Well, I think I was.

Starting point is 00:08:23 These big, big LLMs do spend millions of dollars in being trained. And so that's exactly what the model was, that you have to have a model of learning, which somehow promised you that the amount of computation needed may be much, but it was still polynomial time. It was still doable. And there's, this is an honest question. I don't know the answer. There's machine learning, which is a specific approach, you know, a specific set of algorithms.

Starting point is 00:08:53 But then there's the more broad category of machines, learning. Are those two separate things? Is machine learning a general term or a more specific one? Well, these things are used in many senses. For me, the same, I think. Machine learning. It's the name of an academic field which includes various things, but clearly it tries to capture what happens when machines learn.

Starting point is 00:09:21 It should be the same. So it goes beyond neural networks. Sure, sure. So neural networks are just one of many algorithms for doing learning. And from the 80s onwards and maybe even before, people were exploring a whole array of different algorithms and different algorithms are useful in different contexts. So in some contexts, we've got enormous datasets,

Starting point is 00:09:45 neural networks have been shown to be very good. But still, there are many other algorithms, which are widely used for smaller datasets. Could you kind of put us in the mindset of how people were thinking in the 1980s? I have this vague feeling there was super excitement about AI in maybe the 60s and then it cools off a bit in the 70s and it begins to return again in the 80s. Yes, I can only speak for myself, of course. So I came into learning from a theoretical perspective. So from what I was doing, this theoretical field called computational learning

Starting point is 00:10:25 theory grew up, which focused very much on formulating learning from examples. So maybe one of the big services done by that community. So I formulated this PAC model, the probably approximately correct model of learning. And basically what it captures strategically is how you generalize is when you learn from examples, when do you declare yourself successful? So clearly you have to predict future examples accurately, as accurately as you can. But also you want to show that the effort made to predicts future examples, to be in the position to be able to predict future examples should be doable, moderate in terms of polynomial time and things like that. And so in particular, the more effort you make, the more accurate your prediction should be.

Starting point is 00:11:24 But furthermore, the more accurate you, the more effort you make, the more effort should be reasonably rewarded. So the rewards for the more effort should go up, not infinite decimally slowly. And this boils down to technically to algebraic curves to a exponent, constant exponent, the error being a constant exponent of the effort you make. I see. Anyway, so anyway, so this was a theoretical model. It was widely explored, and I think it focused attention on just this very specialized problem of learning from examples efficiently. And then soon afterwards, for reasons partly inspired by this, partly from other reasons, people started to have data sets which they shared and they systematically compared the efficacy of different algorithms, how were they did on different data sets. This started from the mid-80s, and they grew this very large experimental machine learning community.

Starting point is 00:12:30 So for a long time, the neural nets weren't competitive because they just performed badly on small datasets. But, as you know, when the data sets became bigger, they became more and more competitive. So in the machine learning fund, there was lots of excitement because theoretically it kind of worked. I'm experimentally, it worked on many datasets. And anyway, to me, it was a plausible entry into what the mind did. It was something which was computationally feasible. Because from a theoretical perspective, the previous things people tried to formalize like logic, reasoning, you know, everything turned out to be computationally interactable.

Starting point is 00:13:13 So from a theoretical point of view, this was exciting for the community which did this. But they grew this large experimental community from which the current successes all derive, I think. I do want to mention that the phrase probably approximately correct or pack learning, as you're calling it. And there's also a book that you wrote with that title. That is one of the best titles I've ever seen for a book. And I presume that, I mean, is it allowed to take lessons that are implicit in that phrase and go beyond computers and specific learning algorithms to sort of a more general epistemological goal we should all have to probably be approximately correct? Sure, sure. given their license to do exactly that.

Starting point is 00:14:11 But I think the phrase, although it's intended as a technical one, indeed, I mean, it should be more widely used. So in fact, the sense in which I think it should be broadly understood is that when people try to understand AI now, where the successes of AI are basically just exactly this learning phenomenon, I think the important point is that, you know, the phenomenon is that there's, some promise that averaged over many cases, your predictions would be good.

Starting point is 00:14:42 But over any single case, the promise is very weak. You know, every single case, you know, it's very, the promise is weak. So certainly just knowing this phrase, you should immediately realize that when you try to use AI for some safety critical application, you should be very careful because in New York, critical case, the promises are very weak. And it seems though like a almost like a simple philosophy of science, right? That science is not logic. It's not something where we deduce with absolute certainty a result.

Starting point is 00:15:16 We have some examples, like you said. We have hypotheses based on those. And our aspiration should be to be approximately correct as often as we can. Yes, exactly. So actually in this earlier book, I do indulgence on philosophical speculation. And I think the distinction I make there is that some tasks we do are theory full and some are theory less. So by theory full, I mean something where we have a good theory, for example, like physics, where we have a theory, we believe we know what's going on and our predictions follow this theory. But most of what we do in every day life is kind of theory less.

Starting point is 00:15:58 We don't know the rules. When we make up nice flowing sentences, we don't know exactly how to do it. But the point is that nevertheless, we may have some high skill in being able to do this. And we get this skill from this learning process of many examples, exactly as they chat GPT. It can make continuous sentences, but really flowing prose, but we can't characterize what that means. So exactly, so the take I have is that much of what we do is theory-less, but it doesn't mean that it's not effective or predictable or predictable on the average or useful because we just so this learning processes are in some sense robust and useful.

Starting point is 00:16:46 When people turn to telehealth or weight loss, they're looking for real support. That's why more people are choosing orderly meds.com. Orderly meds connects you with real doctors and access to proven GLP1 medications like semaglutide and terseptatide. No guessing, just a more supportive experience and all shift directly to your door in discrete packaging. Do your research. questions, then visit orderlymeds.com slash podcast for an exclusive offer. That's

Starting point is 00:17:08 orderly meds.com slash podcast. Individual results may vary now. Medical advice, eligibility required seaside for details. Now I'd like to introduce you to Meaningful Beauty, the famed skincare brand created by iconic supermodel Cindy Crawford. It's her secret to absolutely gorgeous skin. Meaningful beauty makes powerful and effective skincare simple, and it's loved by millions of women. It's formulated for all ages and all skin tones and types, and it's designed to work as a complete skin care system, leaving your skin feeling soft, smooth, and nourished. I recommend starting with Cindy's full regimen, which contains all five of her best-selling products, including the amazing youth-activating melon serum.

Starting point is 00:17:47 This next-generation serum has the power of melon-leave stem cell technology. It's melon-leaf stem cells encapsulated for freshness and released onto the skin to support a visible reduction in the appearance of wrinkles. With thousands of glowing five-star reviews, why not give it a try? Subscribe today and you can get the amazing, meaningful beauty system for just 4995. That includes our introductory five-piece system, free gifts, free shipping, and a 60-day money-back guarantee.

Starting point is 00:18:13 All of that available at meaningful beauty.com. It makes me think about speaking of philosophy of David Hume and his worries about induction, you know, saying that you can't know anything with certainty even if every single example you've seen is that way. And in some sense, it seems like in at least this computer science context, you can formal the fact that despite the fact that anything could happen, you have reason to believe that probably you have a model that is going to give you a pretty high probability of getting it right.

Starting point is 00:18:48 Well, exactly. So, in fact, when you were asking about the 1980s, so when I was getting into this field, I was reading the philosophy and what the philosophers called the problem of induction, which the problem goes back to the Greeks. And I think the philosophers have kind of their efforts kind of faded out because they didn't quite know what to do with it. And I do think that computer science has solved it. I mean, they solved it in the sense that it's given one meaning in which is understandable. Of course, philosophers sometimes use the word very generally, so it's more generally.

Starting point is 00:19:26 But I think computer science has given a meaning to this in the domain in which it's meant, it's kind of solved. Yes, I think philosophy is important here. And you said this already, but I do want to highlight it, the question of the efficiency of the calculation and the computational complexity, because philosophers, again, and I sometimes masquerade as a philosopher myself, but sometimes they imagine that you're Laplace's demon, and you have infinite calculational capacity. But in the real world, it matters if something takes n steps or n-square or E to the end steps.

Starting point is 00:20:04 And as I understand it, one of the nice things about PAC learning, probably approximately correct, is that you can show that it is doable. It's efficient in some quantifiable sense. Sure. So I think it's a description of some real-world phenomenon. And it does explain that some things we can learn effortlessly, like children learn the meaning of words in their language,

Starting point is 00:20:30 and fairly reliably. something easy to learn somehow, although we don't know why. But there are also things which are hard to learn. So we're figuring out the physical laws of the universe, just by looking at the stars, isn't obvious somehow. People have to work very hard at working that out. So not everything is easy to learn. But obviously, we live off things which are easy to learn.

Starting point is 00:20:55 And so there are some problems that are sort of intractable. You know, I mean, you're the computer scientist here. Do we have a clean division into problems that are efficiently solvable and problems that look pretty intractable to us? Well, in computation, that's a feel of complexity. But even if you're restricted to learning, so what's easy to learn and how to learn. The obvious thing to say is that cryptography is really the flip side of learning. So in cryptography, you incurred messages. and someone listening in on many of your encodes messages,

Starting point is 00:21:34 you don't want them to be able to decode your future messages. So cryptographic functions have to be things which are hard to learn. That's their design. And so those are examples of hard to learn functions. Things we believe are hard to learn, and we actually use them every day. And it's important that they'd be hard to learn. So, yeah, so the spectrum of easy to learn and hard to learn,

Starting point is 00:21:59 and how to learn, at least the extremes are pretty clear. We use the easy ones. We use the hard ones for photography. And then there's something in the middle, which we don't understand so well. And I guess that's actually very useful because it highlights the sense in which you're using the word learning. It's not just memorizing information, getting more knowledge. That is not what you mean by learning. Yes.

Starting point is 00:22:23 So in this sense of learning, you need many examples, and you generalize for many examples. You abstract something for many examples that in the future you can act, you can classify a new example. So you get labeled examples, pictures of elephants or not elephants. Labelled as elephants and not elephants, future someone gives you a picture and you have to label it as an elephant or not. So you are trying to invent a rule that you will then test maybe against future data. In practice, how constrained is that generalizing process? Do real computer scientists imagine that there is a predetermined set of possible rules that we're sifting through? Or are the computers really using their imagination somehow?

Starting point is 00:23:09 Well, they're using a learning algorithm. So it depends. You choose what learning algorithm you want to use. So if you use a deep neural net, then the rule will be a deep neural net, which you learn. Or if you want to do something simpler, then I like a decision tree, there will be a decision tree. So it's up to learning algorithm produces some rule from a class of rules, which they're able to. And so this is stuff that you started thinking about and writing about in the 80s, I guess. And my impression, correct me if I'm wrong, is it's now kind of background knowledge for everyone who is doing AI.

Starting point is 00:23:50 This pack learning is a good way of thinking about what is going on right now. I believe so. I think it does capture the main phenomenon which happens when you train a big learning algorithm. So, for example, the fact that it gets better and better with more data and it gets better and better with some algebraic curve. That's what the experiment shows. I know. I think there seems to be debate these days on. scaling and how much it will affect the future intelligence abilities of AI models, right?

Starting point is 00:24:35 I mean, there clearly has been enormous advances very recently with large language models, but they also still make mistakes. And am I right to think that there's like a camp that says all of those mistakes are going to go away as we get more and more data? And another one that says, no, some mistakes are kind of systematic. Well, I'm not sure they should all say the same thing that the mistakes will maybe go down, but the effort made to make them go down more and more will be big and bigger. So just following this approach, having one big learning box,

Starting point is 00:25:14 we understand what it does and we also understand this limitation. So in fact, from the 90s onwards, I was looking for ways in which, we can build on this and do different architectures than just a single box. Okay. So I think the future is still learning, still using the same phenomenon of learning boxes, but probably in a system you'd have many boxes and you'd have some sort of reasoning capability, you'd have some way of chaining together the conclusions they've reached to simulate something more like our reasoning capability. So when you say you have many boxes, you don't just have many copies of the same kind of box.

Starting point is 00:25:55 You have different kinds of learning approaches that are going to collaborate with each other. Well, not necessarily. Maybe they're trained on different data. Ah, okay. We're trained on different data to have different, like the different box may recognize different concepts or maybe recognize different words in the English language. And does this bear on the question of whether or not, let's stick with large language models because they're so in the news right now,

Starting point is 00:26:24 whether these models understand things, whether they know things, or is it a different kind of concept we should be using? Well, okay, so my answer is that these large language models are clearly trained for one thing, which is predicting the next syllable, the next token, as they call it. And that's what they're trained for.

Starting point is 00:26:48 They're very good of that. they're hard to beat. And any other attribute you give them is your intuition. And people seem to be very generous to large language models and intuting all the things they intuit. But I think if you intuit other capabilities like reasoning I think it's very hard to prove those capabilities. And I don't think they've been proved, certainly,

Starting point is 00:27:19 since such a large effort has been put into into these new, into these large language models, people should explore what may be by luck they're doing something else in addition. But we don't know. And I think it's quite likely that they're not. They're very good at very smooth pros. The experts at figuring out are the next thing. And obviously, the quality of any any of these learning systems depends

Starting point is 00:27:51 crucially on the data. They're given. Everything depends on the data set. And certainly people who produce these large language models put an enormous effort into having a very good data set. You know, human trainers, this and that. Okay, so it wouldn't work well otherwise.

Starting point is 00:28:07 So besides, you know, having the learning algorithms, the other secret is, you know, the data is streamed on. Well, I'm pleased to hear you say that. It's very similar to things that I said in a recent solo podcast myself where I tried to make the point that what is impressive to me about large language models is not that they are thinking like human beings

Starting point is 00:28:29 are. That's not what they're trained to do and what we should expect them to do, but they can sound like they're thinking like human beings are, and that is a fascinating fact that maybe we should work to understand better. Yes, yes. I mean, that's a small comment on a tacticalical comment that maybe our standards are low. We're impressed by anyone who speaks smoothly and can chain together five sentences coherently that impresses us usually. But maybe it's easier to do them with thinking if we're given a billion examples. So just in this AI world, just for a while, before we're going to generalize in a bit,

Starting point is 00:29:09 but are there clear paths besides getting more and more data? to pushing the algorithms in the direction of cognition or reasoning or thought or something like that. Can we do the same kind of thing but with a different twist to make it more like what we recognize as thinking? Well, okay, so now you're going into an area where opinions differ. Good. So clearly in AI there, initially there was this big effort in just basing it on reasoning. so people try to use classical logic to do the reasoning. So that wasn't robust enough.

Starting point is 00:29:53 It was hard to put in knowledge and be robust. So the problem with using classical logic and somehow marrying it with machine learning is that the assumptions are rather different. Classical logic is very deterministic, unforgiving to errors, unforgiving to inconsistency. whereas machine learning is very forgiving to errors, inconsistencies and everything else. So the approach I've been trying to pursue is to impose on, is to marry learning and reasoning, but distort the reasoning process to be compatible with learning.

Starting point is 00:30:35 So you need some reasoning which is, which forgives errors and things are correct and do a certain probability. You know, if empire has been, B implies C, then A implies C would only be true with hyperability. It's the assumption that's true, that kind of thing. So anyway, so I think that's the right way in which AI could go if we care more about reasoning. So, for example, if at the moment, even if you are impressed with large language models, you probably wouldn't wake up in the morning and decide what to do depending on what

Starting point is 00:31:11 your large language model recommends to you, you wouldn't do that. But if you wanted to make, you know, if you wanted to make a bit more reliable, a bit more authoritative, then I think that's the kind of thing you'd have to do.

Starting point is 00:31:22 You'd have to make systems, which conform to what we conventionally think of it as reasoning. And it is possible to combine, is to do certain kinds of reasoning on knowledge which is learned. So I think it's a possible avenue, it's likely that it's, it'll be done for more limited domains of knowledge initially.

Starting point is 00:31:50 So the fact that loud language models can, you know, talk about everything is impressive. But if you really wanted to reason, then probably computationally would become totally prohibitive to impose on that. But there are ways for AI to progress where some reasoning capabilities are. integrated with learning. I was recently, I found myself on a panel discussion on AI in physics. And there's clearly very obvious applications when you have these huge data sets that physics gets from particle physics or cosmology or whatever, and you want to use a model to find

Starting point is 00:32:32 features in those data sets. That's an obvious application. But as a theoretical physicist who wants to invent new concepts to, to describe the world, that's harder for us. And a friend of mine fellow physicist Jesse Thaler from MIT suggested that what we need is large language models, but not in the space of tokens, but in the space of concepts, that they can sort of string together concepts in new ways

Starting point is 00:33:01 and be creative theoretical physicist that way. Is that a pipe dream, or is that something like a hot research project these days? Well, I think, you know, so I'm not quite sure why he's talked about large language models. I mean, if you say machine learning, machine learning on concepts in theoretical physics, I understand. I mean, it doesn't have to be this string-like thing of large language models. I don't think it has to be presented as text and sentences and paragraphs. Sure. So certainly using machine learning to predict.

Starting point is 00:33:37 some new law of physics from lots of data. So in principle, there's something one one can pursue. I don't know how successful it's going to be, but it depends very much on how you represent the knowledge. And, you know, so, yeah, yes, so I think it's something to pursue for sure. Now I'd like to introduce you to Meaningful Beauty, the famed skincare brand created by iconic supermodel,

Starting point is 00:34:07 Cindy Crawford. It's her secret to absolutely gorgeous skin. Meaningful beauty makes powerful and effective skin care simple, and it's loved by millions of women. It's formulated for all ages and all skin tones and types, and it's designed to work as a complete skin care system, leaving your skin feeling soft, smooth, and nourished. I recommend starting with Cindy's full regimen, which contains all five of her best-selling products, including the amazing youth-activating melon serum. This next-generation serum has the power of melon leaf stem cell technology. It's melon leaf themselves encapsulated for freshness and released onto the skin to support a visible reduction in the appearance of wrinkles.

Starting point is 00:34:43 With thousands of glowing five-star reviews, why not give it a try? Subscribe today and you can get the amazing Meaningful Beauty System for just 4995. That includes our introductory five-piece system, free gifts, free shipping, and a 60-day money-back guarantee. All that available at Meaningful Beauty.com. Okay, okay, we've gotten the AI maybe out of our system. We'll come back to it, I'm sure. But I do love the fact that in your books, you are willing to go beyond computer science and draw larger lessons and think about wider problems.

Starting point is 00:35:18 You know, and in a very sort of legitimate way in the sense that like maybe this works, here's a hypothesis, you know, let's go test it kind of thing. So I guess one inroad here is, does nature use probably approximately correct learning? Or is that a way of thinking about information processing in the natural world, not just in computers or brains? So you mean nature beyond computers and beyond brains as well? I'm thinking of biology and evolution and life and things like that, early life. Well, I think so. So I pursue that. So certainly in Darwinian evolution, and I've worked on that as well.

Starting point is 00:36:04 So in fact, the book you referred to earlier, it's probably approximately correct. I do have a chapter on that. So one can formulate Darwinian evolution as a learning process where the, basically, the labeling, there's no one who labels elephants and not elephants, but the labeling is survival. Okay. So different things happen and some survive, some not, and that's the feedback from the world.

Starting point is 00:36:31 So, yeah, so in my opinion, this is, yeah, I mean, this is a phenomenon of evolution, and I think it's a basic phenomenon of our cognition. So in fact, I think the basic question you're asking is that going from computer science to natural world, Okay, what am I doing here? Yeah. So I think I'm really going the other way. Okay. So as a general way of looking at it,

Starting point is 00:37:04 so the idea that what humans can do can also be done by machines is something which Alan Turing already discussed. And for good scientific reasons, during thesis, he said, yes, everything humans do, machines can do also. But I think what held back AI for a long time is that we couldn't identify what we actually did, what was it that we have to simulate by machine. Okay, so it's a lack of understanding of ourselves.

Starting point is 00:37:35 And then we reasoned, we tried logic, it didn't work so well, but it's learning from examples which humans do. So that's worked out better. There's a theory there, and also in practice it works. That's what everyone uses. So I think in generally computer science, I think everything we ask computers to do, we've got the idea from humans. Okay. Computers do what, all the ideas come from humans having done it already in some form.

Starting point is 00:38:05 And so I think in pushing AI forward, we really have to understand ourselves better. So what is it that we do? Okay. So we learn from examples. What else do we do? trying to put that down on paper, I think, will help us understand ourselves and give us something to simulate, something to, as a goals for computers. Yeah.

Starting point is 00:38:31 So I think the two go very naturally together, I think, humans. Yes, that does make sense. But, you know, I like this connection to Darwin and natural selection, because I don't really think I've thought of it this way. Like, when you say the word learning, I think about going to school. school, listening to lectures, doing homework, you know, things that happen in our brains. But you're saying if you conceptualize learning as coming up with the model, with the algorithm that, you know, generalizes from the inputs it's had, and your goal is reproductive fitness,

Starting point is 00:39:06 right, then, you know, what nature is doing it, whether you like it or not, is tuning the genome to solve this problem by this approximately correct kind of paradigm. Yeah, so the learning algorithm there is the mutation algorithm and whatever goes into that. And it learns from the feedback is survival. And it's trying to fit the world to be good to fit the world. And does it fit into our understanding of the efficiency? I mean, is Darwinian evolution a good algorithm for solving that problem or is it kind of sloppy? Well, so the problem is that Darwin didn't tell us the details.

Starting point is 00:39:50 And we still don't know the details. So one theory is that the mutations in the genome are a uniform random noise. But we know it's not exactly that, but we don't know whether it's somehow a clever version, clever variant of that which helps evolution. So we, yeah, so we, so we, so, so we, so, so, so, so, so it has to be, answer must be yes that, you know, this must be what nature is doing. But even if it's kind of a pencil and paper explanation of what's going on that, you know, you know, if you do a mutation, how much does it change the expression level of your genes

Starting point is 00:40:39 and, you know, to mutate from this to that, you know, what's needed, how, you know, how, How many examples do you need, that kind of thing. So that hasn't been, more work needs to be done there. Yeah. But I think there's scope, even for a pension paper, plausibility analysis, that evolution can work as fast as it has, because I don't think there's any scientific explanation of why evolution has succeeded as fast as it has.

Starting point is 00:41:06 Yeah. Unsolved science, I think. Right, good. There's lots of unsolved problems. But, okay, we're going to fearlessly leap ahead to, human beings. Human beings are, I think, we're, our audience is all going to agree, we can learn. We have that capacity. And you, in your new book, you have a new book out called the importance of being educable, a new theory of human uniqueness. And you're, you're kind of shifting from learning,

Starting point is 00:41:33 which is a fact, to educability, which may be, there's a gradient or a spectrum of abilities there. Well, yes. So, yeah, so I build on learning. So I assume we can learn. I mean, that's something which, a theory which has worked well. So, and as I said, I regard learning as a capability, which you can define fairly precisely. And I think humans do it, machines do it well. And so basic question I ask is, you know, what else do we do, which is fundamental? And I think the way the question is asked in the book, which I think yields the answer is, so in the course of evolution, at some point humans emerged. At some point, we became capable of doing various things, which, like our civilization, and it happened fairly suddenly, so maybe the changes were kind of sudden, weren't that great. and even it may have been some composite capability,

Starting point is 00:42:42 we suddenly came together, but what capability do we have, which can account for our civilization? And of course, similar questions have been asked ever since Darwin by any number of people, and the usual problems are that if you pick one feature, there's no feature which is totally unique to humans. So everything you define, you know, some animal can do.

Starting point is 00:43:10 So what the book is about, so this educability I define, so it's composed of three basic things. One is learning from experience, which is exactly the same as learning from examples, pack learning. So once you can do that, it's international even for low animals. You can do that, you know, low animals, 100 million years. ago could have already adapt and if food was towards the light they would go to the food. But once you do that, then you should be able to also chain together what you've learned in different contexts. And this is what I alluded to this, what I hope computer systems would do more. There's some sort of chaining of different things you've learned. And that's a very basic kind of

Starting point is 00:43:55 reasoning. And so here the question is, how can you put that also on some principle basis that if you chained together two pieces of knowledge, why are you so sure that you don't get nonsense? Okay, so maybe they're inconsistent in some hidden way. You've learned in different contexts. So once you go away from classical logic, the guarantees of chaining things together get lost. So you need some basis for training.

Starting point is 00:44:26 And then the third aspect of educability is a very human one of not having to learn from experience or change from experience, but being able to just take from someone else their experience. So they tell you how they do things, they tell you their theory of physics, they tell you the theory of politics, and you just put it into your brain, and then you can apply the next minute.

Starting point is 00:44:53 So this is taking theories explicitly given. Right. So the difference is, you know, we can train a dog to do tricks, but basically it's learning from examples, right? Like here's, like, you get rewarded when you do this, you don't get rewarded when you do that. As far as I know, there are no cases where I can just speak English to tell the dog to do something that has never experienced before and it goes and does it. So this is your candidate for something that really makes human beings different. Well, I think even that's difficult to make. I think the candidate is the, is the

Starting point is 00:45:30 integration of these three things. Okay, fair enough, yeah. On top of each other. I mean, there are cases that you can, for apes, you can sometimes by physical demonstration, give them a complicated task to do, to retrieve food from some tube, et cetera, et cetera. So it's almost like giving instructions

Starting point is 00:45:51 which they can remember and repeat. Okay. So again, it's being able to give lengthy instructions once and have an animal repeated, again, isn't totally unique to humans. But we're certainly much better at this. We sit in lecture rooms and listen to podcasts and take stuff in which animals don't.

Starting point is 00:46:12 Right, okay. And so it's the integration of these three things. And the two things are easy to say. One is you're learning from experience, the other is you're being taught. The conjoining or combining different models is a little bit more slippery. in my brain, maybe you can help flesh it out a bit.

Starting point is 00:46:32 I mean, do we need a symbolic representation of two different theories in order to ask if they're consistent, or is it more implicit somehow? No, no, this is kind of simple in some sense. So it's like if you think of your mind's eye, so when you plan something, you plan how to go to, I don't know, dinner this evening. So you imagine a situation, use some knowledge to predict what will happen. If you do X and then once X's happened, then you get a new situation and then you use some new knowledge to tell you what you should do next. So planning is an example where in your mind you chain together a piece of knowledge and you run the world forward.

Starting point is 00:47:14 And also with reasoning, very simple reasoning that something happens, you have some knowledge, you know what's going to happen next. Then some other knowledge tells you what's going to happen next. So this training is something we do all the time. It's for planning, reasoning. and this is what we need. This seems an essential part of being able to operate. It's not just like having a single neural net and applying a single neural net and saying, yes, X.

Starting point is 00:47:41 But it does, it's related to a hypothesis that Adam Bully talked about on my podcast earlier. And he and others have been, their psychologists, thinking about mental time travel, the ability to put yourself into a future situation hypothetically, right? And I just think of it as imagination, but it's very close to what you're saying. It's sort of working out a set of things that haven't actually happened, but you're able to have the capacity to say it would happen, given my theories.

Starting point is 00:48:14 Yes, because we can take in other people's theories, and we almost don't care whether they're true or not. We don't know whether they're true or not, but we can do this mental processing with eats. You can watch a movie. Whether it's fact or fiction, you almost don't care, but you can process it. We've got this great facility to understand what's going on,

Starting point is 00:48:39 to draw implications of what's going on, and whether it's fact or some fantasy or some future thing which hasn't happened, no, we don't even care. Right, right, good. And so this net capacity to do these three things, this is educability? Yes, yes, in my view.

Starting point is 00:49:00 That's my definition of educability, yes. When people turn to telehealth or weight loss, they're looking for real support. That's why more people are choosing orderly meds.com. Orderly meds connects you with real doctors and access to proven GLP1 medications like semaglutide and terseptatide. No guessing, just a more supportive experience,

Starting point is 00:49:17 and all ship directly to your door in discrete packaging. Do your research, ask questions, then visit orderlymeds.com slash podcast for an exclusive offer. That's orderly meds.com slash podcast. Individual results may vary now medical advice, eligibility required seaside for details.

Starting point is 00:49:33 Now I'd like to introduce you to Meaningful Beauty, the famed skincare brand created by iconic supermodel Cindy Crawford. It's her secret to absolutely gorgeous skin. Meaningful beauty makes powerful and effective skincare simple, and it's loved by millions of women. It's formulated for all ages and all skin tones and types, and it's designed to work as a complete skin care system, leaving your skin feeling soft, smooth, and nourished.

Starting point is 00:49:56 I recommend starting with Cindy's full regimen, which contains all five of her best-selling products, including the amazing youth-activating melon serum. This next-generation serum has the power of melon-leave stem cell technology. It's melon-leave stem cells encapsulated for freshness and released onto the skin to support a visible reduction in the appearance of wrinkles. With thousands of glowing five-star reviews, why not give it a try? Subscribe today and you can get the amazing, meaningful beauty system for just $49.99.

Starting point is 00:50:23 That includes our introductory five-piece system, free gifts, free shipping, and a 60-day money-back guarantee. All of that available at meaningful beauty.com. And one of your mottos or slogans is that educability is perhaps more important than intelligence, which we tend to talk about all the time. Sure. So I think the main downside of intelligence is that no one can define it. that it's, you know, of course, people have, of course, complained that we give importance to intelligence. We test people to do intelligence tests and things,

Starting point is 00:51:03 this has consequence, and we don't even know what we're testing for, where the questions come from. So I think it's very unfortunate that the notion of intelligence has become so important because it's not explicitly defined. So I've explicitly defined educationality. Intelligence has no such a definition. In fact, the early

Starting point is 00:51:27 Sir Charles Spearman, I think, who first tried to deal with IQ, notions by IQ using statistics. I mean, he defined this as an implicit statistical notion. So basically, I think for him, the general intelligence was came from a study of how children in schools, the different subjects.

Starting point is 00:51:54 And it turned out that the children who were good at one thing were likely to agree with something else as well, that performance in different subjects in school was correlated. And then he almost defined intelligence to be the core of a correlation. But it's very implicit, and we still don't know what the definition is. So the main downside of intelligence is that we

Starting point is 00:52:17 don't know what it is. And when people try to define it, they, you know, say it comes in 10 different varieties and people disagree. So I think it's, I think we should go and look for more useful notions. Well, I guess that's the next question. It's nice to be able to define educability, but that falls short of convincing us that it is the important feature that has, you know, enabled civilization, as you talk about in the book. Well, it's got some of the important components. So certainly, you know, people discuss how humans are unique as far as our culture. We've got synonymous culture.

Starting point is 00:53:00 We can hand culture on so easily. You know, one person in a lecture room or someone can write a book and a million, if they read the book, get it the next day and also have access to no such phenomenon. So certainly the spread of human culture is certainly based on being able to transfer explicit theories. So that part is essential. And it's part of this competition process. Sure. So I mean, for any model, its value depends on how useful it turns out to be.

Starting point is 00:53:38 So that we don't know yet. But it's a candidate. Well, when would you, or could we pinpoint a moment of historical time when we say, oh, okay, human beings have figured out how to educate themselves, or is it more gradual? You mean in the past? Yeah. Yeah. Historically, like, was it hunter-gatherers? Did it come along with agriculture?

Starting point is 00:54:05 Well, okay, so when we got this capability, okay, so I don't need to speculate. this, but I can, okay? But I think this could have come even before humans. Okay. So I think it's possible that, you know, we had them 300,000 years ago, some predecessor species already evolved this, and it just took a very long time for it to become useful to us. It's like a snowballing effect that you need more and more knowledge

Starting point is 00:54:34 to share among humans before, you know, it becomes useful. So there's, I mean, there's no evidence of any, mutation having spread to the whole human population in the last tens of thousands of years. So there's no biological explanation of a new capability in the recent times of agriculture and things like that. So my guess is that the capability is much, much earlier. But I guess, I mean, you're a college professor, as I am I. You've had students. The capacity to be educated differs from person to person, right? I mean, it's a, it's not just humans have it. It's a skill or a capacity that we can improve on. We can some, we can make better. We can make bigger.

Starting point is 00:55:26 Well, yeah, so in almost any aspect of life, and individuals differ in performance, if you give them some test. So in the book, I do raise this question of whether educoverty can be enhanced by some by some process of and I mean and I don't know the the answer to that so I mean so I concentrated on the fact that this educability I think is something common to all humans but you know we may have different different levels of it and whether it can be enhanced I don't know certainly I mean if the if the concept has any meaning then it should be measurable there should be some way of saying, yes, we have so much of it. And I suggest that research could be done along these lines where people try to test

Starting point is 00:56:22 educability. And so the main feature of that, in some one sense, is very simple, is that when you practice your educability, you're gaining new information. So if you have a one-hour educability test, you should not be testing for anything which you knew beforehand. Sure. You should avoid any kind of skill. You should be testing for kind of skills, which native skills.

Starting point is 00:56:48 You should be testing for information you've got in that one hour. So it is slightly different from what people do currently. Of course. Yes. It's the opposite of what people do currently. Yes. Yes. Yes.

Starting point is 00:57:00 Exactly. But also, I mean, it's very hard because maybe you're giving somebody some new information and asking how quickly they can learn it or new information and asking how quickly they can generalize it. But it suffers from the same worries that large language models do. How do you know that you're not tainted?

Starting point is 00:57:21 How do you know that this person hasn't thought about similar things before? Yes, for sure. It's difficult to do. I suppose you just have to, the questions would have to be so designed that there's a likelihood that so maybe there's some artificiality

Starting point is 00:57:35 in the questions or the subject matter is so obscure. expect the person to know about it. Yes, that's the difficulty of the design. But, I mean, that's what education is right. You're trying to impart new things. So if you want to measure that, then somehow you have to solve that problem. And you at least ask the rhetorical question.

Starting point is 00:57:57 Should we care about this property of educability? For example, in the leaders that we choose. So do you think that we should undergo a shift from valorizing intelligence to valorizing educability more? I suppose if only because I still don't know what people mean by intelligence.

Starting point is 00:58:19 Okay, so I think that's the question. But I mean, I think in leaders, I think this makes a lot of sense because as we see, you know, when people become

Starting point is 00:58:33 leaders, they come in with so much knowledge, but they need, you know, there's new things happen. And we really want them to be able to use the new things. It's not enough that when they come into their leadership position, they know everything is worth knowing. I mean, new things happen, and they want to be able to pick it up and use it.

Starting point is 00:58:55 I guess there is a current worry about things like misinformation, right? Or just the fact that we get so much input from the world, from the Internet, and sifting through it and paying attention to things becomes a good. crucial skill. Is that something that we get new insights on by thinking about educability as a central concept? Well, the insight I got from thinking about this, which I hadn't appreciated fully before, is that, you know, when I described this

Starting point is 00:59:27 educability model, it seems a very powerful way in which we can, you know, learn information and process it. But somehow there's nothing in it, which, which, is good at testing whether the information we've received, especially in the third mode, if someone tells us a theory. There's nothing in my model, which gives us a capability to check out

Starting point is 00:59:52 whether this new theory is correct. So we're very easy prey to false theories, to conspiracy theories. If someone tells us their political theory, how are we to know whether it's good for us or not, you know, we're not sure. Maybe we're related to our previous knowledge, but so I think we're very bad at evaluating theories

Starting point is 01:00:15 other people give us. Because I think it's just inherently difficult to do, I think there's no way it can be done. So I think, so now that there's this deluge of information, I think this human weakness may become more, more of more dangerous for us because although I think the important point for me is that this weakness is in us than humans, and we have to deal with it at the human level. So it's not just a question that the deep fakes and the new tricks for fooling us,

Starting point is 01:00:46 we could have been fooled by older methods as well. And somehow we have to recognize that we're easily fooled and educated to understand it. And it's not just a question of the new technology. I think we have to deal with the inherent weakness. I did have a couple of recent conversations that made the point about the social aspects of how human beings learn and pass knowledge down. We are more trusting of other human beings than other species are of their fellow beings. And that's enabled us to sort of learn faster because we have teachers that we trust and believe and things like that. And maybe there's a dark side of that, which makes us a little bit too willing to accept what certain favored people say.

Starting point is 01:01:39 Yes, so certainly, you know, public education is based entirely on trusting the teacher. If you go to physics course, you have to trust. There's no way the student can verify everything they've learned, so you have to trust. But still, I think we're not totally trusting. So psychologists do do experiments where they compare, you know, if children are, given information by two different people. And children do have preferences for who to believe. Do they believe their parents or someone who seems to know more about the subject?

Starting point is 01:02:12 So we are born with some strategies for dealing with this complicated world. But obviously the strategies aren't that powerful. But the idea is who to trust is critical. Yes. Does again this lens of educability does it suggest better ways? to educate people or to have a school system, to focus learning? I mean, is it, have we been, maybe the answer is no, but have we been led astray by thinking too much about intelligence

Starting point is 01:02:45 and not enough about educability? Well, I think more research could be done, so I don't know the answer to your question, but I think, you know, I think this formulation does certainly ask new questions. I mean, the most obvious thing to say, I think, is that somehow people have talked about the science of education, but still, I don't think that education is pursued from a very scientific point of view. It's very much best practices kind of thing. And, you know, there's some background science which you could build an education system on top of,

Starting point is 01:03:25 but it's not used that much. So I think there's a lot of scope for putting more of a scientific basis for developing more of a scientific basis of education by further research. Whereas I think this notion of intelligence, I don't think has provided us with very much. By scientific basis, are you thinking of an empirical testing of what works and what doesn't, or a theoretical superstructure to explain why certain things work? Well, I think it's kind of both that in the end you do have to test whether something works, but to get some assurance that this thing, this working transfers to other situations, often experiments are done in one situation.

Starting point is 01:04:12 Some new education technique works, but it doesn't really transfer. So I think some, as you could say, some superstructure would be useful to help us understand while there's some chance of some approach working. Okay. So I think it's a combination. I mean, maybe an easier question, not very easy, but an easier one than reforming the school system is just reforming our individual selves, right? Is it possible for a person to improve their own educability to learn how to be educated better? Is that just a matter of becoming a better thinker, better scientist, or is it more to it than that?

Starting point is 01:04:50 Well, I don't know. I think that that should be a subject of research. But, you know, if one can measure educability, then you can ask the question more rigorously. That if you could do some educability test and see whether there's something you do for yourself improves your educability. So if you don't measure it, then, you know,

Starting point is 01:05:16 not get what it means again. So again, so I don't know. But I think it does raise questions, which I think seem to be new. Well, you have a, as usual, in the podcast preparation, I can skim through the book of the person who's coming on, but I can't read every word. You have a chapter entitled education as a model, educability, I think, as a model of computation. Could you explain what that means? That sounds very interesting and important. Yeah, okay.

Starting point is 01:05:46 So part of the book is a justification of this mixing of computation and human behavior. So I think this is kind of a philosophical question of where is the science in computer science? So like in Turing had this Turing machine. So what's the science? Is it just a mathematical definition and that's the end of it? or is there something more? And what I try to explain is that with, you know, computational models, the Turing machine is clearly the best example.

Starting point is 01:06:25 There is a kind of a new kind of way of trying to get to knowledge where you have a definition which has some properties of robustness. For example, if you make a variance of it, then it's very good if it captures the same notion still, that if you define some notion of computability, you don't want the definition, you don't want the meaning to change if you change some little part of the definition.

Starting point is 01:06:54 You want it to be robust. So there's some such characterizations. So on this chapter, I try to suggest that educability satisfy some of these notions that one should have some confidence that this model has some robust, robustness that if you try to express similar things tomorrow, you'll probably get a model,

Starting point is 01:07:19 which is kind of the same or similar. Okay, so it's trying to justify why this is a scientific approach. Okay. So some people would say the only scientific approach is to do experiments. So if I'm talking about anything about humans, in the title I should do experiments on humans. But I'm not doing experiments on humans. So what am I doing? So that's what I'm trying to answer.

Starting point is 01:07:41 Good. I mean, and maybe that sort of brings us full circle to the AI goings on these days. You know, it's been a lot of excitement over the last year or so. Does, you know, we've asked, is it important for leaders to be educable? Can we improve our own educability? What does it mean for a computer to be educable? Is it exactly the same meaning? Is it, you know, these three things that you listed? And if, so can we or should we or are we aimed at giving computers those capacities? Yeah, well, I suppose the point of my book is that the aims should be about the same. But I think what, again, having written this down, what one realizes is that the difficulty of being educable is that someone has to decide the content of the education. Okay, so just being educable doesn't produce. useful human beings if what they're educated with is just bad stuff okay so and same with the same with computers so so the difficulty is with the current pure learning systems

Starting point is 01:08:56 obviously what the training set but if you make them educable then the even more decisions you have to make about you know what knowledge you give them because depending on what knowledge you give them the results will be different so So being educable has its great dangers as well. You can educate machines very easily and also humans very easily to do things you don't want done. Well, I always, you know, near the end of the podcast, I always encourage the guests to let our hair down a little bit. You know, you just touched on these looming issues that a lot of people are super concerned about, about AI risks, whether literally existential risks to the planet and the species or, you know, smaller scale political, social risks.

Starting point is 01:09:44 I mean, you're someone who has made enormous contributions to helping make machines seem intelligent. Are you worried that it's going to get out of control, or is that something that will just keep monitoring and tweak along the way? Well, I think the one thing I would say is that the most extreme fears of this singularity, you know, I don't see or support because these arguments for singularity are usually based on the fact that of some sort of mysticism, that machines will become super intelligent in a way we don't understand and they'll take over. So my view is more that, you know, what machines will do, there will be more. capable along lines we understand, you know, learning, reasoning, taking our theories. So at least they'll do things, they'll do processes we understand.

Starting point is 01:10:40 And these are the same processes we do, I believe. Okay. So we've got some scientific basis of understanding what they do. So then AI is just like any other kind of dangerous science, like, you know, chemistry. Yeah. Where, you know, bad things can happen. But, you know, if you understand, in our. and we know what we know, we don't know,

Starting point is 01:11:03 we can kind of have some control of what goes wrong. So something like, you know, when new, in the pharmaceutical industry, if there's some new drugs that are released, they're very thorough tests. One can't predict ahead of time, you know, whether they'll succeed or not. One can't predict whether the tests will really be totally foolproof, maybe the mistakes are made. But that AI is kind of a similar kind of thing. that we understand enough, we'll be cautious, we'll do common sense precautions.

Starting point is 01:11:37 But the idea that somehow they'll take over without us letting them take over, I think, is a misplaced fear. Good. So, I mean, putting aside that the misplaced fear, what is your, and this will be the final question. What do you think will be the biggest impact on human lives from the fact that AI is making such advances? Well, I think we'll just, you know, well, it's going to be more like a mixed economy where some things are done by machines, some by humans, and we'll just have to get used to that. And how it'll evolve in detail of, of course, no one knows computers will get into all aspects of our lives. But we'll just have to get used to the idea that many of the things we do, computers will do. And we shouldn't be upset by that. We shouldn't be upset by that.

Starting point is 01:12:34 That's always good advice. To control what we control and live with what we can. So Leslie Valian, thanks so much for being on the Mindscape podcast. Thank you.

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas - 272 | Leslie Valiant on Learning and Educability in Computers and People

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.