Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas - 184 | Gary Marcus on Artificial Intelligence and Common Sense

Starting point is 00:00:00 From the neon lights of the club to the harsh, buzzing lights of the office. Don't let the wear show on your face. Just swipe Mabeline instant eraser concealer to erase the night before, wherever that happens to be. Instantly covered dark circles and under-eye bags for a brighter, more awake look. This do-it-all formula also contours, corrects, and highlights, all while staying lightweight, crease-resistant, and smooth. It may be the world's greatest eraser. Find your shade of instant eraser concealer at your local retailer. has a big story to tell.

Starting point is 00:00:32 Discover Jardians, Empiglphosen, 10 or 25 milligram tablets. Visit Jardians.com, call 1-88-9668-648, or talk to your doctor to see if Jardians is right for you. Hello, everyone, and welcome to the Mindscape podcast. I'm your host, Sean Carroll. If you've been paying attention to advances in technology, science, or just news in the world, it's hard not to be impressed with recent progress in artificial intelligence, mostly driven by neural networks and deep learning, machine learning kinds of techniques. We've really been able to do things with AI that when I was your age, we just couldn't do.

Starting point is 00:01:10 For one thing, artificial intelligence programs are easily able to kick the butts of human beings when it comes to games like Go and chess, which was considered very far away, not too long ago. another example is GPT3, which you may have heard of, which is one of these language processing things where you can ask it a question or you can give it a prompt in some sense and it will respond or will continue on in the vein of the words that you gave it

Starting point is 00:01:38 on the basis of the fact that it has read a lot of things and it sort of looks for correlations between them. And finally, and maybe, you know, most importantly for our everyday lives, AI is everywhere around us in vision recognition, you know, recognizing the images that are in front of us, maybe even self-driving cars or something like that someday. But certainly recommendations, what music to listen to, what movies to watch, etc. AI is at work.

Starting point is 00:02:06 So on the one hand, it's very impressive. On the other hand, none of these are going to be confused for a human being. None of these versions of AI are going to pass the touring test in some very advanced way. You can sort of jigger up versions of the touring test that are passable by modern AI, but it's not a full-blown general intelligence, right? AGI, artificial general intelligence, the kind of AI that would really fool you into thinking, it might as well be human. So today's guest, Gary Marcus, thinks that he knows why we're not able to do that. More importantly, he thinks that we're moving in the wrong direction or focusing on the wrong things to make progress

Starting point is 00:02:45 in this particular direction. The idea is that there are certain kinds of things that neural networks, deep learning is good at looking for correlations in gigantic data sets. There's other things that it's not good at. It's not good at understanding in some vague sense that we would like to define. It's not good at common sense, at understanding how the world works at a basic level so that in individual circumstances we can apply our knowledge in a kind of reliable way. That's why self-driving cars turn out to be harder than we thought they would be. The world out there is a messy place, and you need a picture of the fundamental way the world works, not just a set of correlations in your computer, or at least that's what Gary would say.

Starting point is 00:03:28 And he even has advice for how we can make progress in the right direction, because there's been a shift in how artificial intelligence research has been done. In the early days, it was symbolic. You would try to define symbols, variables in your AI program that represented different things, and then look for relationships, or try to define relationships between the different variables. Whereas it's almost more mindless today, the deep learning algorithms just take in a whole bunch of data and then spit out correlations between them. As Gary points out, the best deep learning algorithms actually are hybrids. They actually make use of the symbolic approach as well, but he still thinks we should be going a lot further in that direction.

Starting point is 00:04:12 We really need that kind of understanding-based approach. to make artificial intelligence of the kind you would recognize as human-like in some sense. And that might not just be something we want to do because it would be cool. It might be important technologically going forward. So we're going to dig into that. It's a lot of fun. Gary's very opinionated guy. He actually started out in neuroscience and psychology before moving into AI.

Starting point is 00:04:36 So he really cares about how real human beings think. He wants to make computers think better than they do today. So let's go. Gary Marcus, welcome to the Mindscape podcast. Glad to be here. So we're going to talk about artificial intelligence. Let me give my impression of the history very, very briefly so you can tell me whether I'm correct or not. Like there were go-go days of artificial intelligence, maybe in the 60s and 70s where we were first getting computers that were up to the task of even thinking about it.

Starting point is 00:05:21 And people thought very soon we'd be talking to them and having deep philosophical conversations. And then it didn't pan out. It turned out to be harder than we thought. these days there's a bit of a resurgence, right? I mean, AI is kind of everywhere, neural networks, people are thinking about self-driving cars, and you're on the side of being a little bit of a gadfly, right? You're saying like, okay, yes, we've had some successes here, but this is not going to be just smooth sailing until we get real human-level intelligence. Is that a fair overview? Yeah, I mean, I could pick some details. So I think the first bit of enthusiasm was in the 50s

Starting point is 00:05:58 rather than the 60s. And the first winter, the first AI winter, which I think is important, was in the early 70s when there was something called the Light Hill Reporter, I think 1973, that said, hey, this stuff isn't really going anywhere. We're putting all this money into it, but what are we getting for it?

Starting point is 00:06:15 And research really slowed down then. And I think that the field lives in permanent worry that that might happen again. And maybe you should live even more worry than it does. So, you know, that's a little edit there. I would like to clarify, for the record, that I love AI. I'm not someone who thinks AI is impossible.

Starting point is 00:06:38 I use the word gadfly. I would probably use the word skeptic. And to that point of that skepticism, there's both a specific and a general. The general thing is one of my favorite graphs I ever saw was the prediction for how far away, let's say, artificial general intelligence might be, although the term is new, but the idea has been around for a long time, that how many years until we have AI that's actually, let's say, as smart as people, and we can talk about whether that's even the right criterion. And you look, and it's basically always people say it's 20 years away.

Starting point is 00:07:12 Classic. They always say, you know, 20 years from now. So, you know, that itself is a little bit of an object lesson. And then there's a question about what is it that we actually have now? What have we made progress on and whatnot? And I know as a businessman, you have respect for the data. and you have respect for the fact that there's different kinds of data and different kinds of measurements. And there are some measurements on which the kind of, I'll call it the orthodox Kurzweilian notion of exponential growth seems bang on.

Starting point is 00:07:41 And one example for that is chess playing. Another example is Go playing. On these board games, there has really been exponential growth or even super exponential growth. You know, you look at Go now as compared to when I was a kid and computers couldn't play it. all. I could beat it a go player now. There's no way I could beat AlphaGo. So on those kinds of things, there's been exponential growth. On some other things, growth has been slower than the popular media would have it. So, you know, if you had read the popular media over the last few years, you'd probably think the vision was solved, that we now know how to do computer vision.

Starting point is 00:08:20 How to recognize that. The reality is we have actually made progress there, but we have not solve the problem. So I'm looking at you right now on a, you know, a Zoom type call. And I guess your audience isn't looking at the image that I am, but I can instantly parse what's going on there. And not just label the objects, but I can actually answer questions. Like, there are things on the wall and I can make guesses about how they might be mounted there. And I would be surprised if they started floating around the room. Like, I have an understanding not just of the entities, but how they relate to one another. I see like stacks of books and they're paper on top of them and I understand, you know, why the paper is not floating and not falling.

Starting point is 00:09:00 So I have this kind of integrated with physics understanding of the scene. And AI does not have that. And that's part of what perception is. So, you know, every other month there's another study now showing these so-called adversarial attacks and so forth, showing that these vision systems can be fooled. In fact, mostly they rely on texture and things like that. So one of my favorite examples from recently was a, what was that? I think a fire truck overturned on a snowy road, and the system said with great confidence that it sees a snowplow.

Starting point is 00:09:32 And it did that because on a snowy road, if there's a vehicle, it's likely to be a snowplow. And these systems are very much driven by textures that they see and by kind of probabilities of what things are generally likely. They don't have an overall understanding of the scene. Another recent example was a – I'm trying to remember what it was. it was something or other, I think it was an Apple with the word iPad on a piece of paper in front of it. Right. iPod. And so, you know, it thought that that was an iPod because the word was written on the page.

Starting point is 00:10:09 So, you know, perception is not actually solved, but there has been actual progress. So that's the intermediate case. The first case was true exponential progress. Then there's perception where there are pieces of it where there is exponential progress and pieces of it not so much. And then there's natural language understanding and reasoning. And I would say we have not really made progress at all. You know, GPT3, which we may want to talk about, gives the illusion of having natural language understanding, but I don't really think that it does.

Starting point is 00:10:36 And we are nowhere near, for example, an all-purpose general assistant. We're nowhere near to having the kind of language you would want if you had a domestic robot. I have a cartoon in my book where somebody says, put everything in the living room away, and the robot ends up cutting up with a saw the couch, right? Because it doesn't understand what it is that we would mean by put everything in the living room. We have no candidate solution for that problem. It's not just that we made no progress. We don't even know how to make progress on that.

Starting point is 00:11:06 Now you're making me sad that there's no artificial intelligence podcast editor. That would make my life a lot quicker. That would be great. I mean, you know, even there's an example of like how AI does actually help in some ways. So now there are tools if you want to put together PowerPoint slides for an online talk in this crazy error in which we are living that will automatically transcribe and do a pretty decent job and then go find where the breaks are in the words. So like there are a lot of places maybe that you wouldn't even expect where AI is actually

Starting point is 00:11:37 helping now. There are also some of it's hurting now. We should talk about those too. But, you know, AI is real now and it wasn't before. And that's both a blessing and a curse because some of it's reliable and some of it's, It's not, and there are all kinds of problems with it. But just on that first question, you know, of history and where we are now, what I would say to wrap up a long-winded answer is we've made a lot of progress on a lot of things, but there's some core problems, which are mostly about understanding the world and what people are talking about. We really haven't made that much progress.

Starting point is 00:12:12 And it could be now we really are for the first time 20 years away, where all those other times we, you know, we weren't. aren't, or it could be we're still like 50 years away. Yeah. The core question of how you use common sense knowledge of the world in order to interpret things. So again, back to the scene in your room. Like, also there's a place where the lighting values are higher. And I can guess that that's outdoors.

Starting point is 00:12:37 Like making all those inferences about what I see and what's likely to be going on, we just don't know how to do that yet. And maybe it's good to, we're able to get into the details a little bit. The audience likes the details. So let's try to understand why there has been this progress. And as far as I can tell, the overwhelming majority of recent progress in AI has been driven by neural networks and deep learning algorithms. Is that fair? And what does that mean?

Starting point is 00:13:04 It's true, but with some caveats. So, first of all, there are older techniques that everybody takes for granted, but are real and are already out there. Second of all, there are things like AlphaGo that are actually hybrid models that use classical tree search techniques enhanced with Monte Carlo techniques in order to do what they're doing. So they're not just a straight multi-layer perception as a kind of stereotype that people have of neural networks. You have some inputs. They feed into a hidden layer that does some summation and activation function goes to an output. They're not just that, right? They actually borrow some important ideas about search, for example, and symbols from classical

Starting point is 00:13:49 AI. And so they're actually hybrid systems, and people don't acknowledge that. So there's a second caveat I would give you. The third caveat I would give you, we can come back to the second, but the third caveat I would give you is, yeah, most of the progress has been with deep learning lately, but most of the money has been there, too. And it was really interesting to see, I mean, and I don't just mean like 60% versus 40. I mean like 99.9% of the investment right now, literally, is in deep learning. And classic symbol manipulation AI is really out of favor. And people like Jeff Hinton say don't

Starting point is 00:14:25 spend any money on at all. And so it was really interesting. There was this competition presented at the Nureps conference, which is the biggest conference these days in the AI field, just a month or so ago on a game called NetHack that has various complications in it. And a symbolic system actually won in an upset victory over all this deep learning stuff. And so, you know, if you look back at the history of AI and the history of science more generally, sometimes things get counted out too soon. It is true that deep learning has made a bunch of progress. But the question is, you know, what follows from there?

Starting point is 00:14:59 No, I'm not actually trying to make any value judgments. I just would like to explain to our audience what the options are. Like, what do you mean by deep learning? What is that? And what is that in comparison to symbolic manipulation? Deep learning is fundamentally a way of doing statistical analysis on large quantities of data. At least that's, you know, it's forte. You can actually use it in a bunch of different ways.

Starting point is 00:15:23 But most of the progress has come from that. And what's impressive about the recent work is it allows us to learn from very large quantities of data. The classical AI system really didn't do a lot of learning at all. They're mostly hand-coded. And sometimes that's the right thing to do. So we don't need to learn how to do navigation. We need to learn some details, but we don't need to learn how to do navigation for the purpose of one of the most useful AI things out there, which is route planning, telling you

Starting point is 00:15:56 how to get home from whatever crazy place you wound up in. Right. That's not a deep learning driven system. But there are other systems where if you can glom on to all the data that's out there, you can solve certain problems very effectively, and that's what deep learning has been good for. So an example of that is labeling your photos in, let's say, the Apple Photos app or Google Photos or something like that. There, what you really want to do is to get user data, like measured in the billions or trillions of examples, and have a system that can extract from all of that data,

Starting point is 00:16:31 what is the most likely label for this image, given the other images that are in my data, that have been labeled. So that's a kind of typical use of deep learning is very good at. And speech recognition is similar. So, you know, I hear this word. Lots of people have said it in lots of different ways. And, you know, I hear this particular sound. Is it like that collection of things that I've heard before or this other collection?

Starting point is 00:16:55 It turns out deep learning is just far in a way the best and in some way the simplest way to solve a whole bunch of problems like that. Sometimes it's only a little bit better than the other solutions and it gets more press maybe than it deserves, but it usually is the best for these problems. We have billions and billions of training examples. It's usually the right way to go. But what is it? How does it work? It's statistics. It's correlations. But how does it find these correlations in ways that we couldn't do a few decades ago? So, I mean, the basic idea is not actually new. It was something I should clarify first. So the mathematics around this has actually been around

Starting point is 00:17:30 for decades. People had the idea to do it before. Basically, you're just trying to to figure out I have an error, how can I reduce the error that I've made before, adjusting weights between a bunch of things that we call nodes that are supposed to make us think of neurons. I mean, we can have a whole separate discussion about whether they have anything really to do with neurons, but they're at least loosely inspired by neurons, and you're adjusting the weights between them, sort of how loudly they talk to one another, and if one of them talks too loudly to the other one, you find out over time, well, I should make it talk

Starting point is 00:18:05 a little bit more softly and this one should talk more loudly and you're basically doing that on a mass scale and it turns out to work really well. The math was rediscovered a bunch of different times. There's actually a debate in this mailing list called Connectionists right now about that history and people periodically have these debates. There's no question that it's been around for a long time. What really happened is that people develop GPUs for video games that allow you, to do a lot of the relevant mathematics in parallel at the same time. And that allowed people to do this deep learning thing at a scale that they didn't really even dream of 20 years ago.

Starting point is 00:18:48 So that was a major thing. There's this paper called the Hardware Lottery. I'm trying to think of it. I think her name Sarah Hooker wrote this really interesting piece about how the, and I confess I've only read the summary of it. I haven't read the piece yet. But her thesis is basically you can have these accidents of history where a particular architecture or something like that is available at a particular moment and people just run with it.

Starting point is 00:19:14 And there's a little bit of that going on here. And I don't know if she makes this observation or not, but it connects with what she says, where it's kind of partly an accident of what we figured out how to parallelize first that has made deep learning as popular as it is. So, you know, it was clever to try to use these chips that were built for something else for the purposes of deep learning, and it really changed deep learning. I've made a remark earlier about, you know, unfairly dismissing things too soon. Deep learning was unfairly dismissed too soon.

Starting point is 00:19:45 So I have dismissed its ability to do sort of deep cognition, but that's a separate question. Its ability to do basic pattern recognition was actually in doubt. So in the early 2000s, Jeff Hinton, who's his big star, now, was kind of like he gave a poster at this conference, nobody came. Yeah. You're like, you know, this stuff doesn't really work. Classic story. We understand the math.

Starting point is 00:20:08 It's kind of cool. It has its own kind of elegance. But you're not really getting it to work. So like forget about it. And to his credit, he's stuck with it. And once people like him could use it at scale, it turned out that this technique is actually lousy with small amounts of data, but it's brilliant with large amounts of data.

Starting point is 00:20:27 And so it was actually like a perfect storm. So one aspect of it was getting these chips, which made a huge difference. Another was we didn't have databases with trillions of examples, you know, in, let's say, the year 2000. You know, the internet is the other major technology that has driven deep learning. The internet means that you have large amounts of data. Yeah. When people turn to telehealth or weight loss, they're looking for real support.

Starting point is 00:20:52 That's why more people are choosing orderly meds.com. Orderly meds connects you with real doctors and access to proven GLP1 medication. like semaglutide and terseptatide. No guessing, just a more supportive experience. And all shift directly to your door in discrete packaging. Do your research. Ask questions. Then visit orderlymeds.com slash podcast for an exclusive offer.

Starting point is 00:21:11 That's orderly meds.com slash podcast. Individual results may vary now. Medical advice. Eligibility required seaside for details. Hey, everyone. It's Cal Penn. I'm the host of Earsay, the Audible and I-Heart Audio Book Club.

Starting point is 00:21:28 This week on the podcast, I am sitting down with Ray Porter, the narrator of, of Andy Weir's audiobook Project Hail Mary, massive sci-fi adventure about survival and science, and what happens when you wake up alone very far from Earth? I really had to make a decision because I caught myself getting that frog in my throat and starting to get teary as I'm narrating some of these sections.

Starting point is 00:21:52 And it's like, okay, yo, yeah, yo, is this indulgent? And I really thought about it. I was like, no, at this point, it would kind of be betraying the trust the author and the listener have in telling this story if I don't go through it. But there's places in this book that deeply emotionally affected me, and I left it on the mic. That's great. Because it served the story. People will say like, oh my God, I cried at the end.

Starting point is 00:22:15 It's like, yeah, dude, me too. Listen to Earsay, the Audible and IHeart Audio Book Club on the IHeart Radio app or wherever you get your podcasts. And in some sense from your description, it sounds like, I mean, it's artificial intelligence, but it's kind of dumb and kind of straightforward, right? I mean, you have all these perceptrons, these nodes, and they have weights, and they just float to whatever is the best at fitting the data on the training set without any deep understanding of what has happened, as you were talking about, you know, papers do not float in the air or anything like that.

Starting point is 00:22:49 So what's the, how would you characterize the alternative of the symbolic approach, I guess, is what you're calling it. Yeah, well, let me, before I do that, Let me say that I think that we need elements of the symbolic approach. I think we need elements of the deep learning approach or something like it, but that neither by itself is sufficient. So I'm a big fan of what I call hybrid systems that bring together in ways that we haven't really even figured out yet, the best of both worlds. But with that preface, because people often in the field like to misrepresent me as that symbolic guy. And I'm more like the guy who said, don't forget about the symbolic stuff.

Starting point is 00:23:27 we needed to be part of the answer. Okay. So the symbolic stuff is basically the essence of computer programming or algebra or something like that. I mean, what's really about is having functions where you have variables that you bind to particular instances and calculate the values out. So simplest example would be an equation in algebra. Y equals X plus two.

Starting point is 00:23:49 I tell you what X is. You can figure out what Y is. And there, it doesn't matter which X is you have seen before. You have this thing that is defined universal. universally is the way a logician might put it. Universally for everything in some domain, right? Any physicist would grasp that immediately or any programmer or any logician.

Starting point is 00:24:10 And that is the essence of what allows programs to work. So we're using a tool called Zencaster, and it's putting the bits together of your image such that I can see you and vice versa, and it's doing this in real time, because there are functions that can do that across any image. Then we might do some image processing if we're on Zoom to do segmentation. We could talk about that.

Starting point is 00:24:31 But the basic thing there is I have a function that says, you know, for any set of bits, I will do this function. And I don't care if this image is one that I saw before. And similarly, if I type something in the chat box, it doesn't matter if I come up with a novel sentence or a familiar sentence, whereas the deep learning stuff is all about similarity to the things that you have seen before. And so it's really a different almost thesis about what cognition should be. And I think the right thesis is actually our brains anyway can do both.

Starting point is 00:25:00 We can do the logical abstract stuff. And I actually did experiments all the way back in the late 90s on human infants showing they could do the abstraction even at seven months old. So there's this ability for us to do abstraction, which allows us to be computer programmers or to do logic. And there's also this like heavy statistical analysis that we humans do. We're not quite as good as the machines at it. but we can do a lot of it. So we know that the word inextricably is often found by linked or bound, but never by water.

Starting point is 00:25:32 And we know a lot of statistical things too. Not at the same scale, but we're good at it. And we use it. So we use it in parsing sentences, for example. We make predictions about what the other person is going to say. But then again,

Starting point is 00:25:45 if the other person surprises us, we can usually figure it out. So, like, you know, a lot of comedy is based on saying something that isn't expected. Right. And having the listener, you know,

Starting point is 00:25:54 figure out that thing. And if you had a system that only kind of makes predictions, like you can think of GPT3, which is the most famous language system right now, as a really amazing version of autocomplete. And auto complete is, you know, pretty useful. And we auto complete to other sentences. But we also have to deal with the unexpected. And for that, symbols are actually really good. And this is why we need to bring both of these traditions together. Well, you have the example in one of your papers that I really liked of children learning how to make the past tense in English, where there's a rule, right? You had ED. I podcast.

Starting point is 00:26:32 I podcasted. But then there's all these irregular ones where it doesn't follow the rule. And so it's kind of like for the regular verbs, it's symbolic kind of manipulation. And for the exceptions, it's more like a deep learning kind of thing. Exactly. And that's actually what my dissertation in 1993 was about exactly. that. It was about these split systems. In fact, I wandered off from AI for a long time because I just found it kind of really not very inspiring and came back around the time of Watson because I was surprised that Watson actually won at Jeopardy. I can tell you why. I think it won. But I was surprised and not often surprises. And as a scientist, when I'm surprised, that really wakes me up. And so I was reawoken to AI in 2012, I guess or so by Watson. And then, then at around the same time, deep learning was popular.

Starting point is 00:27:24 And I was like, oh, man, I've seen this movie before because the stuff that I was working on for my dissertation, which included those regular and irregular verbs, which Steve Pinker called the fruit fly of cognitive psychology or something like that. All that, it's the same issues went into my thesis looking at how children were doing things as come up again now when we try to figure out, well, what can deep learning do for us and not? And it's like it can do the irregular verbs that it's not so great with the regulars. Right. But it's interesting because, I mean,

Starting point is 00:27:57 so deep learning is very, very good at some things. Obviously, we've had tremendous success, success playing chess, playing go. Protein folding is a new success. Can we pause there, though? So, like, sure.

Starting point is 00:28:09 The success in the protein folding and the success in the games actually depend on hybrid systems. And the media coverage of that, And even the internal understanding the field doesn't, I think, realize how much the hybrid stuff is important. So, for example, if you just took the deep learning part of AlphaGo, it did not have all of the search stuff, the Monte Carlo search there, it wouldn't be that good. And similarly, Alpha Fold has a whole lot of very careful structured representations around the nature of the three-dimensional geometry that's trying to solve. And it's not just a simple multi-layer perception.

Starting point is 00:28:52 I'll put in arbitrary data, get arbitrary data out, and I'm good to go. So oftentimes the field kind of pumps up the deep learning and doesn't really talk about the other piece of it. I'll give you one other example, which is open AI had this example of a system, quote, solving the Rubik's Cube with deep learning. But if you actually read the paper, the part of it that I would think of as solving, which is like knowing which face to turn when, was done entirely by. a symbolic algorithm, and they didn't mention that. The deep learning was doing the motor control.

Starting point is 00:29:23 And it was, you know, it was a nice contribution to motor control, not as nice as they made it out to be, but it was, you know, a real thing to be able to get the system to turn in one hand, the Rubik's cube at the right time. But the cognitive part of what should I turn in the Rubik's cube, which is kind of the part that makes it interesting to the average person. They pick it up and they can do the motor control, but they don't know how to do the other part. That was done by symbolic system, and none of the media accounts talked about that. And, you know, there is a mystique associated with deep learning right now. But often it's actually just part of the picture.

Starting point is 00:29:58 And that just gets completely lost. Sure. But what I'm trying to get at is the fact that these successes don't easily generalize. You know, we think about chess and go as quintessences of intelligent thought, right? But in some sense, they're really, really simple. The rule set is very, very simple, and it's mostly a matter of having enough capacity and computing power to think about it. And, you know, obviously there's a tremendous amount of cleverness that goes into designing the algorithms. And I take your point that it's a hybrid kind of algorithm.

Starting point is 00:30:29 But my impression is that if you change the rules of the game by a little bit, right, you change the rules that you're allowed to do with the stones with Go or how the pieces move in chess, the algorithm that was the world's champion. at the regular rules wouldn't be able to adapt very easily to the new rules, whereas a human being could adapt pretty quickly because it has more heuristic understandings of what positions are strong and things like that. Yeah, I pretty much agree with that. I mean, it could start over and learn a new game. I think that the things that the mind have built are pretty good at games where there's a closed world and you can gather an infinite amount of data for free. So, related to your point in trying to scope out what is the generalization and generalizability. So these systems are good at closed worlds where the rules are, you know, ideally haven't

Starting point is 00:31:26 changed in 2,000 years and you can play yourself and you get as much data as you want. They don't generalize as much to the real world because you usually don't have the same kind of fixed set of rules and it actually is costly to get data. So, you know, if you're trying to figure out what I should do today with my life, you can't get infinite data and you can't solve, you know, I have, well, I'll give you an example of an article that I had somebody made, the article was with Ernie Davis in the ACM journal, and the illustrators came up with a great picture, which was, we were talking about common sense reasoning and the importance of it. They had the picture of a robot on a tree cutting the limb from the wrong

Starting point is 00:32:06 side, such that if it succeeded in the cut, the tree limb was going to fall down and so was the robot. So that's an example of something you can't get by infinite self-play, right? You don't want to fall out of a lot of trees. You want to have some other way of getting to that source of knowledge. If you can work in a hermetically sealed problem where there's kind of no influence from the external world, and it's always the same thing, you can use this kind of brute force approximation. But if you have to deal with things you can't expect, it's problematic. Now, it's problematic for other approaches, too. Nobody has a great AI solution to dealing with the unknown.

Starting point is 00:32:46 So you might remember when long-term capital failed, a billion-dollar epic mess-up, you know, a bunch of Nobel Prizes had a model of what they thought would work, and they didn't realize that, you know, you could have problems with the Russian bond market that would influence this other stuff, right? That wasn't a deep learning failure. That was a failure of models, though.

Starting point is 00:33:07 And we don't know how to make models in general versatile enough to deal with the unknown. I mean, I was not a fan of Rumsfeld, but his point about unknown unknowns is actually a good one. Human beings are better at dealing with unknown unknowns, at least in some cases, than any technology that we've currently developed. So if you imagine trying to make a domestic robot right now,

Starting point is 00:33:32 Amazon's got something that they're talking about shipping, there's just a lot of stuff that comes up that nobody's anticipated. And if all you're doing is kind of looking stuff up in a database, of what you've seen before, at some point that breaks down. So to your point about generalization, you know, nothing really unforeseen happens and go if you've played yourself 20 million times. But in the real world, you know, it's snowing in Vancouver and that's not really happening.

Starting point is 00:33:56 I need to cross the street and I can't even see the street and now what do I do? Right. And, you know, systems aren't really built around that. Well, we can see that there is, there's going to be some trade-off between letting the algorithm learned by itself versus giving it some structure, right? Like you mentioned for the protein folding where it's not just consider every configuration in space. There's some preexisting ideas that are built in there. My impression is that for chess and go, the lesson was don't spoil the algorithm by teaching it human tricks, because it'll learn faster just by playing against itself.

Starting point is 00:34:35 Yeah, I mean, that's true at this moment in the history of AI. I don't know that it's always. always true. So another weakness in current AI is we just don't know how to leverage existing knowledge. We don't know how to specify it and we don't know how to use it. There are some domains where it's actually fine. So we can do like taxonomy. So if I tell you that a penguin is a bird, you can make a bunch of inferences about that and realize that it's going to breathe and reproduce. There's some things where we can take a bit of knowledge and extend it further. But we don't know how, first of all, in the case of Go, we don't often don't know how to represent the expert knowledge. In some cases, we do. We don't really know how to use it. And it turns out in that domain

Starting point is 00:35:19 right now, as you say, but with emphasis on the right now, it is easier just to do brute force and just start over, basically, than to have a bunch of expert Go players tell you stuff. Although, you know, there was actually an expert Go player on the, it was an author on one of these papers and probably did say some things. And there are some things that are built in because we know how to do them. Like we know that it's rotation invariance. Again, as a physicist, you know what I'm talking about. I can rotate the board.

Starting point is 00:35:46 I can flip the board. And basically I wind up in the same conceptual space. So there are some things we know how to build in. But here's another example. A general intelligence ought to be able to read Wikipedia and use all that information. in order to make all kinds of decisions, like to help us with material science or medicine or whatever. And we don't have systems that can do that,

Starting point is 00:36:14 that can sort of like take the results of hard-won human knowledge, or it could be about almost any domain, and put them in. So right now the systems, right now, the systems that we have are mostly kind of blank slates. They get whatever they know by having all of those nodes, line up and balance out in the right kinds of ways without any or without much influence from the knowledge of the world.

Starting point is 00:36:46 And it's cool when it works. And in some domains it works well, others it doesn't. But like here's another case, driving. You would like to be able to just put in the rules of the California driver's code and stick it in with your deep learning system. But we don't know how to do that. You just don't. Well, language is a great example, and you know, you alluded to it several times already.

Starting point is 00:37:10 GPT3 is the system that everyone talks about these days. I mean, maybe you can tell us just how GPT3 works, what's so interesting about it. To be honest, I'm less impressed with its results than many people seem to be. Well, I may be even less impressed than you are, but many people are. It's true. In fact, most, to me, it's a kind of parlor trick that's actually a mistake in the evolution of AI. What it does is another of these systems, it's more complex than the one that we talked about before, but it's still basically about setting weights for connections.

Starting point is 00:37:45 It has some prior structure around attention to help it know about relations between words over certain space and time. There are things called positional encodings, and we don't have to go into all the technical details. But the basic framework, if you will, is you get a prompt, it sees some set of words, and then it predicts what might follow. And in this way, it's kind of like the mother of autocomplete. So you can type in anything, and it will continue in that same style. In some ways, it's astonishing. So, you know, you type in something that looks like a movie script, and it'll, like, continue, you know, often with the same characters and the same format and all of this.

Starting point is 00:38:33 As a kind of surrealist generator, it's fantastic. You type in part of a story and it will continue the story. So why am I not enamored of it when it's capable of doing some really cool things? I would not dispute that it can do really cool things. Oh, and also, like it's capable of being really grammatical, which earlier systems were not. And it's kind of astonishing in how it does that. Nonetheless, I think that it's misguided. And I think it's misguided because there's no real semantics there.

Starting point is 00:39:06 There's no underlying understanding of what it is talking about. And this manifests in different ways. So it will give you fluent speech. I wrote an article that was supposed to be called GPT3 bullshit artist. The editor wouldn't let me call it that. So I think it's called GPT3 Bloviator. But they did allow us to have our conclusion, which is that it's a fluent spouter of bullshit. And we had examples like this.

Starting point is 00:39:36 You're thirsty. You have some grape juice, but not enough. So you look around, you find some cranberry juice, you sniff it, you pour it into your glass, and then you, and GPT, it automates. And so it says, then you drink it, which is plausible, statistically speaking as a continuation. And then it says, you die. And, you know, most people don't die by having crammed grape juice. It's usually pretty harmless stuff. So the system doesn't actually understand anything about toxicology or why you might die.

Starting point is 00:40:04 It's just statistically speaking, the probability of the word die after you sniffing, you're thirsty, and some corpus that it is learned from happens to be high. And I think that illustrates what's really going on there is it's just looking for corpus through the corpus for correlations. It doesn't understand what this correlations are about. And that leads it to a weird position in terms of what it does. You can't type in an idea and have it formulate that idea in words, which is what classic computational linguistics tries to do. It can only do this game and then people work around this game of, I'll feed my thing in and hope that it continues. And what they wind up, for example, with is a lot of toxic speech.

Starting point is 00:40:43 And DeepMind just had like, you know, 10 people working on the problem. And actually, there are hundreds in the field trying to make these things not be as toxic. And there's no solution there because you just have the correlations. You don't have an underlying system where you can, like, query it the way you could query a database. You can query a database and say, you know, how many people of this age group are here or whatever. You can't query GPT and say, are you making a toxic remark or are you, you know, singling out a group? It doesn't know. It's just statistical correlations between words.

Starting point is 00:41:17 And so people are trying to put all these band-aids on top of it to make it less toxic. But it's not going to happen. The technology does not really afford that. And then it has a truthiness problem. So it's very fluent. And so it's easy for it to make stuff up and you not notice. And so, you know, it will make up whatever, you know, anti-vaccine stuff if that happens to be in the database. It is no idea what it is that it's spouting.

Starting point is 00:41:42 And there's, again, no band-aid that will solve it. Hey, everyone. It's Cal Penn. I'm the host of Earsay, the Audible and I-Heart Audio Book Club. This week on the podcast, I am sitting down with Ray Porter, the narrator of Andy Weir's audiobook Project Hail Mary, massive sci-fi adventure about survival and science. And what happens when you wake up alone very far from Earth? I really had to make a decision because I caught myself getting that frog in my throat

Starting point is 00:42:14 and starting to get teary as I'm narrating some of these sections. And it's like, okay, yo, yeah, yo, is this indulgent? And I really thought about it. I was like, no, at this point, it would kind of be betraying. being the trust the author and the listener have in telling this story if I don't go through it. But there's places in this book that deeply emotionally affected me and I left it on the mic. That's great. Because it served the story.

Starting point is 00:42:39 People will say like, oh my God, I cried at the end. It's like, yeah, dude, me too. Listen to Eursay, the Audible and IHeart Audio Book Club on the IHeart Radio app or wherever you get your podcasts. You had an analogy, which I thought was very illuminating, with the guy who won a Scrabble tournament in French, even though he spoke no French, because he just sort of memorized a list of French words that would be really useful in Scrabble. Yeah, I went back to that book by Fatsis to try to find, I think those people called them word tools or something like that. They don't know what the words are, so they're just using word tools, and that's exactly what's going on. That's why I don't even like playing Scrabble against other people, because if they're good, then they've memorized all these little words that, you know, fit very well. Two letter words, man. That's where it's at.

Starting point is 00:43:20 Ah, terrible. But I mean, the thing that got me, I was actually, before I understood what it really was about, and I had seen some of the hype about GPT3, I thought maybe it would be fun to do a podcast where I interviewed GPT3 and had it, you know, voice synthesized. But then I realized the very basic fact that it has no memory. So it doesn't remember what you just talked about one question before. And so there's really just no, after five minutes, it becomes highly unamusing. I'm doing an art project around that notion. And I did some interviewing of GPT. And I would ask questions like, are you a computer?

Starting point is 00:43:56 And it would say, yes. I would say, are you a person? It would say yes, like the very next sentence. Like, it doesn't remember. So why not? Why can't you just add some memory in there? What is the conceptual leap that makes it hard? Or is it just that this?

Starting point is 00:44:11 That's a really good question. That's an A question there. for sure. What is it about the nature of the system that makes it non-trivial to just add memory? And some people have tried certain kinds of things. It's just built from the foundation in a different way. It's built from the foundation to correlate little bits of information, like the probabilities of these words following those words.

Starting point is 00:44:36 And it's not built to have a representational scheme. It does not contact a representational scheme about these are the entities in the world and these are their properties. It's just built on a completely different path. And maybe there's some way of merging them. And I do think that ultimately the answer to AI is going to come from merging at least some of the insights from the GPT tradition with some of the insights from the more classical AI tradition.

Starting point is 00:45:04 But I don't think it's going to come literally from merging GPT with these other systems because GPT does not have the internal representations that you need. It'd be like saying I've written this big computer program, but I'm not going to let anybody else see what's inside of it. And now I just want you to hum and I hope that they match together. They're not, you know. Yeah, there needs to be some planning to make them work together. It needs to be some planning around what are going to be what we call technically the interface conditions.

Starting point is 00:45:35 And it doesn't have an API to use, you know, computer gate terms, an API where you can say, hey, what are the people, that you're talking about right now? What are the assertions that you made about them? What are you presupposing? And you can't build the API because it isn't there. I mean, it seems to me like that's... It sometimes looks like it's there because it looks like it's coherent,

Starting point is 00:45:57 but it's a superficial illusion of the fact that it's drawing on this vast database of things that people have said. You can't build the API to do it. It seems like this is a pretty strong argument just by itself for deep learning or that kind of statistical correlation to be a tool used by a, symbolic manipulator. You need some view of the world that is represented symbolically, but then by all means, have some deep learning help you with what the correlations are to predict what's going to come next. Well, there's a narrow version of that and a broader version, I guess.

Starting point is 00:46:32 The narrow version I think is actually wrong, and the broad version I think is right. So the broad version is, yeah, we need to have symbol systems rely on learning systems to do some of their grounding about what those symbols are about. I think that's the broader argument that you're making, I think it's just right. The narrower version, I don't think GPT itself is actually the right tools for doing the grounding,

Starting point is 00:46:54 because it doesn't have those interface conditions. It hasn't been built from the ground to land in the right place. So, like, you're trying to have two sides of the bridge or the tunnel, I mean, meet up, and it just wasn't built that way. But I think the idea of building that tunnel, is right of like let's figure out what these systems are good for.

Starting point is 00:47:18 Like there are lots of opportunities in the world to be tracking correlations, but you need to have respect, I think, for where you're trying to wind up. And as a cultural matter, as a sociological matter, the deep learning people for about 45 years have been, or no, actually like 60 years, have been aligning themselves against the symbol manipulated tradition. Okay, well, you know, this is why we're on the podcast. We're going to change that. Sorry, say again.

Starting point is 00:47:44 That's why we're having this podcast. We're going to change it. You're doing your. Well, I was about to say, it might be changed a little bit. So Jeff Hinton, who's the best known person in deep learning, has been really, really hostile to symbols. It wasn't always the case. And in the late 80s, he wrote a book about bringing them together. And then he at some point went off completely on the deep learning side.

Starting point is 00:48:05 Now he goes around saying deep learning can do everything. And he, like told the EU, don't spend any money on symbols and stuff like that. But Jan Lecun, one of his disciples, actually said, in a Twitter reply to me yesterday, you know, you can have your symbols if I can have my gradients, which actually sounds like, you know, compromise. Right. So I was kind of excited to see that. That does sound good.

Starting point is 00:48:23 Sometimes people can, like, say their own opposite size really be pretty close to each other. There's one example I want to get on the table because it really made me think. And I think this is the time to do it, which is the identity function. You talk about this in your paper. So let's imagine you have some numbers. they go through a process that spits out an output from the input, and every single time the output is just equal to the input. So you put in one zero one zero binary number,

Starting point is 00:48:52 and it puts out the same number. And you make the point that every human being sees the training set. You know, here's five examples and goes, oh, it's just the identity function. I can do that and extrapolates perfectly well to what is meant. But computers don't, or deep learning doesn't. Yeah, deep learning doesn't. I don't think it means that computers can't, but it means that what you need to learn in some cases is essentially an algebraic function or a computer program.

Starting point is 00:49:20 Part of what humans do in the world, I think, is we essentially synthesize little computer programs in our heads. We don't necessarily think of it, but the identity function is a good example. My function is I'm going to say the same thing as you. Or, you know, we can play like Simon says, and then, you know, I'm going to add the word Simon says to the ones that go through and not the ones that go through and not the ones that don't go through. Very simple function that, you know, five-year-olds learn all the time. And it's done as a function that applies to a whole bunch of different inputs. So you can say, Simon says, touch your finger to your nose, or Simon says put your phone in front of your nose, or Simon says, put your wrist strap on your head or whatever. Your viewers can't see me

Starting point is 00:49:58 doing these ridiculous things, but I'm glad you're laughing. And so you can do this on a, you know, infinite set of things. And that's really what functions are about and what programming is about, doing these things with an infinite range. Identity, this is the same as that. You learn the notion of a pair in cards, and you can do it with the twos and the threes and the fours, and now I make a new deck. I don't have twos and threes and fours.

Starting point is 00:50:21 I don't know, stars and guitars, and you can tell me that a pair of guitars means two guitars. You've taken that new function, put it in a new domain. That's what deep learning does not do well. It does not go over to these new domains. There are some caveats around that,

Starting point is 00:50:34 but in general, that's the weakness of this system. And people have finally realized that nowadays people talk about extrapolating beyond the training set. But the paper that you read, I don't know which version, but I first was writing about this in 1998, is really capturing that point. It took a long time for the field to realize that there are actually different kinds of generalization. It also goes back to the past 10 stuff. So people said, there's no problem. Our systems generalize.

Starting point is 00:51:00 And I said, no, there's these special cases. And finally now they're saying, oh, there are these special cases when you have to go beyond the data that you've seen before. And really, that's the essence of everything where things are failing right now. So, take driving. You know, these systems extrapolate or sorry, interpolate very well in known cases. And so they can, you know, change lanes in the environments they've seen. And then you get to Vancouver on this crazy snowy day that nobody predicted and you don't want your driverless car out here. Because you now have to extrapolate beyond the data and you really want to rely on like your cognitive understanding where the road might be because you can't see the landmarker anymore.

Starting point is 00:51:35 And that's the kind of reason they can't do. Your identity function example, it raises an interesting philosophical question about what the right rule is. Because it's not like the deep learning algorithms just made something up, but you gave an example where the training set all were a bunch of numbers that all ended in a zero. And the other ones were, you know, random. And so we figured it out. But the deep learning just thought the rule was your output number always ends in a zero. And the thing is that that is a valid rule. You know, it didn't just completely make it up.

Starting point is 00:52:06 but it's clearly not what a human would want the conclusion to be. I've been talking about this for 30 years. I've made that point in my own papers. Be the first person to ever ask me about it. Well, how do we formalize the idea? It's really a deep and interesting point, right? It's not that even when these systems make an error, it's not that they're doing something mathematically, you know, random or something like that.

Starting point is 00:52:33 They're doing something systematic and lawful, but it's not the, way that we see the universe. And in certain cases, it's not the sort of functional thing that you want to do. And that's very hard for people to grasp. So for a long time, people used to talk about deep learning and rule systems. It's not part of the conversation now as much as it used to. But they would say, oh, well, the deep learning system learns the rule that's there. And what you as a physicist understand, or what a philosopher would understand is that rules are underdetermined by data. You need something, you know, there are multiple rules. The easy example is if I say two, four, six, eight, what comes next?

Starting point is 00:53:09 You know, it could be 10, but it could be something else, and you really want some more background there. So it turns out that deep learning is mostly driven by the output nodes, the sort of the nodes that at the end giving the answer. And they each learn things independently of one another. And that leads to a particular style of computation that is good for interpolation and not so good at extrapolation. And people make a different bet. And I did these experiments with babies to show that even very young people make this different bet, which is we're looking for tendencies that hold across a class of items. We're looking for the rule.

Starting point is 00:53:45 That's just how we're built. Sometimes that gets us in trouble. There's a word apophonia, which is like looking for patterns that aren't even really there. And so sometimes it doesn't serve us well. But it very often does. And language is a great example where it serves us really well. You learn a grammar, and then you can apply it to any words that you can throw in that grammar, even novel words. So if I told you that this thing was called a blicket, then we can start talking about blickets right away.

Starting point is 00:54:11 I can say, how much did that blicket cost you? Would you sell me your blicket? Would you recommend the blicket? Is there something an alternative to the blicket? Like, you know, you're off to the races with one training example because you put it in the context of something that is rule governed, where you have a grammar that tells you not only the syntax, blicket, the plural is the morphology, is going to be. blickets and it's going to tell me how I can use it with verb than noun, but also semantics. You can know that I probably mean an individuatable object, a single object that I can go and

Starting point is 00:54:41 count and you know all this kind of stuff, like right away because you have a world model that you map your language onto. That's what it's really about. Well, and this is exactly where I was going to go with this because what is the way that clearly, you know, with all this setup, we need to give our world model to our computer friends, to our artificially intelligent friends. And how do we do that? Is it that we human beings need to formalize our sort of manifest image of the world, our picture of common sense, and then turn it into a bunch of symbols? Or is it, and I'm sure that I think I know what the answer to this

Starting point is 00:55:14 is, could we deep learn our way into common sense? You know, could we just, is there a way of letting computers figure out the same kind of common sense that we have? I take a view that I think is a little like what Kant was trying to say in the critique of your reason, although I'm never sure I've completely understood that book. But he talks about having basically prior knowledge of space and time. And I'm sorry, I haven't read your book, which would be super relevant at this point. But I think, and so I like you hear you take on it, but my view is you can learn a lot, but that you need a framework to learn, to embed that knowledge. And so minimally, I think you need to know that there is space, that there is time, that there is causality, that there are enduring objects in the world, and some other stuff, but like stuff like that.

Starting point is 00:56:07 And I believe that there's some reasonable evidence from the animal literature and the human infant literature to think that these things are in humans innate. I think you need to start with that or else you just wind up with GPT. In fact, I think GPT is a brilliant experiment, unintentional but brilliant. on the idea of could you just learn everything like, let's say, from words or from, you know, people haven't really done it from pixels, but I think you'd wind up in the same place. And I think the answer is no. You don't wind up with the API I'm talking about. If you don't have prior notions about enduring objects that you're talking about, then you're just in correlation soup.

Starting point is 00:56:46 And it's, you know, made the best job of correlation soup that I've ever seen, but it's still correlation soup. and it doesn't really connect to those things, which means that it can't know that it's ridiculous to say I'm a computer and a person in one breath or two breaths. It doesn't have the framework to know that things don't tend to change too much over time because it doesn't know what time is. So I don't fully have an answer to the question that you posed a minute ago, but I think it starts by saying we're going to learn some stuff,

Starting point is 00:57:15 but it's going to be relative to a framework where we have some basic knowledge about the world to start with, that there are these enduring objects, etc. When people turn to telehealth or weight loss, they're looking for real support. That's why more people are choosing orderly meds.com. Orderly meds connects you with real doctors and access to proven GLUTide and terseptatide. No guessing, just a more supportive experience, and all shift directly to your door in discrete packaging. Do your research. Ask questions. Then visit orderlymeds.com slash podcast for an exclusive offer. That's orderlymeds.com slash podcast. Individual May vary not medical advice, eligibility required, seaside for details.

Starting point is 00:57:58 Hey, everyone, it's Cal Penn. I'm the host of Earsay, the Audible and I Heart Audiobook Club. This week on the podcast, I am sitting down with Ray Porter, the narrator of Andy Weir's audiobook Project Hail Mary, massive sci-fi adventure about survival and science, and what happens when you wake up alone very far from Earth? I really had to make a decision because I caught myself getting out. frog in my throat and starting to get teary as I'm narrating some of these sections and it's like, okay, yo, yeah, yo, is this indulgent? And I really thought about it. I was like, no, at this point, it would kind of be betraying the trust the author and the listener have in telling this story

Starting point is 00:58:39 if I don't go through it. But there's places in this book that deeply emotionally affected me and I left it on the mic. That's great. Because it served the story. People will say like, oh my God, I cried at the end. It's like, yeah, dude, me too. Listen to EIRSA, the Audible and IHeart Audio Club on the IHeart Radio app or wherever you get your podcasts. I mean, just to emphasize how tricky all this is, you know, I think it maybe underscels the difficulty if we just think about there's space and there's time and there's objects and they have solidity. Because there's also, you know, number one, there are relationships between these objects. There are functions that they have. You already mentioned causality.

Starting point is 00:59:21 You mentioned the fact that there are values. You don't chop up the sofa to put it away because that's something that is already away in some sense. And so it's going to be quite a trick. And I guess what you're saying is, I think I agree, you're saying that the computers are not just going to learn all that stuff by looking at correlations. But there's still a tremendous program out there in front of us of figuring out what it is we want them to know ahead of time. Yeah. So the most impressive shot on goal to use a kind of cliche I've heard a bunch lately. You're in Canada.

Starting point is 00:59:54 Right. No, but I heard it aloud around the vaccines. And rightly, I think, people said this is going to work. There's so many shots on goal. And they did. We underestimated the human capacity to ignore data, but that's another issue. So Doug Lannett built this thing called Syke, which you may or may not know about. CYC.

Starting point is 01:00:19 He's done it for the last 35 years. And it was an attempt to put all of common sense knowledge or a large fraction of common sense knowledge in machine interpretable form. And it hasn't been the home run that he thought it would be. And I think people have drawn the wrong lesson from his lack of a huge obvious success. So, like, he didn't build Google with it, right? And, you know, he may have hoped that he would have. But I still think what he was trying to do was right. I think maybe it failed because it started too soon with a different set of tools than we would do to use the project that he was trying to do.

Starting point is 01:00:56 But I think the project is right, that we're not going to solve this AI problem or general intelligence problem without having a lot of knowledge in formats that the machine can leverage. So, you know, you need to know if you're predicting about grape juice and cranberry juice, that they're both juices and that other things being equal, you can mix juices together and you won't die or whatever. And there's a question about like what level of specificity you want, all of that stuff to be in. Do you want to derive everything from like, you know, quantum mechanics? Do you want to have intermediate representations at the level of juice, which is what people do? but you need some kind of knowledge that machines can reason over.

Starting point is 01:01:39 And he built something like 1,100 micro-reasoners that reason over things like economics and I don't know if beverages are in there or not, but lots of little domains and desires. The most impressive thing he has, and I write about this in an article called The Next Decade in AI and give a reference to his article, which might be in Forbes or Fortune, I can't remember which. He goes through this example with Romeo and Juliet, where the system is actually able to reason about something complicated, like what Juliet thinks is going to happen when she drinks this potion that's going to fake her death. That's really sophisticated stuff. And he shows that his system that has this common sense knowledge can make good inferences around that. And nothing in the deep learning tradition can do anything like that.

Starting point is 01:02:25 And it's a proof of conceptual proof, proof of concept that if you have the right knowledge, you can make. can actually get machines to do really rich inference. But there's also an aster's around it, which is like it doesn't just read the Shakespeare and make this inference. Rather, he has converted the Shakespeare into a set of logical propositions and then the system is able to reason over those logical propositions. And so I guess the skeptic would say, well, that's the whole, he's left out the whole problem. I'm a little bit more optimistic. I think he has left out a huge problem but also showed that another part would be solvable if we do one piece of it. But there's a whole interesting set of issues that don't even get talked about that much,

Starting point is 01:03:10 which is like, can we really do this without knowledge sort of like what he was doing? My answer would be, no, we need something like what he is doing, but also that he did it in the 80s and we know a lot about, for example, statistical representation of information that he didn't have the tools to use then. So, like, you want a lot of distributional information. You don't want to just discretize things into logical bins. You also want to know, like, what's typical. He doesn't have a lot of that kind of stuff represented.

Starting point is 01:03:37 So you do it differently if you did it now. But I think what he was trying to do is still of the essence. I'm reminded of several years ago, Chris Anderson, who was the editor of Wired at the time, wrote some little piece saying that theory is dead in science. My least favorite article of Chris Anderson's in the Wired of all time. And his logic was, look, if we have enough data, we can just figure out what all the correlations are, who needs a theory? And I wrote one of the responses, and I said, look, Tycho, Tico, I should say, Brahe, the famous astronomer, collected a lot of data. And people like Kepler,

Starting point is 01:04:12 his protege, found some correlations in the data and constructed some very useful rules. And that was good. But it was when Isaac Newton came along and invented a theory to explain why Kepler's rules were there. That was when we really understood something, because then we could talk about beyond, you know, going, extrapolating beyond the data sets, et cetera. So in some sense, maybe the worry about or the problem with deep learning is that we're too good these days at being Tico and Kepler because we're able to manipulate these huge data sets, but true understanding won't come until we are able to abstract a simple set of rules, which will be a little bit more robust than the original data sets.

Starting point is 01:04:55 Well, I mean, I think even getting to Kepler would be progress in that most of Most of the work in the sort of AI scientific discovery stuff builds in the answer in some way or another. And like, you know, what Kepler did that was awesome was to kind of come up with his own answer. He wasn't like, you know, choosing from three templates or something like that. Like there's a really cool paper by Josh Tenenbaum and Charles Kemp where they see data and they infer like, does this follow a ladder or a circle or whatever, these kind of conceptual relationships. But all the choices are built in in the beginning. And it's still a cool paper, but the really cool paper, which nobody knows that, right, I don't know either, would actually induce that these are even the logical forms that you should think about. And that's, you know, maybe that's closer to what Newton did.

Starting point is 01:05:45 But I would give Kepler some cred for that, too. You know, there's a problem. Some people sometimes talk about about extracting, like, what the variables are that you even want to talk about. And I think that's often the critical thing, where sometimes. there's a billion different choices and you need to know this is the one that I care about. It's actually impressive even that children ever learn what integers are, for example. This is the kind of thing that, you know, if I were still a professor, I retired as a professor young, but if I were still a professor, I would be telling everybody work on this problem.

Starting point is 01:06:20 How is it that, and Susan Kerry is actually telling people this, how is it the kids actually figure out what integers are, right? That's an example of a kind of conceptual apparatus that's incredibly valuable. And yet, you know, it's not obvious that it's innate. Like, it's pretty obvious that number is innate. So, you know, so many animals have some conception of approximate number, you know, that 12 is more than seven. Like any animal can, most animals can figure that out. But knowing what a discrete, countable system is where you can have infinity and all,

Starting point is 01:06:56 that's a pretty cool intellectual accomplishment and kids do it. They do the same thing when they learn to read. Some kids don't, but most of them do. A harder example is fractions. You know, the median split on SATs, SAT math, is apparently do you really get what fractions are or not.

Starting point is 01:07:13 Do other primates understand integers? Say again. Do other primates understand integers? You can get them to count. You know, the extent to which they understand integers is not totally agreed on. Okay. I don't know. They do some controversy in the literature.

Starting point is 01:07:29 They can at least do things like remember a sequence of small integers. You know, whether they get to the point of realizing, hey, I could just keep going with this forever if you just teach me the right words for it. I don't know. Well, it goes right into what I wanted to ask next, which is the extent to which being inspired by biology and evolution and actual human reasoning is useful, right? like evolution is not goal directed. It was not set up to try to build a perfect computer, and the human brain is really good at driving and talking and not so good at playing chess or multiplying big numbers together.

Starting point is 01:08:06 Do you think that we can take how evolution got us to where we are as inspiration for this program of hybrid systems? I think the right word is inspiration, right? So, you know, there's this field of biomimicry. And I think that the more, of that story is there's often useful stuff and then there's stuff you don't want to copy. So like you don't

Starting point is 01:08:28 want to build your theory about how to support objects around the human spine is a terrible solution to supporting, you know, this heavy thing on the top of the stock. And that happens to be there because we were quadrupeds and it was kind of evolutionarily cheap in the sense of being likely

Starting point is 01:08:44 to rotate the quadruped you know 90 degrees and then your vertical and your biped and it's great. But really like a tripod would have been a lot better. You don't want to copy everything about our design. In fact, I wrote a book called Cluge, which was about all the things I think are lousy about human cognition.

Starting point is 01:09:02 Starting with, you know, or focusing on things like confirmation bias, where we notice evidence for our own theories. You know, our political system right now is an epic morality tale in confirmation bias and how bad that is. So you don't want your AI system to be subject to confirmation bias where it comes up with theories, notices evidence for those theories. It pats itself on the back and ignores the counter theories. Like the last thing in the world you would want an AI system to do.

Starting point is 01:09:28 So we don't want to copy biology, but we do want to learn from it. So we want, you know, there are things that people still do way better than machines, even though there are things we do really poorly. So we wouldn't want to copy the memory systems of people because they're not that great. But on the other hand, they're cue driven in a way that's kind of cool. And maybe we can kind of do that in AI now. but the way that people can understand semantics in relation to a syntax,

Starting point is 01:09:57 that's really interesting. We don't know how to do that with machines. Maybe we'll figure out a better way to do it than people do, but right now the only game in town is people. So let's see if we can learn from them. Can you explain that issue without we're defining the words semantics and syntax as you're using them? I mean, that issue is how do you relate the meanings of words

Starting point is 01:10:18 to the ways in which you assemble them and derive the meaning of a sentence in terms of its parts. And it turns out that GPT can actually replicate the assembly of the parts into a grammatical sentence, but it can't relate that

Starting point is 01:10:34 to a situation in the world that is being described by its sentence. And it certainly can't go back. It can't actually go in either direction. You can't give it a situation in entities and expect a sentence that will validly describe them, nor go the other way and get the sentence and figure out.

Starting point is 01:10:53 Whereas you and I, that's what we're doing, sometimes imperfectly, but we're trying to grasp each other's meaning. So you're building a model. What is Gary is actually saying there, right? And, you know, we have a limited bandwidth and whatever, but, you know, we get there. And the machines don't really have that capability right now in a general way. I guess I'm just wondering how much you went back to Kant a little while ago,

Starting point is 01:11:16 but how much innate knowledge in the human brain is crucial to this kind of reasoning that we do in extrapolating. And is that something that would help us figure out what to build in to a good hybrid AI system? I mean, first thing I'll say is it's controversial. Nobody knows. I spent, you know, the first two-thirds of my career as a developmental psychologist slash cognitive psychologist, thinking very deeply about this. I wrote a book called The Birth of the Mind, which was about how you might get innate structure given the tools of developmental neuroscience or molecular biology and what we know about developmental neuroscience and so forth.

Starting point is 01:11:52 And, you know, so I thought about these things a lot. And the honest answer is we don't know exactly what's innate. The best work, I think, is done by Elizabeth Spelke, a developmental psychologist at Harvard. But, you know, there's a lot of work out there. My best guess is that we have at least about a dozen things that are innate and it could be a lot more. So a dozen things include things like the ability to represent these abstractions that we were talking about, the ability to distinguish between types and tokens. So, like, know that this water bottle as opposed to water bottles in general.

Starting point is 01:12:27 Space, time, causality, all those kinds of things are like form a bare minimum. And I've written about this occasionally. And then you could have a lot more. I often point to the last chapter of Pinker's first popular book, The Language Instinct, where he runs off a list of like 15 things that includes like, I think I'm quoting verbatim, mental rolodex. And maybe that is innate. You know, some things you might be able to derive if you had others.

Starting point is 01:12:54 Like if you had a cost-benefit system, which I think is innate, and you had abstract variables, and you had a few other things, maybe you could acquire some of the others. And then there's a tension in the developmental psychology literature, or actually I won't just call it attention, a mistake, foundational mistake, which is that people think that if something could be learned, it is not innate. But that's wrong, right? So there may be many things that could be learned, but maybe biology has chosen, so to speak, and you know all the ways in which I'm being anthropomorphic, but has, you know, alighted upon solutions that make those innate because it's a whole lot safer or faster or whatever. So you think about a baby Ibex scrambling down a

Starting point is 01:13:35 mountain. It is not working out online the physics of objects and slopes and stuff like that. That is there in that maybe. Or a honeybee can calculate the solar azimuth function and extend it to lighting that it's never seen before if you do the right experiment. So there's clearly innate stuff there about physics and observation and so forth. So there may be a lot more than just the 10 or so things that I'm talking about. But even those 10 have made me kind of like public enemy number one in the machine learning world where where they want to learn everything from scratch. They're like, why is Gary O'Reilly's on about this and neat and the stuff? You know, how do you know?

Starting point is 01:14:14 I think that one of the biggest problems with the field of AI is that right now it is dominated by a group of people who do machine learning. And there's the old saying up to a man who has a hammer, everything is a nail. And so the people in machine learning have made astonishing progress in some ways in the last decade. And so they think that the tool that they have is the tool. Whereas I think the right thing is to say, congratulations, that is an awesome tool. We thank you for it.

Starting point is 01:14:42 Now let's see how we could use it in combination with other tools to do even more awesome things. But it's been a bitter battle getting people to even think about that. Well, one of the possible ways of thinking about this is, I kind of don't want to think of Alpha, Go, or whatever, as intelligence almost at all. Like, it's a very good of playing Go. But it doesn't remind me of a human being in very many other ways.

Starting point is 01:15:08 Well, it's like an idiot savant. I mean, I would say that intelligence is multi-dimensional. And some of the things that I think are fairly counted as intelligence are like doing that kind of computation. It's fine, call it. As long as you realize that it is multidimensional. And there are other dimensions where, you know, it's not even showing up to bat. So, you know, one definition of intelligence is like adaptively solving. unknown problems and it doesn't have that at all.

Starting point is 01:15:37 Yeah. Well, and is the general goal of trying to make AI equal the capacities of human intelligence the right goal or should we just be saying? Oh, we shouldn't do equal. We should go for exceed. Danny Kahneman and I had this conversation. We sort of came up with this phrase together in a way. It was a panel or whatever, which is humans are a low bar.

Starting point is 01:15:57 I think that was his conversion. And I said, and yet we still can't exceed it yet. Right. We, you know, we surely should want our machines not to be human level intelligence, but to be way smarter than us. So, you know, we want, if we're going to trust AI as much as we seem to want to, it's got to be good, right? You know, if we're going to put it in charge of stuff, whatever that stuff is, you know, it better be able to not be subject to confirmation bias. It better not just perpetuate. racist stereotypes from the past, but actually be able to put values so that it's not just interpolating, but extrapolating to the world that we want to have, right? And that means it's going to be better than most people or better than all people. That should be what we're aspiring to. What we're settling for now is we've got these cool tools and they can do some stuff. And sometimes they actually, you know, tell people to commit suicide or say racist things or whatever.

Starting point is 01:16:58 And we're like, but, you know, but I get really good recommendations from Amazon. So it's okay. And like, that's where we are now. I'm not super thrilled with that. But I guess what I'm getting at is, I mean, as you said, I completely agree that there's very many different kinds of intelligence. And computers are going to be, it's going to be easier to make computers good at some kinds of intelligence than it is at other kinds of human intelligence. And I'm sure both are important, but, you know, how do we balance just putting computers to work at the things they're good at versus trying to nudge them to become good at other things that we human beings know and love? I think it starts with what you just said. I often make your point a slightly

Starting point is 01:17:43 different way, which is that people talk about artificial intelligence as if it was one thing, but it's actually many things. It's actually a whole family of algorithms and also databases and so forth that have different properties. They're good at some things. They're not good at other things. That's going to change over time. So there's the AI of 2022 is different from the AI of 2019. And I sure is how I hope that the AI of 2025 is better than what we've got right now,

Starting point is 01:18:14 because it's problematic now. And you can't just talk about it like it's a magic wand. It's not. It's a set of tools that are more or less appropriate to certain, problems. And so it's totally fine to use current AI for photo tagging. 2025 AI will be even better at it, but the cost of getting a mislabeled photograph is generally not that high unless then again you're using it for surveillance, in which case maybe it's really high. So, you know, I just saw another one of these examples of somebody who like went to jail because an AI system misread something.

Starting point is 01:18:48 I think in the book we gave an example of something in China that gave somebody a speeding ticket because their face, they were an actress, their face was on a bus that went fast. And so the wrong person got convicted of the crime, the tools are really appropriate and then not appropriate depending on how they get used. So you can tolerate the error in photo tagging if you're not using it to identify criminals.

Starting point is 01:19:16 If you're identifying criminals, then you probably need at least 2025 AI or whatever, you know, 2030 AI because the stakes are so high. Same thing with suicide prevention. You can write a little chatbot that'll make people feel better some of the time. But when the stakes are high, I don't think the tools we have right now are up to it. Driving is another example. It's easy to build a car that can follow a lane.

Starting point is 01:19:41 You can have like 70 hours of training data and Vindio show this and you can follow a lane. And that's great. But it doesn't mean that you know what to do on a snowy day. And so, you know, we have to be very careful about the laws around driverless cars. And, you know, like right now I think Elon Musk is beta testing on public roads. I don't think that's cool. There have been some accidents. And so understanding that AI is actually a heterogeneous thing rather than a single magic wand

Starting point is 01:20:07 is important. Now, that makes it hard, right? Because people want a policy that's sort of about AI writ large. And that doesn't match the reality of we are incrementally developing science and engineering to make things better. And we understand some of it and not others. others. And without asking you any to make predictions about time scales or anything like that, do you see any obstacles to AI being just as good at human beings as human beings are at,

Starting point is 01:20:35 you know, writing poems or symphonies and so forth? In principle, no. I mean, they're not going to have the emotions, you know, that might drive some of that stuff. it's actually not that hard to write, you know, knockoffs of Bach without the, already, without the emotional resonance. So, you know, one piece of your question is about specifically about creativity, and then there's a larger question. So many particular things that we would define as creative, we can already build machines to do without that connection to the underlying emotional impulse that might lead to something.

Starting point is 01:21:14 And so, like, vocals are hard because vocals are really. about emotion. You know, synthesizing a drumbeat, you know, logic 12 or whatever is the latest edition can do that pretty well, right? Yeah. You know, there's a humanized function to add a little random variation and make it sound like a person. You know, there are certain things that we can do, you know, very, very well.

Starting point is 01:21:35 And in some cases, have been able to actually for 30 years. And like people are reinventing them with deep learning, but people already knew how to, how to do some of those things. The larger question, well, sorry, one more thing. There are some artistic endeavors that I think are way beyond current computers, though, like a movie, right, where you have to have coherence over a long period of time. So, you know, GPT can actually make advertising jingles that are like two liners pretty well. But it can't keep the coherence that you would need for, you know, for an ordinary film. You could make something, you know, from the late 60s.

Starting point is 01:22:13 with pharmaceuticals involved, it seemed interesting. But long form is not the strong point of what we have right now, and it won't be for a while. But I don't think anything in the realm of cognition is impossible, right? You know, we are just met computers, and, you know, we don't quite understand how those meet computers work. But there are information processors that, you know, take in information and manipulate it and come up with outputs. That's what our brains do. do and computers, you know, get better at that. And I don't see any, like, principled argument that says 500 years from now,

Starting point is 01:22:51 people will still be smarter than machines. I just don't see it. 500 years is much safer than 20 years. I think you've chosen wisely about your time horizon there. But I guess, you know, there are people who are worried about existential risks from AI, taking over and have different values than us. Do you share those worries? I mean, clearly giving AI's value.

Starting point is 01:23:13 systems at all, recognizable to us, is a tricky situation. So I wrote this piece in 2012 called Moral Machines for the New Yorker. I was one of the first people to talk about trolley problems in AI, where in my particular case, in the New Yorker article, it was a school bus is out of control, you're on a highway, you know, should you sacrifice yourself? A lot of people picked up on this later, Obama talked about. I now have some regret around that. It's all your fault.

Starting point is 01:23:46 We've narrowed it down. Okay. I mean, there were a couple of us who wrote about it around the same time, but I was one of the first. But the thing is that the real challenge right now is much lower to the ground than that. And it's not often that you actually come up with the school bus. But, you know, Asimov's basic laws, you know, don't do harm. Just think about that one. The model that we have now is around images.

Starting point is 01:24:12 I show you a bunch of images. You learn from those images to recognize another. It doesn't work. That model doesn't work for harm. I can't show you a bunch of pictures of harm and really get you to grasp the concept of what harm is. We just don't know how to program what harm is. We don't know how to program really any human values

Starting point is 01:24:32 into current technology. And it's actually related to stuff we've been talking about throughout the whole conversation, which is it's kind of an interface thing. We don't know how to specify these things in terms that learning systems would understand. And we can't really do it entirely with an innate set of rules either. There has to be some learning. We have to give some examples. There was a film called Chappie a few years ago where a robot learned its values.

Starting point is 01:24:59 And, you know, one of the lines in the film is the robot is trying to figure this stuff out, all out and the robot's been captured by a bad guy. And it's the robot's master has said, you can't kill people. And the bad guy is a bit disappointed to discover that the robot knows this. Because the bad guy would actually like the robot to kill people, but the bad guy is clever. And he works around and he says, yeah, you can't kill people, but it's okay to harm them.

Starting point is 01:25:31 And the robot is sort of, you know, left to construct its own, own ethical values based on the input that it's getting. And the reality is that we will get to a point where maybe already have gotten to a point where it would be really nice to have AI systems with values. And we have not gotten to the point where we know how to program that in. So, you know, one of the problems with GPT3 is all the toxic language that it produces because it's like trained on like the worst of Reddit and stuff on that. And we just don't have like that already it would be nice to constrain these systems such

Starting point is 01:26:05 that they would follow some set of values, and you can argue about what that set of values would be, but we don't know how to do it. DeepMind just had like 10 people, 20 people working on this problem and came up dry. Nobody knows how to actually constrain these systems to values. They don't have the APIs to plug into, and it's a problem. So when you come to the existential stuff,

Starting point is 01:26:29 I'm not worried about it now. I'm not worried about in the short-term robots taking over the world, they don't care about us, they have no motivation to do so. And they're frankly dumb right now. The ability to win a go doesn't count for anything. I mean, go is actually a great example because go is about territory, right? And they don't actually understand anything about territory in the real world. Getting better at go has not made them any more desirous of human territory nor taught them anything about, you know, that would be useful in an actual battle. I'm not too worried about those kinds of things in the near term. The near term, I'm worried about misapplication of the AI that we have

Starting point is 01:27:10 now. But in a hundred-year time frame, it could be an issue. I think it's fine that we have a few people around thinking about these issues now. Maybe they never come to pass, but it's good to have some thought into it. It's not an urgent need. You know, I'd like to end on an optimistic note. So maybe, you know, what if everyone listened to you? What if you were not the bad boy of AI? And everyone said, you know what? More symbols, more variables, more hybrid approaches to join up to our deep learning. How would you see AI going in the next few years?

Starting point is 01:27:47 I wrote a piece for the times where I argued for a CERN for AI. If people really listen to me, that's what they would do. And I would get people to gather around a particular problem, in part because otherwise if you have a large sum of money, then people just do their own thing and don't actually coordinate. And the problem that I would coordinate them around is having an AI that could read and understand the medical literature.

Starting point is 01:28:13 I think it would be enormous value in that. You could also maybe think about doing the same thing around climate change, read and understand material science and so forth. Either those would be fine in my view. But have people coordinate around machine reading. I'm not talking about like keyword. matching, which we can do very well right now. But having a system read a scientific literature come up with experiments based on, you know,

Starting point is 01:28:39 what it reads, come up with novel solutions and so forth. Like I think that could change the world. And it would certainly push AI forward. So, you know, if I were king for the day, that's what we would do. All right. Now people know it. They can spread the word. Gary Marcus, thanks so much for being on the Mindscape podcast.

Starting point is 01:28:54 Thanks. This is really fun. Honey, are you awake? I am now. much wrong. My back is killing me. This mattress is too soft. Yeah, but I'd like it soft.

Starting point is 01:29:33 Soft mattress meant comfort, but every morning your back tells a different story. This is the sound of support. The sound of air weave. Born from Japanese innovation, our air fiber technology provides the firm, even support your body has been craving. The aha moment your,

Starting point is 01:29:53 your back has been waiting for. Airwave, discover truly supportive sleep. With AirWeave, you can customize your support. Our three-block configuration allows you to adjust the firmness on each side of the bed. He gets the firm support he needs, she gets the comfort she wants. Manufactured in Japan, AirWeave is truly supportive sleep. Discover more at airweave.com.

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas - 184 | Gary Marcus on Artificial Intelligence and Common Sense

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.