Lex Fridman Podcast - Gary Marcus: Toward a Hybrid of Deep Learning and Symbolic AI
Episode Date: October 3, 2019Gary Marcus is a professor emeritus at NYU, founder of Robust.AI and Geometric Intelligence, the latter is a machine learning company acquired by Uber in 2016. He is the author of several books on nat...ural and artificial intelligence, including his new book Rebooting AI: Building Machines We Can Trust. Gary has been a critical voice highlighting the limits of deep learning and discussing the challenges before the AI community that must be solved in order to achieve artificial general intelligence. This conversation is part of the Artificial Intelligence podcast. If you would like to get more information about this podcast go to https://lexfridman.com/ai or connect with @lexfridman on Twitter, LinkedIn, Facebook, Medium, or YouTube where you can watch the video versions of these conversations. If you enjoy the podcast, please rate it 5 stars on iTunes or support it on Patreon. Here's the outline with timestamps for this episode (on some players you can click on the timestamp to jump to that point in the episode): 00:00 - Introduction 01:37 - Singularity 05:48 - Physical and psychological knowledge 10:52 - Chess 14:32 - Language vs physical world 17:37 - What does AI look like 100 years from now 21:28 - Flaws of the human mind 25:27 - General intelligence 28:25 - Limits of deep learning 44:41 - Expert systems and symbol manipulation 48:37 - Knowledge representation 52:52 - Increasing compute power 56:27 - How human children learn 57:23 - Innate knowledge and learned knowledge 1:06:43 - Good test of intelligence 1:12:32 - Deep learning and symbol manipulation 1:23:35 - Guitar
Transcript
Discussion (0)
The following is a conversation with Gary Marcus.
He's a professor emeritus at NYU, founder, robust AI,
and geometric intelligence.
The latter is a machine learning company
that was acquired by Uber in 2016.
He's the author of several books
on natural and artificial intelligence,
including his new book, Rebooting AI,
Building Machines We Can Trust.
Gary has been a critical voice highlighting
the limits of deep learning and AI in general and discussing the challenges before our AI
community that must be solved in order to achieve artificial general intelligence. As I'm
having these conversations, I try to find paths towards insight, towards new ideas, I try
to have no ego in the process, and get in the way.
I'll often continuously try on several hats, several roles.
One for example is the role of a three-year-old who understands very little about anything,
and asks big what and why questions.
The other might be a role of a devil's advocate, who presents counter ideas with a goal
of arriving at greater understanding through debate.
Hopefully both are useful, interesting, and even entertaining at times.
I ask for your patience as I learn to have better conversations.
This is the Artificial Intelligence Podcast.
If you enjoy it, subscribe on YouTube, give it 5 stars and iTunes, support it on Patreon,
or simply connect with me on Twitter.
At Lex Friedman spelled F-R-I-D-M-A-M.
And now here's my conversation with Gary Marcus. Do you think human civilization will one day have to face an AI-driven technological
singularity that will, in a societal way, modify our place in the food chain of intelligent
living beings
on this planet.
I think our place in the food chain is already changed.
So there are lots of things people used to do by hand that they do with machine.
If you think of a singularity as like one single moment, which is I guess what it suggests,
I don't know if it'll be like that, but I think that there's a lot of gradual change and
AI is getting better and better. I mean, I'm here to tell you why I think it's not nearly as'll be like that, but I think that there's a lot of gradual change, and AI is getting better and better.
I mean, I'm here to tell you why I think it's not nearly as good as people think, but
the overall trend is clear.
Maybe Rick Hertzwell thinks it's an exponential, and I think it's linear in some cases, it's
close to zero right now, but it's all going to happen.
We are going to get to human level intelligence or whatever you will.
Artificial general intelligence at some point.
And that's certainly going to change our place in the food chain
because a lot of the tedious things that we do now,
we're going to have machines do it.
A lot of the dangerous things that we do now,
we're going to have machines do.
And I think our whole lives are going to change
from people finding their meaning through their work,
through people finding their meaning
through creative expression.
So the singularity will be a very gradual,
in fact, removing the meaning of the word singularity,
it'll be a very gradual transformation in your view.
I think that it will be somewhere in between,
and I guess it depends what you mean by gradual and so on.
I don't think it's gonna be one day.
I think it's important to realize that intelligence is a multi-dimensional variable.
So people sort of write this stuff as if IQ was one number and the day that you hit 262
or whatever you displace the human beings.
And really, there's lots of facets to intelligence.
So there's verbal intelligence and there's motor intelligence and there's mathematical
intelligence and so forth.
Machines and they're mathematical intelligence far exceed most people already and they're
ability to play games.
They far exceed most people already.
In their ability to understand language, they lag behind my five-year-old, far behind
my five-year-old.
So there are some facets of intelligence, the machines of graphs, and some that they haven't. And, you know, we have a lot of work left to do to get them to, say,
understand natural language, or to understand how to flexibly approach some, you know, kind of novel,
McGiver problem solving kind of situation. And I don't know that all of these things will come
once. I think there are certain vital prerequisites that we're missing now.
So for example, machines don't really have common sense now.
So they don't understand that bottles contain water and that people drink water to quench
their thirst and that they don't want to dehydrate.
They don't know these basic facts about human beings.
And I think that that's a rate limiting step for many things.
It's a rate limiting step for reading, for example, because stories depend on things like,
oh my God, that's person's running out of water.
That's why they did this thing.
Or, if they only had water, they could put out the fire.
So you watch a movie and your knowledge about
how things work matter.
And so a computer can't understand that movie
if it doesn't have that background knowledge.
Same thing if you read a book.
And so there are lots of places where if we had a good
machine interpretable set of common sense, many things would accelerate relatively quickly,
but I don't think even that is like a single point. There's many different aspects of knowledge.
And we might, for example, find that we make a lot of progress on physical reasoning,
getting machines to understand, for example,
how keys fit into locks or that kind of stuff or how this gadget here works and so forth.
And so machines might do that long before they do really good psychological reasoning,
because it's easier to get kind of labeled data or to do direct experimentation on a microphone stand, then it is to do direct
experimentation on human beings to understand the levers that guide them.
That's a really interesting point actually.
Well, there's easier to gain common sense knowledge or psychological knowledge.
I would say the common sense knowledge includes both physical knowledge and psychological
knowledge.
In the argument, I was making physical versus psychological.
Yeah, physical versus psychological.
The argument I was making is physical knowledge
might be more accessible because you could have a robot,
for example, lift a bottle, try putting a bottle cap on it,
see that it falls off if it does this,
and see that it could turn it upside down.
And so the robot could do some experimentation.
We do some of our psychological reasoning
by looking at our own minds.
So I can sort of guess how you might react to something based on how I think I would react
to it.
Robots don't have that intuition, and they also can't do experiments on people in the same
way or we'll probably shut them down.
So, if we wanted to have robots figure out how I respond to pain by pinching me in different
ways, that's probably, it's not going to make it past the human subjects board and companies
are going to get sued or whatever.
There's certain kinds of practical experience that are limited or off limits to robots.
That's the interesting point.
What is more difficult to gain a grounding in because to play devil's advocate I would say
that human behavior is easier expressed in data in digital form and so when
you look at Facebook algorithm they get to observe human behavior. So you get
to study and manipulate even a human behavior in a way that you perhaps cannot study and manipulate the physical world. So it's true why you said
pain is like physical pain, but that's again the physical world emotional pain might be much easier to experiment with
perhaps unethical, but nevertheless some would argue it's already going on. I
think that you're right for example that
is already going on. I think that you're right, for example, that Facebook does a lot of experimentation in psychological reasoning. In fact, Zuckerberg talked about AI at a talk that he gave nips. I wasn't
there, but the conference has been renamed Neurops, but he's so we called nips when he gave the talk.
And he talked about Facebook, basically, having a gigantic theory of mind. So I think it is certainly possible.
I mean, Facebook does some of that.
I think they have a really good idea of how to addict people to things.
They understand what draws people back to things.
And I think they exploited in ways that I'm not very comfortable with.
But even so, I think that there are only some slices of human experience that they can
access through kind of interface they have.
And of course, they're doing all kinds of VR stuff and maybe that'll change and they'll
expand their data.
And, you know, I'm sure that that's part of their goal.
So, Evid is an interesting question.
I think love, fear, insecurity, all of the things that I would say some of the deepest
things about human nature and the human mind could be explored
the digital form.
It's that you're actually the first person just now that brought up.
I wonder what is more difficult because I think folks who are the slow and we'll talk
a lot about deep learning, but the people who are thinking beyond deep learning are thinking
about the physical world.
You're starting to think about robotics in the home robotics. How do we make robots manipulate objects which requires an understanding of the
physical world and it requires common sense reasoning. And that has felt to be like the next step for
common sense reasoning, but you've not brought up the idea that there's also the emotional part.
And it's interesting whether that's hard or easy. I think some parts of it are and some aren't. So my company that I recently founded with broad Brooks, you know, from MIT
for many years and Ruma and so forth, we're interested in both. We're interested in physical
reasoning and psychological reasoning among many other things. And, you know, there are
pieces of each of these that are accessible. If you want a robot to figure out whether it can fit under a table, that's a relatively
accessible piece of physical reasoning.
If you know the height of the table and you know the height of the robot, it's not that
hard.
If you wanted to do physical reasoning about Jenga, it gets a little bit more complicated,
and you have to have higher resolution data in order to do it.
With psychological reasoning, it's not that hard to know, for example,
that people have goals and they like to act on those goals, but it's really hard to know exactly
what those goals are. But ideas of frustration, I mean, you could argue it's extremely difficult
to understand the sources of human frustration as they're playing Django with you or not.
Yeah, I mean, it's very accessible. There's some things that are going to be obvious
and some not, so like, I don't think anybody
really can do this well yet, but I think
it's not inconceivable to imagine machines
in the not so distant future, being able
to understand that if people lose in a game
that they don't like that, right?
That's not such a hard thing to program
and it's pretty consistent across people.
Most people don't enjoy losing,
and so that makes it relatively easy to code.
On the other hand, if you wanted to capture everything
about frustration, we'll get frustrated
for a lot of different reasons.
They might get sexually frustrated,
they might get frustrated,
they can get their promotion at work,
they all kinds of different things.
And the more you expand the scope,
the harder it is for anything
like the existing techniques to really do that.
So I'm talking to Gary Kasparov next week
and he seemed pretty frustrated with this game,
I guess, too, blue.
So, yeah, well, I'm frustrated with my game against him
last year, because I played him.
I had two excuses, I'll give you my excuses upfront.
I won't mitigate the outcome.
I was jet lagged and I hadn't played in
25 or 30 years, but the outcome is he completely destroyed me and it wasn't even close. Have you ever been beaten in any
board game by a machine? I have I actually beat the or if it played the predecessor to deep blue
Deep thought I believe it was called, and that too crushed me. And after that,
you realized it's over for us. There was no point in my playing Deep Blue. I mean, it's a waste of
Deep Blue's computation. I played Caspar of because we both gave lectures this same event, and he was
playing 30 people. I forgot to mention that not only did he crush me, but he crushed 29 other people at
the same time.
I mean, but the actual philosophical and emotional experience of being beaten by a machine,
I imagine, is, I mean, to you who thinks about these things may be a profound experience.
Or no, it was a simple, I mean, I think.
Mathematical experience.
Yeah, I think a game like chess, where it's you know you have perfect information.
It's no two-player closed-end and there's more computation for the computer. It's no surprise the machine wins. I mean I'm not sad when a computer.
I'm not sad when a computer calculates a cube root faster than me like. I know I can't win that game. I'm not gonna try.
Well, with a system like AlphaGo or AlphaZero,
do you see a little bit more magic in a system like that,
even though it's simply playing a board game,
but because there's a strong learning component,
you know, if I should mention that
in the context of this conversation,
because Casparov and I are working on an article
that's gonna be called AI is not magic.
And neither one of us thinks that it's magic, and part of the point of this article is that, of an eye are working on an article that's going to be called AI is not magic.
Neither one of us thinks that it's magic and part of the point of this article is that
AI is actually a grab bag of different techniques and some of them have, or they each have their
own unique strengths and weaknesses.
So, you read media accounts and it's like, oh, AI, it must be magical or it can solve
any problem.
Well, no, some problems are really accessible
like chess and go. And other problems like reading are completely outside the current technology.
It's not like you can take the technology that drives AlphaGo and apply it to reading and get
anywhere. DeepMind has tried that a bit. They have all kinds of resources. They built AlphaGo.
I wrote a piece recently that they
lost and you can argue about the word loss. But they spent $530 million more than they
made last year. So they're making huge investments. They have a large budget and they have applied
the same kinds of techniques to reading or to language. And it's just much less productive there
because it's a fundamentally different kind of problem. Chests and go and so forth are closed-end problems.
The rules haven't changed in 2,500 years.
There's only so many moves you can make.
You can talk about the exponential as you look at the combinations of moves, but fundamentally
the go board has 361 squares.
That's the only, you know, those intersections are the only places that you can place your
stone.
Whereas when you're reading, the next sentence could be anything.
You know, it's completely up to the writer what they're going to do next.
That's fascinating.
You think this way.
You're clearly a brilliant mind who points out the Emperor has no clothes, but so I'll
play the role of a person who says, you know, put clothes on the Emperor, good luck with
it.
The romanticizes the notion of the emperor period and it's suggesting
that clothes don't even matter. Okay, so that's really interesting that you're talking
about language. So there's the physical world of being able to move about the world, making
an omelet and coffee and so on. There's language where you first understand what's being
written and then maybe even more complicated than that
having a natural dialogue. And then there is the game of go and chest. I would argue that
language is much closer to go than it is to the physical world. Like it is still very constrained.
When you say the possibility of the number of sentences that could come, it is huge, but it
nevertheless is much more
constrained, it feels maybe I'm wrong, than the possibilities that the physical world
brings us.
There's something to what you say in some ways in which I disagree.
So one interesting thing about language is that it abstracts away.
This bottle, I don't know if it would be in the field of view, is on this table.
And I use the word on here, and I can use the word on here, maybe not here.
But that one word encompasses, you know, in analog space, a sort of infinite number of
possibilities.
So there is a way in which language filters down the variation of the world.
And there's other ways.
So, you know, we have a grammar.
And more or less, you have to follow the rules of that grammar. You can break them a little bit, but by and large, we follow the rules of grammar. And so that's a constraint on language.
So there are ways in which language is a constraint system. On the other hand, there are many arguments, let's say there's an infinite number of possible sentences, and you can establish that by just, you know, stacking them up. So I think there's water on the table.
You think that I think there's water on the table.
Your mother thinks that you think that I think the water is on the table.
Your brother thinks that maybe your mom is wrong to think that you think that I think, right?
We can make it in sentences of infinite length.
We can stack up adjectives.
This is a very silly example of very, very silly example of very, very, very, very silly example
and so forth.
So, they're good arguments that there's an infinite range of sentences.
In any case, it's vast by any reasonable measure.
And for example, almost anything in the physical world we can talk about in the language world.
And interestingly, many of the sentences that we understand, we can only understand if
we have a very rich model of the physical world.
So I don't ultimately want to adjudicate the debate that I think you just set up, but I find
it interesting.
You know, maybe the physical world is even more complicated than language.
I think that's fair, but you think that language is really, really complex.
It's hard.
It's really, really hard.
Well, it's really, really hard for machines, for linguists, people trying to understand
it. It's not that hard for children, and that's part of what's driven, really hard. Well, it's really, really hard for machines, for linguists, people trying to understand it.
It's not that hard for children, and that's part of what's driven my whole career.
I was a student of Stephen Pinkers, and we were trying to figure out why kids couldn't
learn language when machines couldn't.
I think we're going to get into language.
We're going to get into communication intelligence, and you know, networks, and so on.
But let me return to the high level, the futuristic for a brief
moment. So you've written in your book, your new book, it will be arrogant to suppose that
we could forecast where AI will be, where the impact it will have in a thousand years
or even 500 years. So let me ask you to be arrogant. What do
AI systems with or without physical bodies look like a hundred years from now?
If you would just a you can't predict, but if you were to
it's loss by an imagined do can I first justify the arrogance before you try to
push me beyond it? Sure.
I mean, there are examples like, you know, people figured out how electricity
worked. They had no idea that that was going to lead the cell phones, right? I mean, things
can move awfully fast once new technologies are perfected. Even when they made transistors,
they weren't really thinking that cell phones would lead to social networking.
There are nevertheless predictions of the future, which are statistically unlikely to come
to be, but nevertheless, the best thing to be wrong. I'm asking you to be
just the way what I like to be wrong. Pick the least unlikely to be wrong thing, even though it's
most very likely to be wrong. I mean, here's some things that we can safely predict, I suppose.
We can predict that AI will be faster than it is now. It will be cheaper than it is now. It will be
better in the sense of being more general and applicable in more places. It will be pervasive.
I mean, these are easy predictions. I'm sort of modeling them in my head on Jeff Bezos'
famous predictions. He says, I can't predict the future not in every way I'm paraphrasing,
but I can predict that people will never want to pay more
money for their stuff.
They're never going to want it to take longer to get there.
So you can't predict everything,
but you can predict something.
Sure, of course, it's going to be faster and better.
And we can't really predict is the full scope
of where AI will be in a certain period. I think it's
safe to say that although I'm very skeptical about current AI, it's possible to do much
better. There's no in principle argument that says AI is an insolvable problem that
there's magic inside our brains that will never be captured. I mean, I've heard people
make those kind of arguments. I don't think they're very good.
So, AI is gonna come
and probably 500 years of planning to get there. And then once it's here,
it really will change everything.
So, when you say AI is gonna come,
are you talking about human level intelligence?
So, maybe I like the term general intelligence.
So, I don't think that the ultimate AI, if
there is such a thing, is going to look just like humans. I think it's going to do some
things that humans do better than current machines, like reason flexibly and understand
language and so forth. But this doesn't mean that have to be identical to humans. So,
for example, humans have terrible memory and they suffer from what some people call motivated reasoning.
So they like arguments that seem to support them
and they dismiss arguments that they don't like.
There's no reason that a machine should ever do that.
So you see that those limitations of memory
as a bug, not a feature?
Absolutely.
I'll say two things about that.
One is I was on a panel with Danny Conham in the Nobel Prize winner last night and we were talking about this stuff. I think
we converged on is that the humans are a low bar to exceed. They may be outside of our
skill right now, but as AI programmers, but eventually AI will exceed it. So we're
not talking about human level AI. We're talking about general intelligence that can do all kinds of different things and do it without
some of the flaws the human beings have. The other thing I'll say is I wrote a
whole book actually about the flaws of humans. It's actually a nice book
into the counterpoint to the current book. So I wrote a book called Cluj, which
was about the limits of the human mind. Current book is kind of about those few
things the humans do a lot better than machines.
Do you think it's possible that the flaws of the human mind, the limits and memory, our
mortality, our bias is a strength, not a weakness, that is the thing that enables from which
motivation springs and meaning springs.
I've never heard a lot of arguments like this. I've never found them that convincing
I think that there's a lot of making lemonade at a lemons
So we for example do a lot of free association where one idea just leads to the next and they're not really that well connected
And we enjoy that and we make poetry out of it and we make kind of movies with free associations and it's fun and whatever
I don't I don't think that's really a virtue of the system.
I think that the limitations and human reasoning actually get us in a lot of trouble.
For example, politically we can't see eye to eye because we have the motivational reasoning
I was talking about and something related called confirmation bias.
We have all of these problems that actually make for a rougher society because
we can't get along because we can't interpret the data in shared ways. And then we do some
nice stuff with that. So my free associations are different from yours and you're kind of
amused by them and that's great. And hence poetry. So there are lots of ways in which we take
a lousy situation and make it good another example would be our memories are terrible
So we play games like concentration where you flip over two cards try to find a pair
Yeah, you imagine a computer playing that computers like this is the dullest game in the world
I know where all the cards are. I see it once I know where it is
What are you even talking about when we make a fun game out of having this terrible memory?
so we are imperfect in discovering and optimizing
some kind of utility function, but you think in general there is a utility function. There's
an objective function that's better than others. I didn't say that.
But see, the presumption, when you say, I think you could design a better memory system.
You could argue about utility functions and how you want to think about that.
But objectively, it would be really nice to do some of the following things, to get rid
of memories that are no longer useful.
Like objectively, that would just be good.
And we're not that good at it.
So when you park in the same lot every day, you confuse where you park today, with where
you park yesterday, with where you park yesterday
with where you park the day before and so forth. So you blur together a series of memories. There's just no way that that's optimal.
I mean, I've heard all kinds of wacky arguments people trying to defend that. But in the end of the day, I don't think any of them hold water.
It's just a memory of traumatic events would be possibly a very nice feature to have to get rid of those.
It'd be great if you could just be like,
I'm gonna wipe this sector.
You know, I'm done with that.
I didn't have fun last night.
I don't wanna think about it anymore.
Whoop, bye bye.
I'm gone, but we can't.
Do you think it's possible to build a system?
So you said human level intelligence
is a weird concept, but.
Well, I'm saying I prefer general intelligence.
I mean, human level intelligence is a real thing,
and you could try to make a machine that matches people
or something like that.
I'm saying that per se shouldn't be the objective.
But rather, that we should learn from humans
that things they do well and incorporate that into our AI,
just as we incorporate the things that machines do well
that people do terribly.
So I mean, it's great that AI systems
can do all this brute force computation that people can.
And one of the reasons I work on this stuff is because I would like to see machine solve problems
that people can't, that combine the strength, or to, that in order to be solved, would combine
the strengths of machines to do all this computation with the ability, let's say, of people to read.
So, you know, I'd like machines that can read the entire medical literature in a day.
You know, 7,000 new papers, or's the numbers comes out every day. There's
no way for, you know, any doctor or whatever to read them all. The machine that could read
would be a brilliant thing. And that would be strengths of, you know, brute force computation
combined with kind of subtlety and understanding medicine that, you know, good doctor or scientist
has.
So if we can linger a little bit on the idea of general intelligence, so young LeCoon believes
that human intelligence isn't general at all, it's very narrow.
How do you think, I don't think that makes sense.
We have lots of narrow intelligences for specific problems, but the fact is, like anybody
can walk into, let's say, a Hollywood movie, and reason about the content of almost anything that goes on there.
So you can reason about what happens in a bank robbery or what happens when someone is infertile and wants to, you know,
go to IVF to try to have a child or you can, you know, the list is essentially endless and
you know, not everybody understands every
scene in the movie, but there's a huge range of things that pretty much any ordinary adult
can understand.
His argument is that actually the set of things seems large to us humans because we're
very limited in considering the kind of possibilities of experiences that are possible.
But in fact, the amount of experience that are possible is infinitely larger.
Well, I mean, if you want to make an argument that humans are constrained in what they can understand,
I have no issue with that.
I think that's right, but it's still not the same thing at all as saying,
here's a system that can play Go.
It's been trained on 5 million
games. And then I say, can it play on a rectangular board rather than a square board? And you say,
well, if I retrain it from scratch on another 5 million games, I can't. That's really, really narrow
and that's where we are. We don't have even a system that could play go and then without further
retraining play on a rectangular board, which any good human could do, you know, with very little problem. So that's what
I mean by narrow. And so it's just wordplay to say that's just words. Then yeah, you mean
general in a sense that you can do all kinds of go board shapes flexibly. Well, that would
be like a first step in the right direction. Obviously,
that's not what it really means in your kidding. What I mean by a general is that you could transfer
the knowledge you learn in one domain to another. So if you learn about bank robberies in movies and
there's chase scenes, then you can understand that amazing scene in breaking bad when Walter
White has a car chase scene with only one person, he's the only one in it, and you can reflect
on how that car chase scene is like all the other car chase scenes you've ever seen, and
totally different and why that's cool.
And the fact that the number of domains you can do that with is finite doesn't make it
less general.
So the idea of general is you can just do it on a lot of, don't transfer it across a lot of domain. Yeah, I mean, I'm not saying humans are infinitely
general or that humans are perfect. I just said, you know, a minute ago, it's a low bar, but it's
just, it's a low bar, but, you know, right now, like the bar is here and we're there and eventually
we'll get way past it. So speaking of low bars, you've highlighted in your new book as well,
but a couple years ago wrote a paper titled Deep Learning a Critical Appraisal that lists 10 challenges faced by current
deep learning systems. So let me summarize them as data efficiency transfer learning,
hierarchical knowledge, open-ended inference, explainability, integrating prior knowledge, causal reasoning, modeling on stable world,
robustness, adversarial examples, and so on.
And then my favorite problem is reliability and
engineering of real world systems.
So whatever people can read the paper,
they should definitely read the paper,
should definitely read your book.
But which of these challenges if
solved in your view has the biggest impact on the AI community?
It's a very good question.
And I'm going to be evasive because I think that they go together a lot.
So, you know, some of them might be solved independently of others.
But I think a good solution to AI starts by having real, what I would call cognitive models of what's going on.
So right now we have an approach that's dominant where you take statistical approximations. starts by having real, what I would call cognitive models of what's going on.
So right now we have an approach that's dominant where you take statistical approximations
of things, but you don't really understand them.
So you know that bottles are correlated in your data with bottle caps, but you don't understand
that there's a thread on the bottle cap that fits with the thread on the bottle and that
that's what tightens in.
If I tighten enough that there's a seal in the water will come out.
Like there's no machine that understands that.
And having a good cognitive model of that kind of everyday phenomena is what we call common sense.
And if you had that, then a lot of these other things start to fall into at least a little bit better place.
Right now, you're like learning correlations between pixels when you play a video game or something like that.
And it doesn't work
Very well it works when the video game is just the way that you
Study did and then you alter the video game in small ways like you move the paddle and break out a few pixels
And the system falls apart because it doesn't understand it doesn't have a representation of a paddle a ball a wall
The set of bricks and so forth and so it's reasoning at the wrong level
So the idea of common sense is full of mystery.
You've worked on it, but it's nevertheless full of mystery,
full of promise.
What does common sense mean?
What does knowledge mean?
So the way you've been discussing it now
is very intuitive.
It makes a lot of sense that that is something we should have
and that's something deep learning systems don't have.
But the argument could be that we're oversimplifying
it because we're oversimplifying the notion of common sense because that's how we, it feels
like we as humans at the cognitive level approach problems.
So a lot of people aren't actually going to read my book, but if they did read the book,
one of the things that might come as a surprise to them is that we actually say a common sense is really hard and really complicated.
So they would probably, you know, my critics know that I like common sense, but that chapter actually starts by us beating up not on deep learning, the kind of on our own home team as it will.
So Ernie and I are first and foremost people that believe in at least some of what good old fashioned day I try to do.
So we believe in symbols and logic and programming, things like that are important.
And we go through why even those tools that we hold fairly dear aren't really enough.
So we talk about why common sense is actually many things.
And some of them fit really well with those classical sets of tools.
So things like taxonomy. So I know that a bottle is an object or it's a vessel,
let's say, and I know a vessel is an object
and objects are material things in the physical world.
So I can make some inferences.
If I know that vessels need to not have holes in them,
then I can infer that to carry their contents, then I can infer that in order to carry their contents, then
I can infer that a bottle shouldn't have a hole in it in order to carry its contents.
So you can do hierarchical inference and so forth.
And we say that's great, but it's only a tiny piece of what you need for common sense.
We give lots of examples that don't fit into that.
So another one that we talk about is the cheese grater.
You've got holes in a cheese grater.
You've got a handle on top
You can build a model in the game engine sense of a model so that you could have a little cartoon character
Flying around through the holes of the grater, but we don't have a system yet
Taxonomy doesn't help us that much. They really understands why the handle is on top and what you do with the handle
Or why all of those circles are sharp or how you'd hold cheese with respect to the greater in order to make it actually work.
Do you think these ideas are just abstractions that could emerge on a system like a very
large deep neural network?
I'm a skeptic that that kind of emergence per se can work.
So I think that deep learning might play a role in the systems that do what I want systems
to do, but it won't do it by itself.
I've never seen a deep learning system really extract an abstract concept.
What they do, principled reasons for that, stemming from how backpropagation works, how
the architectures are set up.
One example is deep learning people actually all build in something like,- build in something called convolution which Jan Maccoon is famous for
which is an abstraction they don't have their systems learn this so the
abstraction is an object looks the same of it appears in different places and
what will come figured out and why you know essentially why he was a co-winner
of the Turing word was that if you program this in innately, then your system would be a whole
lot more efficient. In principle, this should be learnable, but people don't have systems
that kind of reify things, that make them more abstract. And so what you really wind up
with if you don't program that in advance is a system that kind of realizes that this
is the same thing as this, but then I take your little clock there and I move it over
and it doesn't realize that the same thing applies to the clock.
So the really nice thing, you're right, that convolution is just one of the things that's
like, it's an innate feature that's programmed by the human expert, but we need more of those
not less.
Yes, so the, but the nice feature is, it feels like that requires coming up with that brilliant idea, it can get your touring award,
but it requires less effort than encoding something we'll talk about the expert system,
so encoding a lot of knowledge by hand.
So it feels like there's a huge amount of limitations,
which you clearly outline with deep learning,
but the nice feature of deep learning,
whatever it is able to accomplish,
it does a lot of stuff automatically
without human intervention.
Well, and that's part of why people love it, right?
But I always think of this quote from Bertrand Russell,
which has all the advantages of theft over honest toil.
It's really hard to program into a machine,
a notion of causality, or
you know, even how a bottle works, or what containers are. Ernie Davis and I wrote up, I
don't know, 45 page academic paper trying just to understand what a container is, which
I don't think anybody ever read the paper. But it's a very detailed analysis of all the
things, well not even all, some of the things you need to do in order to understand the container, it would be a whole lot nice and you know,
I'm a co-author in the paper and made it a little bit better but Ernie did the hard work for that particular paper.
It took him like three months to get the logical statements correct and maybe that's not the right way to do it.
It's a way to do it but on that way of doing it, it's really hard work to do something as simple as
understanding containers.
And nobody wants to do that hard work.
Even Ernie didn't want to do that hard work.
Everybody would rather just feed their system in with a bunch of videos with a bunch of containers
and have the systems infer how it can fit into the haters work.
It would be like so much less effort, let the machine do the work.
And so I understand the impulse, I understand why people want to do that.
I just don't think that it works.
I've never seen anybody build a system that in a robust way can actually watch videos
and predict exactly which containers would leak and which ones wouldn't or something.
Like, I know someone's going to go out and do that since I said it and I look forward
to seeing it.
But getting these things to work robustly is really, really hard.
So, Jan LeCoon, who was my colleague at NYU for many years, thinks that the hard work
should go into defining and unsupervised learning algorithm.
That we'll watch videos, use the next frame basically in order to tell it what's going on.
And he thinks that's the royal road, and he's willing to put in the work in devising
that algorithm.
Then he wants the machine to do the rest.
And again, I understand the impulse.
My intuition, based on years of watching this stuff and making predictions 20 years ago
that still hold, even though there's a lot more computation and so forth, is that we
actually have to do a different kind of hard work, which is more like building a design specification
for what we want the system to do, doing a hard engineering work to figure out how we
do things like what Yon did for convolution in order to figure out how to encode complex
knowledge into the system. Current systems don't have that much knowledge other than convolution,
which is again, this object being in, object-expeeing in different places
and having the same perception, I guess I'll say.
Same appearance.
People don't want to do that work.
They don't see how to naturally fit one with the other.
I think that's, yes, absolutely.
But also on the expert system side,
there's a temptation to go too far the other way,
so which is having an expert sort of sit down
and encode the description, the framework
for what a container is, and then having the system reason
to the rest.
For my view, one really exciting possibility
is of active learning where it's continuous interaction
between a human and machine, as the machine
that there's kind of deep learning type
of extraction information from data patterns and so on, but humans also guiding the learning procedures, guiding both the
process and the framework of how the machine learns, whatever the task is. I was with you with almost
everything except the phrase deep learning. What I think you really want there is a new form of machine learning.
So let's remember deep learning is a particular way
of doing machine learning.
Most often it's done with supervised data
for perceptual categories.
There are other things you can do with deep learning,
some of them quite technical,
but the standard use of deep learning
is I have a lot of examples and I have labels for them.
So here are pictures.
This one's the Eiffel Tower.
This one's the Sears Tower.
This one's the Empire State Building.
This one's a cat.
This one's a pig and so forth.
You just get millions of examples, millions of labels.
And deep learning is extremely good at that.
It's better than any other solution that anybody has devised.
But it is not good at representing abstract knowledge.
It's not good at representing
things like bottles contain liquid and, you know, have tops to them in some way. It's not
very good at learning or representing that kind of knowledge. It is an example of having
a machine learn something, but it's a machine that learns a particular kind of thing, which
is object classification. It's not a particularly good algorithm for learning about the abstractions that govern our world. There may be such a thing, part
of what we counsel in the book is maybe people should be working on devising such things.
So one possibility, just I wonder what you think about it is, deep neural networks do form
abstractions, but they're not accessible to us humans in terms of we can't
sum truth in that. So is it possible that either current or future neural networks form
very high level abstractions, which are as powerful as as our human abstractions of common
sense, which just can't get a hold of them. And so the problem is essentially what we need
to make them explainable.
This is an astute question, but I think the answer is at least partly no. One of the kinds of classical
neural network architectures is what we call an auto-associator. It just tries to take an input,
goes through a set of hidden layers, and comes out with an output. And it's supposed to learn
essentially the identity function that your input is the same as your output. So you think of
those binary numbers, you've got like the one, the two, the four, the eight,
the 16, and so forth.
And so if you want to input 24, you turn on the 16, you turn on the eight.
It's like binary one, one, and bunch of zeros.
So I did some experiments in 1998 with the sort of the precursors of contemporary deep
learning.
And what I showed was you could train these networks on all the even numbers and they would
never generalize the odd number.
A lot of people thought that I was, I don't know, an idiot or faking the experiment or
wasn't true or whatever, but it is true that with this class of networks that we had in
that day, that they would never ever make this generalization.
And it's not that the networks were stupid, it's that they see the world in a different
way than we do.
They were basically concerned, what is the probability that the rightmost output node
is going to be one?
And as far as they were concerned, and everything they'd ever been trained on, it was a zero.
That node had never been turned on, and so they figured, well, I turned it on now.
Whereas a person would look at the same problem and say, well, it's obvious. We're just doing
the thing that corresponds. The Latin for it is Mutatus Mutatus. We'll change what needs
to be changed. And we do this. This is what algebra is. So I can do f of x equals y plus
two. And I can do it for a couple of values. I can tell you of y is three, then x is five,
and if y is four, x is six. And now I can do it with some totally different number, like a million,
they can say, well, obviously it's a million and two, because you have an algebraic operation
that you're applying to a variable. And deep learning systems kind of emulate that, but they don't
actually do it. The particular example, you can fudge a solution to that particular problem,
but the general form
of that problem remains that what they learn is really correlations between different input
and output nodes.
They're complex correlations with multiple nodes involved and so forth, but ultimately
they're correlative.
They're not structured over these operations over variables.
Now someday people may do a new form of deep learning that incorporates that stuff, and
I think it will help a lot
And there's some tentative work on things like differentiable programming right now that fall into that category
But this sort of classic stuff like people use for ImageNet
Doesn't have it and you have people like Canton going around saying simple manipulation like what Marcus would I advocate is like the gasoline engine
It's obsolete. We should just use this cool electric power that we've got with a deep learning. And that's really destructive
because we really do need to have the gasoline engine stuff that represents, I
mean, I don't think it's a good analogy, but we really do need to have the stuff
that represents symbols. Yeah. And the hidden as well, we'll say that we do need to
throw out everything and start over
So I mean there's a lot yeah, you know hint and said that to Axios and I had a friend who
Interviewed him and tried to pin him down on what exactly we need to throw and he was very evasive
Well, of course because we can't if he knew that he'd throw it out himself
But yeah, but I mean you can't have it both ways
You can't be like I don't know what to throw out
But I am gonna throw out the symbols. I mean, you can't have it both ways. You can't be like, I don't know what to throw out, but I am gonna throw out the symbols.
I mean, it not just the symbols,
but the variables and the operations over variables.
And don't forget, the operations over variables,
the stuff that I'm endorsing,
and which, you know, John McCarthy did when he found a AI,
that stuff is the stuff that we build most computers out of.
There are people now who say,
we don't need computer programmers anymore,
not quite looking at the statistics of how much computer programmers actually get paid
right now. We need lots of computer programmers. And most of them, they do a little bit of
machine learning, but they still do a lot of code, right? Code where it's like, you know,
if the value of X is greater than the value of Y, then do this kind of thing, like conditionals
and comparing operations over variables. Like, if there's this fantasy,
you can machine learn anything.
There's some things you would never want to machine learn.
I would not use a phone operating system
that was machine learned.
Like, you made a bunch of phone calls
and you recorded which packets were transmitted
and you just machine learned it, be insane.
Or to build a web browser by taking logs of keystrokes
and images, screenshots, and then trying to learn
the relation between the, no way,
we'd ever, no rational person would ever
try to build a browser that way.
They would use symbol manipulation,
the stuff that I think AI needs to avail itself
of in addition to deep learning.
Can you describe what your view of symbol manipulation
in its early days?
Can you describe expert systems?
And where do you think they hit a wall or a set of challenges?
Sure. So I mean, first I just want to clarify, I'm not endorsing expert systems per se.
You've been kind of contrasting them. There is a contrast, but that's not the thing that I'm endorsing.
Yes. So expert systems try to capture things like medical knowledge with a large set of
rules. So, if the patient has this symptom and this other symptom, then it is likely that
they have this disease. So, there are logical rules and they were simple manipulating rules
of just the sort that I'm talking about. And the problem? They code a set of knowledge
that the experts have put in. And very explicitly so.. So you'd have somebody interview an expert and then try to turn that stuff into rules.
And at some level, I'm arguing for rules.
But the difference is those guys did in the 80s, who was almost entirely rules, almost
entirely handwritten with no machine learning.
What a lot of people are doing now is almost entirely one species of machine learning with
no rules. And what I'm counseling is actually a hybrid. I'm saying that species of machine learning with no rules.
And what I'm counseling is actually a hybrid.
I'm saying that both of these things have their advantage.
So if you're talking about perceptual classification, how do I recognize a bottle?
Deep learning is the best tool we've got right now.
If you're talking about making inferences about what a bottle does, something closer to
the expert systems is probably still the best available alternative and probably we want
something that is better able to handle quantitative and statistical information than expert systems is probably still the best available alternative and probably we want something
that is better able to handle quantitative and statistical information than those classical
systems typically were. So we need new technologies that are going to draw some of the strengths
of both the expert systems and the deep learning but are going to find new ways to synthesize
them. How hard do you think it is to add knowledge at the low level? So mind human intellects to add extra information to
symbol manipulating systems.
And some domains, it's not that hard, but it's often really hard.
Partly because a lot of the
things that are important, people wouldn't bother to tell you.
So if you pay someone on Amazon Mechanical
Turk to tell you stuff about bottles, they probably won't even bother to tell you some
of the basic level stuff that's just so obvious to a human being and yet so hard to capture
in machines. You know, they're going to tell you more exotic things and like they're all well and good
but they're not getting to the root of the problem.
So untutored humans aren't very good at knowing and why should they be, what kind of knowledge
the computer system developers actually need.
I don't think that that's an irremediable problem.
I think it's historically been a problem. People have had crowdsourcing efforts and they don't think that that's an irremediable problem. I think it's historically been a problem.
People have had crowdsourcing efforts and they don't work that well.
There's one at MIT, we're recording this at MIT, called Virtual Home,
where, and we talk about this in the book, find the exact example there,
but people would ask to do things like describe an exercise routine.
And the things that the people describe are very low level and don't
really capture what's going on. So like, go to the room with the television and the weights,
turn on the television, press the remote to turn on the television, lift weight, put
weight down, it's like very micro level. And it's not telling you what an exercise routine
is really about, which is like, I want to fit a certain number of exercises in a certain time period, I want to emphasize these
muscles.
I mean, you want some kind of abstract description.
The fact that you happen to press the remote control in this room when you watch this
television isn't really the essence of the exercise routine.
But if you just ask people like, what did they do?
Then they give you this fine grain. And so it takes a level of expertise about how the AI works in order to craft the right kind of knowledge.
So there's this ocean of knowledge that we all operate on. Some of them may not even
be conscious, or at least we're not able to communicate it effectively. Yeah, most of
it we would recognize if somebody said it if it was true or not, but we wouldn't think to say that it's true or not.
It's a really interesting mathematical property.
This ocean has the property that every piece of knowledge in it, we would recognize it as
true if it was, we're told, but we're unlikely to retrieve it in the reverse.
So that, that interesting property, I would say there's a huge ocean of that knowledge.
What's your intuition? Is it accessible to AI systems somehow? Can we? So you said, I mean,
most of it is not, I'll give you an asterisk on this in a second, but most of it is not
ever been encoded in machine interpretable form. And so, I mean, if you say accessible,
there's two meanings of that.
One is like, could you build it into a machine?
Yes.
The other is like, is there some database
that we could go, you know, download
and stick into our machine?
The first thing, no.
Could we?
Is what's the integration?
I think we could.
I think it hasn't been done right.
You know, the closest, and this is the asterisk,
is the CYC psych system
trying to do this. A lot of logicians worked for Doug Lennon for 30 years on this project.
I think they stuck too close to logic, didn't represent enough about probabilities, tried to
hand code it. There are various issues. It hasn't been that successful. That is the closest
successful. That is the closest existing system to trying to encode this. Why do you think there's not more excitement slash money behind this idea?
Currently. There was people view that project as a failure. I think that they confused
the failure of a specific instance that was conceived 30 years ago for the failure of
an approach, which they don't do for deep learning. So, you know, in 2010 people had the same attitude towards deep learning.
They're like, this stuff doesn't really work.
And, you know, all these other algorithms work better and so forth.
And then certain key technical advances were made.
But mostly, it was the advent of graphics processing units that changed that.
It wasn't even anything foundational in the techniques.
And there was some new tricks, but mostly it was just more compute.
And more data, things like ImageNet that didn't exist before, that allowed deep learning.
And it could be, to work, it could be that, you know,
psych just needs a few more things or something likes like.
But the widespread view is that that just doesn't work.
And people are reasoning from a single example.
They don't do that with deep learning.
They don't say nothing that existed in 2010, and there were many, many efforts in deep
learning was really worth anything, right?
I mean, really, there's no model from 2010 in deep learning, the predecessors that deep
learning, that is any commercial value whatsoever at this point.
Right there, they're all failures.
But that doesn't mean that there wasn't anything there.
I have a friend, I was getting to know him,
and he said, I had a company too,
I was talking about a new company.
And he said, I had a company too, and it failed.
And I said, well, what did you do?
And he said, deep learning.
And the problem was he did it in 1986 or something like that.
And we didn't have the tools then or 1990.
We didn't have the tools then, not the algorithms.
His algorithms weren't that different from ModO.
But he didn't have the GPUs to run it fast enough.
He didn't have the data.
And so it failed.
It could be that, you know, symbol manipulation
per se with modern amounts of data and compute and maybe
some advance in compute for that kind of compute might be great.
My perspective on it is not that we want to resuscitate that stuff per se, but we want
to borrow lessons from it, bring together with other things that we've learned.
And it might have an image net moment where it will spark the world's imagination and
it will be an explosion of simple manipulation efforts.
Yeah, I think that people at AI2, the Paul Allen's AI Institute, are trying to build data
sets that, well, they're not doing it for quite the reason that you say, but they're trying
to build data sets that at least spark interest in common sense reasoning.
So create benchmarks that people...
Benchmarks for common sense.
That's a large part of what the AI2.org is working on right now.
So speaking of compute,
Rich Sutton wrote a blog post titled Bit or Lesson,
I don't know if you've read it,
but he said that the biggest lesson that can be read
from 70 years of AI research is that general methods,
that leverage computation are ultimately the most effective.
But most effective of what?
So they have been most effective for perceptual classification
problems and for some reinforcement learning problems.
And he works on reinforcement learning.
Well, no, let me push back on that.
You're actually absolutely right.
But I would also say they would
been most effective generally because everything we've done up to.
Would you argue against that? To me, deep learning is the first thing that has been successful
at anything in AI. And you're pointing out that this success is very limited, folks,
but has there been something truly successful
before deep learning?
Sure, I mean, I want to make a larger point,
but on the narrower point, the classical AI
is used, for example, in doing navigation instructions.
Sure, sure.
It's very successful.
Everybody on the planet uses it now,
like multiple times a day.
That's a measure of success, right?
So, I mean, I don't think classical AI was wildly successful,
but there are cases like that that are just used all the time.
Nobody even notices them because they're so pervasive.
So, there are some successes for classical AI.
I think deep learning has been more successful,
but my usual line about this, and I didn't invent it,
but I like it a lot, it's just because you can build
a better ladder, it doesn't mean you can build
a ladder to the moon.
So the better lesson is if you have a perceptual
classification problem, throwing a lot of data at it
is better than anything else.
But that has not given us any material progress in natural
language understanding, common sense reasoning like a robot would need to navigate a home,
problems like that, there's no actual progress there.
So, flip side of that, if we remove data from the picture, another bit of lesson is that
you just have a very simple algorithm
and you wait for compute to scale.
This doesn't have to be learning,
it doesn't have to be deep learning,
it doesn't have to be data driven,
but just wait for the compute.
So my question for you,
do you think compute can unlock some of the things
with either deep learning or simple manipulation that?
Sure, but I'll put a, you know, proviso on that.
The more compute's always better.
Like, nobody's gonna argue with more compute.
It's like having more money.
I mean, the data returns are more money.
Exactly, there's diminishing returns on more money,
but nobody's gonna argue if you wanna give them more money, right?
I mean, except maybe the people who signed the giving pledge
and some of them have a problem,
they have problems to give away more money
than they're able to.
But the rest of us, you know, if you want to give me more money, find.
Say more, more, more problems, but okay.
That's true too.
What I would say to you is your brain uses like 20 watts,
and it does a lot of things that deep learning doesn't do,
or the simple manipulation doesn't do, that AI just hasn't figured out how to do.
So it's an existence proof that you don't need
server resources that are Google scale
in order to have an intelligence.
I built with a lot of help from my wife
to intelligences that are 20 watts each
and far exceed anything that anybody else
has built at a cell phone.
Speaking of those two robots,
how, what have you learned about AI
from having, well, they're not robots,
but the start of the intelligence agents.
If there's two intelligence agents,
I've learned a lot by watching my two intelligence agents.
I think that what's fundamentally interesting,
well, one of the many things
that's fundamentally interesting about them is the way that they set their own problems to solve.
So my two kids are a year and a half apart, they're about five and six and a half, they play together all the time, and they're constantly creating new challenges.
Like that's what they do is they make up games and they're like, well, what if this or what if that or what if I had this superpower or what if, you know, you could walk through this wall. So they're doing these, what if scenarios
all the time. And that's how they learn something about the world and grow their minds and
machines don't really do that. So that's interesting. And you've talked about this, you've written about
it, you thought about it, nature versus nurture.
So what innate knowledge do you think we're born with?
And what do we learn along the way in those early months and years?
Can I just say how much I like that question?
You phrased it just right, and almost nobody ever does.
Which is what is the innate knowledge and what's learned along the way.
So many people that catamize it, and they think it's nature versus nurture.
When it is obviously, has to be nature and nurture.
They have to work together.
You can't learn the stuff along the way unless you have some innate stuff, but just because
you have the innate stuff doesn't mean you don't learn anything.
And so many people get that wrong, including in the field, like people think,
if I work in machine learning, the learning side,
I must not be allowed to work on the innate side
with that, we'll be cheating.
Exactly, people have said that to me.
And it's just absurd.
So thank you.
But you know, you could break that apart more.
I've talked to folks who studied the development
of the brain, and I who studied the development of the brain.
The growth of the brain in the first few days in the first few months,
in the womb, all of that, is that innate? So that process of development from a stem cell to the growth of the central nervous system and so on, to the information that's encoded through the long
arc of evolution. So all of that comes into play and it's unclear. It's not just whether
it's a dichotomy or not. It's where most or where the knowledge is encoded. So what's
your intuition about the innate knowledge, the power of it, what's contained in it, what can we learn from it?
One of my earlier books was actually trying to understand the biology of this.
The book was called The Birth of the Mind.
How is it that genes even build innate knowledge?
And from the perspective of the conversation we're having today, there's actually two questions.
One is what innate knowledge or mechanisms would have you.
People or other animals might be endowed with. One is what innate knowledge or mechanisms or what have you.
People or other animals might be in doubt with,
I always like showing this video of a baby Ibex
climbing down a mountain, that baby Ibex,
you know, a few hours after his birth,
knows how to climb down a mountain,
that means that it knows not consciously
something about its own body and physics
and 3D geometry and all of this kind of stuff.
So there's one question about it,
like what is biology give its creatures?
What is evolved in our brains?
How is that represented in our brains?
The question I thought about in the book,
The Birth of the Mind.
And then there's a question of what AI should have
and they don't have to be the same.
But I would say that it's a pretty interesting set
of things that we are equipped with that allows us to do a lot of interesting things
So I would argue or guess based on my reading of the developmental psychology literature, which I've also participated in
that
Children are born with an ocean of space time other agents places
And also this kind of mental algebra that I was describing before. No
certain causation if I didn't just say that. So at least those kinds of things.
They're like frameworks for learning the other things. So are they disjoint in
your viewers? Is it just somehow all connected? You've talked a lot about
language. Is it all kind of connected as some mesh that's language like, if understanding
concepts altogether?
I don't think we know for people how they're represented in machines, just don't really
do this yet.
So I think it's an interesting open question, both for science and for engineering.
Some of it has to be at least interrelated in the way that the interfaces of a software
package have to be able to talk to one another.
The systems that represent space and time can't be totally disjoint because a lot of the
things that we reason about are the relations between space and time and cause.
I put this on and I have expectations about what's going to happen with the bottle cap on
top of the bottle and those span space and time.
If the cap is over here, I get a different outcome.
If the timing is different, if I put this here, after I move that, then I get a different outcome that relates to causality.
So obviously these mechanisms, whatever they are, can certainly communicate with each other.
So, I think evolution had a significant role to play in the development of this whole
clue, right?
How efficient do you think is evolution?
Oh, it's terribly inefficient.
Accept that.
Well, can we do better?
Let's come to that.
It's inefficient, except that. Once it gets a good idea it runs with it
so it took
I guess a billion years if I went roughly a billion years to
evolve to a vertebrate
Brain plan and once that vertebrate playing plan evolved it spread everywhere
So fish have it and dogs have it and we have it we have adaptations of it And once that vertibriplane plan evolved, it spread everywhere.
So fish have it and dogs have it and we have it.
We have adaptations of it and specializations of it.
But in the same thing with a primate brain plan.
So monkeys have it and apes have it and we have it.
So you know, the additional innovations like color vision and those spread really rapidly.
So it takes evolution a long time to get a good idea,
but being anthropomorphic and not literal here. But once it has that idea, so to speak,
which cashes out into one side of genes or in the genome, those genes spread very rapidly.
And they're like subroutines or libraries, I guess the word people might use nowadays or be
more familiar with. They're libraries that get used over and over again.
So once you have the library for building something
with multiple digits, you can use it for a hand,
but you can also use it for a foot.
You just kind of reuse the library
with slightly different parameters.
Evolution does a lot of that, which means
that the speed over time picks up.
So evolution can happen faster
because you have bigger and bigger libraries. And what I think has happened in attempts at evolutionary computation is that people start with
libraries that are very, very minimal, like almost nothing, and then progress is slow.
And it's hard for someone to get a good PhD thesis out of it and then give up.
If we had richer libraries to begin with, if you were evolving from systems that hadn't
originated structured to begin with, then things might speed up. Or more PhD students,
if they have a Luchin processes indeed and a meta way, runs away with good ideas, you need to have
a lot of ideas, pool of ideas in order for it to discover one that you can run away with.
And PG students representing individual ideas as well.
Yeah, I mean, you could throw a billion PhD students at it.
Yeah.
The monkeys are type riders with Shakespeare.
Yeah.
Well, I mean, those aren't cumulative, right?
That's just random.
And part of the point that I'm making is that evolution is cumulative.
So if you have a billion monkeys independently, you don't really get anywhere.
But if you have a billion monkeys, I think dockens made this point originally,
or probably other people, dockens made it very nice and either a selfish,
tinder blind watchmaker.
If there is some sort of fitness function, it can drive you towards something.
I guess that's dockens point.
And my point, which is a variation on that, is that if the evolution is cumulative, I mean the related points, then you can start on faster.
Do you think something like the process of evolution is required to build the intelligent systems? So if we do logically, so all the stuff that evolution did, a good engineer might be able to do. So for example, evolution made quadrapads, which distribute the load across a horizontal
surface.
A good engineer could come up with that idea.
I mean, sometimes good engineers come up with ideas by looking at biology.
There's lots of ways to get your ideas.
Party what I'm suggesting is we should look at biology a lot more.
We should look at the biology of thought and understanding
and the biology by which creatures intuitively reason about physics or other agents or like how
do dogs reason about people? Like they're actually pretty good at it. If we could understand my
college we joke dog mission, if we could understand dog mission well, and how it was implemented, that might help us with our AI.
So do you think it's possible that the kind of time scale
that evolution took, is the kind of time scale
that would be needed to build intelligent systems?
Or can we significantly accelerate that process
inside a computer?
I mean, I think the way that we accelerate that process
is we borrow from biology.
Not slavishly, but I think we look at bi- how biology is solved problems and we say, does
that inspire any engineering solutions here? Try to mimic biological systems and then
therefore have a shortcut. Yeah, I mean, there's a field called biomem, and people do that for like material science all the time.
We should be doing the analog of that for AI, and the analog for that for AI is to look
at cognitive science or the cognitive sciences, which is psychology, maybe neuroscience, linguistics,
and so forth, and look to those for insight.
What do you think is a good test of intelligence in your view?
So I don't think there's one good good task. In fact, I tried to organize a movement toward something called a Turing Olympics and
my hope is that Francois actually going to take Francois Chalet is going to take over this. I think
he's interested in I just don't have place in my busy life at this moment, but the notion is that
there'd be many tests and not just one because intelligence is multifaceted.
There can't really be a single measure of it because it isn't a single thing.
Just the crudest level, the SAT has a verbal component and a math component because they're not
identical. And Howard Gardner has talked about multiple intelligence, like kinesthetic intelligence
and verbal intelligence and so forth. There are a lot of things that go into intelligence and people can get glued to one or the other.
I mean, in some sense, every expert is developed a very specific kind of intelligence.
And then there are people that are generalists.
And I think of myself as a generalist with respect to cognitive science,
which doesn't mean I know anything about quantum mechanics,
but I know a lot about the different facets of the mind.
And, you know, there's a kind of intelligence to thinking about intelligence. I like to think that I have some of that, but social intelligence, I'm just okay.
I'm, you know, there are people that are much better at that than I am.
Sure, but what would be really impressive to you?
You know, I think the idea of a touring Olympic is really interesting,
especially if somebody like Francois running it, but
to you in general,
Not as a benchmark, but if you saw an AI system being able to accomplish something that would impress the heck out of you
What would that thing be? Would it be natural English conversation? For me personally, I
Would like to see a kind of comprehension that
relates to what you just said. So I wrote a piece in the New Yorker in I think 2015, right
after Eugene Gustman, which was a software package, won a version of the Turing test. And
the way that it did this is it, well, the way you win the Thuring Test, so called when it is, you know, the
Thuring Test is you fool a person into thinking that
the machine is a person is you're evasive, you pretend to
have limitations, so you don't have to answer certain
questions, and so forth. So this particular system
pretended to be a 13 year old boy from Odessa, who didn't
understand English and was kind of sarcastic and wouldn't
answer your questions and so forth.
And so, judges got fooled into thinking, you know, briefly with a very little exposure.
It was a 13 year old boy.
And it docked to all the questions that Turing was actually interested in, which is like,
how do you make the machine actually intelligent?
So, that test itself is not that good.
And so, in New Yorker, I proposed alternative, and alternative, I guess.
And the one that I proposed there was a comprehension test.
And I must like breaking back as I've already given you one
breaking bad example.
And in that article, I have one as well, which was like something
that if Walter what you should be able to watch an episode of
breaking bad, or maybe you have to watch the whole series to
be able to answer the question and say, you know, if Walter
White took a hit out on Jesse, why did he do that? Right.
So if you could answer kind of arbitrary questions
about characters, motivations, I would be really impressed
with that.
I mean, he built software to do that.
They could watch a film or their different versions.
And so ultimately, I wrote this up with Praveen
Paratosh in a special issue of AI magazine
that basically was about the Turing Olympics.
There were like 14 tests proposed.
And the one that I was pushing was a comprehension challenge
and Praveen, who's at Google, was trying to figure out
like how we would actually run it.
And so we wrote a paper together.
And you could have a text version too,
or you could have an auditory podcast version,
you'd have a written version.
But the point is that you win at this test
if you can do, let's say, human level
or better than humans, at answering kind
of arbitrary
questions.
Why did this person pick up the stone?
What were they thinking when they picked up the stone?
Were they trying to knock that glass?
And I mean, ideally, these wouldn't be multiple choice,
either, because multiple choice is pretty easily
gameed.
So if you could have relatively open-ended questions,
and you can answer why people are doing this stuff,
I would be very impressed.
And of course, humans can do this, right? If you watch a well-constructed movie, questions and you can answer why people are doing this stuff. I would be very impressed.
And of course, humans can do this, right?
If you watch a well-constructed movie
and somebody picks up a rock,
everybody watching the movie knows
why they picked up the rock, right?
They all know, oh my gosh,
he's gonna hit this character or whatever.
We have an example in the book about
when a whole bunch of people say,
I am Spartacus,
you know, this famous scene.
You know, the viewers understand, first of all, that everybody or everybody minus one has
to be lying.
They can't all be Spartacus.
We have enough common sense knowledge to know they couldn't all have the same name.
We know that they're lying and we can infer why they're lying, right?
They're lying to protect someone and to protect things they believe in.
You get a machine that can do that.
They can say, this is why these guys all got up and said, I am Spartacus.
I will sit down and say, AI has really achieved a lot.
Thank you.
Without cheating any part of the system.
Yeah, if you do it, there are lots of ways you can cheat.
You could build a Spartacus machine that works on that film.
Like that's not what I'm talking about.
I'm talking about you can do this with essentially arbitrary films, sort of, you know, from
a large set.
Even beyond films, because it's possible such a system would discover that the number of
narrative arcs in film is like limited to like famous thing about the classic seven
plots or whatever.
I don't care if you want to build in the system, you know, boy meets girl, boy loses girl, boy finds girl.
That's fine. I don't mind having some head story knowledge.
Okay. Good. I mean, you could build it in the nailer, you could have your system watch
just a lot of films again. If you can do this at all, but with a wide range of films,
not just one film in one genre, but even if you could do it for all Westerns,
I'd be reasonably impressed.
Yeah.
So, in terms of being impressed, just for the fun of it, because you've put so many interesting
ideas out there in your book, challenging the community for further steps, is it possible
on the deep learning front that you're wrong about its limitations?
That deep learning will unlock Yanleku next year or publisher paper that achieves this comprehension?
So do you think that way often as a scientist? Do you consider that your intuition that
Deep learning could actually run away with it?
I'm more worried about rebranding is a kind of political thing.
So I mean, what's going to happen, I think, is the deep learning is going to start to encompass
simple manipulation.
So I think Hinton's just wrong.
You know, Hinton says we don't want hybrids.
I think people work towards hybrids and they will relabel their hybrids as deep learning.
We've already seen some of that.
So AlphaGo is often described as a
deep learning system, but it's more correctly described as a system that has deep learning, but also
Monte Carlo tree search, which is a classical AI technique. And people will start to blur the lines
in the way that IBM blurred Watson. First Watson meant this particular system, and then it was just
anything that IBM built in their cognitive vision. But purely, let me ask for sure. That's a
branding question, and that's a giant mess. I mean, purely a single neural network being able to accomplish. I don't
reason. I don't stay up at night worrying that that's going to happen. And I'll just
give you two examples. One is a guy at DeepMind thought he had finally outfoxed me at
Zergie Lord, I think, his Twitter handle. And he said, he specifically made an example.
Marcus said that, such and such, he fed it into GP2,
which is the AI system that is so smart that OpenAI couldn't release it
because it would destroy the world, right?
You remember that for a few months ago.
So he feeds it into GP2, and my example was something like a of rose, a tulip, a tulip, a lily,
is a blank and he got it to actually do that, which was a little bit impressive and I wrote
back and I said that's impressive but can I ask you a few questions? I say, can it, you know,
was that just one example? Can it do it generally and can do it with novel words, which is part of
what I was talking about in 1998 when I first raised the example. So a DAX is a DAX, right?
And he sheepishly wrote back about 20 minutes later
and the answer was, well, it had some problems with those.
So, you know, I made some predictions 21 years ago
that still hold in the world of computer science.
That's amazing, right?
Because, you know, there's a thousand or a million times
more memory and, you know, computations, a million times, you know, do a million times more memory and computations, a million times more
operations per second spread across a cluster.
There's been advances in replacing sigmoids with other functions and so forth.
These are all kinds of advances, but the fundamental architecture hasn't changed and the fundamental
limit hasn't changed.
What I said then is kind of still true.
Here's a second example.
I recently had a piece in wire that's adapted from the book.
The book went to press before GP2 came out, but we described this children's story and
all the inferences that you make in this story about a boy finding a lost wallet.
And for fun, in the wired piece, we ran it through GP2.
GP2, something called TalkToTransformer.com,
and your viewers can try this experiment themselves.
Go to the wired piece that has the link and it has the story.
And the system made perfectly fluent text
that was totally inconsistent with the conceptual
underpinnings of the story.
This is what, again, I predicted in 1998, and for that matter, Chomsky Miller made the
same prediction in 1963.
I was just updating their claim for a slightly new text.
Those particular architectures that don't have any bills in knowledge, they're basically
just a bunch of layers doing correlational stuff.
They're not going to solve these problems. So 20 years ago, you said the emperor is no close.
Today, the emperor still has no clothes. The lighting is better though. The lighting is better.
And I think you yourself are also, I mean, and we found out some things to do with naked
emperors. I mean, it's not like stuff is worth it. I mean, they're not really naked. It's more like they're in their briefs and everybody thinks that it's
so like, I mean, they are great at speech recognition. But the problems that I said were hard
because I didn't literally say the emperor has no clothes. I said, this is a set of problems
that humans are really good at. And it wasn't couched as AI. It was couched as cognitive
science. But I said, if you want to build a neural model of how humans
do certain class of things, you're going to have to change the architecture, and I stand
by those claims.
So, and I think people should understand you're quite entertaining in your cynicism, but
you're also very optimistic in a dreamer about the future of AI, too.
So you're both, you're just, there's a famous saying about being people overselling technology in the short run and
underselling it in the long run.
And so I actually end the book, or Ernie Davis and I end our book with an optimistic chapter,
which kind of killed Ernie because he's even more pessimistic than I am.
He describes me as a contrarian and him as a pessimist.
But I persuaded that we should
end the book with a look at what would happen if AI really did incorporate, for example,
the common sense reasoning and the nativism and so forth, the things that we counseled for.
And we wrote it, and it's an optimistic chapter that AI suitably reconstructed so that we could
trust it, which we can't do now, could really be world-changing.
So, on that point, if you look at the future trajectories of AI, people have worries about negative effects of AI,
whether it's at the large existential scale or smaller, short-term scale of negative impact on society.
So, you write about trustworthy AI.
How can we build AI systems
that align with our values that make for a better world that we can interact with, that
we can trust?
The first thing we have to do is to replace deep learning with deep understanding. So
you can't have alignment with a system that traffics only in correlations and doesn't
understand concepts like bottles or harm. So as Amal talked about these famous laws, the first one was first do no harm.
And you can quibble about the details of Amal's laws, but we have to, if we're going to build
real robots in the real world, have something like that.
That means we have to program in a notion that's at least something like harm.
That means we have to have these more abstract ideas, the deep learning is not particularly
good at. They have to be in the mix somewhere. You could do statistical analysis
about probabilities of given harms, but you have to know what a harm is in the same way
that you have to understand that a bottle isn't just a collection of pixels.
And also, you're implying that you need to also be able to communicate that to humans. So the AI systems would be able to prove to humans that they understand that they know
what harm means.
I might run it in the reverse direction, but roughly speaking, I agree with you.
So we probably need to have committees of wise people, ethicists and so forth.
Think about what these rules ought to be.
We shouldn't just leave it to software engineers. It shouldn't just be software engineers and it shouldn't just be
people who own large mega corporations that are good at technology.
Ethicists and so forth should be involved. But there should be some assembly of wise people as
I was putting it that tries to figure out what the rules ought to be, and those have to get translated into code. You can argue or code or neural networks
or something. They have to be translated into something that machines can work with.
And that means there has to be a way of working the translation. And right now we don't.
We don't have a way. So let's say you and I were the committee and we decide that as
the most first law is actually right. And let's say it's not just two white guys which would
be kind of unfortunate and we have a broad and so we've represented a sample of the world or however
we want to do this. And the committee decides eventually, okay, Asimov's first law is actually pretty
good. There are these exceptions to it. We want to program it in these exceptions. But let's start
with just the first one and then we'll get to the exceptions. First one is first do no harm. Well, somebody
has to now actually turn that into a computer program or a neural network or something. And
one way of taking the whole book, the whole argument that I'm making is that we just don't
not do that yet. And we're fooling ourselves if we think that we can build trustworthy AI.
If we can't even specify in any kind of,
we can't do it in Python, and we can't do it in TensorFlow,
we're fooling yourselves in thinking
that we can make trustworthy AI,
if we can't translate harm into something that we can execute.
And if we can't, then we should be thinking really hard,
how could we ever do such a thing?
Because if we're going to use AI in the ways
that we want to use it to make job interviews or to do surveillance, not that I personally want to do that or
whatever. I mean, if we're going to use AI in ways that have practical impact on people's
lives or medicine, it's got to be able to understand stuff like that.
So, one of the things your book highlights is that, you know, a lot of people in the deep
learning community, but also the general public, politicians,
just people in all general groups and walks of life have a different levels of misunderstanding
of AI.
So, when you talk about committees, what's your advice to our society?
How do we grow?
How do we learn about AI such that such committees could emerge?
Or large groups of people could have a productive discourse about how to build successful AI systems.
Part of the reason we wrote the book was to try to inform those committees. So part of the reason we
were at the book was to inspire a future generation of students to solve what we think are the important
problems. So a lot of the book is trying to solve what we think are the important problem.
So a lot of the book is trying to pinpoint what we think are the hard problems where we
think effort would most be rewarded.
And part of it is to try to train people who talk about AI, but aren't experts in the
field to understand what's realistic and what's not.
One of my favorite parts in the book is the six questions you should ask.
Anytime you read a media account, so like number one is if somebody talks about something,
look for the demo. If there's no demo, don't believe it. Like a demo that you can try. If you can't
try it at home, maybe it doesn't really work that well yet. So if we don't have this example in the
book, but if Sundar Pinchai says, we have this thing that allows it to sound like human beings.
In conversation, you should ask, can I try it?
And you should ask how general it is.
And it turns out at that time,
I'm alluding to Google Duplex when it was announced,
it only worked on calling hairdressers,
restaurants, and finding opening hours.
That's not very general.
That's narrow way, I.
And I'm not gonna ask your thoughts about Sophia.
But yeah, I understand that's a really good question
to ask of any kind of
hyped up idea. So it's he has very good material written for her, but she doesn't understand
the things that she's saying. So a while ago, you've written a book on the science of learning,
which I think is fascinating, but the learning case studies of playing guitar, all guitar
zero. I love guitar myself. I've been playing my whole life. So let me ask a very important question. What is your favorite song rock song to listen to or try to play?
Well, those would be different, but I'll say that my favorite rock song for listen to is probably all along the watch tower, the Jimmy Hendrix version.
The Jimmy Hendrix version feels magic to me. I've actually recently learned that I love that song. I've been trying to put it on YouTube,
myself singing, singing is a scary part.
If you could party with a rock star for a weekend,
living or dead, who would you choose?
And pick their mind, it's not necessarily about the party.
Thanks for the clarification.
I guess John Lennon's such an intriguing person
and I mean, think a troubled person, but
an intriguing one.
So, beautiful.
Well, imagine is one of my favorite songs.
So, also one of my favorite songs.
That's a beautiful way to end it.
Gary, thank you so much for talking to me.
Thanks so much for having me. Thank you.