That Neuroscience Guy - The Neuroscience of Artificial Intelligence
Episode Date: July 1, 2022Whether you know it or not, artificial intelligence is probably a part of your daily life. It may be the face recognition in your phone, or vacuum that learns the layout of your house on its own. But ...how does an artificial brain compare to a human brain? In today's episode of That Neuroscience Guy, we discuss the neuroscience behind creating artificial intelligence.Â
Transcript
Discussion (0)
Hi, my name's Olof Kergolsen, and I'm a neuroscientist at the University of Victoria.
And in my spare time, I'm that neuroscience guy.
Welcome to the podcast.
Kia ora, and greetings from New Zealand again.
Sorry this episode's a little bit late, but I'm on vacation and I've just been taking a bit of time off.
But this is the episode for Sunday last week. We'll get it up and we'll get a bite up and we'll try
to keep it on track until I'm back in Canada. You know, since I've been here, I've been thinking a
lot about, you know, topics that might be interesting to people. And I wanted to delve
into the idea of the neuroscience of artificial intelligence.
Now, you've all used artificial intelligence, whether you know it or not.
For instance, if your phone recognizes your face, that's a form of artificial intelligence.
And if you've played video games, in those video games, whether it's something as simple as tic-tac-toe or something complex like Halo,
there is AI there. The computer is able to control
things and it makes decisions. And this even spills out into the real world with robots.
For instance, you might own one of those vacuum cleaners that vacuums your house for you.
How does it do that? And how does this relate to neuroscience? Well, the reason AI is cool from a neuroscience perspective is that a lot of AI is based on how humans learn and how humans make decisions. And there's actually
a field called computational neuroscience, where the whole point of the field is to sort of
try to figure out the mathematical patterns that your brain is computing to learn and make
decisions. So on today's podcast, the neuroscience of artificial intelligence,
and specifically how AI learns and how AI makes decisions,
and how that relates to the human brain.
So the origins of AI can kind of be tied to the computer.
The first mechanical computer was designed by Charles Babbage in 1822,
and it basically was what was called a mechanical computer.
It used gears and things like this to be able to do some simple computations.
And this was accelerated, you know, as we headed into World War II specifically.
And you might have heard stories about the code breaking that went on at Letchley Park in the UK,
the code breaking that went on at Letchley Park in the UK, where basically mechanical computers were used to break German U-boat codes to allow the Allies to avoid having ships sunk.
But World War II also generated the first electric computer, and it was programmable.
It was ENIAC in 1943. But really, the history of AI is a little bit further than that, because these computers
were essentially just calculators in a sense. But machines that think sort of evolved in the 50s,
and particularly credit for the term artificial intelligence is given to Marvin Minsky and John
McCarthy at Dartmouth College, who were sort of the fathers of the field of artificial intelligence.
who were sort of the fathers of the field of artificial intelligence.
And they were trying to come up with ideas about how machines could behave in ways like animals and humans do.
So an example of this emerged in the late 90s.
Now we're forwarding quite a bit.
And this was the Deep Blue computer built by IBM that actually played Garry Kasparov,
the world's reigning chess champion, and played him twice.
The first time in 1996, Kasparov beat Deep Blue 4-2, but a rematch was scheduled the
following year in 1997, and Deep Blue beat Garry Kasparov three and a half games to two
and a half games.
It was the first defeat of a reigning world champion by a computer under tournament
conditions. So how did AI work in this case? And what was Deep Blue actually doing? Well,
Deep Blue relied on what are called lookup tables. And a lookup table is fairly straightforward.
The concept of a lookup table is essentially given the state of the board. And let's just
use chess for this example. and if you don't know
chess, you could think of tic-tac-toe, but given the state of the board, so the pieces on the board
and where they are, what is the move the computer should make? All right, and a very simple lookup
table would be, you know, if you think of tic-tac-toe, if x takes a top left corner,
o could take the bottom right corner. So the lookup table would say if X goes there first,
this is where I go.
And lookup tables can be more sophisticated
by simply saying if X goes in this location,
O could make randomly choose one of two locations.
Imagine the middle square or the bottom right square.
And this is true of chess as well.
So what Deep Blue was doing was effectively it was looking for the state of the board that Kasparov had put it in.
And so the pieces on the board were all in positions.
And then Deep Blue was designed to make a move based on that.
So it would say, given Kasparov doing this, I'll do this.
The modern version of AI, if you stick with this kind
of thing, is Google's DeepMind project. And Google's DeepMind doesn't rely on lookup tables.
It uses a combination of what's called deep learning and neural networks to basically learn
how to solve problems. And in the game, there's a couple of great examples of that. Basically,
the DeepMind was asked to beat the entire library of Atari
video games, and it didn't know anything about the games. It just sort of figured out what was told,
the, you know, this is a win, this is a loss. And by losing a whole bunch of times, it figured out
the moves to make or how to move the joystick, if you will, to be able to play the Atari games.
And after a period of training, the AI, in other words, to be able to play the Atari games. And after a period of
training, the AI, in other words, Google DeepMind, could actually play the games better and more
efficiently than humans. And in the one that was in the news a lot more recently, in 2017,
it came to light, was a specialized form of DeepMind called AlphaGo bested the number one player in the world in the game Go, which is
an ancient Chinese board game. And the reason Go is such an important challenge is that the number
of possible moves in Go is equivalent to the number of stars in the known universe. So solving
Go was considered to be an almost impossible problem for AI, but AlphaGo was able to do that using the aforementioned
learning algorithms and neural networks. So for the second half of this, that's a bit of a history
lesson. For the second half of this, I want to sort of talk briefly about how these things work.
And these are concepts that we've sort of covered before, but I want to frame this in the artificial
intelligence world. So if you think of your vacuum cleaner, that little robot vacuum cleaner that learns how
to vacuum your house, well how does it actually work?
How does it learn what to do?
The way it learns is actually pretty straightforward and it relies on what's called reinforcement
learning.
And we've talked about reinforcement learning before and how humans learn back in season
one, but basically if you want to think of the reinforcement learning example with the vacuum cleaner,
basically it just starts driving. And if it moves forward and it doesn't hit anything,
it says, well, this is a good little bit of space here. And it assigns that a value. And
reinforcement learning relies on what are called prediction errors, if you remember.
And reinforcement learning relies on what are called prediction errors, if you remember.
You take what actually happens and what you expect to happen, and you compute a prediction error of the difference.
So if the robot doesn't expect to hit something and it doesn't hit something, there's no prediction error.
So when the robot moves forward and it doesn't hit anything, it actually goes, well, this is good, I haven't hit anything.
So it assigns a value to moving forward from that particular location. So the way you can visualize this is imagine if you took your living room and you drew it on a piece of graph paper, all right?
And you could outline where the walls are, you could outline where the furniture is, and then
you're going to have a whole bunch of empty squares, which is the space that needs to be
vacuumed. So when the robot moves into a square where there's nothing it vacuums away happily
and it says well this is an okay place to be. Now the robot's also keeping track of the squares that
it's visited because it doesn't want to sort of vacuum the same space over and over again.
Now what happens when the robot hits something? Well it basically
treats that as a punishment. So again in terms of prediction errors, basically
what happens is the robot says well moving forward from the space I was just
in is bad because I've hit a wall and it basically has a negative prediction
error. So it didn't expect to hit a wall but it hit a wall so it assigns a
negative value to moving forward in the square
where the wall is. And that's what the robot does, is it just keeps driving around, bumping into
things, and anytime it hits something, it assigns that specific location a negative value. And if
you imagine doing this on a piece of graph paper, what you would come up with, if we just kept the
numbers really simple, is anywhere where there was a wall would be a minus one or a piece of furniture, and anywhere there's open space there would be a one.
And what the robot's learning to do then is just to move across all of the ones.
It's going to cover all of the ones, and the really smart robots will do a bit of math
to figure out what's the best pattern to do this.
The cheaper ones will just drive around randomly until it's literally covered all the ones,
and it might have to retrace
its steps a few times but it's just going to keep looking for ones and when it when it covers the
one you can imagine it sort of pushes it to zero but at the same time it's avoiding those minus
ones it doesn't want to hit walls or furniture and this is why if you move your furniture around
your robot vacuum will get confused because the new furniture location might have been where a
bunch of ones were and now all of a sudden it hits them so it assigns it a
minus one and this is how the robots adapt is they're always computing these
prediction errors every time they vacuum now for a final note on this well how
does it get back to the starting point well the way it gets back to the
starting point is it assigns a very high value, say a 10 to the charging station. So once it's
done vacuuming, all right, it's covered all the ones and then, or it's running low on battery
power because it's keeping track of its own charge level. Well, then it just navigates through that
space looking for that 10. And again, a bit of math would allow it to compute the shortest path
to the 10 that avoids all those minus ones and gets it back to
the starting location. Now, this is also true of how computers and artificial intelligence learns
to play games. So how does AI learn to play a game? Well, quite simply, it uses this same sort
of logic. So imagine a computer is learning to play tic-tac-toe.
Well, the computer will just move randomly, all right?
So it's just going to pick random things in the early stages because it doesn't know anything about whether moves are good or bad,
and it doesn't have those lookup tables that we talked about.
But what it's going to do is when it wins a game,
it's basically going to say,
well, hey, what were the moves I made that game that got me a win?
And it's going to assign positive values or ones to the moves that helped it win.
And when it loses a game, it's going to look at the moves that cost it the game,
and it's going to assign minus ones to those moves.
So it's actually creating its own lookup tables.
So as opposed to being told the lookup table by playing itself a ton of times, it will learn which moves led to
wins and which moves led to losses using these prediction errors. So just the same way that
humans learn. This is what we think happens in the human brain. You know, if you were, and I used
this example back in season one, but imagine you're a student and you wrote an essay and, you know, you thought you got an 80 on your essay, but you got a 60.
This is a negative prediction error because the outcome is 60 and the expectation was 80.
60 minus 80 is minus 20.
And that prompts a change in behavior.
This is how reinforcement learning works.
And this is how artificial intelligence learns to do things
and i'll finish with one more example of this and that is td gammon i've always loved the td
gammon story td gammon was a computer program written by andy tazaro and basically he wanted
a computer to learn to play backgammon and if you don't not familiar with backgammon you could
quickly google it but basically the idea is you have pieces on a board and your opponent has pieces
and you roll dice and move your pieces around the board to get to the other side. And your
opponent's trying to do the same thing and if your opponent lands on you, it knocks your
piece off the board and you have to get back on and vice versa. But it's a fairly complex
game. Anyway, what Andy Tassaro did is he just used
reinforcement learning and he had the computer play itself millions of times over and over and
over again. But every time it won a game, it would say, well, what moves did I make that allowed me
to win the game? And it would assign positive values to those moves. And every time it lost
a game, it would assign negative values to the moves that it made. Now I'm sort of short cutting this a bit for simplicity's sake
because what it actually does is after any given move it's going to compute a
prediction error and if it's getting to a state or a board position that's
closer to the winning the game then it's going to compute a prediction error
saying hey things are better than expected and if it moves into a state that's going to eventually leads to a loss it's going to compute a prediction error saying, hey, things are better than expected. And if it moves into a state that eventually leads to a loss, it's going to know that as well and compute negative prediction errors.
So it doesn't actually do it all right from the end.
It's doing it step by step as it goes through.
But hopefully you get the basic concept that when these artificial intelligence agents make moves,
they compute prediction errors and they use that to change the values for the choices they're making.
they compute prediction errors and they use that to change the values for the choices they're making. In the case of TD Gammon, you know, how it should move its pieces given a given board
position and a dice roll. But the cool thing about TD Gammon, and the reason I love this story,
is when they were testing it, it was playing World Champions at backgammon, and at one stage
it made a move that a human wouldn't have made in that situation. So basically,
given a certain board position and a certain dice roll, it made a move that humans wouldn't have
done. The experts were saying, well, it's made a mistake. That's not the move to do.
Well, it turns out, after much play, humans realized that the move that TD Gammon had come
up with was actually better than what humans thought they should do in that situation.
And it had found that just by trial and error learning or reinforcement learning.
So the AI had created something new.
And that's a key point I want to finish on,
is that artificial intelligence does have the ability to create new things
because it's just going to try random combinations of things until it works.
And you could take that lesson from TD Gammon and put that to music where computers can now
generate music because they know what good music is because we tell them that. And if it generates
a random piece of music and that people don't like, it punishes the notes that it's chosen and
those kind of things. And if it generates something we do like, it reinforces them. And now you've got computer algorithms that can make unique music. And this
is true of almost anything, but the key concept here that underlies this are these prediction
errors or reinforcement learning. Now that's the end of part one about the neuroscience of
artificial intelligence. And I've told you a little bit about how AI learns,
and it follows those reinforcement learning principles that humans use as well.
On the next episode, I'm going to come back and talk about the AI of how artificial intelligence
makes decisions, and that will involve what are called neural networks.
Thank you so much for listening. Please subscribe to the podcast.
If you have ideas, you can follow me on Twitter, at ThatNeuroscienceGuy.
Just DM me. Say, hey, what about an episode on this?
We're planning the last couple episodes of Season 3,
and then we're going to take a bit of time off and plan Season 4.
But please send us your ideas.
And of course, there's our website, ThatNeuroscienceGuy.com.
Thank you so much to those of you
that are supporting us on patreon and buying t-shirts all of that money is going to graduate
students in my lab to help them with their studies my name is olaf krigolson and i'm that
neuroscience guy thank you so much for listening and i'll be back to you shortly with another
neuroscience bite and another episode of the podcast