Radiolab - The Alien in the Room

Starting point is 00:00:00 Hey, it's Latif. One of the reasons I love working at this show so much is that at the end of the year, I can look back through the episodes that we made in the last month and I'll find surprising patterns. I was like, oh, I guess we were really obsessed with the human voice this year?

Starting point is 00:00:16 We did at least three episodes about it, one about the biological evolution of the voice, one about the technological history of text to speech. And then probably the one we're proudest of. In September, we did an episode, episode featuring the genius grant winning writer and activist Alice Wong. And she specifically was talking about this valve attached to her tracheotomy tube that allowed her to speak. Sadly, Alice passed away two months later in November. Monumental loss. We feel so honored that one of the last

Starting point is 00:00:51 things she did with her voice was to do a story with us. One of the other things we were not so secretly thinking about all year were the cuts to public media. For our station, WNYC, that meant a nearly $3 million a year hole. If we want to keep going, we got to fill that hole. So if you've heard and been moved by what we put out in the last year, maybe it was the story of the quantum physicist Kossum trying to survive in Gaza, or the Dr. David in Philly trying to use AI to repurpose generic medicines, or the astronomer charity, studying how galaxies die as she's mourning her own family. We have laughed and cried and had our minds blown together. If you want us to keep doing that and to help keep it free for everyone, please chip in. If you make a year-end donation now,

Starting point is 00:01:43 we have three new thank you gifts you can choose from, all of which were designed by the great Jared Bartman, a laptop slash bumper sticker, which will send you if you contribute any amount, A t-shirt, I got to say, it's like the most audio nerd t-shirt. People will think you cut Radio Lab episodes in Pro Tools. Lastly, a jigsaw puzzle featuring the art of one of my favorite stories this year, The Age of Aquaticus. Check them all out and make a year-end donation atradialab.org slash donate. Or if you want to provide ongoing support for Radio Lab, you can join the lab.

Starting point is 00:02:16 This month will send you that new t-shirt, as well as all the membership perks, bonus content and more. You can find more information about joining the lab also at RadioLab.org slash donate. So one more time. Thank you, thank you.

Starting point is 00:02:29 Thank you. And now on to the episode. Wait, you're listening. Okay. All right. Okay. All right. You're listening to Radio Lab.

Starting point is 00:02:46 Radio Lab. From W. N. Y. C. See? Yeah. Okay, after all of that, it is time to finally discuss. Latif, the question, the topic, the theme of the moment, perhaps. Climate change.

Starting point is 00:03:05 No, that nobody cares about climate change, man. Come on. Simon! Hey, I'm Latif Nasser. This is Radio Lab, where despite what reporter-producer Simon Adler just said, we hear at the show, including Simon, do care about climate change. But we're here today to talk about a different huge overwhelming thing that we're all in the middle of. I mean, I don't want to put words in your mouth, but what I have been feeling is a general sense of frustration. Yeah, yeah.

Starting point is 00:03:32 Something that everybody's talking about, but nobody seems to actually understand. You and I have even done interviews together with people on this stuff. That's right. Which is, of course, artificial intelligence. So much of the coverage about this stuff right now is like this running debate, right? where you've got people on one side saying, these AI, you know, they think they are intelligent and eventually they'll outsmart and destroy us all.

Starting point is 00:03:56 Right. And then on the other side, you've got people being like, no, they aren't actually intelligent. They're just mimicking us. And it's not as big a deal as everyone says. Right. And I don't actually know who to believe. Yeah.

Starting point is 00:04:09 And I think it's because, like, I don't know what AI is. Like, I don't know how it does what it does. under the hood. Yeah, because we don't know, right? This is one of the most extraordinary things about machine learning AI is that we don't really know what they are. But after reading, countless articles

Starting point is 00:04:28 talking to tech people and scientists, I finally felt like I was getting at that question when I talked to this guy. Stephen Cave, I'm the director of the Leverham Center for the Future of Intelligence. He leads this sort of think tank at the University of Cambridge. And there's about 50 of us now

Starting point is 00:04:44 trying to understand these systems using a really wide range of methods, including tests taken from animal psychology. Tests designed to measure how well a mouse can problem solve. And applying them to AI agents in order to understand, well, where are we in the kind of evolutionary, cognitive tree of life of AI? And they've actually turned these tests into a sort of competition that they call the Animal AI Olympics. Yes, indeed. Okay. Okay. Well, that just sounds fun.

Starting point is 00:05:16 Right. Yeah, exactly. Yeah. So to do this, they've created a slightly lower resolution toy story-looking digital world. Okay. Or maybe even more accurately, like if you know the game, Minecraft. Oh, yeah, yeah, yeah, sure, sure. It looks like that. It's this three-dimensional space filled with all these different bright, primary colored objects.

Starting point is 00:05:40 Okay. And then they take these AI, which are running on basically the same. kind of engine that power chat GPT. And they give these things a little avatar, like a hedgehog or a pig or a panda. And then they just sort of place them in this 3D world and say, there is food in here. Find it. So it has to like navigate the digital world to find. I mean, like I assume it's not really food.

Starting point is 00:06:07 It's this green orb that they're looking for. And I mean there are walls that they have to like figure out how to get around. And there are transparent walls. But it's like physical world problem solving. Absolutely. And I mean, well, this is the sort of task that mice or pigeons can pull off pretty easily for these AI agents. Things like manipulating objects and understanding gravity. It's a real challenge.

Starting point is 00:06:32 Like they struggle to press a lever or perceive an edge. Which any animal can do, or at least, you know, any mammal, say. And so effectively, these systems don't have. the common sense of a mouse, whereas higher reasoning, maths and so on, they can do a hell of a lot better than humans can. That's the more of X paradox, right? Like, it's like easy things are hard and hard things are easy. Exactly.

Starting point is 00:06:57 Yeah. And like, we've known this for a long time and it's pretty obvious at this point. But after running all of these AIs through this thing, dozens, hundreds of times, what Stephen has seen over and over is that they have a completely different profile. of capabilities and skills than any animal. They are not like us. No. I mean, one of its capabilities might be convincing us it's human-like, but it isn't.

Starting point is 00:07:24 Well, okay, so then what is it like? I mean, is the AI little tadpoles, or what is it? Well, there is one metaphor that some people like to use, and that's the octopus. You know, what's wonderful about the octopus is they are phenomenally smart. They can use tools, for example, without being. taught. They develop sophisticated tactics of all kinds or lots of wonderful octopus escape stories. Well, wait, because that doesn't sound like AI at all. No. Then why this metaphor? Well, it's helpful not because AI's like them, but because in a way,

Starting point is 00:08:03 it really shows how different intelligence can be. Okay. I mean, octopus is their intelligence is distributed through their tentacles. He says, you know, we and all. All mammals have this one central brain. But octopuses, they have nine little brains, one in the center, and then one in each limb. So their tentacles can function much more independently, which is how they manage to have eight of them, all doing, like, clever things all at once. And, you know, this kind of intelligence is fundamentally alien to us. Hmm. And that's a good way of looking at AI.

Starting point is 00:08:39 Alien. profoundly alien. Which, on the one hand, makes this thing feel sort of unknowable, impossible to understand. But then on the other, well, it is alien. It did not evolve in some far-off galaxy or even the depths of the ocean. Right. Like, this is an alien we created year by year, transistor by transistor. And so this is what we're doing today.

Starting point is 00:09:15 We are going to trace the evolution of this alien in our midst, this alien that we designed, in the hopes, at least, of like coming to some deeper understanding of what it actually is today. And then maybe, if we're lucky, that will give us some insight into this thing we are all, almost certainly, going to have to face off with at some point or another. So... This is great.

Starting point is 00:09:42 Like, I feel like we all need this, we all need this explainer. Great. Uh, fill your glass, because here we go. Hey, you guys can hear me. Yes, I can hear you, Simon. Hello, Terry. How are you? Very good.

Starting point is 00:10:06 Thank you. Uh, sorry for the slight delayed start here. Some classic technical difficulties, you know. So there are a lot of different first contacts we could point to with this alien species. But the most fun place to start that I found is with this guy, Terry Sinovsky. Professor at the Salk Institute for Biological Studies. Who, yeah, sort of like the midwife of AI. Is that a helpful way to think of you or no?

Starting point is 00:10:33 Yes, yes, actually. Well, it's obviously more complicated than that. But that's not a bad analogy. Terry trained as a neurobiologist. He came up poking probes in monkeys' heads to try to understand how the brain works. But then, in the mid-80s, he teamed up with some computer scientists, trying to make computers do animal brain-like things, like hear and recognize sounds or visuals. But it was going nowhere because everything was based on rules at the time. Like all computer programming at this point, it was this incredibly complicated set of, like,

Starting point is 00:11:12 if this, then, that statement. So if you see this and you see that, but you don't see that, then that means this. This sort of web of logic, which when it comes to recognizing sounds or pictures was a problem. Because for each rule, there are tens of thousands, 100,000 exceptions. Just too many nuances in the rules to hard code in. And so it was clear that this approach, this way of doing it through rules, was really hopeless. And so, together with my friend and collaborator, Jeffrey Hinton, he started to wonder if there was a different way to tackle this.

Starting point is 00:11:50 Learning. And so with a small group, with computers that were puny by today's standards. They set out to build a machine that could learn. And one of the first things they tried to teach it was how to pronounce English. You know, text to speech in computer science. And amazingly, Demonstration of Network Learning by Terry Sinovsky and Charles Rosenberg.

Starting point is 00:12:19 They have recordings from these early training sessions. Now, if you want to learn from experience, you have to have lots of data. And so, you ready? Ooh, ah, blah. Okay, sometimes, can you want, look at me. They took a transcript of a kid talking, a transcript I had my friend and neighbor Levon reenact.

Starting point is 00:12:41 When we walk from school, I go to my grandmother's house because he just has candy. Nice. That's perfect. You ready for the next one? And then what Terry did was give the computer this text and then also gave it the exact phonemes, like the symbols for the proper pronunciation for those words. No rules, just actual pronunciations. And then said to the computer, quiz yourself. Like, go ahead and try and then compare what you tried to the correct pronunciation.

Starting point is 00:13:10 First recording, day novo learning. And here it is. Wow. Right, so it has no idea what it's doing. Yeah, not even close. It doesn't sound like a baby either. Like, that just sounds like glitched out. It's chaos, right?

Starting point is 00:14:00 It's like noise effectively. Yeah. But then, as it continued quizzing itself, comparing its output. to what it should have said. When I go to my cousins, I play badmits and all that. Slowly, we could actually hear the learning. You could hear it figuring out the difference between vowels and consonants.

Starting point is 00:14:32 And then it would start pronouncing small words, you know, oh. We got to go to my kukon to get to one mile to yumbia by neuronum come down away. We sleep. And, you know, it only took a couple of days. When we walk home from school, I like to go to my grandma's boss. Well, because she gave us candy. And it was acing it. And we eat there sometimes.

Starting point is 00:15:01 And we eat there sometimes. Sometimes we sleep over the night there. Sometimes we sleep over a night there. Sometimes. When I go to go to my cousins, I get a play. That led to all that. But the really astonishing thing is that when they gave the program new words and new sentences that it had never seen before. He won't stop Zumpinoran in the bathtub.

Starting point is 00:15:27 It pronounced those two. He repeats Zumpunerun gets tired when he goes to. That when he finally gets us to see. It was phenomenal. Sometimes I get to go to bed at 1230. Sometimes, but most of our times I don't. What we didn't appreciate it back then was that NETHawk was a little bit

Starting point is 00:15:49 of 21st century AI in the 20th century. That this process of learning was the future. Are we done? We're done. Thank you so much. Well, okay, but like what actually happened there? What is it doing? How do you get a machine to learn that?

Starting point is 00:16:14 Well, take a baby human. You know, it's born with this clump of gray stuff in its head, which is really a bunch of neurons that are all connected in like a random messy way. Oh, they are connected? I just imagined a baby brain was like nothing was connected. It was a blank slate. No, when the baby emerges, the neurons are all connected. They're just not connected in ways that make sense in terms of the world they've just popped into.

Starting point is 00:16:40 But then, when it gets some input, like it touches something hot, gets yelled at, gets cuddled, it starts to strengthen some of these connections and prune others back. Okay. Until you have this just unbelievably complicated network of connections that can recognize patterns in the world around it and, you know, know that this is a square or if you poke a cat, you get scratched. That's right. In the brain, you can adapt to your world that you happen to be in by changing the strengths of connections between neurons. And so basically, Terry and others wanted to create some version of that in a machine. Yeah, you hit it. The models we were developing, these neural network models, were based on very simplified versions of brain circuits.

Starting point is 00:17:24 Okay, but how did you do that? Like, what is going on under the hood here that allows it to, to do this. Well, we understand mathematically how they work. And we're making progress now at trying to translate the mathematics into something that humans understand. And so, Latif, here is my best attempt to translate this for us humans. Okay. I mean, so just setting aside all of the technical setup on like, how is it even interpreting the data or what are you inputting? With the help of this guy, Grant Sanderson. Yeah. I run a YouTube channel that's named three blue and brown. I often talk about math, but math adjacent things as well.

Starting point is 00:18:05 Great. We're just going to draw like a mental image of what one of these networks looks like. Okay. Let's go. Now, as we all know, these neural nets can do crazily complex things. But for now, we are going to give one a very simple problem. I'm going to draw a couple shapes. What shape is this? A circle. Can we get a computer to see a circle? How about this? A circle. A very childlike task. Yeah, sure.

Starting point is 00:18:33 Now, first things, first, to get an image into the computer, we're going to chop it up into a bunch of pixels. It's like a 10 by 10 grid of them. Okay. And we're going to imagine those pixels is 100 light bulbs. One light bulb for every pixel and light bulbs that will be on if their corresponding pixel is filled in with ink and off if their pixel is empty.

Starting point is 00:18:58 Okay. So you've got this circle of, of illuminated bulbs in this grid of bulbs that are off. Okay, I can see it. From there, for reasons that'll make sense in a minute, below that, we're going to add a smaller grid of 10 light bulbs, and then below that, just one bottom bulb. At the top, 100 light bulbs, and then another layer, 10 light bulbs, another layer one bulb.

Starting point is 00:19:25 Exactly, and that final bulb, that is just the answer. the output that when it turns on says yes. Circles? Circles, that's right. There's a circle here. Okay. But this last bulb, it's a little bit special. It's not like the other bulbs in that it's actually on a dimmer.

Starting point is 00:19:43 So it can also answer like maybe a circle because it could be a square if it's kind of bright. Or I'm pretty sure if it's pretty bright. Or if it's all the way on, that means this is definitely a circle. As a side note, yeah, this feels like quite the challenge where we're torturing the poor audience members here, probably like on their drive and not able to allocate their visual cortex to like try to visualize all this. But setting aside all of the technical terminology. There's one last thing to do. We have to wire all of these bulbs together so that electricity can like flow from that top grid through that middle grid down to that last bulb,

Starting point is 00:20:24 which will hopefully turn it on. So we call up an electrician, we tell him go and connect every bulb in the top 100 to every bulb in the middle 10, and then go and connect every bulb in the middle 10 to that final bulb. So literally every bulb is connected to every other bulb, basically. Exactly. So that electricity can flow down from any bulb that's lit up and kind of cascade through all of them. Got it. And so, the electrician starts pulling the wires, soldering, and they say, I'm done. But the thing about this electrician is they're shit.

Starting point is 00:21:10 Like, they just do a terrible job. Some of the wires that they put in are like a strong copper. Others are just twine, so they can't even carry electricity. And so when this is all said and done, this network we get. is kind of like a fresh baby brain with just random neurons clumped together. Got it. And so when we do send an image of a circle into it,

Starting point is 00:21:36 into the machine... Hey, why are you using a microphone again? To record your voice. Lighting up some of the bulbs in that top grid. What shape is this? Uh... The electricity passes down through these random connections from the top to the middle, down to the bottom,

Starting point is 00:21:52 and in all likelihood... Iron rectangle. It's completely wrong on this. That final bulb might be a little lit up or half lit up or just completely off. Okay. Now, when a child gets something wrong... No, what is that? And like a parent scolds them, that is altering the connections between the neurons and the brain.

Starting point is 00:22:14 Strengthening some, pruning others back, right? Right. And that is what we want to do with this machine. We want to mess with those wires. the strengths of those connections between the bulbs. Right, right, right. Now, we could just go in there and rewire this thing by hand. Yeah.

Starting point is 00:22:32 We could pick out the important bulbs because we know which ones are lit up for a circle and direct their current through the middle bulbs to that final bulb. But, you know, that would take just as long as hard coding it. Right. And so instead, we're going to give this thing the chance to learn all this, to learn what the connections should be. So, when it gives us that first random wrong answer, There is a 12.2% likelihood of a circle in this image.

Starting point is 00:23:01 We're going to say bad robot. There is absolutely a circle in this image. Try again. Okay, I will try again. But then, after that first try, instead of us standing there saying yes or no, we are going to set it up to learn all on its own. We're going to step away and let math be its babysitter, be its teacher. And so, this is the moment where we have to dive into the math a bit.

Starting point is 00:23:26 Uh, okay. It's not that complicated. It's mostly multiplication. All right, okay. Let's go. First of all, these bulbs in the computer, they're really just numbers. One, two, three, four, five. And the wires, you can really just think of them as variables that multiply these numbers.

Starting point is 00:23:43 X times two. As they pass through them. Y times point three. A good wire multiplies. the electricity by five or whatever. A bad one divides it in half or even zeros it out. And that means we can just take this entire array of bulbs and wires and turn it into a giant equation.

Starting point is 00:24:05 You know, A times B plus C times D plus E times F. There's some other math strewn in there very artfully and deliberately. But the key here is with a bit of mathematical trickery, this equation can represent the difference between the output, is giving. There is a 12.2% likelihood of a circle. And the output we want it to give. There is a 100% likelihood of a circle.

Starting point is 00:24:31 And if we think, hey, I've got this function and I want to find a minimum of that. Like minimize the difference between your output and the output we want. There's a whole field of math that is just built, ready to do exactly this kind of thing. This is what calculus is all about. Like, Newton, if he was rising from the grave, would just be, like, showing fireworks right now saying, hey, I got this. I know how to do this one. So somehow the calculus tells you, in math equation form, if you're getting closer to the right answer? Yeah.

Starting point is 00:25:03 And now, don't worry. We're not going to go into the calculus other than to say, we walk away and the calculus becomes the teacher. Okay. So. 12% likelihood. After the first wrong answer, the equation says, no. Machine tries again. 25% likelihood.

Starting point is 00:25:19 And the equation says closer, and the machine tries again. 77% likelihood. And each time it tries, it messes with the wiring, the weight of the connections between the bulbs. Getting it closer and closer to right. Exactly. And what happens over time is that that middle grid of 10 bulbs, their connections back to that top grid are getting tweaked in such a way that it's like they're starting to pick up clues. Like, maybe it's getting stronger signals from bulbs that are part of a curve. Or maybe it figures out that the corner bulb can't be on for it to be a circle.

Starting point is 00:25:59 And, like, the thing is, we actually don't know. I mean, when people talk about these things being a black box, this is what they mean. It's this middle grid. It's all automated by math. It's picking up something. And we might think it. You don't know what the clues are. we just know that

Starting point is 00:26:17 they're right, that the clues are like they work. Yeah, it's fine in some signal that tells it there is a circleish thing here. And as it keeps giving answers and the equation keeps telling it whether it's right or not or closer or further away, eventually

Starting point is 00:26:33 each of those middle bulbs is receiving the right electricity from the right top bulbs to know if these characteristics of a circle are there. And if they are, they pass that along to the final lightball, which will light up if enough of those characteristics are present. And at that point, yeah, our little network here has learned to recognize this circle, which... That's actually kind of astonishing.

Starting point is 00:27:06 That's pretty amazing. It is, but it's only this one circle. And so the important thing is that if you... do this process, not with just this one circle. Circle, circles, but with tens, hundreds, thousands of examples. You know, big circles, little circles, messy circles, circles drawn by you and me, and you have the machine tweak all those different wires

Starting point is 00:27:29 for all those different examples. You can then take all of that and do one final, actually very simple bit of math. Just average it all together. all together. All of the wire strengths you got from all the examples for wire one get average down to one value. All the wire strengths that you got for wire two get average down to one value.

Starting point is 00:27:57 And if you've done this right, you can then send in any of the drawings it's seen before or new drawings it's never seen. Circles drawn by a two-year-old or a picture of an orange. And it will say, yes, there is a circle there. Holy cow. Now, that process we just went through can recognize way more sophisticated things than just a shape, like cats or dogs. And I mean, the only real difference in the model is instead of these three grids we just used, these three layers, you know, an input, a middle, and an output, you just add more layers of bulbs in the middle. These multiple middle layers allow the computer to recognize progressively more complicated components of the picture.

Starting point is 00:28:44 So, like, the first layer might just find the edges. The second might find textures. The third, forms. The fourth, maybe eyeballs. Because it's like, has everything is made up of building blocks of the layer before it? Yes. Without, crucially, without anyone labeling any of those intermediate, like, it's figuring that out itself. Exactly. And then using the same mathematical reinforcement, it can tune and tweak

Starting point is 00:29:12 to get shit right. Okay, wow. Well, I need to drink after all of this to sort of let all this settle in. Great. Okay, like this, I'm, I wish my kids could learn like this. Like, the way they learn is so physical, so emotional. It matters who's saying it. It matters how they're saying it. It matters the tone. It matters all these different things. Like, this is, so clean and like crazy fast i mean what just took us 10 15 minutes to explain that all happens in seconds so it can learn the circle thing at basically lightning speed but like a circle recognizing a circle is one thing sure and like now we're talking like actually making like making a you know a sonnet as if shakespeare wrote it that seems like a that seems like a very wide gulf it seems

Starting point is 00:30:05 like there's still a lot of place to go. For sure. And our little alien is going to have to evolve here. Yeah. But in terms of its architecture, of how it does this, it's basically the exact same. Huh. The only real difference is we're shifting its focus from recognizing to a slightly different skill. And we're going to get to that.

Starting point is 00:30:32 You want to predict what I'm going to say next? Right after a quick break. Exactly. Right after a quick break. asked this question to me before the break. Like, how did this thing evolve from being able to recognize shapes to generate stuff? Yeah. And I posed that very question to Grant Sanderson. Okay.

Starting point is 00:31:20 Yeah. Okay. So I would say there's many different ideas at play here. Who, again, YouTuber has thought a hell of a lot about this stuff. And he says, the important next step is to realize that, yes, you could think of what we did with those circles as having the machine recognize them, or you could say we were asking the machine to predict the answer we wanted. Like with the circle example, there's two things that it could predict. Circle or not. Okay. So it's not anything meaningfully different. It's just like, let's just call everything a

Starting point is 00:31:54 prediction. Right. But it becomes important when we're talking about generative stuff. Okay. Like in the case of language, predict what word comes next. So to explain, going all the way back to the 80s, IBM began playing around with these chatbots that you could type to, and it would respond. Hello there. How are you today? And the way it would do what it was doing was it would take every word that you typed in as your question, turn those words into numbers.

Starting point is 00:32:24 We're not going to go into how, because that would take an hour in it of itself, but turn those words into numbers, send it through this multi-layered set of bulbs. But in this case, those bulbs, those layers it's passing through, they haven't been trained to categorize a sentence. Like, we don't want it to say that was a question. Instead, it has been trained to spit out the word that is most likely to come next, to predict the most likely next word. Just one word. It's not even a word. Also, there's a nuance here between the notion of words and tokens, but excessive nuance.

Starting point is 00:33:00 Yeah, but it's like what is it even basing? like how is it predicting that with a circle you know it's a circle we know the right answer we're giving it the right answer it's calculating back to to that right answer right but like uh in a sentence that could go any million number of ways how can it ever have a right answer to train back to well so what ibn was doing was giving it a bunch of texts books transcripts conversations feeding that into this machine and so then the the right answer was the most likely word to followed the preceding words. So it's like, it's just like

Starting point is 00:33:38 here's a giant stack of human talking and in this giant stack what's the most likely thing that would have been said next in this exact scenario? Exactly. That's right. And just one brief aside

Starting point is 00:33:54 because it's sort of fun. I think I have this right that a word is a big long list of like 13,000 numbers. What? A computer has to turn a word, one word, just like a one word into 13,000 numbers? Yeah, and so like in the way that a pixel value, in the circle example, is like basically a zero or a one. It's like every word is this list of 13,000 numbers.

Starting point is 00:34:22 It's so weird that it, like, that that's the simpler version for it. I know. Let me turn it into this, like, phone book. of numbers, which is again, like, which points to how these things are so not us. Yeah, they're really not us. Not at all. Wow. But they're using us, though, right? Like, it's our talk that's getting turned into numbers. And it literally does it, one word at a time. So after it's written the first word of its response, it just does the whole process over again. It takes all the words in your question, plus the first word it predicted, sends all that through the network

Starting point is 00:35:03 again. And then it just predicts the next word after that. Sends that through those bulbs again. And then the next word after that. Does the whole thing again. And plays the same game over and over and over. And one of the words in its vocabulary is the like end conversation token. So it like, it has some notion of when to stop. But the act of stopping is itself just one more prediction. It's it's one more probability in that big list of things that should happen next. And as I said, this is how they were doing it all the way back in the 80s. And I mean, if you interacted with a chatbot, even in like the 20 teens, this is the way they were doing it as well. Really?

Starting point is 00:35:37 Do you have any recollection of when you first came in contact with one? Oh, God. I feel like it must have been one of those like customer service bots on a website kind of thing. And I'm sure not just because it's a customer service experience, but because it was an early chatbot experience, it wasn't very good. No, no, no, no, terrible, no, terrible. And a big part of why they were bad was they had difficulty dealing with longer

Starting point is 00:36:02 stretches of text. This is Stephen Levy. Editor at large, at Wired. He's been covering this stuff for a long time. I published a book in 1992 called Artificial Life. I was two years old, by the way. Thanks for that. Sorry.

Starting point is 00:36:18 Yeah, yeah, thanks. And he says, because it predicted words one at a time and one after the other, the longer the question or the longer the answer, the more likely it was to miss or lose the larger meaning. And so eventually, predict a word that just doesn't make sense or is out of place. Exactly. Huh. And so just to give one very concrete example to illustrate it, like the sentence, what sound does my dog make when I slam the door? It's like...

Starting point is 00:36:51 That's so... I can see why that would be confusing. Right. Like, you have to somehow know that in that sentence, dog is really the operative term here. Right. Right. The important noun, it's not I or door. Right, right, right. And so in 2017, this guy, you know, Oscar Reap, who worked at Google, set out to solve this dog door problem. He thought that the thing should be able to figure out, oh, this is the most important part of the sentence. This is what I should pay attention to.

Starting point is 00:37:19 And now the question becomes like, how the heck does one go about doing that? And what they figured out was the problem here is we're giving it one word at a time. and we're having it predict one word at a time. And what we need to be able to do instead is have it somehow process the sentence as a whole so that, you know, something at the end of the sentence can sort of feed back on the weight or meaning it gives to something at the beginning of the sentence. And one way that you can just imagine it doing this is that instead of just making a prediction and giving an answer, you need to take in all the information, make a prediction, make a prediction, But then just like set that aside because you're going to take in all that information again. And then we're going to send it through again and again. And again, each time focusing on a different word in the sentence, generating a different possible prediction before landing on some final prediction, which God willing would be bark.

Starting point is 00:38:18 It's like the computer simultaneously lives in the multiverse of that sentence where each word in that sentence is the most important. Yeah, and like I, I've looked at this stuff for months, and I still don't totally understand exactly how a machine does this. But, well, I mean, something like that. And also, you know, you can say no. You can tell me. Well, I mean, I mean, in the raw sense, yeah, that's the idea. Like the complexity here, you can see it's going through the roof here, like where you're like, oh, God, this is so much more computing you need to do. Totally.

Starting point is 00:38:54 And this was a big barrier for a long time. I mean, that's why these chatbots were almost as bad in the early 2000s as they were in the 1980s. And this is where we get to the next step in the evolution of our little alien friend here, which, as many evolutionary leaps are, was mostly a hardware upgrade. I mean, if you have been following the news about AI at all, you've probably heard this term GPU. GPUs components that go into data centers. Or the company... Computer chipmaker, invidia.

Starting point is 00:39:26 Invidia. the most valuable company in history that makes these things. Its story, of course, wrapped up in the frenzy around the future of artificial intelligence. These things, and this company have been at the center of the conflict

Starting point is 00:39:37 between China and the U.S. when it comes to export controls. The idea here is for the U.S. to kind of limit the ability for China to catch up when it comes to AI. And interestingly, what these GPUs, these graphical processing units, were originally designed for, was computer games. Video games, things like that.

Starting point is 00:39:56 And what they're really good at is just doing a bunch of different math problems all at once. Exactly. It's just all about multiplying and adding numbers as fast as you can. There's some other things, but like, by and large, like, just do those two things and we're off to the races. And doing these math problems all at once, which is called parallel processing, that's exactly what these learning machines needed to do some version of that super complicated multiverse prediction thing we discuss. Sure, sure, sure.

Starting point is 00:40:23 And so, with these GPUs and this new... parallelized architecture that Google named a transformer, all of a sudden, they could get a machine to parse those longer sentences and give at least reasonable answers to more complicated questions. All right. But what really sent these AI chatbots into the stratosphere was a kind of knock-on effect of this parallel processing. Because when you can process everything at the same time in parallel, you can actually

Starting point is 00:40:50 train on a lot more material in the same amount of time. And so eventually, they just gave it, basically, basically the entire internet, almost everything we humans have ever said on the internet, as its training material and started sending that through this network of light bulbs and wires that was just unimaginably big, like to get a sense. In our smaller example with the circle, there's something like a thousand and some odd parameters, a thousand or so of those wires. GPT3, which was kind of dumb by today's standards, but it came out, had 175 billion parameters.

Starting point is 00:41:26 175 billion things that could be tweaked. Yeah, and many of the ones that we have now, they're trillions of parameters. And as they fed, basically all the things we humans have ever said on the internet into this thing. Throwing way more training examples and way more compute than anyone would reasonably think to do. Slowly, they started to notice.

Starting point is 00:41:47 That with a sufficiently large amount of data on a sufficiently large model, run with sufficiently many cycles of training, these new computers do seemingly intelligent things. Now, a lot of what I just described was written up in a paper called Attention is All You Need. And these findings are really what unlocked these large language models like ChatGPT. And that's all it was really intended for. But...

Starting point is 00:42:19 There was a passage in there saying we think this can work for images and video. And indeed, that turns out to work. That same basic model of massive parallel processing with tons of input. That could predict the next part of an image or sound. The moment civilization was transformed. And that moment, that realization, is really what triggered the explosive proliferation of artificial intelligence. different kinds, practically different species of AIs that we are living amongst today. New artificial intelligence systems. Machines that can teach themselves superhuman skills.

Starting point is 00:43:08 Chat, GPT3, GPT4. Do you think Apple Intelligence. Dolly. An app called Lenza. Bard. It's called Mid Journey. Text to video art generated by it. It's crazy.

Starting point is 00:43:19 Look at this. So I don't know what AI it is they're using. Yes, it feels like an episode of Blackmerex. So it's like all of these different apps doing all of these different things in all these different mediums. They're taking in a huge amount of examples and then they're using fancy math to basically predict the next word, the next pixel, the next note. And from that, it's like generating this whole huge diversity of new stuff. Yeah, basically. And I mean, it's also just as we described doing something that I don't totally understand

Starting point is 00:44:06 that that's more holistic than just looking at the thing that happens next. But it is drawing on the examples it's been given to decide what should happen next, which suddenly sounds not so simple. No, it does send you into a spiral. Because it's like, is what I do any different from that just spewing out, you know, some iteration of everything else I've seen before this? Yeah, but first of all, you're not pulling from the whole internet, right? Like, you have to depend on just the limited things you've experienced or can even maybe remember. That's fair.

Starting point is 00:44:41 And your like math is also just way sloppier. It's not as accurate. Yeah. And to that point, and maybe we shouldn't even go here. But there's this one other thing that you can control in these models, which is called the temperature, which is like this final knob you get to tweak on the thing. And so if you have, I think it's if you have the temperature all the way down, it will give you the most likely thing to come next. If you turn the temperature up a little bit, though, it then is going to pick like the second or third most likely thing to come next. No, so you can control, like, how precise you want the math.

Starting point is 00:45:17 Like, you can say I want it a little stanky. Yeah, like there's a little bit of randomness in it then that it's then acting upon in what it does next. So maybe you just want the temperature turned up on like every third word. So that there's this almost spontaneous feeling serendipitous creation, active creation that comes out of this rigid math. Like it's like something startlingly creative might just be a less right answer. A less right answer. Wow. Yep. And just by doing that, it's going to.

Starting point is 00:45:50 keep doing stuff that we are going to get increasingly uncomfortable with. Yeah. Like right now there is an AI generated song on the Billboard country charts. Really? I didn't hear about that. But like if that's the case, I see no way that eventually a fully AI generated film won't hit the box office. Like that's just going to happen. But when it happens, it will be only because of all of this math.

Starting point is 00:46:20 To me, I think the thing that makes me realize is when you see under the hood, what you see is less like something spooky and ethereal. Yeah, like there are times when it gets spooky. When like there will be a time, like I'll be listening to like an AI generated podcast and then one of the hosts breathes. And I'm like, wait, that's so weird. Like it doesn't even need oxygen. Why is it breathing?

Starting point is 00:46:46 And now it's like, oh, because you know that like that like that's just. the next statistical thing that would come in that sentence is a breath. That to me, that to me is like, it's much less eerie because you can see where I got it from. Right. But, well, okay, I do have one bit more for you because, I don't know, I still found myself wondering how it will feel as these things get better and better. And in particular, what it'll feel like in the moments we sit across from me. And it is better than us at something we have spent our lives working on.

Starting point is 00:47:27 That it is better than us at something we truly love. Yeah, many, many people. Also, all my friends tell me like, wow, you are the first professional goal player. Be famous because you lost the game. No. So, yeah, it's me. And so I got in touch with this guy. Van Huay, I'm a professional goal player.

Starting point is 00:47:50 a three-time European champion. So, real quick, go. It is an ancient Chinese game, considered probably the most complicated board game in the world to teach a computer to play because of just how open-ended it is. All you really need to know is you are trying to control as much of the board as possible.

Starting point is 00:48:09 You go back and forth with your opponent, placing one stone at a time, and you control portions of the board or territory by either, like, cordoning off sections of it or encircling your opponent's stones. It's very simple idea, but it's difficult. Because with such simple rules, there are just this crazy number of ways the game can play out.

Starting point is 00:48:33 In fact, folks like to say that there are more possible ways for a go game to go than there are atoms in the known universe. Yes. Anyhow, back to Fawn. I remember I discovered a goal, age like six in my school, in Xi'an, and I feel something, oh, this game, I can play. And I progress very quick.

Starting point is 00:48:59 One year after, I learned go, for my school, I'm number one. Three years after that, I'm in the best team in the province. And not long after that, I stopped my school. I only learn go game. I mean, for years. Every day, only thing you do is just to play go game. Twelve hours. Twelve hours of playing.

Starting point is 00:49:19 Yeah, it's no joke. Around age 15, he went pro. And somewhere along the way, he says, he noticed this almost magical quality of the game. Go for me, it's like a mirror. A mirror? Mirror, yeah. Because when you play, you can see your mind on the board.

Starting point is 00:49:41 He says all the choices you make, whether you're aggressive in attack or are patient and waiting, you know, in a sense, how you think, stares back up at you. It's like print, mind print. And your opponent's mind, he says, it's printed there too. So I play with someone, I don't know him, and never talk with him. I play one game, I know him. This is magical.

Starting point is 00:50:05 But this mirror of his, well, it was about to get shattered. 2015, that means a hazardous. Researcher at Google. send me an email, like we have some very exciting goal project. Can you go to our office, visit? We'll show you our project. I tell, yes, okay, why not? And what they showed him was this thing called AlphaGo.

Starting point is 00:50:27 It was a computer that had learned how to play the game, and they asked him, like, will you play against this? So I tell you, okay, we can play the geyser, because I will win. It's just the program. That's the program. What can you do? You can win with me? never. It's like 0% chance to win this. Zero percent. And why were you so confident?

Starting point is 00:50:50 Because I know the best program this moment, I can give six stone handicapped. Handicap. Handicap. Handicamp, handicapped. Got it, got it, got it, got it. So how you can possible make the technique, make the huge different just months? It's so impossible. And so a month later in this windowless office room, Fon, faced off with this computer and its human stone-placing helper in a best-of-five-game match. That first game, all the game, I feel good. I think I will win. But, end of a game. With just a few stones left to play. Oh, how are stupid? I make some mistake. And I lost my first game. Okay. But, you know, he's thinking I was sort of arrogant going into this. I was overly confident.

Starting point is 00:51:40 So next game, I will be careful, I will play more seriously, I will win the game. So the next day, next game, he sits down at the board, starts carefully placing his stones, and it's looking good on the board, but inside his head... I feel something really difficult, very difficult, very difficult. Because... I like fight, but after good, don't fight with me. And if I want to take something, Afa gave me very easily. Looking down at the board, he was not able to see his opponent's mind in the way he always had.

Starting point is 00:52:21 No. There was no bravery. There was no subterfuge that he could sense. I see Africa won't do this. Africa won't do that. But why he won't do this? You cannot find it. You can't.

Starting point is 00:52:37 And so he didn't know how to respond to it. And his mind started to race. Good move, bad move. Good move, what mean? Bad move what means? Good mood what's think my teacher. The good move, what's think my student. Everybody.

Starting point is 00:52:53 All my friends. And he realized that with all these emotional pushes and pulls that eventually... I will make mystic. But half a god... No. Never. When you think about this, the confidence is crash.

Starting point is 00:53:15 It's crash. All crash. And I lost again. Very, very badly. And I lost again for third, fourth, and the last one. Yeah. Damn. But you know, this experiment is really good for me.

Starting point is 00:53:52 This experiment is really good for me. I really see myself. Really? You think AlphaGo taught you to be more... Myself, yeah. I think this is AlphaGo teaching me about that. And why? Because I see myself.

Starting point is 00:54:17 So it's like AlphaGo teach me that our life, we will always lust, lust, lust, lust. Sorry, it's real life. It's our life. I think this is a human. This is important for us. I think what he saw in that. game as he was losing was kind of what you were saying about seeing under the hood making AI less spooky like he could see it wasn't magic it was math with no mistakes right and when

Starting point is 00:54:52 he saw himself you know like not being the perfect go player in any given moment or in every given moment like that's what makes him a person a person who could love something but still lose at it, maybe feel bad about that, and then use that feeling to figure out what to do next. Today, I'm teaching the goal in China with students. Right, why are you teaching go? The computer will always win. Yes, yes, but be careful. Because I think all you experimented to learn is still useful.

Starting point is 00:55:32 So don't worry, it will be coming. You can do nothing. it and just learn. Yeah, I get that. Before we wrap this thing up, I wanted to put all of this in front of someone, and not an AI person, but somebody with a really wide scope on technology and history. And so I went to this guy. Tom Malaney, I'm professor of modern Chinese history at Stanford University.

Starting point is 00:56:00 I worked with him years back on a story about typing in Chinese, and he's just one of the most thoughtful and informed people, I know. Yeah, that means a lot. That means a lot to me. So how would you respond? Well, I mean, everyday life is at its core a study of this awful, amazing, horrifying, never-ending surprise of what it means to be born and live and die as a human. and even if at the end of the day an AI is orders of magnitude smarter, AI, just by definition, cannot suffer and rejoice and live and die in quite the same way that humans can,

Starting point is 00:56:49 in the same way that we cannot live and die and suffer and comprehend and feel the way an octopus can. I mean, the only thing an AGI will be able to do is contemplate, my goodness, what does it mean to be an AI? And so I am not worried at all about what AI means with regard to meaning human identity, what it means to be human, or any of that. Well, that was very beautiful.

Starting point is 00:57:24 And while I love that, I'm still like, but this is going to mess, everything up so badly. I don't know if this... I agree. It's going to... No, it's... I mean, this is going to get weird down to the fabric.

Starting point is 00:57:39 But fast forward this, you know, 20, 30 years, if we're still around at a sort of climate change level, when another future human is sitting in this fabric altered world, it will still be a group of humans rejoicing, suffering. Like, it will still... be that condition. And so it's kind of a, it's kind of a liberate, for me, it's like a little bit of a liberatory time. It's, it's, it's, it's, uh, great. Maybe we'd get to free up a little bit more space to get back to work thinking about how to be human.

Starting point is 00:58:26 Because we have not, we have not even come close to solving that. issue. I don't know. ...you know. ...you know... ...their... ...that... ...the...

Starting point is 00:59:32 Special thanks to Stephanie Yin and the New York Institute of Go for teaching us the game, to Mark Daria and Levon, to Barbara Svenich, and of course, thank you. you to Grant Sanderson for his unending patience, explaining the math of neural nets to us. Grant is kind of like your favorite math nerds, favorite math nerd. His YouTube channel is Three Blue One Brown. Check it out. This story was reported and produced by Simon Adler with original scoring and sound design by Simon Adler, which brings me to the last unsavory thing I have to say, which is goodbye to Simon Adler, who happens to be one of our best reporter producers here at the show and also a friend.

Starting point is 01:00:38 He's going off to, among other things, pursue his music career, and this was his last episode on staff with us. Chances are if you list out your favorite episodes from the last 11 years at the show, more than a couple will be his. Could be some of the tech stories he did. He did stories about drones in Ukraine, about content moderation on Facebook. could be some of the international stories he did. He reported about the hunt for an endangered rhino in Namibia. He did a story about a species of raccoon in the Caribbean islands of Guadalupe. He did a lot of stories about democracy as well.

Starting point is 01:01:15 Covered a town, Seneca, Nebraska, that voted itself out of existence. He did a story back in 2017 about a New York City Council race where the campaign manager was a little-known guy named Zoran Mamdani. Besides being a killer reporter, not to mention composer and interviewer, Simon has also spent so many hours coaching an entire generation and staffers and interns. He's so generous with his expertise and his time, really someone who makes everyone around him better. Anyway, we have been so lucky to have him as part of our nerdy band for 11 years.

Starting point is 01:01:55 Check out his band, WinStar Enterprises, on Instagram. Instagram. That's Simon and another fellow former radiolabber, Alex Overington. We just, we already miss you, Simon. And good luck out there. Oh, you want me to say this? Oh, that's fun. Hi, I'm Cordelia and I'm from New York City. And here are the staff credits. Radio Lab is hosted by Lulu Miller and Latif Nassar. Soren Wheeler is our executive editor. Sarah Sandback is our executive director. Our managing editor is Pat Walters.

Starting point is 01:02:38 Dylan Keefe is our director of sound design. Our staff includes Simon Adler, Jeremy Bloom, W. Harry Fortuna, David Gable, Maria Paz Gutierrez, Sindhu Nian Nassam Bandan, Matt Kilty, Mona Madgavkar, Annie McEwan, Alex Neeson,

Starting point is 01:02:57 Sarah Kari, Anisa Vizza, Arianne Wack, Molly Webster, and Jessica Young. With help from Rebecca Rand, our fact checkers are Diane Kelly, Emily Krieger, Anna Pujol Mazini, and Natalie Middleton. Leadership support for Radio Lab science programming is provided by the Simons Foundation and the John Templeton Foundation. Foundational support for Radio Lab was provided by the Alfred P. Sloan Foundation. Thank you.

Radiolab - The Alien in the Room

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.