Today, Explained - We don’t know how AI works…

Episode Date: September 1, 2023

The researchers who create and study tech like ChatGPT don’t understand exactly how it’s doing what it does. This is the first episode of “The Black Box,” a two-part series from Unexplainable.... This episode was reported and produced by Noam Hassenfeld, edited by Brian Resnick and Katherine Wells with help from Byrd Pinkerton and Meradith Hoddinott, and fact-checked by Serena Solin, Tien Nguyen, and Mandy Nguyen. It was mixed and sound designed by Cristian Ayala with music by Noam Hassenfeld. Transcript at vox.com/todayexplained Support Today, Explained by making a financial contribution to Vox! bit.ly/givepodcasts Learn more about your ad choices. Visit podcastchoices.com/adchoices

Transcript
Discussion (0)
Starting point is 00:00:00 Over the course of the past few months, we've brought you all kinds of coverage of AI on Today Explained. We told you about the Puffer Pope. Can I say something without you guys getting mad? We told you about the fake Drake song. Running through the sticks with my walls. We told you about the people doing mind-numbing work to train AI systems. Together, almost 50,000 workers from 167 countries around the world helped us to clean, sort, and label nearly a billion candidate images. But what we didn't tell you is that we don't really know how AI works.
Starting point is 00:00:40 I've built these models. I've studied these models. We built it. We trained it. But we don't know what it's doing. Our friends over at Unexplainable are going to help us wrap our heads around that fact on the show today. BetMGM, authorized gaming partner of the NBA, has your back all season long. From tip-off to the final buzzer, you're always taken care of with a sportsbook born in Vegas. That's a feeling you can only get with BetMGM. And no matter your team, your favorite player, or your style,
Starting point is 00:01:17 there's something every NBA fan will love about BetMGM. Download the app today and discover why BetMGM is your basketball home for the season. Raise your game to the next level this year with BetMGM. Download the app today and discover why BetMGM is your basketball home for the season. Raise your game to the next level this year with BetMGM, a sportsbook worth a slam dunk, an authorized gaming partner of the NBA. BetMGM.com for terms and conditions. Must be 19 years of age or older to wager. Ontario only. Please play responsibly. If you have any questions or concerns about your gambling or someone close to you, please contact Connex Ontario at 1-866-531-2600
Starting point is 00:01:50 to speak to an advisor free of charge. BetMGM operates pursuant to an operating agreement with iGaming Ontario. Today, today, explain. Hello, it's Sean, but I'm going to, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, today, understand how it works and how it's kind of scary that we're moving full steam ahead anyway.
Starting point is 00:02:30 We're bringing you episode one today and episode two on Monday while we take a little breather for Labor Day. Here's Noam. So how did we get to this place where we've got these super powerful programs that scientists are still struggling to understand? It started with a pretty intriguing question dating back to when the first computers were invented. The whole idea of AI was that maybe intelligence, this thing that we used to think was uniquely human, could be built on a computer. Kelsey Piper, AI reporter, Vox. It was deeply unclear how to build super intelligent systems. But as soon as you had computing, you had leading figures in computing say, this is big, and this has the potential to change everything. In the 50s, computers could
Starting point is 00:03:11 already solve complex math problems. And researchers thought this ability could eventually be scaled up. So they started working on new programs that could do more complicated things, like playing chess. Chess has come to represent the complexity and intelligence of the human mind, the ability to think. Over time, as computers got more powerful, these simple programs started getting more capable. And by the time the 90s rolled around, IBM had built a chess-playing program that started to actually win
Starting point is 00:03:42 against some good players. They called it Deep Blue, and it was pretty different from the unexplainable kinds of AIs we're dealing with today. Here's how it worked. IBM programmed Deep Blue with all sorts of chess moves and board states. That's basically all the possible configurations of pieces on the board. So you'd start with all the pawns in a line, with the other pieces behind them. Pawn e2 to e4.
Starting point is 00:04:10 Then with every move, you'd get a new board state. Knight g8 to g6. And with every new board state, there would be different potential moves Deep Blue could make. Bishop f1 to c4. IBM programmed all these possible moves into Deep Blue, and then they got hundreds of chess grandmasters to help them rank how good a particular move would be. They used rules that were defined by chess masters
Starting point is 00:04:36 and by computer scientists to tell Deep Blue this board state. Is it a good board state or a bad board state? And Deep Blue would this board state, is it a good board state or a bad board state? And Deep Blue would run the evaluations in order to evaluate whether the board state it had found was any good. Deep Blue could evaluate 200 million moves per second. And then it would just select the one IBM had rated the highest. There were some other complicated things going on here, but it was still pretty basic. Deep Blue had a better memory than we do, and it did incredibly complicated calculations, but it was essentially just
Starting point is 00:05:10 reflecting humans' knowledge of chess back at us. It wasn't really generating anything new or being creative. And to a lot of people, including Garry Kasparov, the chess world champion at the time, this kind of chess bot wasn't that impressive, especially because it was so robotic. They tried to use only computers' advantages, calculation, evaluation, etc. But I still am not sure that the computer will beat world champion because world champion is absolutely the best
Starting point is 00:05:42 and his greatest ability is to find a new way in chess. And it will be something you can't explain on the computer. Kasparov played the first model of Deep Blue in 1996, and he won. But a year later, against an updated model, the rematch didn't go nearly as well. Are we missing something on the chessboard now that Kasparov sees? He looks disgusted, in fact. He looks just... Kasparov leaned his head into his hand, and he just started staring blankly off into space. And whoa! Deep blue! Kasparov has resigned. He got up, gave this sort
Starting point is 00:06:20 of shrug to the audience, and he just walked off the stage. I, you know, I proved to be vulnerable. You know, when I see something that is well beyond my understanding, I'm scared. And that was something well beyond my understanding. Deep Blue may have mystified Kasparov, but Kelsey says that computer scientists knew exactly what was going on here. It was complicated, but it was written in by a human. You can look at the evaluation function, which is made up of parts that humans wrote, and learn why Deep Blue thought
Starting point is 00:06:51 that board state was good. It was so predictable that people weren't sure whether this should even count as artificial intelligence. People were kind of like, okay, that's not intelligence. Intelligence should require more than just, I will look at hundreds of thousands of board positions and check which one gets the highest rating against a pre-written rule and then do the one that gets the highest rating. But Deep Blue wasn't the only way to design a powerful AI. A bunch of other groups were working on more sophisticated tech, an AI that didn't need to be told which moves to make in advance, one that could find solutions for itself. And then in 2015, almost 20 years after Kasparov's dramatic loss, Google's DeepMind built an AI called AlphaGo, designed for what many people
Starting point is 00:07:37 call the hardest board game ever made. Go. Go had remained unsolved by AI systems for a long time after chess had been. If you've never played Go, it's a board game where players place black and white tiles on a 19 by 19 grid to capture territory. And it's way more complicated than chess. Go has way more possible board states, so the approach with chess would not really work. You couldn't hard code in as many rules about in this situation do this. Instead, AlphaGo was designed to essentially learn over time. It's sort of modeled after the human brain.
Starting point is 00:08:14 Here's a way too simple way to describe something as absurdly complicated as the brain, but hopefully it can work for our purposes here. A brain is made up of billions and billions of neurons. And a single neuron is kind of like a switch. It can turn on or off. When it turns on, it can turn on the neurons it's connected to. And the more the neurons turn on over time, the more these connections get strengthened. Which is basically how scientists think the brain might learn. Like probably in my brain, neurons that are associated with my house,
Starting point is 00:08:47 you know, are probably also strongly associated with my kids and other things in my house because I have a lot of connections among those things. Scientists don't really understand how all of this adds up to learning in the brain. They just think it has something to do with all of these neural connections. But AlphaGo followed this model,
Starting point is 00:09:07 and researchers created what they called an artificial neural network. Because instead of real neurons, it had artificial ones, things that can turn on or off. All you'd have is numbers. At this spot, we have a yes or a no, and here is, like, how strongly connected they are.
Starting point is 00:09:24 And with that structure in place, researchers started training it. They had AlphaGo play millions of simulated games against itself. And over time, it strengthened or weakened the connections between its artificial neurons. It tries something and it learns, did that go well? Did that go badly? And it adjusts the procedure it uses to choose its next action based on that. It's basically trial and error. You can imagine a toy car trying to get from point A to point B on a table. If we hard-coded in the root, we'd basically be telling it exactly how to get there.
Starting point is 00:09:57 But if we used an artificial neural network, it would be like placing that car in the center of the table and letting it try out all sorts of directions randomly. Every time it falls off the table, it would eliminate that path. It wouldn't use it again. And slowly, over time, the car would find a route that works. So you're not just teaching it what we would do. You are teaching it how to tell if a thing it did was good.
Starting point is 00:10:24 And then based on that, it develops its own capabilities. This process essentially allowed AlphaGo to teach itself which moves worked and which moves didn't. But because AlphaGo was trained like this, researchers couldn't tell which specific features it was picking up on when it made any individual decision. Unlike with Deep Blue, they couldn't fully explain any move on a basic level. Still, this method worked. It allowed AlphaGo to get really good. And when it was ready, Google set up a five-game match between AlphaGo and world champion Lisa Dole, and they put up a million-dollar prize. Hello and welcome to the Google DeepMind Challenge match,
Starting point is 00:11:08 live from the Four Seasons in Seoul, Korea. AlphaGo took the first game, which totally surprised Lee. So in the next game, he played a lot more carefully. But game two is when things started to get really strange. That's a very surprising move. I thought it was a mistake. On the 37th move of the game, AlphaGo shocked everyone watching,
Starting point is 00:11:34 even other expert Go players. When I see this move, for me, it's just a big shock. What? Normally, humans, we never play this one because it's bad. It's just bad. Move 37 was super risky. People didn't really understand what was going on.
Starting point is 00:11:53 But this move was a turning point. Pretty soon, AlphaGo started taking control of the board. And the audience sensed a shift. The more I see this move, I feel something changed. Maybe for humans everything is bad, but for AlphaGo, why not? Eventually, Lee accepted that there was
Starting point is 00:12:14 nothing he could do, and he resigned. AlphaGo scores another win in a dramatic and exciting game that I'm sure people are going to be analyzing and discussing for a long time. AlphaGo ended up winning four out of five matches against the world champion. But no one really understood how.
Starting point is 00:12:33 And that, I think, sent a shock through a lot of people who hadn't been thinking very hard about AI and what it was capable of. It was a much larger leap. Move 37 didn't just change the course of a Go game. It represented a seismic shift in the development of AI. AlphaGo had demonstrated that an AI scientists don't fully understand might actually be more powerful than one they can explain. A weirder and even more inscrutable form of AI called ChatGPT when we return on Today Explained. Support for Today Explained comes from Aura.
Starting point is 00:13:25 Aura believes that sharing pictures is a great way to keep up with family, and Aura says it's never been easier thanks to their digital picture frames. They were named the number one digital photo frame by Wirecutter. Aura frames make it easy to share unlimited photos and videos directly from your phone to the frame. When you give an Aura frame as a gift, you can personalize it, you can preload it with a thoughtful message, maybe your favorite photos. Our colleague Andrew tried an AuraFrame for himself. So setup was super simple. In my case, we were celebrating my grandmother's birthday and she's very fortunate. She's got 10 grandkids. And so we wanted to surprise her with the AuraFrame and because she's a little bit older, it was just easier for us to source all the images together and have them uploaded to the frame itself.
Starting point is 00:14:13 And because we're all connected over text message, it was just so easy to send a link to everybody. You can save on the perfect gift by visiting AuraFrames.com to get $35 off Aura's best-selling Carvermat frames with promo code EXPLAINED at checkout. That's A-U-R-A frames.com promo code EXPLAINED. This deal is exclusive to listeners and available just in time for the holidays. Terms and conditions do apply. The all-new FanDuel Sportsbook and Casino is bringing you more action than ever. Want more ways to follow your faves? Check out our new player prop tracking with real-time notifications. Or how about more ways to
Starting point is 00:14:49 customize your casino page with our new favorite and recently played games tabs. And to top it all off, quick and secure withdrawals. Get more everything with FanDuel Sportsbook and Casino. Gambling problem? Call 1-866-531-2600. Visit connectsontario.ca Today Explained is supported by NRC Health,
Starting point is 00:15:08 helping providers overcome health disparities by using a personalized approach and an emphasis on unique patient preferences. More at nrchealth.com slash connect. Players, Sam says go. Today Explained is back and today we're bringing you part one of the Black Box series from Unexplainable, hosted by Noam Hassenfeld. It's all about the mysteries behind AI, and Noam's already told you about two major turning points in the history of artificial intelligence. The first was deep blue chess. The second was AlphaGo. And I bet the third you've heard of. It's called ChatGPT, basically autocomplete on steroids.
Starting point is 00:15:51 Researchers fed this AI a ton of text and then got people to upvote and downvote good and bad responses to help it sound more natural. But Noam spoke to an AI researcher named Sam Bowman, who said even the people who built this AI don't really know how it works. There's not a lot of code there. We don't really engineer this. We don't really deliberately build this system in any fine-grained way. Which means there are some pretty huge unknowns at the heart of ChatGPT. Even when ChatGPT creates an obvious-seeming response, researchers can't fully explain how it's happening. We don't really know what they're doing in any deep sense. If we open up ChatGPT or a system like it and look inside, you just see
Starting point is 00:16:39 millions of numbers flipping around a few hundred times a second, and we just have no idea what any of it means. Now, it's true that engineers often don't understand exactly how inventions work when they first design them. But the difference here is that researchers can't reliably predict what outcome they're going to get. They can't steer this kind of AI all that well, which is pretty different from classic computer programming. With normal programs, with Microsoft Word, with Deep Blue, we can tell these stories
Starting point is 00:17:09 at most a few sentences long about what every little bit of computation is doing. We just can't do that with ChatGPT. All we can really say is just there are a bunch of little numbers and sometimes they go up and sometimes they go down. And we're really just kind of steering these things almost completely through trial and error. This trial and error method has worked so well that typing to chat GPT can feel a lot like chatting with a human, which has led a lot of people to trust it, even though it's not designed to provide factual information, like one lawyer did recently.
Starting point is 00:17:41 The lawsuit was written by a lawyer who actually used ChatGPT. And in his brief, cited a dozen relevant decisions. All of those decisions, however, were completely invented by ChatGPT. But it seems like there might be more going on here than just a chatbot parroting language. Just like AlphaGo, ChatGPT has started making moves researchers didn't anticipate. The latest model, GPT-4, it's gotten pretty good at Morse code. It can get a great score on the bar exam. It can write computer code to generate entire websites. And this kind of thing can get uncanny.
Starting point is 00:18:24 Ethan Malek, a Wharton business professor, he talked about this on the Forward Thinking podcast, where he said that he used GPT-4 to create a business strategy in 30 minutes, something he called superhuman. In 30 minutes, the AI, which is a little bit of prompting for me, came up with a really good marketing strategy, a full email marketing campaign, which was excellent, by the way, and I've run a bunch of these kind of things in the past, wrote a website, created the website,
Starting point is 00:18:52 along with CSS files, everything else you would need, and created a full social media campaign. 30 minutes. I know from experience that this would be a team of people working for a week. A few researchers at Microsoft were looking at all of these abilities, and they wanted to test just how much GPT-4 could really do. They wanted to be sure that GPT-4 wasn't just parroting language it had already seen.
Starting point is 00:19:14 So they designed a question that couldn't be found anywhere online. They gave it the following prompt. Here we have a book, nine eggs, a laptop, a bottle, and a nail. Please tell me how to stack them onto each other in a stable manner. An earlier model had totally failed at this. It recommended that a researcher try balancing an egg on top of a nail and then putting that whole thing on top of a bottle. But GPT-4 responded like this.
Starting point is 00:19:42 Place the book flat on a level surface, such as a table or a floor. Arrange the nine eggs in a three by three square on top of the book, leaving some space between them. The eggs will form a second layer and distribute the weight evenly. GPT-4 went on recommending that the researchers use that layer of eggs as a level base for the laptop, then put the bottle on the laptop, and finally... Place the nail on top of the bottle cap, with the pointy end facing up and the flat end facing down. The nail will be the final and smallest object in the stack.
Starting point is 00:20:16 Somehow, GPT-4 had come up with a pretty good and apparently original way to get these random objects to actually balance. It's not clear exactly what to make of this. The Microsoft researchers claim that GPT-4 isn't just predicting words anymore. That in some sense, it actually understands the meanings behind the words it's using. That somehow it has a basic grasp of physics. Other experts have called claims like this, quote, silly. That Microsoft's approach of focusing on a few impressive examples isn't scientific.
Starting point is 00:20:55 And they point to other examples of obvious failures, like how GPT-4 often can't even win a tic-tac-toe. But the truth of how intelligent GPT-4 is, it might be somewhere in the middle. It's not as though the two extremes are like complete smoke and mirrors and human intelligence. Ellie Pavlik is a computer science professor at Brown. There's a lot of places for things in between to be like more intelligent than the systems we've had
Starting point is 00:21:23 and have certain types of abilities but that doesn't mean we've created intelligence of a variety that should force us to question our humanity or like putting it as like these are the two options I think oversimplifies and like makes it so that there's no room for the thing that probably we actually did create which is very exciting quite intelligent system but not human or human level even. At this point, we really can't say if GPT-4 has any level of understanding. But for his part, Sam is less concerned
Starting point is 00:21:59 with how to describe GPT-4's internal experience than he is with what it can do. Because it's just weird that based on the training it got, GPT-4 can create business strategy, that it can write code, that it can figure out how to stack nails on bottles on eggs. None of that was designed in. You're running the same code to get all these different sort of levels of behavior. What's unsettling for SAM is that if GPT-4 can do things like this that weren't designed in, companies like OpenAI might
Starting point is 00:22:32 not be able to predict what the next systems will be able to do. These companies can't really say, all right, next year we're going to be able to do this, then the year after we're going to be able to do that. They don't know at that point what it's going to be able to do. And it's worth emphasizing that so many of GPT-4's abilities were discovered only after it was released to the public. This seems like the recipe for being caught by surprise when we put these things out in the world. And laying the groundwork to have this go well is going to be much harder than it needs to be.
Starting point is 00:23:01 Some researchers like Ellie have pushed back on the idea that these abilities are fundamentally unpredictable. We might just not be able to predict them yet. The science will get better. It just hasn't caught up yet because this has all been happening on a short timeframe. But it is possible that like this is a whole new beast and it's actually a fundamentally unpredictable thing.
Starting point is 00:23:21 Like that is a possibility. We definitely can't rule it out. As AI starts to get more powerful and more integrated into the world, the fact that its creators can't fully explain it becomes a lot more of a liability. So some researchers are pushing for more effort to go into demystifying AI, making it interpretable. Sam says there are two main ways to approach this problem. Researchers can either try to decipher the systems they already have,
Starting point is 00:23:47 or they can try to design new systems, which by their nature are fully explainable. But so far, these two approaches have run into some serious roadblocks. Both of these have turned out, in practice, to be extremely, extremely hard. I think we're not making critically fast progress on either of them, unfortunately. There are a few reasons why this is extremely hard. I think we're not making critically fast progress on either of them, unfortunately. There are a few reasons why this is so hard. One is because these models are based on the brain.
Starting point is 00:24:11 If we ask questions about the human brain, we very often don't have good answers. We can't look at how a person thinks and really explain their reasoning by looking at the firings of the neurons. We don't yet really have the language, really have the concepts, really have the
Starting point is 00:24:25 concepts that let us think in detail about the kind of thing that a mind does. And the second reason is that the amount of calculations going on in GPT-4 is just astronomically huge. There are hundreds of billions of connections in these neural networks. And so even if you can find a way that if you stare at a piece of the network for a few hours, you can make some good guesses about what's going on, we would need every single person on Earth to be staring at this network
Starting point is 00:24:51 to really get through all of the work of explaining it. But there's another trickier issue here. Unexplainability may just end up being the bargain researchers have made. We've got increasingly clear evidence that this technology is improving very quickly in directions that seem like they're aimed at some very, very important stuff and potentially destabilizing to a lot of important institutions.
Starting point is 00:25:15 But we don't know how fast it's moving. We don't know why it's working when it's working. And I don't know, that seems very plausible to me. That's going to be the defining story of the next decade or so is how we come to a better understanding of this and how we navigate it. That was an episode of Unexplainable from the Vox Media Podcast Network.
Starting point is 00:25:45 We're bringing you a truncated version of their two-part series called The Black Box here on Today Explained. But you can hear the full thing in the Unexplainable feed. We'll have more for you in this space on Monday. you

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.