Factually! with Adam Conover - The Real Problem with A.I. with Emily Bender

Starting point is 00:00:00 You know, I got to confess, I have always been a sucker for Japanese treats. I love going down a little Tokyo, heading to a convenience store, and grabbing all those brightly colored, fun-packaged boxes off of the shelf. But you know what? I don't get the chance to go down there as often as I would like to. And that is why I am so thrilled that Bokksu, a Japanese snack subscription box, chose to sponsor this episode. What's gotten me so excited about Bokksu is that these aren't just your run-of-the-mill grocery store finds. Each box comes packed with 20 unique snacks that you can only find in Japan itself.

Starting point is 00:00:29 Plus, they throw in a handy guide filled with info about each snack and about Japanese culture. And let me tell you something, you are going to need that guide because this box comes with a lot of snacks. I just got this one today, direct from Bokksu, and look at all of these things. We got some sort of seaweed snack here. We've got a buttercream cookie. We've got a dolce. I don't, I'm going to have to read the guide to figure out what this one is. It looks like some sort of sponge cake. Oh my gosh. This one is, I think it's some kind of maybe fried banana chip. Let's try it out and see. Is that what it is? Nope, it's not banana. Maybe it's a cassava potato chip. I should have read the guide. Ah, here they are. Iburigako smoky chips. Potato

Starting point is 00:01:15 chips made with rice flour, providing a lighter texture and satisfying crunch. Oh my gosh, this is so much fun. You got to get one of these for themselves and get this for the month of March. Bokksu has a limited edition cherry blossom box and 12 month subscribers get a free kimono style robe and get this while you're wearing your new duds, learning fascinating things about your tasty snacks. You can also rest assured that you have helped to support small family run businesses in Japan because Bokksu works with 200 plus small makers to get their snacks delivered straight to your door.

Starting point is 00:01:45 So if all of that sounds good, if you want a big box of delicious snacks like this for yourself, use the code factually for $15 off your first order at Bokksu.com. That's code factually for $15 off your first order on Bokksu.com. I don't know the way. I don't know what to think. I don't know what to say. Yeah, but that's alright. Yeah, that's okay. I don't know anything. Hello and welcome to Factually. I'm Adam Conover. Thank you for joining me once again. I'm recording this intro for you from my hotel room in Phoenix, where I am doing a big weekend of stand-up comedy. I'm so excited to be back on the road again. So if you live in Boston, Arlington, Virginia, Washington, D.C., Nashville, New York City, Spokane, Washington, or Tacoma, Washington, head to adamconover.net slash tourdates to get tickets. And of course, please keep watching The G Word on Netflix. And if you like the show,

Starting point is 00:02:50 you can support us on Patreon at patreon.com slash adamconover. And I thank you for doing so. Now, this week, we're talking about artificial intelligence. Now, you might have seen in the news from the past week that a Google engineer named Blake Lemoine was put on paid leave after claiming that Google's AI language model was, in fact, sentient. This is a language model called Lambda, aka Google's language model for dialogue applications. And what happened was Blake asked some questions of it and then from its answers came to the conclusion that it was conscious and had the intelligence of a seven or eight year old and needed to be treated as such. And he kept pushing his Google colleagues to recognize his supposed discovery. Now, there's a couple of weird things about this story. First of all, he wasn't put on paid leave

Starting point is 00:03:37 just because of his claims about the AI. He also sent a bunch of confidential information to lots of different people, including a senator and claimed that Google had engaged in religious discrimination. And also critically, the the transcript that he put together that he claimed proved the thing was sentient was, in fact, edited from lots of different conversations that he combined in order to make it look like the system was much more fluent with language than it actually was. make it look like the system was much more fluent with language than it actually was. Now, Google and pretty much every actual AI researcher has rejected Lemoyne's claims, and with very good reason. Not just because he edited the transcripts to make them look more compelling than they were, but because anyone who knows how these language models actually work knows that there is no way this thing is sentient. All that these language models do is that they are fed in a huge amount of data, huge amount of text from the internet, from novels,

Starting point is 00:04:31 things like that. And then they mash up that text, rearranging it into a different output. That's literally all that they do. It's basically a sophisticated form of magnetic poetry. And saying that a system like this is actually sentient is a lot like saying your fridge wrote a poem, all right? It did not. It is not intelligent. This is a human tool that humans use to produce language in a very cool, somewhat novel way. But the fact that Lemoyne was actually convinced he really did believe the AI was sentient really does illuminate something that's actually very frightening about AI. See, we've lived with this fantasy for years that one day AI would become sentient and either take over the world or we would need to learn how to behave ethically towards

Starting point is 00:05:15 it, right? We need to learn how to treat data, the intelligent android from Star Trek, just like we would a fellow human being. That is what our fantasies have led us to believe in reality. But the truth of what's dangerous about AI is far different. See, the real problem is that we as a species are so gullible and so good at creating systems that seem to be intelligent to us even when they are not, that we believe in AI even when nothing of the sort exists. The perfect example of this, of course, is self-driving cars.

Starting point is 00:05:48 The industry has convinced people that self-driving cars are either just around the corner or already here, with the result that people are buying cars that have much more rudimentary automated driving features and treating them as though they can actually take their hands off the wheel and their eyes off the road with deadly results. But an even deeper risk is that it distracts us from paying attention to who is actually making the decisions. See, if we treat AI as some inevitability, some enormous force of the future that is just going to come and envelop us and change our world regardless of what we do, well, that makes us less likely to question the people who are actually building technology, questioning people who work at Google, who work at Tesla, who work at Netflix when they tell us that their algorithm can decide what we want to

Starting point is 00:06:35 watch and what we don't. In reality, humans are the ones creating these algorithms, creating these systems. And when they try to deflect blame and say, oh, no, the AI did it. The super intelligent AI is coming for us. It's changing everything. Well, it might make us less likely to ask the people who are actually creating these tools what they are doing and why. Artificial intelligence, in other words, is in a lot of ways like the Wizard of Oz. It's false. It's a ruse. It's a fake power that we are all investing all of our fears and concerns into when in reality, we should be asking who's making the decisions behind the curtain. So to cut through the bullshit around this issue and talk about what is really at stake

Starting point is 00:07:18 and what we should really be concerned about when it comes to, quote, artificial intelligence, we have an actual intelligence on the show, a real life human expert. Her name is Emily Bender, and she's a computational linguist at the University of Washington. And she has written about the risks of massive language models and what we currently call artificial intelligence. Please welcome Emily Bender. Emily, thank you so much for being on the show. I am super excited to join you. This is going to be fun. It's going to be, we're going to have a great time today. I want to jump right into it. We're constantly bombarded lately by articles, pieces in the news about how amazing AI is,

Starting point is 00:08:08 how great it is at things, or how terrifying it is, how we should all be worried about what it can do. You wrote a piece recently with the wonderful title on AI that we should resist the urge to be impressed. And I love that, very contrarian, which is up my alley. And I think you have some expertise in this because you work in computational linguistics. Why should we resist the urge to be impressed by AI? Why should we resist the urge? Not where's the urge coming from, but why should we resist? Well, start wherever you like. So you can start with where it comes from if you want. Let's start with why to resist. There is a lot of power being concentrated these days

Starting point is 00:08:37 in the name of AI. And that power comes in the form of money, and it comes in the form of collections of data, and it comes in the form of influence on our systems of governance. And that is real, right? The claims about what the technology can do, some of them are well-founded and some of them are not, but the power is real. And the more we are in awe of what this technology can supposedly do, the less well-position positioned we are to counteract that power. And that's the why. So it's that if we maintain that sense of awe, oh my gosh, AI is so amazing, then we are what? Less likely to question what it does, less likely to talk about why it actually is doing the things that it's doing. Instead, we'll just see it as some supernatural force or something.

Starting point is 00:09:23 Right. In fact, we're less likely to question what people are doing with it. Right. As we get impressed with, oh, it's AI. It's this autonomous thing. It's so smart. It can do all these things. Why don't we just let those objective machines decide? And that's what the people selling it are telling us that they're providing.

Starting point is 00:09:36 And none of that is true. And we need to sort of hold on to human agency, both our own agency, but also accountability for the people behind the so-called AI. Right. You talk about a piece by Stephen Johnson in the New York Times, who's a wonderful writer, but he writes about AI and how smart it is and is it getting too smart and do we need to put in safeguards against how intelligent and super smart AI is, and you really take issue with that framing from him. Could you tell me why? Yeah. So that was a piece about this organization called OpenAI, and it was the journalist sort of just speaking from the point of view of OpenAI. And you're right, he's a fantastic writer. So my

Starting point is 00:10:18 Medium post responding to it, I made an audio paper version of it where I recorded it. And so there's a bunch of quotes from him and then there's my writing and I could really feel the difference in level of editing between those as I was reading them. It's like, okay, yeah, but I turned mine around in a weekend so that it would be out there and I won't go back about that. And he's a journalist. You're actually a subject matter expert who works in computational linguistics. So, you know, his job is pretty pros and your job is telling us exactly what's going on. So, you know, not to denigrate journalists at all, but please, what was your perspective? idea of AI and sort of imagined a world in which we have autonomous AIs and the AIs aren't doing what we want them to do as people. And so they said, okay, what we need to do is solve this

Starting point is 00:11:11 problem by figuring out how to build AIs that take on human values and do that as fast as possible. And all of that sort of presupposes that AI is a thing, that it's going to happen no matter what, that this group of people in San Francisco are well positioned to know what human values are, and that there aren't other problems that are under this rubric of AI. And so this whole 10,000-word piece focusing on what they're trying to solve and sort of saying, is this the right way to solve it, just was such a narrow view. and sort of saying, is this the right way to solve it? Just was such a narrow view. And, you know, I knew the piece was coming out because I got interviewed for it and had this back and forth that you'll see

Starting point is 00:11:49 in my Medium post about, he's saying, but, you know, is it a good idea to teach them human values? And I'm like, that's a complete misframing. And he sort of like kept pushing back with that thought. And I was like, okay, this is not going to be a good piece when it comes out. Just from the interview, you were like, you're on the wrong track, dude. Yeah. And I'm trying to course correct and you're not taking the course correction.

Starting point is 00:12:16 So, you know, out it comes and I'm like, yeah, I need to say something. And partially I felt personally responsible because I was in the piece with a couple other people, Meredith Whitaker and Gary Marcus, as skeptics. But the way we were presented, it's like, hey, you're in the skeptics box over there. Here's the big claims from OpenAI. And then there's the people who disagree. And it's like, but that gives the framing to OpenAI. And the question isn't what's the best way to build AI from my perspective? The question is, what is happening right now as people are using these collections of data and collections of money and collections

Starting point is 00:12:51 of computers and working in the space that's under-regulated? How do we educate the public as to what the public needs to know to resist that? And that article was not that. Yeah. And Meredith Whitaker, who you mentioned, by the way, previous guest on the show, was not that. Yeah. And Meredith Whitaker, who you mentioned, by the way, previous guest on the show, we talked about many of these issues with her. But let's pull apart some of the assumptions that you say that this piece and most media on AI sort of takes for granted that we need to question. You open by saying it presupposes that AI is a thing, even. That asking the question, okay, should we, it's like a science fiction question, should we teach AI to have human values presupposes a fact about AI, that AI is,

Starting point is 00:13:33 I don't know, like the Steven Spielberg movie, like some little child with super intelligence who we're going to teach. According to you, I guess it is not the case. What is the truth about whether, is AI a thing? Is AI a thing? Yeah. So I went to school in California in the 90s. Undergrad at Berkeley, grad student at Stanford, studying linguistics all the way through. But I was sort of around the computer scientists because there's this notion of cognitive sciences and we all fit in.

Starting point is 00:14:00 And the standing understanding then as a kind of a joke was as soon as you knew how to build something in a computer, that problem was no longer an AI problem. As soon as you could build a computer that could play chess, well, chess is no longer AI because we've understood how to do it. So we no longer even call it artificial intelligence. Is that what you mean? Yeah, exactly. So these things that were like AI problems, once they got solved, they sort of left the AI part of computer science and became just their own thing that was more specific. And I think part of what was going on there is that here's like, how could we possibly make a computer that could, you know, play chess? Oh, here's how we do it. Oh, okay. Well, it's just this program that works that way. That's not artificial intelligence. And I think what's happened in the last few years with

Starting point is 00:14:50 the large neural model, so it's called deep learning. You hear about large language models and large computer vision models is we've learned how to scale up the computation, right? So the hardware and the software that allow us to just deal with enormous amounts of text and image to the point that we can no longer look at and say, ah, I understand how it's doing what it's doing. And so now that we have computers that can generate coherent sounding text, that doesn't fall out of AI because we can't look at it and go, ah, but I see how it's doing it. So therefore it's not artificial intelligence. And so we just assume this mind in there. Yeah. Okay. I think I, this is making a lot of sense to me because it's true. I grew up in the nineties and we had deep

Starting point is 00:15:34 blue versus Gary Kasparov when the computer beat Gary Kasparov at chess. And it was like on the cover of news week every week, like our computers smarter than man, like is artificial intelligence here. Now no one talks about chess playing as like is artificial intelligence here now no one talks about chess playing as part of artificial intelligence like yeah it's just a video game like a video game can also beat you at super mario brothers you know like we can program a computer to play super mario brothers better than a human we're not like worried or transfixed about this it's just like yeah there's just a fucking algorithm that you write like no big deal and also critically we know that those algorithms are written by people.

Starting point is 00:16:06 Like the whole framing of Deep Blue at the time was an intentional sort of creating mystique around the computer. But like, I can buy a chess program that'll beat Garry Kasparov for 9.99 from the Apple store. It's not, and I don't think it's some super intelligence that I'm worried is gonna take over the world. I know it's just like, yeah, the people at chessgaming.com or whatever, like coded a good chess engine. And we know this. So to that degree, it's, it makes

Starting point is 00:16:35 it appear that as you're saying, we no longer call it AI. So AI, the way we've used it is always science fiction. It's always off in the future. We framed it as something mystical and magical. And once we understand something, we no longer call it AI, except that now these new models are weird enough that they stay in the mysterious zone, in the magical zone, in our minds. Like when you use one of these things

Starting point is 00:17:00 that generates massive amounts of text or images. Its output is random enough or unpredictable enough and taps into our understanding of enough. It looks like text to us. We're like, oh, there's a mind under there. But there still isn't. It's still just made by people. It's still just made by people.

Starting point is 00:17:22 And because it's landed in this magical zone and stayed there, then we fill in all these other magical things it might be doing. So I have something, I think you're going to enjoy this. There's a researcher in Italy named Stefano Quintarelli, and he's proposed a different name for AI because part of the problem is we call it AI. And so then we make a bunch of assumptions based on that, right? But he says we should instead call it systematic approaches to learning algorithms and machine inferences, abbreviated SELAMI. And then you ask questions like, will SELAMI develop some form of consciousness?

Starting point is 00:17:58 Will SELAMI have emotions? Can SELAMI acquire a personality similar to humans? So he's making the point that it's the word that we choose because we are calling this artificial intelligence as opposed to a big algorithm or something like that. We are ascribing it human qualities like, oh, should we teach it to love? Should we teach it human ethics? When in reality, we should be asking, hey, who the fuck is designing this thing? Which is what we would do if we had a different name for it. Yeah. And what's it built for? And how well does it work in that context that we're trying to use it? Right? Yeah. And

Starting point is 00:18:32 one of these that, so he's got this long list of questions. And the last one is, can you possibly fall in love with a salami? And I'm like, I know people who are really into cured meat, so maybe. And I'm like, I know people who are really into cured meat, so maybe. Yeah, but it makes a point about the absurdity of the way that we talk about artificial intelligence. And it's something that I've really noticed in the tech industry in the last 10 years that it has become even more enamored with science fiction than it was in the past. Like, I think what was remarkable about the tech industry from the 80s, 90s was that it actually was not mirroring science fiction. That I used to think it was remarkable that like Isaac Asimov did not predict the internet, right? Isaac Asimov predicted that there'd be one giant mainframe that you would ask questions and it would say, hello, Adam, I have the answer, right? But that's not what computers turned out to be.

Starting point is 00:19:26 But lately, we've had all these technologists who are trying to create specifically what was in the Isaac Asimov novels they read when they were kids, which means they're trying to create some spooky mystical brain that they can say, whoa, we're not in control of this thing about, even though they are in control of it.

Starting point is 00:19:43 The thing that surprises me about this is that, yes, I definitely feel, because I was a big science fiction nerd as a kid too. I loved reading it. But I feel like these technologists missed the point of a lot of those stories. That, you know, the stories were, like the heart of science fiction is not, gee, cool tech.

Starting point is 00:19:58 It's what happens to humans if we change this thing about the universe, right? And that's going to be true of all kinds of speculative fiction. And, you know, what can go wrong and what ways are humans resilient? And the sort of sci-fi inspired tech these days just seems to be, I want the gadgets from Star Trek and not like, I want to think about humans and how they interact with this stuff. Because so many stories from science fiction, including Star Trek are like, okay, we have these gadgets. Why might it be bad to have those gadgets?

Starting point is 00:20:26 What might happen? What consequences might arise from those gadgets? That's the fun part of science fiction, not just like, wow, there's a cool gun. What happens when you have a gun that that's that cool? And that's what these technologists do not seem interested in asking. But so, OK, you we've established how the way we talk about artificial intelligence is is essentially based on a fantasy. And, you know, folks like writers of The New York Times are not adequately diving into what is the reality of these systems and how they work.

Starting point is 00:20:55 So just talking about text generation, which is one of the big new frontiers of AI. There's this model GPT-3, which is able to produce extremely convincing text. And even before this was released, you were seeing articles that were like, this is going to destroy writing because it's just going to replace writers. I know people, you know, in the Writers Guild, of which I'm a member, that we had members who were worried, what if AI comes and takes our screenwriting jobs? Which is a huge leap to think that, but that was literally what the articles were telling them. So what is the reality of what, you know,

Starting point is 00:21:30 these text generation systems actually are that is more true than the fantasy we're being sold? Yeah. So I'm going to give you a little linguistics lesson here. Bear with me. No, are you kidding? Bear with you? That is what this podcast is all about. That is what we're excited for.

Starting point is 00:21:43 Give me the linguistics lesson. Okay, here's, this is not all of linguistics. This is focused on sort of syntax, semantics, pragmatics, which is part of linguistics. But a key thing about language is that it's what the Saussure called a system of signs. So there's the signifier, the actual form of what you say, and then there's the meaning, what you convey by saying it. So I'm going to give you an example. So the form could be the words I'm saying to you. It could be a bunch of letters arranged so that they make English words because you and I both speak English, right? If we were fluent users of American sign language, it could be gestures with our hands and face and posture, right? All of that is form.

Starting point is 00:22:20 If I put some linguistic form somewhere where somebody who shares that linguistic system can perceive it, then because they share that system, they can pick out what the sort of standing or standard or literal meaning of that phrase is. So for example, I'm going to tell you about a tweet that I wrote that had the words, I'd like more people to know about at sign images of AI, more people to know about, at sign, images of AI, just saying, and then a URL. All right. And you're an English speaker too. So you get that and you're like, okay, somebody said this and they are a person who has this desire about other people and et cetera. You were able to unpack that. Yes. If you have a little bit of shared context with me, you can figure out what I was trying to do with it by using that meaning. Right. Oh, she's probably trying to get people to click on this link and see what this better images of AI dot org thing is. By the way, it's super cool. It's a bunch of artists who have come up with better ways to represent what's actually happening with AI than those like electronic brain pictures we're constantly seeing. All right.

Starting point is 00:23:26 So the illustrations in news articles can tend to be a little bit first thought, I will admit. Yes. Please go on. So just taking you through the form and then the sort of literal meaning and then the speaker's intent, which is also the public commitments, right? When I've gone on and said that thing in a tweet, then I am publicly committed to that content that people can figure out based on the context. The next step is called prolocutionary consequences, all right? And that's what happens in the world because I said that.

Starting point is 00:23:54 Did I change someone's belief? Did I get somebody to click on that image? And so on. Right. But that's sort of beyond my public commitments. Right. If you shared a little bit more context with me, like my colleague who DM'd me did, well, let me back up. So there's that just saying there, and we share the context of this is Twitter. You might've guessed that that was a subtweet because of the just saying, right? Uh-huh. Uh-huh. Not publicly committed to that being a subtweet, right? It's sort of a

Starting point is 00:24:19 subtle, like further reasoning thing, but- Right. share- It's a dog whistle to people who know, who share even more context with you. Exactly. Who might go, oh, I know what Emily's thinking about this. Yeah. So if you share that context and you know that I've also just been tweeting about the Seattle Times op-ed that I put out where there was some of the awful AI art associated with it,

Starting point is 00:24:40 and you might've been able to pick up, oh, okay, she's actually subtweeting the Seattle Times as my colleague did. And he sent me a DM and said, did you just gently subtweet the Seattle Times? Like, yes, I did. But anyways, that's how people use language, right? We have this shared code that consists of form and its literal meaning. And when I speak, I am not handing you literal meanings. I am working to convey some communicative intent, and I'm giving you a big clue to it by the words that I chose to say. GPT-3 is doing none of that. GPT-3 only has the form part, and it's got lots and lots and lots of it. So it knows sort of which forms are plausible combinations, but no connection to any ideas about the world, no connection to the

Starting point is 00:25:27 literal meaning, and absolutely no speaker intent. It's not trying to communicate anything. Yeah. My understanding of the way that an algorithm like GPT-3 works is that it just has a huge body of text that it has hoovered up from the internet, from, I don't know, fan fiction websites are one that like these things seem to get a lot from, right? Just like, okay, we've got tens of thousands of pages and pages of texts. And so then it just has basically just a little internal table of like, when this word appears, this word is more likely to follow it. And then when that word appears, that word is more likely to follow it, but more complicated than that, but basically that. And then when it starts with one word, it like says, okay, what word follows that?

Starting point is 00:26:09 What word follows that? What word follows that? What word follows that? And it's able to, by doing a much more sophisticated version of what I just said, output text that to us sounds sensical. But apart from that, it's just sort of trying to like, hey, I can make something that resembles my input text statistically. And that's what it's doing. Am I generally right? You're generally right, yeah.

Starting point is 00:26:29 I mean, when you say it's more complicated than that, yes, it's more complicated because it's able to take into account lots and lots and lots of preceding words. So it's not just like given the last two words, what's the most plausible next one? But you can get it to be like stylistically coherent because it's dealing with a much larger window, but it's basically that. And it's just like, which string of letters is a likely one to come next after the 50 previous strings of letters. And yeah, there's no, there's no there, there. It is literally, it's not even making up ideas. It's making up strings of letters. Yeah. And it only becomes an idea when somebody reads it and tries to make sense of it. Yeah. And so the question is, why would we even

Starting point is 00:27:13 call that artificial intelligence? I mean, there's nothing intelligent about that. There's no understanding. There's no purposeful construction of a sentence. There's no idea, as you say, that's trying to be conveyed. It's just like, hey, it's randomness that has been shaped to resemble human writing. But why would we even label that as intelligence when you talk about it that way? Yeah. I think it's because we are so easily taken in because we are so good at interpreting language. Like as soon as we come across language, you know, something we hear, something we see in a language we speak,

Starting point is 00:27:54 without like, we can't help but imagine a mind behind it. Yeah. And so it is super easy to get taken in. So this is the like, resist the urge to be impressed. This is where the urge is coming from. You know what this is? Okay. Tell me if you like this metaphor. Say, you know, everyone knows the

Starting point is 00:28:08 infinite room with monkeys on typewriters, right? There's an infinite room. You got monkeys. They're all hitting typewriters randomly. And some of the monkeys through statistics are going to like write a Shakespearean sonnet or whatever, right? Now, what if you could create an algorithm in the infinite room that anytime a monkey typed a character that was not part of Shakespeare, a bullet goes through the monkey's brain and the monkey drops dead and they stop typing, right? And then you walk in and you pick out, okay, all these monkeys are dead.

Starting point is 00:28:34 Here's, I'm sorry that it's so violent. This is just where, this is on the top of my head and this is where my mind is going. I'm very sorry. I don't endorse violence against monkeys. But you, so you go in and you're like, all right, I'm gonna find the alive monkeys. Oh, this one, you know, this one happens to resemble Shakespeare a lot. And

Starting point is 00:28:48 you hand that to someone, it would be ridiculous to say, oh, well, that monkey is intelligent, right? Because you have algorithmically sorted the randomness in such a way that you've produced an output that we find sensical. We, we would understand that that is just the output of a, of a dumb algorithm, but because you're right, because we have this ability to, it triggers in our brains what makes us recognize humanity in someone else, right? Which is the production of like sensical language. We are ascribing intelligence

Starting point is 00:29:19 to what is not fundamentally intelligent. Am I way off base? I'm sorry again for the violence against the monkeys. I love monkeys. You have a pained expression. I'm really sorry. So the difference between the monkeys and something like GPT-3,

Starting point is 00:29:33 aside from the fact that the monkeys are something that we should care for the well-being of and GPT-3 doesn't matter, right? Is that the monkeys in that scenario are just sort of randomly, sometimes touching the keyboard, right? Like if you actually picture the monkeys in the keyboard, it's like most of the time, they of randomly sometimes touching the keyboard, right? Like if you actually picture the monkeys in the keyboard, it's like most of the time they're not going to be typing, right? Yeah.

Starting point is 00:29:50 Yeah. What's motivating them to type? Yeah. Maybe they get bananas when they've hit a certain number of keys, right? Yeah. So mostly they're not typing. Sometimes they do. And if you have enough monkeys and enough time, then eventually some Shakespearean sonnet will come out. The difference there is there's no

Starting point is 00:30:08 selection, right? The monkeys are typing randomly. In fact, chances are you're going to get more hits at the center of the keyboard. Like there's going to be something ergonomic that's going to determine what comes up. With GPT-3, the system for developing it is it is fed lots and lots and lots of text. And at each step, it says, OK, what's my guess as to what the next word is? Yeah. OK, what actually was the next word? OK, I'm going to go back and I'm going to adjust some of the things in my internal representation so that next time I'm more likely to get that one right. Yeah. No gunshots required, but sort of through that process, GPT-3 has been shaped into something that comes up with coherent stuff most of the time, as opposed to the monkeys.

Starting point is 00:30:53 Well, that was my idea for an algorithm that selected them by shooting them in the head. But you know what? It's a bad metaphor. So we don't need to pick apart how stupid what I'm saying is, because I'm, let's be honest, not intelligent either. So we don't need to pick apart how stupid what I'm saying is, because I'm, let's be honest, not intelligent either. But, you know, the point being that, like, this is not fundamentally an intelligent system in the way that we think it is. Yet we are describing it as though it is and we need to teach it how to be ethical and all this whatnot when we should be asking, wait, who created this system? Who is the human in charge of it? And what is it for? What this reminds me of the most is, and I might've ranted about this on the show before,

Starting point is 00:31:32 but a couple of years ago, there was a fad for these things where it was like, I taught an AI to write a Seinfeld episode. And then there would be like this text that was like semi-nonsensical. And I would look at this as someone who knows a little bit about computers and a lot about comedy writing. And I would look at this and go, a human fucking wrote this.

Starting point is 00:31:52 Like there was probably an AI somewhere that was generating text, but a human chose which bits of output were funniest to them and put them in an order. There's no possible way that this was just wholly generated by an AI, but everybody retweeted these things like crazy. They published books of them because everyone loved the fantasy that there was an AI out there that was like kind of smart, but kind of stupid trying to write, you know, Seinfeld. That was a better story

Starting point is 00:32:13 than the fact that there was a human who was behind this, who figured out, hey, I can sell my book proposal if I say an AI did it, right? Now that doesn't hurt anybody for us to maintain that fantasy of the funny AI that can write Seinfeld. But what about when you're talking about,

Starting point is 00:32:29 you know, I don't know, you're talking about the military industrial complex and what it is going to do with AIs, or when you're talking about the media industry and what it is trying to do with AIs. Does that sound more like of a correct take to you than my monkey son typewriter's dog shit. Yes. So there's two things I want to say. One is that with the, a human wrote that, and with respect to the Seinfeld episode, there's two places the humans came in, right? Yes. It's clearly a human was going through and cherry picking and getting, you know, what are the stretches that I'm going to use and what's in order to put them in and so on. But also, humans wrote the original Seinfeld scripts that the thing was trained on. Right. Right. So it's basically just like a, let's, let's throw that stuff in a blender,

Starting point is 00:33:11 give the human something to play with. And yeah, it'd probably be a pretty crappy episode of Seinfeld. Like it wouldn't, it wouldn't actually work, but it's fun to imagine. It sounds like Seinfeld because guess what? It's remixed Seinfeld. Yeah. Right. This is like, this is like making magnetic poetry on your fridge and then saying, wow, my fridge wrote a poem. Like, no, you did. And the people who made magnetic poetry wrote a poem together. Right. Yeah, exactly. But there was something else I was going to go with that about the, oh, so you said, yeah, military industrial complex and so on, but also all of these places that people want to use it where there are problems. So there is a lack of healthcare professionals in this country, especially around mental healthcare, and more people who need access to counseling and mental health services.

Starting point is 00:33:58 That's a problem. That's a problem we should be throwing resources at. We should be making sure that the people doing that work are fully supported and that it's accessible and that it's culturally appropriate and on and on and on. But people look at that lack and they say, gosh, wouldn't it be nice if we could somehow automate this process and reduce the workload on these people and give other people access to the healthcare that they need? And there's this tripping point that I call the machine learning tech solutionism danger zone.

Starting point is 00:34:29 All right. And the idea here is that it would be great if we could start from inputs X and get outputs Y. And the problem with machine learning is that you can make something that looks like it's doing that. So my example here that I sort of originally came to the thought on was these mental health apps that supposedly diagnose disorders based on the sound of your voice.

Starting point is 00:34:53 And it's like, okay, I could imagine that that would be useful. There are contexts where it would help the healthcare system. Things that purport to do that yes things that actually do that no right and and the problem is that you can certainly write a program that takes as input okay i'm going to record a bit of your voice and gives us output you know um no issues or ptsd or you know schizophrenia or depression right and like those are the possible outputs and it'll be right some of the time there'll be times where we don't have a way to verify whether or not it's right. And we have some reason we want to believe it. And it's, this is why we have to be skeptical, right? We have to say when, when someone says,

Starting point is 00:35:34 I built a machine that does this, it's like, how do you know, how does it work on the inside? How did you evaluate it? How is it going to fit into some use case where people are going to use it? Yeah. How would, how would such a thing even begin to work? I mean, is it just taking voice samples of people who have schizophrenia, people who do not have schizophrenia and saying, well, we're going to compare all their voices and, you know, and based on how much the voice input matches, you know, the statistical breakdown of voices with schizophrenia versus voices who don't, we're going to say your voice either falls in one category or the other?

Starting point is 00:36:09 Yeah, basically. That's so stupid. Why would that, why would anyone think that that would have anything to do with? So there is one little kernel in there, which is, I guess there's some research showing that people who are diagnosed with, let's say, depression, it affects the way they speak. If you think about it, someone can sound depressed, right? So yeah, there might be some signal there, but is it robust enough to do what these apps are claiming? No, right? And even if it were, how do we know, right? I'm reminded also, in the really early days of the pandemic, everyone was scrambling to figure out how do we tell cheaply who's got COVID? Like the real need there. Right. Yeah.

Starting point is 00:36:51 And so there's a bunch of sort of things that popped up out of machine learning labs where it's like, okay, record your voice for science. And we're going to develop a system that can classify you as having COVID or not based on the sound of your coughing or the sound of your breathing or whatever. Yeah. Right? And I'm like, where is the training and evaluation data coming from for this? Because the whole point is we don't know who has it. Yeah. And it can be transmitted when people are asymptomatic. So you've got someone calling, recording their voice or their cough as, you know, non-COVID when they actually,, like it doesn't make any sense at all. But the way these systems work, it's, okay, data in,

Starting point is 00:37:27 magic inside the black box, labels out. Yeah. And see, it fits into that need. So therefore we're going to say that it works and not do the thing we need to do, which is figure out how to allocate our societal resources to these issues so that we can actually work on real solutions. Yeah.

Starting point is 00:37:50 I mean, and at the worst case, this like, say, you know, I can imagine someone downloading this app, right? Talking into their thing and talking to their phone. And then the app is like, you have depression, right? And people taking that way too seriously, because like, it's one thing to say, hey, I trained an algorithm to like detect when people are speaking a little bit more slowly than other people, which is like how any of us would listen to someone's voice. That person sounds a little depressed, sounds a little low energy, you know. But like if I were to hear you speak and I would say, you sound a little depressed because as a human, I recognize the differences in voices that are depressed versus are not. You'd go like, oh, okay, thank you, Adam. Like, sure, you're just some dude, right? You're some guy who's like applying a very rough heuristic.

Starting point is 00:38:32 But if you say, I have developed an AI that can detect the difference between depressed voices and not depressed voices, people are going to download that and say, the superhuman AI has told me I have this. And I've seen this happen in my life. Like I literally have had friends say, oh my God, TikTok says I have ADD. Like literally people say this to me. I mean, the algorithm knows me so well. It must know I have ADD.

Starting point is 00:38:57 I gotta go to a doctor. Literally I've had this conversation with friends. And like that is where, I mean, TikTok is one, let's not get into TikTok, whole other topic, but like when you're leveraging our faith in AI and misrepresenting the power of what it can do, because you know, people will believe in it more than they should in order to get them to behave differently. That is what's unethical. We don't need to teach the AIs to be more ethical. We need to teach humans to stop doing that shit. Yeah. And you said worst case.

Starting point is 00:39:25 I've got a worse case for you than that. Oh, please. Wait, wait. Hold on. We got to go to break. That's a perfect tease to get us over the break. We'll be right back to hear the even worse case with Emily Bender. I don't know anything Okay, we're back with Emily Bender, where you gave us a delicious tease.

Starting point is 00:39:53 You have an even worse case. Lay it on me. I want to hear what it is. All right, so your worst case was someone picks up the phone and they get a misconception of what's going on in their own mind. It is really easy to get recordings of other people, right? My worst case is who is going to take these supposedly superhuman AI objective diagnoses and use them to make decisions about child custody, to make decisions about

Starting point is 00:40:16 employment, to make decisions about parole, all of these things that are really impactful to people's lives based on technology that they don't understand. And if they did understand, they would know it was BS and doesn't work. Yeah. And technology that the humans who have created the technology are decrying responsibility, denying their own responsibility in the system and saying, oh, no, the AI did it. Like, I have a very good example of this from my own life. I have a new show out on Netflix. Congrats. And thank you very much. Very happy about it. life. I have a new show out on Netflix. Congrats.

Starting point is 00:40:45 And thank you very much. Very happy about it. It's called The G Word. Please go watch. But one of my complaints about Netflix, and it's not exclusive to Netflix, it's happening all across the entertainment industry now, is that the shows are delivered to you via algorithm, right?

Starting point is 00:40:59 Like it recommends shows to you based on the algorithm. And when you talk to people at Netflix or these other companies, and you'll say, why didn't people watch this show or that show? They'll say, oh, based on the algorithm. And when you talk to people at Netflix or these other companies, and you'll say, why didn't people watch this show or that show? They'll say, oh, well, the algorithm figured out that people didn't like the show. And so it failed because the algorithm didn't give it to them because the algorithm knew that other people didn't like it.

Starting point is 00:41:18 Therefore, the algorithm, you know, like it's, it's the algorithm knows something about the show that like you, a stupid human don't know. And I'm sitting there going, look, all the algorithm does is do what does what fucking Amazon's algorithm does. It says people who watch this also watch this. If you like this, you might like that. And it shows it to them. It's extremely simple. It's very dumb. And if you have any sort of critical thinking towards it, you know, it doesn't work very well, but the people who created it will tell you, oh, we have nothing to do with it. We don't decide the winners and losers.

Starting point is 00:41:49 We don't choose what is the biggest image that's at the top. When you open your Netflix and you see that giant banner at the top, we don't pick that image. The algorithm does. And that's transparently bullshit. People designed the algorithm.

Starting point is 00:42:00 People are making choices. They have a whole building full of people who are paid very well, who are very clearly making decisions there. but they will claim that they do not. And now that's, again, that's just Netflix. Who gives a shit? No big deal. But when you're talking about someone using a system like that to determine who gets parole, that's a real problem. Yeah. Yeah. And I think it's arguably a problem on Netflix as well, because if we basically say we are going to design an

Starting point is 00:42:25 algorithm and just let it go, right? So it's absolutely decisions going into the algorithm, but then that influences the societal conversation, right? Who's watching what, like which thing's getting watched? Let's say there's this fantastic documentary. Oh, one of your previous guests, Glenn Weil, I think was talking up the digital minister of Taiwan and how everyone should watch a documentary about her life. Let's say he got his wish and that documentary exists, but it's on Netflix and it didn't get promoted and the things near to it didn't get promoted. So it ended up sort of in this, like you really had to look for it whole and it didn't get seen, right? Versus a world in which someone said, hey, I want to make sure people see this. I want to make sure it's discoverable.

Starting point is 00:43:04 And then we're talking about it more. So yeah, I mean, in terms of like the immediate consequences to the direct people involved, probably the parole decision is a much more impactful one. But what, you know, when we have these recommender systems that are pushing content to people and making it harder, making you have to go out of your way to find other content, that absolutely affects our societal milieu. Yeah. Yeah, because people are, and the fact that humans

Starting point is 00:43:31 are not taking responsibility for that, they're not taking responsibility for determining the input of what is actually being put into the system or taking responsibility for the fact that they set up the algorithm, that they determined what it is. And by the way,

Starting point is 00:43:44 Netflix doesn't even call that AI, nor does TikTok call their algorithm AI. But people ascribe those algorithms magical properties when in reality they seem the TikTok algorithm is not that good. It just shows me videos that are popular. It figures out when people watch the first 10 seconds and don't click away till the end. And then it shows those videos to other people. That's it. But people believe that it has some deep insight into their character.

Starting point is 00:44:07 So, okay. Well, let's see, you recently wrote a paper. I want to make sure I ask about specific work rather than just my own beefs about AI, about stochastic parrots. I want to understand what that means. Can you explain it to me? Yeah, so stochastic parrots is this phrase

Starting point is 00:44:22 that we've made up to try to capture what GPT-3 and the like are doing. So, stochastic means random or probabilistically guided. And the idea with the parrot metaphor, parrots are actually very intelligent. I'm reliably informed they're super smart birds. And it's also possible that some of them, when they imitate human speech, are doing that because they've learned that they can get a particular effect with a particular sequence of sounds. But I'm pretty sure that they're not actually using language the way humans use language. Correct. The phrase stochastic parrots is, that's all these things are. These things that you want to call AI, they are just randomly saying things that sound good to us.

Starting point is 00:45:02 I think part of the reason parrot is so great is that that Parrots are fantastic on the sound part of it. Parrots do an amazing job coming up with sounds that sound like human spoken languages. Yes. And GPT-3 does an amazing job coming up with pretty large swaths of coherent seeming text. Like that's, it's a nice facsimile. And so that's what that phrase was about.

Starting point is 00:45:25 But that paper was written together with Dr. Timnit Gebru and Dr. Margaret Mitchell and some of their other colleagues at Google and my PhD student, Angelina McMillan Major. And it started because, so this is 2020, and all of the big tech companies are sort of pushing to larger and larger language models. That hasn't stopped, right? There's new ones coming out all the time. And Dr. Gebru was saying, you know, maybe we should think about what the consequences are here, right? Together with Dr. Mitchell, they were the leads of the ethical AI team at Google. So, like, their job was to sort of help guide the company to do good things. And so they sort of said, hey, let's think about this.

Starting point is 00:46:06 And so she DMed me, Dr. Gabriel, on Twitter and said, has anyone written any papers about this? And, you know, I said, no, not that I can think of. Why? She said, well, I want to be able to point to something to sort of like help make the case here internally. And I've been pointing people at your tweet threads. I'm like, well, that's kind of cool. And then the next day I said, well, okay, I don't know any papers, but here's these like six things that I can see of as problems. Hang on, that feels like the outline of a paper. Do you want to write this paper together? And she said, wow, yeah. Well,

Starting point is 00:46:40 I'm pretty busy and you're pretty busy, but maybe we can make this happen. This was early September of 2020. And she pulled in four other people from her team and I pulled in my PhD student. And a month later we had submitted to the conference. And yeah, it was a whirlwind. None of us had planned on doing that in the month of September. So it was definitely like a squish other work out of the way, get this done kind of a thing. But because we had seven co-authors who had all been reading kind of broadly in this area, but different parts of it, we were able to pull together this argument about like, okay, what are the environmental considerations? What about the way that it's learning, you know, really awful biases from the text that it's trained on? And then the stochastic parrots part of it is, okay, so what happens if people believe

Starting point is 00:47:26 that that's a person talking? Yeah. Right? And so, you know, we're pretty solid paper, happy to submit it. And then while we were waiting for the conference to review it, someone at Google got very upset about it and ended up firing two of my co-authors.

Starting point is 00:47:44 Really? Firing Timnit Gebru, That was a famous firing. It made headlines that she was fired. Why would it have angered Google? I mean, she was employed by Google as an AI researcher? Yeah. AI ethics specifically. Writing papers like this was literally her job. Yeah. And so why would that anger Google? So I don't know. And I sort of only have the outsider knowledge of this. And the thing that surprised me is that we thought we would be ruffling feathers at OpenAI because we were talking about GPT-3 as sort of like the main example of this. Yeah.

Starting point is 00:48:22 Google already had something called BERT that's a bit older. They were working towards Lambda, though I don't know that any of us on the team knew that. I certainly didn't. And I guess it was just a little bit too directly to the heart of, hey, this stuff that you're betting everything on might not be a good idea. But the funny thing is, the initial ask from Google was retract the paper or take the Google co-authors off of it. So there was this moment where my PhD student and I were saying, OK, this would be really strange to put a paper out there that's really the work of seven people with only our two names on it. But let's ask them what they want to do. And everyone on the Google side said, our co-authors, we want this out in the world.

Starting point is 00:49:03 Go for it. Like, publish the paper and we'll figure out how to put something in the acknowledgements that says it's not just our work. Yeah. And then Dr. Gabriel was like, you know, this isn't right. They haven't told me why they don't want this paper published. So she pushed back and in that interaction ended up getting fired. But one of the things about it was you'd think if Google's goal was to either have this paper not come out or be unknown or not be associated with Google,

Starting point is 00:49:34 they massively failed. Yeah. No, I mean, it made headlines that an AI ethics researcher was fired for writing a paper about AI ethics. I remember when it happened. And it certainly drew more attention to this issue. This is what people call the Streisand effect. I'm not a fan of that phrase. I find it a little glib, but it is what happened. Yeah.

Starting point is 00:49:56 Yeah, absolutely. I mean, I have not ever had a piece of my writing be so widely read as that paper. And so then there was, you know, that whole firing went down while we were still waiting to hear back from the conference if it was accepted or not. Fortunately, the reviews had been done. So it was actually reviewed anonymously, which is good. That's how it's supposed to work. Peer review, in our field at least, you're supposed to review the paper not knowing who wrote it so that you're not making a decision based on like, okay, do I think so-and-so has good ideas? But do I think these are good ideas?

Starting point is 00:50:25 Yeah. Right? So the paper was reviewed anonymously, but then the sort of – and it had positive enough reviews that it would have gotten in no matter what. But then this whole thing blew up. And then we are like doing the revisions before the version that actually gets published. It's like, wow, we're going to have to put really fine polish on this paper because everybody's going to read it. Well, I mean, it's not a bad, it's not a bad end result. I mean, for the people who lost their jobs, that's bad. But, you know, to have so much attention given to this work, that is good. I'm really interested though, and apologies if I'm being too reductive about the point of your work,

Starting point is 00:51:04 because I have not read this particular paper, although I'm excited to go read it now. But the comparison to parrots is really interesting to me because the fact is that I agree with you that I think parrots are, you know, they're not using language, they're reproducing sounds, and they're smart enough to associate a particular set of sounds with perhaps a certain input or output that like, oh, if when I see X, I make sound Y, thing Z that I like happens. I get a cola nut or whatever it is that, you know, Alex, the famous gray parrot would like. just so deeply hardwired inside of us as this thing that is connected to intelligence. That when we see the parrot do that, we are, oh, the parrot is intelligent. It's almost impossible for us to not treat the parrot as, you know, an equal mind when it is speaking to us this way.

Starting point is 00:51:58 I've seen TikToks. I'm sorry, TikTok has come up so much. But I've seen TikToks of people like, you know, with their TikTok, with their parrot training videos of like, look, my parrot is, my parrot said, I love you. And the parrot really loves me. And I'm like, I don't think the parrot loves you in the way that you are saying it does, but it's almost impossible for you to not take it that way because language is so deeply encoded in you. And so that seems to me to be a great comparison to what we're doing with GPT-3. I mean, our own innate desire to ascribe intelligence to language is like bottomless. Yeah. Yeah. And it's problematic the other way too. So I don't know if you've had the experience of traveling to another country or culture where you don't speak the language, and it is really

Starting point is 00:52:44 hard to get yourself taken seriously as an intelligent being if you don't speak the language. And it is really hard to get yourself taken seriously as an intelligent being if you can't speak the language, right? And it's also really important on the other side of that, like as we are talking with someone in the, you know, English is my first language, sounds like it's your first language. It is a second language for many, many people, right? English is weird in many ways, and that's one of them. And to remember,

Starting point is 00:53:05 to like push back against that notion when someone's looking for their words and they have an accent that makes it clear that English is not their first language, to remember, okay, that means this person has a whole other language that they're not using. And I need to remember that as I'm thinking about it. And we have deep prejudices about language both ways, language being associated with intelligence, that it's really important to keep an eye on. So the parrots are intelligent in a way that GPT-3 is not, right? Yes. And dogs, too.

Starting point is 00:53:34 So I have this friend who's got these little buttons that you can use to train dogs. Oh, I've seen these. Yeah. And the dog hits the button and the dog goes like, walk now or whatever. It's like, oh, the dog is speaking to me. Yeah, because the buttons have a little talk box on them. Yeah, and you record your own voice in it. So my friend recorded, I think, walk and meal and treat and put them there for her dog.

Starting point is 00:53:56 And then after a while, the dog's just like, treat, treat, treat. Yes, and that's amazing. And by the way, why my girlfriend will not let me get those buttons for our dog, even though I really want to get them. But I have also seen people who they put out like 40 buttons and, and the, they say water now, tomorrow, yesterday. And then the people will sort of cherry pick little inputs of the dog going like, um, water now. And they'll go, no, we're not going to go on a walk to the beach today. We just did that yesterday. And then the dog is like water yesterday. And they're like, yes. Oh,

Starting point is 00:54:31 you have a sense of past and future. I'm like, no, this dog doesn't. The dog is hitting stuff and it looks very intentional because the dog is smart enough to know that pressing the button causes a response. But the dog doesn't have these concepts. The dog is, you are, your brain is doing a wonderful thing, which is creating meaning out of semi-random inputs, right? Yeah, yeah, exactly. And noticing those ones and getting excited about it, and then the rest of it's just noise that we don't care about, right? So the cherry picking, when you're publishing work in this area,

Starting point is 00:55:02 if you want to not cherry pick, you have to be very intentional about it from the beginning. I am going to run 10 instances and I'm going to print all 10. If you don't make that rule for yourself, it is really easy to like, okay, I'm playing with it. Oh, I'll put this one in the paper. Not even thinking I'm cherry picking. It's like, oh, that one's fun. Yeah.

Starting point is 00:55:48 Yeah. Yeah. But this this impulse that we have or this ability that we have to see meaning in other minds places, it also strikes me as like a very beautiful thing about humans that we do that. And like it seems like it could be positive regarding AI. Like I'm sure you're very familiar with. there's like an AI game called AI dungeon, which takes text from all different pieces of fan fiction and et cetera, et cetera. And it creates a text game where it says like, you are a wizard. What do you want to do? You're in front of a castle and you say, go into the castle. And it's like, okay,

Starting point is 00:55:55 you go into the castle. You see X, Y, Z there. You can play it like a text adventure, or you can write, I want to fly away in a flying saucer and go meet Donald Trump. And it'll be like, okay, you get in the flying saucer and you fly to the moon and Donald Trump is there.

Starting point is 00:56:08 And he says, you know, hello. And you're like, and it's just doing this. It's just generating text. But when you do it, your own brain's ability to turn the output of the pretty simple GPT-3 algorithm into a story that you find fun is what is making it interesting, right? It's like your own process, your own imagination, your own meaning ascribing thing. And that strikes me as, and people play this game for hours. They pay for it. You know, they enjoy it. They use it to help them, you know, run role-playing sessions. It's like just a very cool tool and game. And that to me sounds like a wonderful use of this kind of technology because

Starting point is 00:56:45 it embraces what we bring to the table rather than pretending like what we bring to the table is actually in the AI somewhere. Yeah. Yeah. No, and I really love that because it's a scenario where it's beneficial to have something that is sending you on topic, coherent seeming, but ungrounded text for you to react to and play with. So it's a really good use case for that technology. Yeah. As opposed to something like, we're going to try to use this to replace search engines. Where it's like, no, no, no. I want my search engine, first of all, to be grounded in actual sources that I can go click through and look at. Yeah. Instead of just making stuff up. And secondly, I also don't want like that, that sort of dialogue setup sounds like it's

Starting point is 00:57:25 really useful in that game in a way that it's not so useful in a search engine where it was sort of narrow down what I can do if I have to like ask questions one at a time, as opposed to sort of frame my query and then get back a bunch of links that I can poke around and explore. Yeah. Wait, so hold on a second. Is replacing search engines with this kind of technology, is that something like Google wants to do? They're talking about it. Yeah. Wait, so hold on a second. Is replacing search engines with this kind of technology, is that something like Google wants to do? They're talking about it. Yeah. Wow. So instead of, because I like the fact when I search the internet, the algorithm is pretty simple. I know how it works. Google used to be very public about here's how the page rank algorithm works.

Starting point is 00:58:00 And I get the sense that I'm searching a database and I can adjust my search query if I want to find something in particular. I put Adam Conover in quotation marks. So I don't see all those appearances where it's Adam and something else. Um, but if I'm not able to do that anymore, if instead I have to ask an AI questions and it's just going to give, like when I say, when I Google Adam Conover age, I want to see a bunch of sources come up that I can pick and choose between. I don't just want the AI to go, Adam Conover is 32. And like, God knows where it got that information. And how do I check whether or not it's reliable? That seems like a real issue. Yes, absolutely. Absolutely. And so there's this paper by some researchers at Google,

Starting point is 00:58:40 the first author's name is Metzler, that was sort of laying out a research program for how to do this. And they laid out a bunch of problems that needed solving. And one of them was the sort of source problem. So we're giving you some information. It's not actually information, it's text strings that we turn into information, as you say, but they're calling it information. But because it's synthesized out of lots and lots of web pages, it's no longer sort of a first class piece of information, what web page that came from. And so their proposed solution to this was, okay, well, we will also train it to generate URLs. And I'm like, that's backwards. You want to avoid separating the information from its source

Starting point is 00:59:18 rather than trying to stick sources back onto information, right? Yeah. But like, like, okay. Currently, if I Google, um, this is actually a pretty good example, um, where, uh, there, there is a, uh, problem that I have on Wikipedia, which is that it has my, uh, birth date on there. Um, but there is no source for the birth date. Right. Um, but if you go, if you want to know how old I am, you can go look at Wikipedia and say, oh, there's no source listed there. There's just like a random date there.

Starting point is 00:59:48 And I don't edit it because I don't edit my Wikipedia page because that's against the rules. But like, you can go look at that. And if you go look at how old is Adam Conover on Google, it'll lead you to Wikipedia and you can go there and you can see that. But if it is now just chewing up the information and giving

Starting point is 01:00:05 you that output and then generating some URL to some other page, that's like, oh, well, here's a place you can go to verify that. You no longer have that like chain of custody, that chain of evidence that lets you know, hey, this is where the information came from. That lets you go find that, oh, citation needed on Wikipedia or whatever else it is. Because I understand as a researcher how Wikipedia works. I don't understand how this algorithm works. Right. Right. And, and, or what it was trained on, what were those source things, right? Yes. And, you know, when you were using this example of, you know, how old is Adam Conover? Well, how old is name is the kind of question that gets an answer between, you know, values between like zero and 120, right? Or, you know, what's the

Starting point is 01:00:47 phone number I can use to reach whatever? Pretty sure GPT-3 could come up with a phone number looking string of digits, right? No, literally you could probably open up GPT-3 and say, today I called Adam Kahn over on the phone at the number. And then it would just fill in some numbers because it would know that that's what a phone number looks like. But that wouldn't be the real phone number, but it would look like it would be. Yeah. Yeah. Sometimes. So there's some people actually at Google, this is a researcher named Carlini and colleagues figured out that they could like poke at these language models and get them to disgorge training data, including things like phone numbers. But it was like, they had to be sort of specifically trying to do it.

Starting point is 01:01:25 Yeah. So you can sort of, it's bad in the sense that you can like collect personally identifying information and then like have the language model spit it out. And it's also bad in the sense that it'll make up phone numbers, right? Yeah. You know? It is funny how you can even, you can see once you start looking into these things, what the sources are, but you have to dig for them.

Starting point is 01:01:44 I was playing with, this is going to make me sound even more conceited because I'm still talking about myself, but I was playing with an AI art generation app, you know, where you put in a prompt and it generates some art for you. And, you know, you, you put like a giraffe eating a bagel and it shows you something that sort of pleasingly weird and looks almost like what you're talking about. And I was like, I wonder what its data set is. Where is it getting the original images from? So I tried getting it to generate me. I wrote Adam Conover,

Starting point is 01:02:08 because that's not just like a random phrase. And it generated a picture that I was like, that looks like my hair. That looks like my glasses, you know? So I'm like, it must be getting this shit off of Google Images or some other sort of image search. It must be searching the web

Starting point is 01:02:21 because there's a bunch of images of me online. They wouldn't be in just some random piece of training data. And so I was able to even sort of like look between the cracks and like figure out which Getty Images photo it was building it off of. But none of that was disclosed anywhere in the app that like that's where the training data is. And that was sort of a weird peek behind the curtain that I was able to do on it. Yeah. So one of the things that actually before we worked together, Dr. Gebru and Dr. Mitchell in a couple of projects and me in another project and others, we're working on this notion of data set documentation. That if we're going to be using systems that are trained

Starting point is 01:02:54 on large amounts of training data and trying to figure out when it's appropriate to use them and when it would not be appropriate, we need to have them travel with very clear documentation of what's in the data. And part of the problem with these very large models is that people generally don't have the budget to document the data sets at the size that they're working. And there's this recent paper by Jesse Dodge and co-authors looking at sort of post hoc what's in this large data set called the common crawl, which is like, let's just crawl a bunch of web data. And it turns out that a very large amount of it is patent applications. So this is like training data for language models, patent applications, including a whole bunch that have been automatically translated from Chinese into English. Whoa. Okay. So that's not very good language training data because it's been like Google Translate has been run on these patent applications, and now it's being used to train a language learning model.

Starting point is 01:03:50 Yeah. And probably not a very good match for the use cases that you want to use a language model for. sets where like training models or sorry you've got these uh opaque uh training data sets that they're using to train the ais even the people training the ais don't know what's in all the data sets because they're just taking some gigantic store of data here's a bunch of fan fiction here's all of twitter here's whatever here's all the google image search they just jam it all in there and then they're trying to get an output from it. And they maybe have no idea what mistakes, what biases, what discriminatory ideas, what private data is in those data sets. And those could just be spat out the other end. We're giving a result that's racist or result that is violating someone's privacy. Like maybe it actually, you type in Adam Conover's phone number and it gives you my real phone number because that was

Starting point is 01:04:47 somewhere in the data. Don't, nobody go do this, please. So is that, that's the issue, is it? Yeah. Yeah. I've got a couple of really nice examples of that for you. So one thing is they basically do scrape at a scale that it can't be vetted. And then there is some work towards saying like, maybe we don't want lots of pornography in our language model training data. Maybe we don't want the sort of in our language model training data. Maybe we don't want the sort of white supremacist hate sites in our training data. So let's think if we can like get rid of these. And so there's this practice, I'm sorry, I don't have the guy's name, but there's this list of 400 some odd very bad words that were up there on GitHub because this

Starting point is 01:05:21 person who was working in a music search site and was working on autocomplete. So you start typing in the name of a song and he's like, these are the 400 some odd words that I don't want automatically popping up. Ah, right. You don't want your program to say, you know, yeah, some horrible slur or something. Like, did you mean autocomplete? Yeah, exactly. So list of words, sensible thing to do. Like I would be embarrassed if these words came up. I don't want, so then what people have done is they said, okay, any website that includes these words is a website we don't want to include in our script data. And part of the problem is that that maybe partially

Starting point is 01:06:01 because of its original context, the words are heavily skewed towards words about sex and sexual identity. And so missing from this training data are actually web fora where LGBTQ people talk about their own lived experience and sort of inhabit their own identities in positive ways, all missing from the training data. Right. So that's one example. training data. So that's one example. Another one, there's a researcher named Robin Spear, who was looking at the biases in these learned systems. And she built a sentiment analysis system. So this is, imagine for some reason you have the problem of reading a Yelp restaurant review and predicting the stars. This person wrote some text. How many stars did they give the restaurant, right?

Starting point is 01:06:47 Artificial tasks. Like nobody ever needs to do that. Like you have the stars, but anyway, it's a- Also the idea of trying to divine any sense in what Yelp reviewers write is ludicrous. Yelp reviewers are the stupidest people on the entire internet. They'll say the food was delicious,

Starting point is 01:07:05 but it was raining, one star. Yeah, exactly. Like what the, okay, so I'm sorry, go on. So Robin Spirit creates this system where she's basically saying, okay, training data is Yelp reviews in English as input, stars as output. And the test is gonna be,

Starting point is 01:07:20 okay, I'm gonna give it some reviews we haven't seen and see if it can get to the stars. But instead of just using that training data, she used something called word embeddings, which is basically taking the large language model and just picking out its representation of a single word. So when she's using that training data, instead of looking at the words as they are, she's looking at the words with this enriched context of how they get used across a much larger corpus. And that much larger corpus is general web garbage. And what she found, and she was looking for this, she wasn't like, uh-oh, I did it. She's

Starting point is 01:07:53 like, let me see what happens. The system was systematically under-predicting the stars for Mexican restaurants. How did that happen? Well, we've got a whole bunch of terrible discourse in English on the internet about immigration from and through Mexico. Right. Such that the word Mexican becomes associated with negative sentiment words. You called the restaurant Mexican, you obviously didn't like it. Yeah. Look, I used to do a whole joke on stage about how I used to do like five minutes at Making Funny Yelp. And part of the whole point was that if you go to Chinatown, the best restaurants have the lowest star reviews

Starting point is 01:08:28 because the people using the version of Yelp you're looking at are English speakers. And English speakers often feel like upset and worried in Chinatown restaurants because for the first time in their lives, they're like in a place where they're not the majority population, right? Where they feel foreign, right?

Starting point is 01:08:48 And so you go look at reviews for a Chinese restaurant, most of them are racist, right? It's people saying like bizarre shit. And so like, there's just that very general bias in that data. So of course that would happen. Right, right. And this is happening even outside of Yelp, right? Which is, I like the bit that you're describing because you're urging people to say, let's think about in crowdsourcing, who is the crowd and do I care about their opinion? Yeah. Right? Yeah. Where is that data coming from? Does it match what I'm trying to do here?

Starting point is 01:09:17 Yeah. If I'm the kind of tourist who decides to go to Chinatown but then feels uncomfortable because I'm in Chinatown, then yeah, maybe I want to see what other people like me think on Yelp. Sure. But if you're the type of person who like actually wants the best cuisine in Chinatown, who wants the, who wants the real shit, who wants the, who wants the food that the non-English speakers like, that the people who live there like, Yelp might not be your best source. Right. Exactly. And Yelp is, I think it's transparent enough that we can look at that and go, okay, the way this person is talking makes me think they're not taking the same point of view as me. Right. Right. Because I'm a human. I can understand. I can read a review. I can understand, oh, this review was written by an asshole. I'm going to scroll past it. Yeah.

Starting point is 01:10:02 Right. Right. But when you're looking at the sort of like average stars, you've already lost that information. And it takes someone like you to sort of go through and scroll through the views and say, hang, hang on. Yeah. There's a pattern here in Chinatown specifically because of who's using Yelp and then going to Chinatown. Yeah. And I bet the problem you would say is that the people who are creating our AI algorithms are not doing that analysis. They're just taking a huge data set and putting it in. They're not asking themselves, hey, hold on a second, what biases are embedded in this data set? I mean, there are people looking at it now, but there's not enough. And they're also not

Starting point is 01:10:37 making it possible for other people to do it, right? The people building it haven't looked, and then what they give you is not the data, not the description of the data. But here's my AI. Yeah. And I'd love to talk because I could talk to you for hours. This is I'm having a blast. But we do have to sort of like start to circle the airport and come in for a landing. So I want to return to a point that you made earlier because I don't want to breeze past it. Because I don't want to breeze past it. You talked about how we're starting to hear about AI systems being used as a way, AI being used instead of actually reshaping the systems in our society that are causing the problems to begin with.

Starting point is 01:11:35 I wish you could talk a little bit more about that, if you wouldn't mind. Yeah, I think it's just whenever someone steps in with a I can solve this problem with AI. That's that's a really critical moment. OK, you've identified a problem. Do I agree that that's a problem? Sometimes no, right? Sometimes the problem being solved is we aren't doing enough surveillance. Let's do more surveillance. Maybe we don't need more surveillance, right? But so you've identified a problem. Do I agree with the problem? And then what are the other possible solutions? And to get to those other possible solutions, you often have to sort of widen the frame and step back because the framing of the problem as a task that AI can handle sort of usually blocks out a lot of other

Starting point is 01:12:15 information. So if we return to those, I'm not sure I can do this quickly off the mental health example, because it's clearly a problem that the need for mental health services in this country outstrips the available people who are trained to do that job and are being paid to do that job. And coming off or still living through, what are we, year three of the pandemic now, or are we? I've lost track. Yes. No, yes, yes. That's about right. That has taken an enormous toll on people's mental health, and there is a big need. But then, well, maybe I can get there. So then the AI tech solutionists say,

Starting point is 01:12:54 okay, in that big need, let's get diagnoses fast. I'm like, is that really going to help? Yes, diagnosis is part of the issue. So then you've got a bunch of people who have diagnoses of dubious quality who still don't have the healthcare professionals to interact with. Yeah. Right? Yeah. So you often have to back up and say, okay, how did we get to where we are? And what are some choices we could make to ameliorate the system?

Starting point is 01:13:19 Yeah. And people will often sell AI as cheaper, but that doesn't necessarily take into account the cost of building the system, of maintaining it over time, of evaluating it and then continually evaluating, is this still working in our use case and so on. So when someone says I'm solving a problem with AI, that's a great moment to start asking some questions. Yeah. I mean, if you're trying to use AI to help underserved people like who don't have enough medical care, I mean, look, in my new show, we did a whole episode on what, you know, why the U.S. failed on COVID-19. And a big part of it was we have underfunded, frankly, defunded public health departments all across the country for decades, especially in poor areas. We visited

Starting point is 01:14:04 a county in Alabama where there's one doctor in the entire county who splits his time between there and Montgomery. And there's a little public health clinic that the public health department, it's like understaffed. They have a couple nurses and they're tasked with overseeing, safeguarding the health of an entire county of people. This is a predominantly black county, one of the poorest spots in the country. And like the idea that, hey, we have some people who are underserved medically, how do we help them? The idea that you would do it with AI is at the very least being willfully ignorant of the reason that the place is underserved to begin with. It's not underserved because of some like gigantic problem, you know, some big boulder that we have to move,

Starting point is 01:14:49 some like, you know, some natural force that, you know, we need to overcome. It's because we as people have not invested in the particular place. And so that is a problem that is within our power to solve. And we don't need technology to solve it. We can do it just by like, you know, passing some bills, like just through irregular human shit can solve it. We don't need some, some magical technology that by the way, probably

Starting point is 01:15:18 can't do what you're claiming it can do. Yes, exactly. Exactly. So someone says there's probably a problem with AI. What's the problem? How else might we solve it? And how do we back up a little bit and see the larger context that that problem sits in? New York Times and your criticisms of the framing and your criticism of whether it even should take for granted that AI even exists. How would you like to see AI covered differently in journalism? What would you like to see the public and the people who inform the public do when they're talking about it? And by the way, should we just abandon this term AI at all? I mean, we in this conversation are continuing to use AI when we're talking about things that are frankly, as we've established, not intelligent. Yeah. Yeah. So I would love to recommend Salami as a possible answer. I also, my son is very good at acronyms and I put this

Starting point is 01:16:18 question to him at one point before I learned the Salami one, and he came up with Pseudo-Sci, And he came up with Pseudo-Sci, which stands for pattern matching by syndicate entities of uncurated data objects through superfluous energy consumption and incentives. I love it. And I do sometimes say AI in scare quotes, and your audience is audio, so they're not going to see all the times that we did the scare quotesuation marks. But maybe you can sort of hear it in the pause. But yeah, I think we should abandon the term. And I think that we should ask questions like, if someone is proposing an AI solution for something, who's going to use it? Who is going to be affected by its use? What happens if it gets the right answer? Who might get by its use? What happens if it gets the right answer? Who might get hurt by that? What happens if it gets the wrong answer? Who might get hurt by that? Is the person using it sufficiently supported to be able to contextualize what's coming out of

Starting point is 01:17:14 there? And how is the person selling this system making money off of it? And what are these giving up to them so that they can make that money. Yeah. What about their framing of AI is benefiting them without you even realizing it, that concedes the point to them? When they say, AI is coming and we have to be ready, it's like, well, why do you want us to believe that? Yeah. I mean, climate change is coming and maybe we can do something about that. In fact, it's here and it's continuing and there's something we can do about that. AI is only coming if A, we keep trying to build it and B, we actually manage to. And A is a decision we can make and B is not. How are systems being used that involve collecting lots and lots of data? What kind of regulation should we be making so that people can't just willy-nilly collect and sell other people's data and put systems in place that cause these problems?

Starting point is 01:18:18 Amazing. Thank you so much for being on the show, Emily. It was wonderful talking to you. Where can people follow up and find your work? Where can they find this stochastic parents paper and where else can they keep up with what you're doing? Yeah. So probably the best for keeping up is Twitter. I'm at Emily M. Bender. And all of my publications are posted on my webpage at the University of Washington. I think if you Google Emily Bender, University of Washington, or if you remember to put in my M, either way, I think it turns up. And yeah, so I keep that up to date.

Starting point is 01:18:50 Back in the day, I had a few publications that weren't open access, but everything else since then, you should be able to just click through and see a PDF. Amazing. Thank you so much for coming on, Emily. It's been a pleasure. Well, thank you once again to Emily Bender for coming on the show. I don't know anything our engineer, Ryan Connor, and everyone who supports this show at the $15 a month level on Patreon. That's Adrian, Alexey Batalov, Alison Lipparato, Alan Liska, Anne Slagle, Antonio LB, Aurelio Jimenez, Beth Brevik, Brayden, Brandon Sisko, Camus and Lego, Charles Anderson, Cress Staley, Courtney Henderson, David Condry, David Conover, Drill Bill, Dude With Games, M, Hillary Wolkin, Jim Shelton, Julia Russell, Kelly Casey, Kelly Lucas, Lacey Tigenoff, Mark Long, Michael Warnicke, Miles

Starting point is 01:19:53 Gillingsrud, Mom Named Gwen, Mrs. King Coke, Nicholas Morris, Nikki Battelli, Nuyagik, Ippoluk, Paul Mauck, Rachel Nieto, Richard Watkins, Robin Madison, Samantha Schultz, Sam Ogden, Shannon Thank you all so much. And if you want to join their ranks, head to patreon.com. Of course, I want to thank the fine folks at Falcon Northwest for building me the incredible custom gaming PC that I often record these episodes on, and Andrew WK for our theme song. You can find me online at adamconover.net or at Adam Conover, wherever you get your social media. Thank you so much for listening, and we'll see you next time on Factually.

Starting point is 01:20:37 That was a HeadGum Podcast.

Factually! with Adam Conover - The Real Problem with A.I. with Emily Bender

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.