99% Invisible - 274- The Age of the Algorithm

Episode Date: September 6, 2017

Computer algorithms now shape our world in profound and mostly invisible ways. They predict if we’ll be valuable customers and whether we’re likely to repay a loan. They filter what we see on soci...al media, sort through resumes, and evaluate job performance. They inform prison sentences and monitor our health. Most of these algorithms have been created with good intentions. The goal is to replace subjective judgments with objective measurements. But it doesn’t always work out like that. “I don’t think mathematical models are inherently evil — I think it’s the way they’re used that are evil,” says mathematician Cathy O’Neil, author of the book Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. She has studied number theory, worked as a data scientist at start-ups, and built predictive algorithms for various private enterprises. Through her work, she’s become critical about the influence of poorly-designed algorithms. The Age of the Algorithm

Transcript
Discussion (0)
Starting point is 00:00:00 This is 99% invisible. I'm Roman Mars. On April 9th, 2017, United Airlines Flight 3411 was about to fly from Chicago to Louisville, when flight attendants discovered the plane was over-bucked. They tried to get volunteers to give up their seats with promises of travel vouchers and hotel accommodations, but not enough people were willing to get off. United ended up calling some Airpoint security officers. They boarded the plane, and forcibly removed a passenger named Dr. David Dow. The officers ripped Dow out of his seat
Starting point is 00:00:37 and carried him down the aisle of the airplane, nose bleeding, while horrified on-mogor shot video with their phones. while horrified on-mokers shot video with their phones. Oh, this is so much fun. Oh my God, look at what you're taking there. You probably remember this incident and the outrage it generated. The international uproar continued over the forced removal of a passenger from a United Airlines flight.
Starting point is 00:00:59 Today the airline CEO Oscar Munoz issued an apology saying, quote, no one should ever be mistreated this way. I want you to know that we take full responsibility. But why Dr. Tao? How did he end up being the unlucky passenger that United decided to remove? Immediately following the incident, some people thought racial discrimination may have played a part, and it's possible that this played a role in how he was treated. But the answer to how he was chosen was actually, an algorithm, a computer program. It crunched through a bunch of data, looking at stuff like how
Starting point is 00:01:34 much each passenger had paid for their ticket, what time they checked in, how often they flew ununited, and whether they were part of a rewards program. The algorithm likely determined that Dr. Dow was one of the least valuable customers on the flight at the time. Algorithms shape our world in profound and mostly invisible ways. They predict if we'll be valuable customers or whether we're likely to repay alone. They filter what we see on social media, sort through resumes, and evaluate job performance. They inform prison sentences and monitor our health. Most of these algorithms have been created with good intentions.
Starting point is 00:02:11 The goal is to replace subjective judgments with objective measurements, but it doesn't always work out like that. This subject is huge. I think algorithm design may be the big design problem of the 21st century, and that's why I wanted to interview Cathy O'Neill. Okay, well, thank you so much. So, because when we start, can you give me one of them, sort of, MPR style introductions and just say your name and what you do?
Starting point is 00:02:35 Sure. I'm Cathy O'Neill. I'm a math magician, data scientist, activist, and author. I wrote the book, Weapons of Math Destruction, How Big Data Increases Inequality and Threatens Democracy. O'Neill studied number theory and then left academia to build predictive algorithms for a hedge fund. But she got really disillusioned by the use of mathematical models in the financial industry. I wanted to have more impact in the world, but I didn't really know that that impact could be really terrible. I was very naive.
Starting point is 00:03:03 After that, O'Neill worked as a data scientist at a couple of startups. And through these experiences, she started to get worried about the influence of poorly designed algorithms. So we'll start with the most obvious question. What is an algorithm? At its most basic, an algorithm is a step-by-step guide to solving a problem.
Starting point is 00:03:21 It's a set of instructions like a recipe. The example I like to give is like cooking dinner for my family. So in this case, the problem is how to make a successful dinner. O'Neill starts with a set of ingredients. And as she's creating the meal, she's constantly making choices about what ingredients are healthy enough to include in her dinner algorithm.
Starting point is 00:03:40 I curate that data because those ramen noodle packages that my kids like so much. I don't think of those as ingredients, right? So I exclude them, I'm curating, and therefore imposing my agenda on this algorithm. In addition to curating the ingredients, O'Neill as the cook also defines what a successful outcome looks like. I'm also defining success, right? I'm in charge of success. I define success to be if my kids eat vegetables at that meal. And you know, a different cook might define success differently. You know, my eight-year-old would define success to be like whether he got to eat Nutella. So that's another way where we the builders impose our agenda on the algorithm.
Starting point is 00:04:22 O'Neill's main point here is that algorithms aren't really objective, even when they're carried out by computers. This is relevant because the companies that build them like to market them as objective, claiming they remove human error and fallibility from complex decision-making. But every algorithm reflects the priorities and judgments of its human designer. Of course, that doesn't necessarily make algorithms bad. Right. So, I mean, it's very important to me that I don't get the reputation of hating all algorithms. I actually like algorithms, and I think algorithms could really help.
Starting point is 00:04:53 But O'Neill does single out a particular kind of algorithm for scrutiny. These are the ones we should worry about. And they're characterized by three properties that they're very widespread and important, so like they make important decisions about a lot of people. Number two, that they're secret, that the people don't understand how they're being scored, and number three, that they're destructive. Like one bad mistake in the design, if you will, of these algorithms,
Starting point is 00:05:18 will actually not only make it unfair for individuals, but categorically unfair for enormous populations as it gets scaled up. O'Neill has a shorthand for these algorithms. The widespread, mysterious, and destructive ones. She calls them Weapons of Math Destruction. To show how one of these destructive algorithms works, O'Neill points to the criminal justice system. For hundreds of years, key decisions in the legal process, like the amount of bail, length of sentence,
Starting point is 00:05:48 and likelihood of parole, have been in the hands of fallible human beings guided by their instincts, and sometimes their personal biases. The judges are sort of famously racist, some of them more than others. And that racism can produce very different outcomes for defendants.
Starting point is 00:06:03 For example, the ACLU has found that sentences imposed on black men in the federal system are nearly 20% longer than those for white men convicted of similar crimes. And studies have shown that prosecutors are more likely to seek the death penalty for African Americans than for whites convicted of the same charges. So you might think that computerized models fed by data would contribute to more even handed treatment. The criminal justice system thinks so too. It has increasingly tried to minimize human bias by turning to risk assessment algorithms.
Starting point is 00:06:32 Like crime risk, like what is the chance of someone coming back to prison after leaving it? Many of these risk algorithms look at a person's record of arrests and convictions. The problem is, that data is already skewed by some social realities. Take for example the fact that white people and black people use marijuana at roughly equal rates. And yet, there's five times as many blacks getting arrested for smoking pot as whites. Five times as many. This may be because black neighborhoods tend to be more heavily police than white neighborhoods, which means black people get arrested for certain crimes
Starting point is 00:07:03 more often than white people. Risk algorithms detect these patterns and apply them to the future. So if the past is shaped in part by racism, the future will be too. The larger point is we have terrible data here. But the statisticians involved, the data scientists, are like blindly going forward and pretending that our data is good, and then we're using it to actually make important decisions. Risk assessment algorithm is also look at defendants' answers to a questionnaire that's supposed to tease out certain risk factors.
Starting point is 00:07:33 They have questions like, you know, did you grow up in high crime neighborhood? Are you on welfare? Do you have a mental health problem? Do you have addiction problems? Did your father go to prison? You know, they're basically proxies for race and class, but it's embedded in this scoring system, the judge is given the score and it's called objective. What is what is the judge take away from it or you know, how is it used? If you have a high risk score, it's used to send you to prison for longer in sentencing. There's also it's also used in bail hearings and parole hearings. If you have a high recidivism risk or you don't get parole. And presumably, you could take all that biased input data and say this high chance
Starting point is 00:08:12 for cynivism means that we should rehabilitate more. I mean, you could take that all the same stuff and choose to do a completely different thing with the result of the algorithm. That's exactly my point. Exactly my point. We could say, oh, I wonder why people who have this characteristic have so much worse serosidivism.
Starting point is 00:08:29 Well, let's try to help them find a job. Maybe that'll help. We could use those algorithms, those risk scores to try to account for our society. Instead, O'Neal says, in many cases, we're effectively penalizing people for societal and structural issues that they have little control over, and we're doing it at a massive scale, using these new technological tools.
Starting point is 00:08:52 We're shifting the blame, if you will, from the society, which is the one that should own these problems to the individual and punishing them for it. It should be said that, in some cases, algorithms are helping to change elements of the criminal justice system for the better. For example, New Jersey recently did away with their cash bail system, which disadvantaged low-income defendants. They now rely on predictive algorithms instead. Data shows that the state's pre-trial county jail populations are downed by about 20%.
Starting point is 00:09:22 But still, algorithms like that one remain unaudited and unregulated, and it's a problem when algorithms are basically black boxes. In many cases, they're designed by private companies who sell them to other companies, and the exact details of how they work on CapSecret. Not only is the public in the dark, even the companies using these things, might not understand exactly how the data is being processed. This is true of many of the problematic algorithms that O'Neill has looked at, whether they use for sorting loan applications or assessing teacher performance.
Starting point is 00:09:52 There's some kind of weird thing that happens to people when mathematical scores are trotted out. They just start closing their eyes and believing it because it's math. And they do, I feel like, oh, I'm not an expert of math, so I can't push back. And that's something you just see time and time again. You're like, why didn't you question this? This doesn't make sense.
Starting point is 00:10:13 Oh, well, it's math and I don't understand it. Right now it seems like because of algorithms and math, it's just a new place to place blame so that you do not have to think about your decisions as an actual company Because these things are just so powerful and so you know mesmerizing to us especially right now They can be used in all kinds of Nafaria's way they're almost magical is that Yeah
Starting point is 00:10:42 That's um scary It's scary and I think I think I go like I would go one step further than that Yeah, that's um, scary. It's scary and I think I would go, I would go one step further than that. I feel like just by observation that these algorithms, they don't show up randomly. They show up when there's a really difficult conversation that people want to avoid.
Starting point is 00:10:58 They're like, oh, we don't know what it, what makes a good teacher. And different people have different opinions about that. So let's just bypass this conversation by having an algorithm score teachers. Or we don't know what prison is really for. Let's have a way of deciding how long to sentence somebody. We introduce these silver bullet mathematical algorithms because we don't want to have a way of deciding how long to sentence somebody. We introduce these silver bullet mathematical algorithms because we don't wanna have a conversation.
Starting point is 00:11:31 In O'Neill's book, she writes about this young guy named Kyle Beam, who takes some time off college to get treated for bipolar disorder. Once he's better and ready to go back to school, he applies for a part-time job at Crocker, which is a big grocery store chain. He has a friend who works there who offers to vouch for him. Kyle was such a good student that he figured the application would be just a formality,
Starting point is 00:11:51 but he didn't get called back for an interview. His application was read-lited by the Personality Test he'd taken when he applied for the job. The test was part of an employee selection algorithm developed by a private workforce company called Kronos. 70% of job applicants in this country take Personality Test before they get an interview, so this employee selection algorithm developed by a private workforce company called Cronos. 70% of job applicants in this country take personality tests before they get an interview. So this is a very common practice.
Starting point is 00:12:11 Kyle had that screening and he found out because his friend worked at Crogers that he had failed the test. So most people never find that out, they just don't hear back. And the other thing that was unusual about Kyle is that his dad is a lawyer. So his dad was like, what were the questions like on this test?
Starting point is 00:12:27 And he said, well, there are some of them were a lot like the questions I got at the hospital, the mental health assessment. The test Kyle got at the hospital was called the five factor model test, and it grades people on extroversion, agreeableness, conscientiousness, neuroticism, and openness to new ideas. It's used in mental health evaluations. The potential employees' answers to the test are then plugged into an algorithm that decides whether the person should be hired. So his father was like, whoa, that's illegal under the American Disability Act. So his father and he sort of figured out together that something very fishy had been going
Starting point is 00:13:01 on, and his father's actually filed a cross class action lawsuit against Kroger's for that. The suit is still pending, but arguments are likely to focus on whether the personality test can be considered a medical exam. If it is, it'd be illegal under the ADA. O'Neill gets that different jobs require people with different personality types, but she says a hiring algorithm is a blunt and unregulated tool that ends up disqualifying big categories of people, which makes it a classic weapon of math destruction. In certain jobs, you wouldn't want neurotic people or
Starting point is 00:13:31 introverted people. Like, if you're at a call center where a lot of really, I rate customers call you up. That might be a problem. In which case, it is actually legal if you get an exception for your company. The problem is that these personality tests are not carefully designed for each business, but rather what happens is that these companies just sell the same personality test to all the businesses that will buy them. A lot of the algorithms that O'Neill explores
Starting point is 00:14:02 in a book are largely hidden. They don't get a lot of attention. We as consumers and child applicants and employees may not even be aware that they're humming along in the background of our lives, sorting us into piles and categories. But there is one kind of algorithm that's gotten a lot of attention in the news lately. Is this a good or bad thing that social media has been able to infiltrate politics. Social media is a technology and as we know, technologies have their good size and the dark size, they're not so good size.
Starting point is 00:14:33 So it all depends on... Towards the end of our conversation, O'Neil and I started talking about the recent election and the complex ways that social media algorithms shaped the news that we receive. Facebook shows a story as based on what they think we want and of course what they think we want is based on algorithms. These algorithms look at what we clicked on before and then feed us more content we like. The result is that we've ended up in these information silos increasingly polarized and oblivious to what people of different political persuasions might be seen. I do think this is a major
Starting point is 00:15:04 problem. The sort of the sky's the limit. We have built the internet and the internet is a propaganda machine. It's a propaganda delivery device, if you will. And that's not, I don't see how that's going to stop. Yeah, especially if every moment is being optimized by an algorithm that's meant to manipulate your emotions.
Starting point is 00:15:26 Right, that's going exactly going back to Facebook's optimizer algorithm. That's not optimizing for truth, right? It's optimizing for profit. And they claim to be neutral, but of course, nothing's neutral. And we have seen the results. We've seen what it's actually optimized for, and it's not pretty. This kind of data-driven political micro-targeting means conspiracies and misinformation can gain surprising traction online. Stories claiming that Pope Francis endorsed Donald Trump
Starting point is 00:15:55 and that Hillary Clinton's sold weapons to ISIS gained millions of viewers on Facebook. Neither of those stories was true. [♪ Music playing in background, music playing in background, music playing in background, Facebook, neither of those stories was true. Fixing the problem of these destructive algorithms is not going to be easy, especially when they're insinuating themselves into more and more parts of our lives. But O'Neill thinks that measurement and transparency is one place to start, like with that Facebook algorithm and the political ads that it serves to its users. If you were to talk to Facebook about how to inject some ethics into their optimization,
Starting point is 00:16:31 what would you do? Would you sort of make a case for the bottom line of truth being a longer tail way to make more money? Or would you just say, this is about ethics and you should be thinking about ethics? To be honest, if I really had their attention, I would ask them to voluntarily find a space on the web to just put every political ad, and actually every ad, just have a way for journalists and people interested in the concept of the informed citizenry, to go through all the ads that they have on Facebook at a given time.
Starting point is 00:17:05 Because even if that article about Hillary Clinton and ISIS was shared thousands of times, lots of people never saw it at all. Just show us what you're showing other people. Because I think one of the most pernicious issues is the fact that we don't know what other people are seeing. I'm not waiting for Facebook to actually go against their interests and change their profit goal, but I do think this kind of transparency can be demanded and given. O'Neill also says it's important to measure the broad effects of these algorithms and
Starting point is 00:17:34 to understand who they most impact. Everyone should start measuring it. What I mean by that is relatively simple. This might not be a complete start, but it's a pretty good first step, which is measure for whom this fails. Meaning, which populations are most negatively impacted by the results of these algorithms? And what is the harm that befalls those people, for whom it fails? And how are the failures distributed across the population?
Starting point is 00:18:03 So if you see a hiring algorithm fail much more often for women than for men, that's a problem. Especially if the failure is they don't get hired when they should get hired. I really do think that a study, a close examination of the distribution of failures and the harms of those failures would really be a good start. If you're not mad enough about how algorithms influence your life, I've got a doozy for you after these messages. We are currently experiencing higher call volumes than normal. Please stay on the line and then agent will be with you shortly.
Starting point is 00:19:13 Here's one that I think is kind of fun because it's annoying and secret, but you would never know it. So if you call up a customer service line, I'm not saying this will always happen but it will sometimes happen that your phone number will be used to back track like who you are, you will be as like are you a high-value customer or low-value customer, and if you're a high-value customer you'll talk to a customer service representative much sooner than if you're a low-value customer. You'll be help put on hold longer. That's how businesses make decisions nowadays. You are caller number 90, nine.
Starting point is 00:19:51 Your call is important to us. Please stay on the line. 99% of visible was produced this week by Delaney Hall, tech and mixed production by Emmett Fitzgerald. Katie Mingle is the senior producer. Kurt Colstead is the digital director. Sean Rial composed all the music. The rest of the staff includes Avery Trouffman-Charifuses, Taren Mazza and me Roman Mars.
Starting point is 00:20:10 Special thanks to Ryan Keesler and Courtney Riddle. We are a project of 91.7 KALW in San Francisco if produced on Radio Row in Beautiful, downtown, Oakland, California. 99% of visible is part of radio topia from PRX, a collective of the best, most innovative shows in all of podcasting. We are supported by the night foundation
Starting point is 00:20:31 and coin carrying listeners, just like you. You can find 99% of visible and join discussions about the show on Facebook. You can tweet at me at Roman Mars in the show at 99PI org, or on Instagram, Tumblr, and Reddit too. But our lovely home on the internet with more design stories than we can ever tell you here on the radio, or Instagram, Tumblr, and Reddit too. But our lovely home on the internet with more design stories than we can ever tell you here on the radio or podcast, I guess this is a podcast.
Starting point is 00:20:51 Is our website at 99PI.org. RADIO TOPIORX RADIO TOPIORX FROM PORX. From PRX.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.