99% Invisible - 274- The Age of the Algorithm
Episode Date: September 6, 2017Computer algorithms now shape our world in profound and mostly invisible ways. They predict if we’ll be valuable customers and whether we’re likely to repay a loan. They filter what we see on soci...al media, sort through resumes, and evaluate job performance. They inform prison sentences and monitor our health. Most of these algorithms have been created with good intentions. The goal is to replace subjective judgments with objective measurements. But it doesn’t always work out like that. “I don’t think mathematical models are inherently evil — I think it’s the way they’re used that are evil,” says mathematician Cathy O’Neil, author of the book Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. She has studied number theory, worked as a data scientist at start-ups, and built predictive algorithms for various private enterprises. Through her work, she’s become critical about the influence of poorly-designed algorithms. The Age of the Algorithm
Transcript
Discussion (0)
This is 99% invisible. I'm Roman Mars.
On April 9th, 2017, United Airlines Flight 3411 was about to fly from Chicago to Louisville,
when flight attendants discovered the plane was over-bucked.
They tried to get volunteers to give up their seats with promises of travel vouchers and hotel accommodations,
but not enough people were willing to get off.
United ended up calling some Airpoint security officers. They boarded the plane,
and forcibly removed a passenger named Dr. David Dow.
The officers ripped Dow out of his seat
and carried him down the aisle of the airplane,
nose bleeding,
while horrified on-mogor shot video with their phones.
while horrified on-mokers shot video with their phones. Oh, this is so much fun.
Oh my God, look at what you're taking there.
You probably remember this incident and the outrage it generated.
The international uproar continued over the forced removal of a passenger from a United
Airlines flight.
Today the airline CEO Oscar Munoz issued an apology saying, quote, no one should ever be mistreated
this way.
I want you to know that we take full responsibility.
But why Dr. Tao?
How did he end up being the unlucky passenger that United decided to remove?
Immediately following the incident, some people thought racial discrimination may have played
a part, and it's possible that this played a role in how he was treated. But the answer to how he was chosen was actually,
an algorithm, a computer program. It crunched through a bunch of data, looking at stuff like how
much each passenger had paid for their ticket, what time they checked in, how often they flew
ununited, and whether they were part of a rewards program. The algorithm likely determined that Dr. Dow was one of the least valuable customers on
the flight at the time.
Algorithms shape our world in profound and mostly invisible ways.
They predict if we'll be valuable customers or whether we're likely to repay alone.
They filter what we see on social media, sort through resumes, and evaluate job performance.
They inform prison sentences and monitor our health.
Most of these algorithms have been created with good intentions.
The goal is to replace subjective judgments with objective measurements,
but it doesn't always work out like that.
This subject is huge.
I think algorithm design may be the big design problem of the 21st century,
and that's why I wanted
to interview Cathy O'Neill.
Okay, well, thank you so much. So, because when we start, can you give me one of them,
sort of, MPR style introductions and just say your name and what you do?
Sure. I'm Cathy O'Neill. I'm a math magician, data scientist, activist, and author. I wrote
the book, Weapons of Math Destruction, How Big Data Increases Inequality
and Threatens Democracy.
O'Neill studied number theory and then left academia to build predictive algorithms for
a hedge fund. But she got really disillusioned by the use of mathematical models in the financial
industry.
I wanted to have more impact in the world, but I didn't really know that that impact
could be really terrible. I was very naive.
After that, O'Neill worked as a data scientist
at a couple of startups.
And through these experiences, she started to get worried
about the influence of poorly designed algorithms.
So we'll start with the most obvious question.
What is an algorithm?
At its most basic, an algorithm is a step-by-step guide
to solving a problem.
It's a set of instructions like a recipe.
The example I like to give is like cooking dinner
for my family.
So in this case, the problem is how to make a successful dinner.
O'Neill starts with a set of ingredients.
And as she's creating the meal, she's constantly
making choices about what ingredients are healthy enough
to include in her dinner algorithm.
I curate that data because those ramen noodle packages
that my kids like so much. I don't think of those as ingredients, right?
So I exclude them, I'm curating, and therefore imposing my agenda on this algorithm.
In addition to curating the ingredients, O'Neill as the cook also defines what a successful outcome looks like.
I'm also defining success, right? I'm in charge of success.
I define success to be
if my kids eat vegetables at that meal. And you know, a different cook might define success differently. You know, my eight-year-old would define success to be like whether he got to eat
Nutella. So that's another way where we the builders impose our agenda on the algorithm.
O'Neill's main point here is that algorithms aren't really objective, even when they're carried
out by computers.
This is relevant because the companies that build them like to market them as objective,
claiming they remove human error and fallibility from complex decision-making.
But every algorithm reflects the priorities and judgments of its human designer.
Of course, that doesn't necessarily make algorithms bad.
Right. So, I mean, it's very important to me that I don't get the reputation of hating
all algorithms. I actually like algorithms, and I think algorithms could really help.
But O'Neill does single out a particular kind of algorithm for scrutiny. These are the
ones we should worry about. And they're characterized by three properties that they're very widespread
and important, so like they make important decisions about a lot of people.
Number two, that they're secret,
that the people don't understand how they're being scored,
and number three, that they're destructive.
Like one bad mistake in the design,
if you will, of these algorithms,
will actually not only make it unfair for individuals,
but categorically unfair for enormous populations
as it gets scaled up.
O'Neill has a shorthand for these algorithms. The widespread, mysterious, and destructive ones.
She calls them Weapons of Math Destruction.
To show how one of these destructive algorithms works, O'Neill points to the criminal justice system.
For hundreds of years, key decisions in the legal process,
like the amount of bail, length of sentence,
and likelihood of parole,
have been in the hands of fallible human beings
guided by their instincts,
and sometimes their personal biases.
The judges are sort of famously racist,
some of them more than others.
And that racism can produce very different outcomes
for defendants.
For example, the ACLU has found that sentences imposed on black men in the federal system
are nearly 20% longer than those for white men convicted of similar crimes.
And studies have shown that prosecutors are more likely to seek the death penalty for
African Americans than for whites convicted of the same charges.
So you might think that computerized models fed by data would contribute to more even
handed treatment.
The criminal justice system thinks so too.
It has increasingly tried to minimize human bias by turning to risk assessment algorithms.
Like crime risk, like what is the chance of someone coming back to prison after leaving
it?
Many of these risk algorithms look at a person's record of arrests and convictions.
The problem is, that data is already skewed by some social realities. Take
for example the fact that white people and black people use marijuana at roughly equal
rates. And yet, there's five times as many blacks getting arrested for smoking pot as
whites. Five times as many. This may be because black neighborhoods tend to be more heavily
police than white neighborhoods, which means black people get arrested for certain crimes
more often than white people.
Risk algorithms detect these patterns and apply them to the future.
So if the past is shaped in part by racism, the future will be too.
The larger point is we have terrible data here.
But the statisticians involved, the data scientists, are like blindly going forward and pretending that our data is good,
and then we're using it to
actually make important decisions. Risk assessment algorithm is also look at
defendants' answers to a questionnaire that's supposed to tease out certain risk factors.
They have questions like, you know, did you grow up in high crime neighborhood?
Are you on welfare? Do you have a mental health problem? Do you have addiction problems?
Did your father go to prison? You know, they're basically proxies for race and class,
but it's embedded in this scoring system, the judge is given the score and it's called objective.
What is what is the judge take away from it or you know, how is it used?
If you have a high risk score, it's used to send you to prison for longer in sentencing.
There's also it's also used in bail hearings and parole hearings. If you have a high recidivism risk or you don't get parole.
And presumably, you could take all that biased input data and say this high chance
for cynivism means that we should rehabilitate more.
I mean, you could take that all the same stuff and choose to do a completely different
thing with the result of the algorithm.
That's exactly my point.
Exactly my point.
We could say,
oh, I wonder why people who have this characteristic
have so much worse serosidivism.
Well, let's try to help them find a job.
Maybe that'll help.
We could use those algorithms, those risk scores
to try to account for our society.
Instead, O'Neal says, in many cases,
we're effectively penalizing people for societal and
structural issues that they have little control over, and we're doing it at a massive scale,
using these new technological tools.
We're shifting the blame, if you will, from the society, which is the one that should
own these problems to the individual and punishing them for it.
It should be said that, in some cases, algorithms are helping to change elements of the criminal
justice system for the better.
For example, New Jersey recently did away with their cash bail system, which disadvantaged
low-income defendants.
They now rely on predictive algorithms instead.
Data shows that the state's pre-trial county jail populations are downed by about 20%.
But still, algorithms like that one remain unaudited and unregulated, and it's a problem
when algorithms are basically black boxes.
In many cases, they're designed by private companies who sell them to other companies,
and the exact details of how they work on CapSecret.
Not only is the public in the dark, even the companies using these things, might not understand
exactly how the data is being processed.
This is true of many of the problematic algorithms that O'Neill has looked at, whether they use
for sorting loan applications or assessing teacher performance.
There's some kind of weird thing that happens to people when mathematical scores are
trotted out.
They just start closing their eyes and believing it because it's math.
And they do, I feel like, oh, I'm not an expert of math,
so I can't push back.
And that's something you just see time and time again.
You're like, why didn't you question this?
This doesn't make sense.
Oh, well, it's math and I don't understand it.
Right now it seems like because of algorithms and math,
it's just a new place to place blame
so that you do not have to think about your decisions as an actual company
Because these things are just so powerful and so you know mesmerizing to us especially right now
They can be used in all kinds of
Nafaria's way they're almost magical is that
Yeah
That's um scary
It's scary and I think I think I go like I would go one step further than that Yeah, that's um, scary.
It's scary and I think I would go,
I would go one step further than that.
I feel like just by observation that these algorithms,
they don't show up randomly.
They show up when there's a really difficult conversation
that people want to avoid.
They're like, oh, we don't know what it,
what makes a good teacher.
And different people have different opinions about that.
So let's just bypass this conversation by having an algorithm score teachers.
Or we don't know what prison is really for.
Let's have a way of deciding how long to sentence somebody.
We introduce these silver bullet mathematical algorithms because we don't want to have a way of deciding how long to sentence somebody. We introduce these silver bullet mathematical algorithms
because we don't wanna have a conversation.
In O'Neill's book, she writes about this young guy
named Kyle Beam, who takes some time off college
to get treated for bipolar disorder.
Once he's better and ready to go back to school,
he applies for a part-time job at Crocker,
which is a big grocery store chain.
He has a friend who works there who offers to vouch for him.
Kyle was such a good student that he figured the application would be just a formality,
but he didn't get called back for an interview.
His application was read-lited by the Personality Test he'd taken when he applied for the job.
The test was part of an employee selection algorithm developed by a private workforce company
called Kronos.
70% of job applicants in this country take Personality Test before they get an interview, so this employee selection algorithm developed by a private workforce company called Cronos.
70% of job applicants in this country
take personality tests before they get an interview.
So this is a very common practice.
Kyle had that screening and he found out
because his friend worked at Crogers
that he had failed the test.
So most people never find that out,
they just don't hear back.
And the other thing that was unusual about Kyle
is that his dad is a lawyer.
So his dad was like, what were the questions like on this test?
And he said, well, there are some of them were a lot like the questions I got at the hospital, the mental health assessment.
The test Kyle got at the hospital was called the five factor model test, and it grades people on
extroversion, agreeableness, conscientiousness, neuroticism, and openness to new ideas.
It's used in mental health evaluations.
The potential employees' answers to the test are then plugged into an algorithm that decides
whether the person should be hired.
So his father was like, whoa, that's illegal under the American Disability Act.
So his father and he sort of figured out together that something very fishy had been going
on, and his father's actually filed a cross class action lawsuit against Kroger's for that.
The suit is still pending, but arguments are likely to focus on whether the personality
test can be considered a medical exam.
If it is, it'd be illegal under the ADA.
O'Neill gets that different jobs require people with different personality types, but she
says a hiring algorithm is a blunt and unregulated tool that ends up disqualifying big categories
of people, which makes it a classic
weapon of math destruction. In certain jobs, you wouldn't want neurotic people or
introverted people. Like, if you're at a call center where a lot of really, I rate customers
call you up. That might be a problem. In which case, it is actually legal if you get an exception for your company.
The problem is that these personality tests
are not carefully designed for each business,
but rather what happens is that these companies
just sell the same personality test
to all the businesses that will buy them.
A lot of the algorithms that O'Neill explores
in a book are largely hidden.
They don't get a lot of attention.
We as consumers and child applicants and employees may not even be aware that they're humming
along in the background of our lives, sorting us into piles and categories.
But there is one kind of algorithm that's gotten a lot of attention in the news lately.
Is this a good or bad thing that social media has been able to infiltrate politics.
Social media is a technology and as we know, technologies have their good size and the
dark size, they're not so good size.
So it all depends on...
Towards the end of our conversation, O'Neil and I started talking about the recent
election and the complex ways that social media algorithms shaped the news that we receive.
Facebook shows a story as based on what they think we want and of course what they think we want is based
on algorithms. These algorithms look at what we clicked on before and then
feed us more content we like. The result is that we've ended up in these
information silos increasingly polarized and oblivious to what people of
different political persuasions might be seen. I do think this is a major
problem.
The sort of the sky's the limit.
We have built the internet and the internet
is a propaganda machine.
It's a propaganda delivery device, if you will.
And that's not, I don't see how that's going to stop.
Yeah, especially if every moment is being optimized
by an algorithm that's meant to manipulate your emotions.
Right, that's going exactly going back to Facebook's optimizer algorithm. That's not optimizing
for truth, right? It's optimizing for profit. And they claim to be neutral, but of course,
nothing's neutral. And we have seen the results. We've seen what it's actually optimized for,
and it's not pretty.
This kind of data-driven political micro-targeting
means conspiracies and misinformation
can gain surprising traction online.
Stories claiming that Pope Francis endorsed Donald Trump
and that Hillary Clinton's sold weapons to ISIS
gained millions of viewers on Facebook.
Neither of those stories was true.
[♪ Music playing in background, music playing in background, music playing in background, Facebook, neither of those stories was true.
Fixing the problem of these destructive algorithms is not going to be easy, especially when they're
insinuating themselves into more and more parts of our lives.
But O'Neill thinks that measurement and transparency is one place to start, like with that Facebook
algorithm and the political ads that it serves to its users. If you were to talk to Facebook about how to inject some ethics into their optimization,
what would you do?
Would you sort of make a case for the bottom line of truth being a longer tail way to make
more money?
Or would you just say, this is about ethics and you should be thinking about ethics?
To be honest, if I really had their attention, I would ask them to voluntarily find a space
on the web to just put every political ad, and actually every ad, just have a way for
journalists and people interested in the concept of the informed citizenry, to go through
all the ads that they have on Facebook at a given time.
Because even if that article about Hillary Clinton and ISIS was shared thousands of times,
lots of people never saw it at all.
Just show us what you're showing other people.
Because I think one of the most pernicious issues is the fact that we don't know what other
people are seeing.
I'm not waiting for Facebook to actually go against their interests and change their profit
goal, but I do think this kind of transparency can be demanded and given.
O'Neill also says it's important to measure the broad effects of these algorithms and
to understand who they most impact.
Everyone should start measuring it.
What I mean by that is relatively simple.
This might not be a complete start, but it's a pretty good
first step, which is measure for whom this fails.
Meaning, which populations are most negatively impacted by the results of these algorithms?
And what is the harm that befalls those people, for whom it fails?
And how are the failures distributed across the population?
So if you see a hiring algorithm fail much
more often for women than for men, that's a problem. Especially if the failure is they
don't get hired when they should get hired. I really do think that a study, a close
examination of the distribution of failures and the harms of those failures would really
be a good start.
If you're not mad enough about how algorithms influence your life, I've got a doozy for
you after these messages. We are currently experiencing higher call volumes than normal.
Please stay on the line and then agent will be with you shortly.
Here's one that I think is kind of fun because it's annoying and secret, but you would never
know it.
So if you call up a customer service line, I'm not saying this will always happen but it will sometimes happen that your phone number will be used to back track
like who you are, you will be as like are you a high-value customer or low-value
customer, and if you're a high-value customer you'll talk to a customer service
representative much sooner than if you're a low-value customer. You'll be
help put on hold longer. That's how businesses make decisions nowadays.
You are caller number 90, nine.
Your call is important to us.
Please stay on the line.
99% of visible was produced this week by Delaney Hall, tech and mixed production by Emmett Fitzgerald.
Katie Mingle is the senior producer.
Kurt Colstead is the digital director.
Sean Rial composed all the music.
The rest of the staff includes Avery Trouffman-Charifuses,
Taren Mazza and me Roman Mars.
Special thanks to Ryan Keesler and Courtney Riddle.
We are a project of 91.7 KALW in San Francisco
if produced on Radio Row in Beautiful, downtown, Oakland,
California.
99% of visible is part of radio topia from PRX,
a collective of the best, most innovative shows
in all of podcasting.
We are supported by the night foundation
and coin carrying listeners, just like you.
You can find 99% of visible and join discussions
about the show on Facebook.
You can tweet at me at Roman Mars in the show
at 99PI org, or on Instagram, Tumblr, and Reddit too.
But our lovely home on the internet with more design stories than we can ever tell you here on the radio, or Instagram, Tumblr, and Reddit too. But our lovely home on the internet with more design stories
than we can ever tell you here on the radio or podcast,
I guess this is a podcast.
Is our website at 99PI.org.
RADIO TOPIORX
RADIO TOPIORX
FROM PORX.
From PRX.