Today, Explained - Inside the AI factory
Episode Date: July 25, 2023We are used to thinking of artificial intelligence as knowledge generated by machines. The Verge’s Josh Dzieza pulls back the curtain on the vast network of human labor that powers AI. This episode ...was produced by Amanda Lewellyn, edited by Amina Al-Sadi, fact-checked by Laura Bullard, engineered by Patrick Boyd, and hosted by Sean Rameswaram. Transcript at vox.com/todayexplained Support Today, Explained by making a financial contribution to Vox! bit.ly/givepodcasts Learn more about your ad choices. Visit podcastchoices.com/adchoices
Transcript
Discussion (0)
We're used to thinking of artificial intelligence as knowledge generated by machines.
You can get ChatGPT to write an email for you.
I hope this email finds you well. Blah, blah, blah, blah, blah.
You can ask Midjourney what Pope Francis would look like in a puffer jacket.
Can I say something without you guys getting mad?
But it turns out there's a vast network of human labor powering AI.
There are people training AI every day, sometimes all day, just clicking away on images, on
pixels, so that the AI can get better at identifying things the way we humans do.
We're going inside the AI factory on Today Explained.
BetMGM, authorized gaming partner of the NBA, has your back all season long.
From tip-off to the final buzzer, you're always taken care of with a sportsbook born in Vegas.
That's a feeling you can only get with BetMGM.
And no matter your team, your favorite player, or your style, there's something every NBA fan will love about BetMGM.
Download the app today and discover why BetMGM is your basketball home for the season.
Raise your game to the next level
this year with BetMGM,
a sportsbook worth a slam dunk,
an authorized gaming partner of the NBA.
BetMGM.com for terms and conditions.
Must be 19 years of age or older to wager.
Ontario only.
Please play responsibly.
If you have any questions or concerns
about your gambling or someone close to you,
please contact Connex Ontario at 1-866-531-2600 to speak to an advisor free of charge.
BetMGM operates pursuant to an operating agreement with iGaming Ontario.
You are listening to Today Explained.
I'm Sean Ramos-Firm and I'm joined by Josh Jezza from The Verge,
who just wrote a big piece about the people behind artificial intelligence.
It is about the human labor behind artificial intelligence. You know, it's often said that AI
learns from data, finds patterns in data, but that data has to be curated, sorted,
labeled, sometimes made by humans.
So I wrote about those humans.
It's often called data annotation, sometimes data labeling.
The work is pretty weird, and there's a huge range in what you might be doing. Like, let's say you log onto your platform and you
might be labeling clothes in social media photos. You might be sorting TikTok videos
based on whether they're fast paced or slow paced or something, or you might be like labeling
food and saying like, yes, that's Diet Coke. Or you might be looking at chatbot responses
and saying, you know, this is incorrect,
or this is profane, or, you know, too long,
or totally off the wall.
So there's a huge range in the types of jobs you might be doing.
What they have in common is they tend to be sort of small.
Like, there's one thing you're doing over and over and over,
and also have extremely high quality standards.
Like let's say you are outlining vehicles
or something like that.
You have to outline it to the pixel.
Is this like the kind of thing that I do
when I'm trying to log into a website and it's like...
How many of these pictures have cars in them?
Exactly.
It's a lot like CAPTCHA.
That was actually a method and still is a method of kind of getting this work for free.
By definition, it's something AI can't do yet.
So when I do a CAPTCHA, I'm helping the back end of some website trade AI?
Exactly. And you may have noticed over the years that CAPTCHAs have gotten harder.
That's because the AI has gotten better.
So you need blurrier, weirder images to sort of raise the bar and also to improve AI.
So in a way, you and I and all of us are AI annotators.
Yes. Yeah. And annotators are just people who do it, you know, full time for pay.
Did you see people doing this kind of work? So I did this kind of work.
You did it yourself? Yes, I did it myself as a way to meet people who are doing this kind of work.
It's all online for the most part. So did you like apply for a job? Did you cheat on The Verge?
I made all of $1.50, I think. But yes, it was my second job for a couple months. But
the application is very easy. You just have to speak English and have an email address.
You fill out some basic information and then you get a welcome email and you're invited into a Slack
channel. And then you have to start training to actually work. You have to
learn what data annotation is and then do kind of a training module actually work. You have to learn what data annotation is
and then do kind of a training module for each task.
It's sort of a video game.
So these courses, they're like instructions.
So you have to read them carefully
and understand each and every bit of it.
And those instructions can come with scenarios
and they can come with some questions or quizzes. A project has, let's say,
three or four courses. You have to start the first one, you finish it, go to the next one,
and so forth. Now you can start working on this thing for money. Give me like a day. What was
your shift like? It was extremely difficult you know i thought i
was going to kind of log in and see what kind of jobs were out there and you know get invited to
these channels and move past this fairly quickly but i i kept flunking the training for the first
task i would try to do huh i can give you an example. Like one of the early ones I was doing was just labeling clothing.
And the instructions were something like,
label the items of clothing that are real clothes
that can be worn by real people or something like that.
It was just like seemingly quite self-explanatory.
So I just sort of clicked proceed past the instructions
and got started and failed immediately.
One of the things that tripped me up at first
was like there was a magazine that had some photos of clothes in it.
I was like, well, that's not, you know, you can't wear a magazine.
But like to an AI, these systems are really literal.
They're not very smart.
And so it's all just pixels.
It doesn't understand what a magazine is or what a reflection is.
And so you need to label images of clothes
and reflections of clothes in mirrors and things like that.
And so that was sort of the first curveball.
But then it just goes on from there.
It's like label costumes, but not suits of armor.
And where you draw that line is the difference between having a job and getting fired.
These sorts of weird distinctions that get drawn.
The full instructions were over 40 pages.
And you have to kind of keep referring back to those
as you do your work.
You talk about failing.
Do you still get paid if you fail?
No, I mean, you'll get paid for the tasks that you completed,
but then, you know, you just get booted out.
It says your low quality has, you know,
gotten you suspended from this task,
and you have to go back and start you suspended from this task, and you
have to go back and start training again on some new thing and try to qualify.
Wow. So it's really in your interest to read the instructions, it sounds like.
Yeah. And workers, I found, because the instructions are not well-written,
they're just inhumanly complex. And so they end up teaching each other, doing a lot of free labor,
honestly, doing YouTube tutorials or Google Meets, where they try up teaching each other, you know, doing a lot of free labor, honestly, doing YouTube tutorials or Google Meets where they try to teach each other what these instructions actually mean.
Is it steady work, Josh? Do you get as many tasks as you want? Is it like dependable income?
No. So this is one of the things that surprised me.
I mean, it's obviously unsteady at the level of, like, if you don't read
the instructions really carefully
and you do something wrong,
you're going to get banned.
And so that is very precarious.
But it's also just unsteady,
even if you're the best annotator
in the world,
there's like a really spiky demand
for this sort of work.
You know, there'll be a period
where there's a bunch
of well-paying tasks on there
and you can work as much as you want and then they'll disappear and you don't know why.
And you have no work or you can only do tasks for a penny or something like that. And then
they'll come back. So I spoke to a lot of people and people were frustrated at the low pay.
Even more than that, people were frustrated that it's steady enough that you can almost depend on it,
but not enough that you aren't constantly without work.
And so they would, you know, I talked to people who developed these habits of
waking up every three hours in case something well-paying appeared.
And then if there was staying up for 36 hours straight, just sort of labeling.
I talked to one guy who was just labeling elbows and knees.
He didn't know why, but it was paying well.
And he just wanted to do it while it lasted
because then you might be out of work for a week.
Elbows and knees?
Yeah, there's a lot of stuff on there
that you just have no idea what it's for.
And that was one of them
where it was just like photos of crowds
and it was like, label all the elbows and knees.
So, okay, so you're just sitting there
labeling elbows and knees for 36 hours straight for how much money?
It's super variable.
Each task pays some amount of money.
But the workers I talked to, they were getting paid for something like that, like a couple bucks an hour, as low as $1 an hour.
You cannot pay all the bills.
It's a side hustle.
Maybe just one bill, maybe it's for the internet bill, and then that's it.
Wow. And do they have any idea why they are labeling elbows and knees for a dollar an hour,
potentially 36 hours straight?
No. Well, they know they're training AI, and they know it's for some company, but they don't
know who's AI or what they're training it to do unless they can kind of guess that it's
a self-driving car or something.
But the elbows and knees, no, they don't know because there's just layers and layers of
anonymity in the system.
So each project, all they know about the platform is that it's called Remo Tasks. And then each project is named something totally cryptic, like Pillbox Bratwurst
or something just like non sequitur code names. And so and who all this labeling is for, really. Thank you. by Wirecutter. AuraFrames make it easy to share unlimited photos and videos directly from your phone to the frame. When you give an AuraFrame as a gift, you can personalize it, you can preload it
with a thoughtful message, maybe your favorite photos. Our colleague Andrew tried an AuraFrame
for himself. So setup was super simple. In my case, we were celebrating my grandmother's birthday
and she's very fortunate. She's got 10 grandkids. And so we wanted to
surprise her with the AuraFrame. And because she's a little bit older, it was just easier for us to
source all the images together and have them uploaded to the frame itself. And because
we're all connected over text message, it was just so easy to send a link to everybody.
You can save on the perfect gift by visiting AuraFrames.com to get $35 off Aura's best-selling Carvermat frames with promo code EXPLAINED at checkout.
That's A-U-R-A-Frames.com promo code EXPLAINED.
This deal is exclusive to listeners and available just in time for the holidays.
Terms and conditions do apply. We'll see you next time. quick and secure withdrawals. Get more everything with FanDuel Sportsbook and Casino. Gambling problem? Call 1-866-531-2600. Visit connectsontario.ca.
Why did the elbow cross the road? Let me tell you, it's quite humorous.
Today Explained, we are back with Josh Jezza. Josh, you just told us that these people who are
slaving away, training AI, 36 hours straight, a dollar an hour, whatever it is, they don't know exactly what they are doing it for.
Do you know what they're doing it for?
So AI needs tons of examples to learn from.
And so autonomous vehicles is a great example of something where this thing is out in the world steering around a multi-ton piece of metal.
The stakes are really high.
You can't have it get confused.
It's super dangerous.
There was a case a couple years ago where an Uber self-driving car killed a woman in Arizona.
It could recognize pedestrians.
It could recognize bikes.
But it struggled to figure out what was happening with a person walking a bike along a street, not near a crosswalk. It didn't have enough
data on it. And so the demand for data for self-driving cars is super high. If you think
about how many times you're driving and you go past construction or just something unexpected
happens, you need to have data on it. So there's thousands and thousands of people
whose job it is to get data from these cars
and go through and say,
here's a pedestrian, here's a traffic cone,
here's a pothole.
That's basically how it works
with any machine learning system.
Whether it's language or image recognition,
you need training data
and you need someone to make sure
it's the right training data
and to put tags on it and
provide that human input. And where are these data annotators based typically?
They're all over the world because you need so much of this data. The pay tends to be fairly low.
And so you have a lot of people in India, the Philippines, Kenya is a big hub, Venezuela, because you often get paid in U.S. dollars.
And so if there's a place where the currency is crashing and people can do the work and there's fast internet, the work tends to go there.
Since I'm in Kenya, Africa, so we get paid, I think, $1 to $2 an hour, which is pretty low.
You can see it's just a side hustle because you cannot cater for your basic needs,
whether it's a phone bill or the rent, yeah.
How long have we been outsourcing our data training?
It's been at least a decade, you know, probably more.
One of the turning points happened in the late 2000s.
You've always needed some form of data curation,
but before that it was often done by a researcher
and their grad students or something.
But with increasing computational power,
it became possible to train on more data.
And so in the late 2000s, you have people start to use
labeled data sets of millions of images instead of, you know, a couple of thousands.
We downloaded nearly a billion images and used the crowdsourcing technology
like Amazon Mechanical Turk platform to help us to label these images at its peak.
When you reach that scale, people start going overseas
because you need people who will work for less.
Together, almost 50,000 workers
from 167 countries around the world
helped us to clean, sort, and label
nearly a billion candidate images.
Will the need for these data annotators eventually dry up?
Is this job sort of a finite experiment?
There are different views on that.
There's certainly people in the AI industry who think
we're going to reach a breakthrough where the AI is going to be so smart
that it doesn't need human input anymore.
It's going to become super intelligent.
There's a lot of other people
who disagree with that.
And certainly historically,
what has happened is
annotation is always kind of getting automated.
Like if you look at those early
image recognition systems,
like that's automated.
AI can tell the difference
between an image of a cat and a dog. But it enables new technologies like self-driving cars,
and now you need even more people doing even more and more complicated forms of annotation.
And that has been the way it's gone. And you can certainly see a world where
these language models are out in the world, and all the things they're supposed to be doing,
like giving health advice or legal advice,
are complex, changing, high-stakes fields, and you're going to need even more human annotation there.
So is this future of, like, you know,
perpetual human collaboration with AI
going to lead us to some ideal
where the cars will drive themselves perfectly?
Or the, I don't know, the robot
doctors will know my knee from my elbow. I guess I should talk about sort of how
brittle these systems are. That's the word that's used to describe their knowledge,
the state of their knowledge. When you're training something to be accurate, for example,
you have people who are rating it for
accuracy, but one, maybe they're not rating it correctly because it's very time consuming and
often impossible to fact check every written response. Often responses are open to interpretation
or just too complicated. And two, you don't know that it's learning the right patterns,
as opposed to learning to talk like whatever text people have labeled as accurate.
Sounds like.
So one of the risks that I think we're seeing now is it's become these language models particularly have become extremely good bullshitters.
Huh.
Like you may have seen the case of the ChatGPT lawyer who submitted some legal filings citing cases that he asked Chachipiti
for. The lawyer cited more than half a dozen relevant court decisions to make his case
for why the lawsuit had precedent. The only problem, none of those decisions were real.
The program even reportedly told him yes when he asked it to verify that the cases were legitimate.
Sounds like a trash lawyer, though, honestly.
Yes, I would certainly not consult ChatGPT for legal advice.
And the question is, will it ever get there if you just throw enough annotation at it, enough data at it?
Is there going to be a point where it learns what is true or false or what the legal reasoning or something like that?
Or is it going to continue to just sort of be a better and better mimic and you're always going to have that possibility that it's going to make some catastrophic error?
That's an open question.
And also, it's an open question of how you're going to have people who can continue to oversee these models as they get
so good at mimicking people. Yeah. Right. Like you need a very good lawyer all of a sudden who can
critique an AI model that is good at, you know, making up legal advice. And what about the other
side of this? Just like the treatment of workers. I mean, you mentioned people working 36 hours
straight. If Google might be behind the contract job that someone in Kenya has that's paying them a dollar an hour to annotate elbows,
are they cool with working people like that, 36 hours straight, for like a dollar an hour?
That is a question for Google.
But I can say that some of their annotators in the U.S., the people who are reading search results and YouTube results through the platform Appen have been protesting their conditions, saying that they're underpaid, that they don't have health benefits.
Raiders are why Google search results are so good.
They make sure that people like you and me get the information we need every single time.
And no one working for Google should be struggling to pay their rent.
Google's defense has been that they are paid fairly.
But there tends to be in the industry not a lot of attention on this kind of work.
Part of it, I think, stems from the sense that it won't be needed for long.
That you, you know, the AI will get good enough that you don't need annotators anymore.
And so it's not really a job so much as just some temporary work that you're calling on someone to do.
And what happens after that is not really your concern.
And so I think there's a sense where companies just don't even really think of it as a labor issue
that they're just kind of buying buying a bunch of data that may be changing i've seen the sort
of in papers people say you know these annotators were paid the median wage wherever they're based
or things like that i think there is you know when attention is brought to this situation, there often is a push to do better, but it's pretty uneven. And there's
just not a lot of transparency in the data pipeline. And so even if you want to do better,
it's hard. You know what it sounds like, Josh? It sounds like it might just be easier to pay
people to do jobs. Did that occur to you at any point while you were clicking through whatever data that you were
annotating? That did occur to me many times while I was annotating. There's one where it was quite
acute where I was tracing pallets in like a warehouse for some kind of self-driving forklift.
And just the amount of really kind of excruciatingly detailed labor that was going into figuring out how to drive a forklift around to automate, you know, one job, a forklift driver.
It was pretty staggering.
I mean, there must have been hundreds, if not thousands of people working on this thing around the world, just tracing pixel by pixel each pallet and each pallet hole, these dark warehouses.
I guess that the hope of these companies is that once you've done all that work,
you have this thing that can do it forever.
But I don't know that that's true because, you know,
the world keeps changing and throwing up new edge cases.
And somewhere in this world, that used to be a good union job.
Right, exactly.
What were you hoping people would take away from your piece? What were you hoping people would learn by going inside this AI factory? I think there are a couple of different
things and a couple of different reasons why it's important to look at this work. I mean,
the first is just kind of the labor issues that it raises. You have these potentially extremely profitable technologies that rely on often
low-paid and labor around the world that is often not discussed.
And kind of the second thing that I wasn't expecting to find but found is that the work
is kind of structurally precarious in a way that a lot of, even for gig work, like gig
work is notoriously precarious.
But the way AI development works, where you need a ton of, even for gig work, like gig work is notoriously precarious, but the way AI development works,
where you need a ton of data to train your model,
and then you need like a bit of more specific data
to fill in some edge case,
and then nothing for a while,
and then a ton more data,
means that if this is going to be a fixture in an AI economy,
there's going to be a lot of time people are not working,
and there's going to be times when lots and lots are not working, and there's going to be times
when lots and lots of people need to work. And the way it's set up right now, the workers pay the
cost of that. They're the ones who are unemployed whenever they're not needed, and then they're
expected to be kind of on demand when they are needed. I think also just a better understanding
of the way these systems work. I think it's easy to, especially with something like ChatGPT, when it can tell you
that it's an AI trained by open AI using reinforcement learning and all about itself,
that it acts in these very human-like ways, that there's a tendency to think it can reason like a
human. But it's important to think about the fact that a lot of that stuff was written there
manually by humans and then reinforced by humans.
And there's a sense in which seeing the humans in the system kind of makes you realize how
inhuman these machines are and that they have some pretty glaring weaknesses.
I don't trust him, Josh. I think that's wise for the time being.
Josh Jezza does investigations at The Verge.
You can read his work at theverge.com.
His piece that inspired our episode today was titled Inside the AI Factory, and it also ran on the cover of a recent issue of New York Magazine.
The show today was produced by Amanda Llewellyn.
It was edited by Amina Alsadi and fact-checked by Laura Bullard.
We were engineered by Patrick Boyd.
I'm Sean Ramos for him, and this is Today Explained. explained.