Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 2x09: Building Transparency and Fighting Bias in AI with Ayodele Odubela
Episode Date: March 2, 2021When it comes to AI, it's garbage in, garbage out: A model is only as good as the data used. In this episode of Utilizing AI, Ayodele Odubela joins Chris Grundemann and Stephen Foskett to discuss prac...tical ways companies can eliminate bias in AI. Data scientists have to focus on building statistical parity to ensure that their data sets are representative of the data to be used in applications. We consider the sociological implications for data modeling, using lending and policing as examples for biased data sets that can lead to errors in modeling. Rather than just believing the answers, we must consider whether the data and the model are unbiased. Guests and Hosts: Ayodele Odubela of @CometML is an ML instructor, founder, and author. Connect with Ayodele on Twitter at @DataSciBae. Chris Grundemann a Gigaom Analyst and VP of Client Success at Myriad360. Connect with Chris on ChrisGrundemann.com on Twitter at @ChrisGrundemann. Stephen Foskett, Publisher of Gestalt IT and Organizer of Tech Field Day. Find Stephen’s writing at GestaltIT.com and on Twitter at @SFoskett. Date: 3/2/2021 Tags: @SFoskett, @ChrisGrundemann, @DataSciBae
Transcript
Discussion (0)
Welcome to Utilizing AI, the podcast about enterprise applications for machine learning,
deep learning, and other artificial intelligence topics.
Each episode brings together experts in IT infrastructure to discuss AI in today's data
center.
Today, we're continuing our ongoing discussion of ethics and bias in AI, since this is such a crucial topic.
First, let's meet our guest, Ayodele Odebella.
Thank you so much.
I'm Ayodele Odebella.
Like you mentioned, I'm a data science advocate at Comet, where we work to help data science
teams build models faster and increase their reproducibility and be able to collaborate
with each other.
And I'm Chris Grunewin. In addition to being the co-host here with Stephen, I am also an independent consultant, content creator, coach and mentor.
And as you mentioned, I'm Stephen Foskett, organizer of Tech Field Day and publisher of Gishtalt IT.
You can find me on Twitter at S Foskett. So those of you who've listened to previous episodes of
utilizing AI over these last two seasons have heard us return again and again to the topics
of ethics, bias, the implicit challenges of dealing with data, and the fact that basically
once you have an AI model trained on a biased data set, it's inevitably going to give us biased outputs. Now, Ayodele, this is
something that you have focused on both professionally and personally. So I wonder if
maybe you can give us your perspective on bias, both in data as a data scientist, as well as in
machine learning. I think you hit the nail on the head by saying it's very much garbage in, garbage out. Models that are based on biased data tend to be biased. And I would even go further to suggest that if we are not taking steps to investigate bias, we are more likely to perpetuate it, regardless of if we think a data set is more unbiased than another one. So I think it's important to mention that while awareness is really great,
organizations can start to focus on this transparency and accountability
so that we do get to equitable ML and don't deal with issues of disparate outcomes
with our machine learning models.
Now, how specifically can companies dive into that
level of transparency, right? I mean, are we talking about ensuring that the AI models themselves are
explainable or is it the data sets that should be open? And obviously there's some proprietary
things here, there's some intellectual property issues that come into play when you're talking
about companies that are going for profit. Are there there specific steps or how to really quickly that companies
can take to provide that level of transparency that really we need? Absolutely. I think part of
it is adopting new frameworks for communicating with users. So when we're talking about transparency,
we should be informing users of what data we collect about them. I think we'll see something similar to the GDPR in the US
at the bare minimum really soon. But going beyond that to informing people what kinds of decisions
we make about them. Are we predicting what kind of user segment they're going to be in? And are
we telling them so? We don't necessarily have to go into all of the specifics because of all of the
issues, but do users know that we're making predictions about them in general? That makes
it so much easier to open the door for this algorithmic appeals process or FDA of algorithms
that have been ideas that are tossed around a lot. Does that tie into the idea of data observability as well,
or is that something separate? I think it absolutely does, especially monitoring how
we collect data and if it's been processed before it gets to our data teams. And understanding those
gaps are really, really difficult. It's so hard to investigate what kinds of data changes and transforms maybe an API is making before you receive data that you use for training models.
So just having, A, more observability and knowing the kinds of changes that happen in this process.
And then being clear and transparent with users about how we're manipulating data. So as someone who makes ML models, I know how
important feature selection and feature engineering is to our outcomes, but it's really rare that we're
transparent about the important features that we are using and communicating to users. It's X, Y,
and Z that made us make this prediction about you? I think that sometimes when companies think of bias,
they immediately jump to sort of, I don't know,
a sociological political perspective
instead of approaching it really as a data problem.
I mean, it's essentially that these things
are only as smart as they are,
as the information that they're fed.
And bias isn't as much a moral question from a ML
perspective as it is a functional question. Essentially, if we build our systems wrong,
then they're going to be wrong. And the approach to fixing that, I think, is similarly,
it's not that we're throwing stones at people. It's that we're
saying, look, you know, we have to be cognizant of this and we have to build things differently.
Is that right? Absolutely. I think, especially when I'm talking to organizations, I try to focus
on, I don't even say bias at times. Sometimes I just talk about statistical parity and understanding if we are
collecting data that's representative of the actual populations, or if we have samples of
data that, especially for surveys, I think that's a popular one where you are going to have
incredibly biased survey results by people who tend to fill out surveys more often. It's easier
to get information, they are more
willing to respond for things like an Amazon gift card or something like that. We understand in the
survey perspective that our results are almost always biased by those who are willing to
participate. But when we are talking about our actual data sets, one, it's really uncommon that we document them as detailed as we should.
I think that's one of the biggest issues and almost every tech org has seen it where
down the line you have a new data science department and they're unable to decipher
what's really going on in the data and how it was collected and those nitty gritty issues that people like myself are often trying to face.
So it is part having better documentation, but also focusing on this is genuinely about statistical parity, regardless of what bias we're looking at, if we're looking at race data, gender data, age, nationality, it's about having
representative data and about being able to create statistical parity amongst those groups when we're
creating models. Yeah, that's a really interesting way of phrasing it, right? And I really like that
idea of talking about it in terms of statistical parity, because obviously that's not a new problem.
Trying to get representative groups for your study, whatever that study might have been, whether it's surveys or medical studies or anything along those lines, has always been an issue to come up with good results, right? Whether you were doing the counting by hand or now in AI. So that almost brings us out of the realm of AI as just a human problem or human organization problem. But obviously, it seems that AI can accelerate that.
So are there other methods or other ways of approaching this
that are maybe, you know, pre-date AI that we should be applying to AI?
Or is that the same thing you've already been talking about?
Or is there other things that we could learn from, you know,
past attempts at statistical parity?
Yeah, I think the aspect, it's kind of one in the same, but
there, we can rely on some precedents that other industries have set. So starting to look at
finance and the way they, I have always encouraged organizations to pretend they were consumer
reporting agencies. And so if you are not dealing with a lot of regulation in exactly
how you have to communicate with users, think about yourselves as if you were a credit card
company. If you deny someone for a credit card, you have to tell them why, because the Federal
Fair Credit Reporting Act requires you to tell people why. So almost an assumption that regardless of if there is a regulation body now,
there may be one within the very soon future. So having an understanding there, I think,
sets groups up for success in that they are, A, better documenting their work, and then also focusing on models that are easier to explain
and easier to interpret. So for the vast majority of organizations, you can take the trade-off of,
you know, a couple more percentage points in accuracy, but have an interpretable model,
and that will save you a lot of headaches over time. So especially if you're predicting
things on your users, it's okay to be a little bit less accurate about what segment they might be in
if you have ways to explain your model and you can clarify exactly why it made a specific decision
about a user and then allow them to repeal it. So there's a lot of work to be done, I think, internally with
teams, working with stakeholders to understand we don't necessarily need a neural network.
We are probably going to be better off using like a linear method or a decision tree.
It's one of the conversations I've had many times, and it takes a little bit of convincing, to be honest,
in some of these organizations to move away from pure accuracy or just trying to beat the state of
the art and trying to create a more, a culture of accountability by using interpretable frameworks.
It's interesting that you bring that up because that's
one of those things that I keep kind of coming back to in my mind. Whenever anybody says AI
these days, they immediately jump to machine learning and specifically like deep learning.
And yet, as you said, like a decision tree or an expert system might be able to give you better
results if it's properly constructed. But of course, one of the challenges
with expert systems in the past has been that they also had bias because of the experts used
and the paths that those experts assumed people would follow. So essentially, you've got a decision
tree. The questions you're asking and the paths available can actually
introduce a tremendous source of problems when it's actually applied.
That, I think, might even be harder to deal with than a machine learning situation where
essentially you just have to just have to find a
representative data set. Absolutely. I think we've basically said that it's okay to develop technology
without really working closely with subject matter experts. And I think that is one of
the biggest issues specifically in machine learning. Obviously with expert systems, who these experts are and their perspectives may
not necessarily match up with the population that this model is going to be making predictions on.
But when we're talking about ML, it's even more difficult because we tend to have engineering
teams that are purely engineering teams. Regardless of our industry,
we should have, I would say, more professionally diverse groups. First of all, because social
scientists are able to help us combine these historical bits and pieces that we don't
already understand. So I think a great example of this is PredPol or like the predictive policing algorithms. So because a lot of people who are creating these products are typically in dominant groups and in dominant communities, it's easy to say it would either be unbiased or purely based on ground truth because they're only using information like the crime type, the location,
as well as the date and time. But the perspective that social scientists can offer is an understanding
of different neighborhoods and knowing that some areas are already over-policed or knowing that
some areas, because of lack of access to specific resources, over policed and then see larger amounts of crime.
Those are the perspectives that it's really hard for on the ground engineers who are buried in the
actual code to be able to recognize. So that's where I would say we leverage perspectives and
even for myself, I am a data scientist, but I'm not a historian. I'm not an anthropologist. I can speak because of lived experience. But even then, I have had to do a. And unfortunately, so many data folks are tasked like that.
Regardless of your organization, you're probably going to have a moment where you're giving
a data set you've never seen before and don't have any context about.
And then being asked to create a model that works in a short period of time is really,
really difficult.
Yeah. in a short period of time is really, really difficult. Yeah, so as somebody who has a bachelor's degree
in sociology, I endorse your endorsement of sociologists.
No, I am so used to having people accuse sociology
and social science as not being a real science,
but frankly, it is a real science
in that it attempts to objectively look at issues that are very hard
to look at objectively. And I think that the, you know, the policing example, the, you know,
the mortgage and finance example, you know, I think that, you know, a nerd might approach that
and say, look, data is data, like these people were arrested at this place in this time,
or these mortgages were defaulted in these
neighborhoods, and therefore, this is hard facts. Whereas a sociologist might look at that and say,
wait a second, like, you know, there are outliers here. How come there's so much data for this
location versus this other location? And, you know, it's that scientific approach as opposed
to sort of like a nerd approach that I think that we need when we're looking at data analytics and machine learning.
And frankly, this is one of the biggest problems I see with a lot of the applications of AI today is that they essentially they're trying to build a model that's too big and too broad and answers too many questions instead of focusing
on a model that does a few things well. Do you guys, am I off base here? I would say not at all.
I think that that's spot on for the various issues. We can, the more narrow the actual scope
of outputs we want, it's far easier to create a model that's representative
and then be able to check that for bias. I think one of the biggest critiques I have for most orgs,
especially non-credit reporting agencies and lending companies, is that we don't ask for
data on race and gender and nationality. And the problem is that that makes it nearly impossible to test if a model is fair.
And that is one of the hardest parts because if you're a lending company, there are regulations
that say you can't just ask your users what their race and gender are.
But for tech companies, it's hard.
I mean, we have model monitoring.
So when models are in production, we're trying to see if there is an accuracy drift. Are they getting worse over time? But it's really rare that we do the same for fairness. And that's because we don't have our data on protected classes. So we're trying to kind of boil the ocean, solve everything at once, but we don't have reliable methods for even checking to see if
we were doing good in the first place. That's really interesting. And I think,
it reminds me, there was an open letter published last year about this paper. I think the paper was
called a deep neural network model to predict criminality usage using image processing. And
essentially what they were doing was they had scanned a bunch of pictures of criminals and were then using that data set
to predict whether or not a new picture was a criminal. And of course, this is where we see
this idea of like, you know, essentially algorithmic discrimination and oppression,
because it just so happens that, you know, in mass, more people of color are arrested than white people,
not necessarily because they commit more crimes, because that's where policing is focused.
And so you've got this kind of real world problem of endemic and systemic racism that is now perpetuated into the algorithmic realm
because you've trained this model on biased data.
And I think that it's interesting that,'s interesting that perhaps having more understanding of that or
more data could actually help solve that, right? We're actually almost using this idea of, oh,
it's unbiased. We just use the data as a shield where if we had gone a little bit deeper and had
more data and then looked at the actual results, that could have come to a better result. Yeah, I think that's a huge part of it is really,
we are just perpetuating what has happened in history. And I think because machine learning
and AI out of every area of tech, I think has been represented by the media as the most
mystical, like we're magicians. And it's because there is this era of mystery around it,
it's easy to just believe the answer we get from a computer.
Like you can look at studies.
Most people are not going to argue with a calculator
if it gives them the wrong answer
because we assume a level of objectivity
and that's not the case.
Especially when we're looking at data
like criminal justice data.
There I have been asked this several times. There's almost no de-biasing.
There's nothing we can do to change historical policing patterns.
There's almost nothing we can do to find data about the crimes that weren't reported. So we have to understand, and I think this is the biggest struggle for the
vast majority of technologists, is not always assuming that because data exists, it's ground
truth and represented and tells you the whole story. You can look at criminal records and assume
because of how certain areas are policed that people of color are more likely to be criminals.
But I think at the core of this issue is that we make assumptions that certain things can be predicted. recently did a talk about being able to predict gender and genderfy and all of these other tools
that try to predict gender based off of your name or based off of an image. We're operating
under an assumption that because of an image, we can predict someone's gender. And I think that's
a flawed assumption to begin with. The same way for predicting criminality. I think it's a flawed assumption that there are
details about your face that for the most part are immutable data, like unless you go get surgery
or drastically change how you look, the structure of your face is immutable. And we are assuming
that that somehow predicts criminality. So I think we tend to not make the right assumptions with what can be
predicted as well as making the right assumptions when we're making, when we're using this data to
build models. We also bring our assumptions along with us when, during the modeling process. And I
think that's what, that's one of the areas that's harder to fight or deal with.
Yeah.
And before people start saying, oh, no, it's all about race and gender and things like
that, this kind of bias can actually creep in anywhere in AI and ML.
So for example, imagine you were creating a network intrusion detection system.
And it learned that network intrusions generally come
from higher port numbers, because that's how the network administrator plugs things in, right? They
plug in the important systems down here, and then they kind of work their way up the switch until
they get to the less important systems. And that's where the network intrusions come from. Or let's
say your firewall learns that, you learns that you can generally trust connections from
America and you generally can't trust connections from the Ukraine. Well, that's all well and good
until you've got your Ukrainian CEO who's trying to connect. So this is a real example of bias,
but there's biases that creep in everywhere in data and everywhere in our
assumptions. And we have to make sure that we're asking these questions continually
in the development of these models. 1000%. Yeah, I wonder, I mean, we've talked a lot about bias,
and I think that was a good kind of summary of the bias question, Stephen. There's some other ethical issues around AI that I'm interested in.
One of them is this concept of ghost work.
And I don't know if this is something, Ayadela, that you've ever encountered or have to work
with, but there's a lot of, I'd call them almost white collar sweatshops, I guess, perhaps,
where we've got folks in some developing countries where labor happens to be cheaper, who are doing some of the categorization to train AI. And potentially, you know, just like
maybe you shouldn't be paying someone five cents an hour to sew shoes, you probably shouldn't be
paying someone five cents an hour to look at, you know, medical images or something like that.
And I wonder how that's part of the ethical issues you've looked at as a data scientist. Absolutely. I think that's one of the more pervasive issues because everyone is excited
about neural networks and deep learning, but they rely heavily on vast amounts of labeled data.
And to get this labeled data, it's exactly like you mentioned. We are paying not even close to like minimum wage for people to wade through images, wade through
other types of just tabular data, and then label it. But one of the largest issues that comes out
of this is based on where you are in the world, based on, unfortunately, especially with human
language, there's so many different names for the same thing. I love there's a great map
and it shows you kind of where in the United States people call certain things. So like,
do you call it a Coke, a soda, a pop? Those things are important when we're talking about data
labeling. And especially for people who are, like you said, I love the way you said it, like
white collar sweatshops, people who are doing this
don't really have to, they don't have the time nor are they really paid enough to think critically
about their labeling. And then we don't really assess how their labeling might be impacted by
regional differences or human biases alone. So I think there's some really great examples of how, especially with
language data, it's incredibly easy to be accidentally biased. So oftentimes we're not
looking at these geological differences and how they play into our language and how
we call certain things different names depending on where we live.
And of course, that can result in just the kind of problems that we're discussing. So I mean,
if you if there are different words that are culturally tied to people of different races or different genders, then that can certainly lead to discrimination by the applications that
are using those systems, even from something that, you know, like you said,
is not necessarily directly tied to those situations. So we do have to wrap up here,
but I wonder if you all want to sort of finish it up. You know, what should be the takeaway here?
What should enterprises be looking for when they're trying to build an AI model without bias? I mean,
how should they approach this? I would say my biggest tip is to work with subject matter
experts and social scientists, and don't be afraid to leverage contractors and people outside of your
organization. I think myself and so many others, we are a little bit too close.
We are so tied to KPIs and goals and all of these other things that get in the way of
actually creating products that prioritize the safety of marginalized people and vulnerable
groups. So work with social scientists. Don't devalue the level of science and the level
of work that they do. And try and find ways to integrate them into your teams or work with them
while you're in this developing stages of ML, as well as when you are in production and you're monitoring these models.
I would say my biggest tip is definitely to think a little bit outside of the technological box and
not assume we're only fixing technical problems, because we are trying to fix a lot of social
issues with technology. Chris, what do you think?
Yeah, I mean, I'm really interested in the idea of transparency in a lot of ways, right, of where you got the data, how it was labeled, and then also the algorithm and what is spitting out. Because I think, you know, one of the things that we talked about earlier was this idea of, you know, looking at AI or ML as magic.
And I think we have a tendency to do that.
And so we just accept these answers
and understanding how that answer came about is really important. And so really having transparency
into the system or algorithm that's creating that answer is super important and would definitely
make me feel better about the answers I'm getting. Yeah, absolutely. It's not magic. And I think that
nerds and data people tend to think of things as really kind you know, kind of hard, soft, on, off, black, white. But there's a million shades of gray, and we need to build systems that are able to approach, you know, confusing inputs and give, you know, reasonable outputs instead of just assuming that things are one way or the other.
So let's finish up, as we often do, with a few fun questions here on the podcast for our guest.
I warned you ahead of time that there would be questions, but not what the questions would be.
So let's hope you have some fun with these. So we'll start with an easy one. And I say easy because no one has come to any remote agreement on this yet in any episode. So we've gotten answers all over the map.
So here we go. Ready? When will we see a full self-driving car that can drive anywhere at any Ooh, I would say probably maybe 30 years, 30 years out.
Okay. So you're on the not right now. I mean, we've recently gotten, we already have that.
And so, and we've recently gotten never. So yeah. So you're kind of in the middle of that.
Great. How about this? Can you think of any jobs that will be completely eliminated
by AI in the next five years? Oh. Like specific jobs that AI can take over? Yeah. Honestly, no.
I can see there are several jobs that AI will either augment drastically, but I don't think it's going to
happen. Yeah, again, that we've gotten such a wide range of answers to that one. It's so much fun to
hear the different ideas. All right. And finally, and this is sort of on point for the topic of
today, is it even possible to create a truly unbiased AI or will there always be biases?
I will say no, there will always be biases. I think we can mitigate the amount of harm that
is actually caused. And I think there, we have to kind of ask in respect to who. So are we fair to individuals or are we fair to groups?
Because we can't satisfy both of those things.
So it's what levels of bias are we able to accept?
And even if we have a biased model or biased, somewhat biased data, are we able to reduce
the amount of harm
that that causes on people?
Excellent answer.
Honestly, right there, that's the, you know,
I think we could have talked about that
for quite a bit longer.
I really appreciate it.
So thank you so much for joining us, Ayodele.
Where can people connect with you
to follow your thoughts on enterprise AI
and machine learning and data science and other topics? Yes. I'm most active on Twitter. So that's at data side bay.
You can also add me on LinkedIn and my website is iadeliodabella.com. So I know there's lots
of vowels in there, but if you, if my name's spelled right, you'll find it.
And you can find me at Chris Gunderman on Twitter
or online, chrisgunderman.com.
Thanks a lot.
You can find me at S Foskett on Twitter.
You can also find me every Wednesday
on the Gestalt IT Rundown,
where we talk about the enterprise tech news of the week.
So thank you for listening to Utilizing AI.
If you enjoyed this podcast,
please do subscribe, rate, and review it. I know everybody says that, but it really does help. And please do share, go to utilizing-ai.com or connect with us on Twitter at utilizing underscore AI.
Thanks, and we'll see you next week.