Everyday AI Podcast – An AI and ChatGPT Podcast - EP 141: How To Understand and Fix Biased AI
Episode Date: November 9, 2023Why are AI models so biased? Whether it's ChatGPT or an AI image generator, LLMs often have certain biases and tendencies. Nick Schmidt, Founder & CTO of SolasAI & BLDS, LLC, joins us to ...discuss how to understand and fix biased AI.Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Ask Nick and Jordan questions about AIUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTimestamps:[00:01:20] Daily AI news[00:04:00] About Nick and Solas AI[00:07:14] Algorithm misuse can lead to discrimination[00:11:54] 3-step burden shifting process to address discrimination[00:14:18] Internet usage leads to biased data collection[00:17:30] AI bias, accessibility, and user control insights[00:22:59] Algorithm fairness through regulations[00:26:16] Algorithmic decisioning and human biases[00:27:32] How to address biases in AI models?Topics Covered in This Episode:1. Prevalence of Bias in AI Models2. Detection and Mitigation of Bias in Algorithms3. Practical Solutions for Addressing Bias in AIKeywords:AI bias, discrimination, image generators, language models, input data, burden shifting process, biased information, societal biases, fairness, exclusion, collective punishment, biased AI, practical advice, best practices, everyday users, legal framework, AI news, smart devices, NVIDIA, animated films, detection, mitigation, discriminatory outcomes, generative AI, model development, algorithmic decision-making, dynamic models, reinforcement, algorithmic fairness, Solas AI, newsletter, daily AISend Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Start Here ▶️Not sure where to start when it comes to AI? Start with our Start Here Series. You can listen to the first drop -- Episode 691 -- or get free access to our Inner Cricle community and all episodes: StartHereSeries.com Also, here's a link to the entire series on a Spotify playlist.
Transcript
Discussion (0)
This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips.
Listen daily for practical advice to boost your career, business, and everyday life.
Meet Firefly AI Assistant, now live and Adobe Firefly, the All In One Creative AI Studio.
Just describe what you want to create and the assistant handles the rest,
orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome.
The assistant accelerates execution.
Why are AI models so biased?
It's pretty bad, right?
Like whether we're talking about chat GPT or AI image generators
or just about any large language model,
we see so many biases and prejudices come out in these models.
And today we're going to talk about maybe how to understand and fix biased.
AI. I'm really excited for today's conversation. It's going to be a good one. It's actually a first for
everyday AI, which I'm excited about. So, hey, welcome to Everyday AI. My name is George Moulson. I'm the
host. And this is your daily live stream podcast, free daily newsletter, helping everyday people
like you and me learn and leverage generative AI. So let's today learn about why AI models are biased.
But before we do, a quick reminder, if you're joining this live, thank you. If you're listening on the
podcast, make sure, as always, check your show notes.
We're always going to leave links to some other great resources where you can read more
about today's episode and listen to related episodes as well.
So before we get in to today's show, let's go over as we do every day, the AI news.
All right.
So is AI coming to your hand?
Yeah, kind of.
So startup company Humane is set to debut.
the highly anticipated humane AI pin.
So it's a new AI powered smart device that is set to launch today for a price of $699
and a monthly subscription fee of $24.
So it is a screenless device that aims to go kind of beyond the typical smartphone
and by providing a lot of AI features such as translation services and music streaming.
So the way this works is you kind of wear it on your shirt and it literally beams the
information to the palm of your hand as the display. I'm not sure if I want it because I have kind of
like fat weird hands. So I'm not sure if I want to stare at them all day. That's just me.
All right, we have chip news. Yay. We love chip news on everyday AI. So Nvidia may be providing
chips to China according to a recent report from the Financial Times. This is newsworthy because
there were some recent restrictions on certain chip exports to China. But some leak documents that
the Financial Times is reporting on are showing that Nvidia has developed three new chips aimed
at growing demand for the AI technology in China, but also at the same time complying with these
new U.S. export controls. So kind of making both parties happy. All right, last but not least,
in our AI news for today, animated films are about to get a lot cheaper thanks to AI.
So recently, the DreamWorks co-founder, Jack Katzenberg, said that AI will help cut the cost of animated films by 90%.
So be prepared for a slew of hopefully higher quality and even more animated films coming your way,
especially as there's these ongoing disputes and feuds with the actors unions and the screenwriting guilds and all these things.
So probably a lot more animated movies coming our way.
And hey, I'm a grown-ish adult, and I'd say bring it on.
I mean, like, have you guys seen Coco?
You know, thanks, AI.
Hopefully we'll see a lot more of that.
All right, we didn't come here to talk about animated films.
Actually, the probably exact opposite.
We came here to talk about how to understand and fix bias AI.
And I'm extremely excited to bring on our guests for today.
But before I do, if you're tuning in, why do you think AI is biased?
Get your questions in now.
it's not often we have someone that can talk about biases in AI.
So I want you all who are tuning in live to get your questions answered.
So make sure you get them in.
And also at the same time now, help me welcome to the show.
Let's bring him in.
There we go.
We have Nick Schmidt, who is the founder and CTO of Solos AI.
Nick, thank you for joining the show.
Thank you very much, Jordan.
I'm looking forward to the conversation.
Oh, it's going to be a good one.
I'm excited.
So real quick, tell us just a little bit about what Solos AI does.
So what Solos AI does is that the ideas that algorithms can be biased.
They can cause discrimination and unfairness.
And we want to find out if that's happening and then fix it.
And so what the software does is that it goes into an algorithm and it says, is there evidence of discrimination?
And that can be defined in a number of ways, but is there evidence of a problem?
If there's not, that's great.
But if there is, it goes on to the next step of trying to break open the black box of the algorithm
and understand what's driving the model's predictive quality, as well as what's driving
discrimination.
And from there, people can start to make decisions about whether to include certain things
in the model or how to mitigate it.
the problem. Once you have that idea, then you can start building alternative models that are
actually less discriminatory. And that's the big part of it, is that we ultimately have software
that's designed to search for and find fairer AI. Yeah. And I'll say this. There's probably no
shortage of work for you all because AI models are, in my personal experience, extremely biased. So
I mean, let's start with maybe the why.
Why are these models that we use?
And without getting too technical, right?
So most generative AI systems that we use,
whether they're chat GPT or Google Bard or mid-Journey,
they're all trained, right?
So they're all trained.
So why are these models even biased in the first place?
So unfortunately, there are so many reasons for this.
And it can really happen at any point in the modeling process.
It can happen with the data that you're putting into it.
That can reflect historical or present discrimination.
It can be around the modeling process where you are building a model that says looking at whether or not you're likely to default on a loan.
And the data set you've got is built in it has discrimination in the outcomes.
People who got loans were discriminated against.
And so once you've got loans,
take that algorithm and actually go out and put it into production, it can anticipate that someone
is going to be discriminated against and discriminated against them in, in not anticipation of that.
And then finally, there's actually using the algorithms. There's ways that you can use them
that are discriminatory, that using, building an algorithm for one purpose and then using it for
another is potentially highly discriminatory.
And there was one example that I think is really good of a algorithm that predicted
healthcare costs.
And it's totally reasonable for an insurance company or someone else to want to understand
what someone is likely to spend over the next year or two years or whatever it is.
That way they can have their financial models be appropriate.
But some brilliant people figured out,
figured out that, hey, we could use this algorithm that predicts future costs to predict health
outcomes. And the problem is, is that after the Tuskegee experiment that was up to the 1970s,
where African-American men were intentionally injected with syphilis by doctors in the United
States, when that came out in the public, African-American visits to primary care physicians,
was dropped by 26%.
What that means is that,
and that trend has continued, by the way,
and what that means is that healthcare spending
for African Americans is much lower than that
of whites or non-Blacks.
And so if you're using prior spending
to predict future healthcare outcomes,
you're ultimately going to be really underestimating
how sick black people are.
And so you're using prior spending to predict future health care outcomes.
And so you're ultimately going to be really underestimating how sick black people are.
going to say that they're not nearly as sick as they are and not give them the treatment they need.
You know, it's, that's a great example that you bring up because it does seem like, you know,
and I've even kind of seen this firsthand on more, you know, widely available models, but it seems
like there is always a discrimination against certain populations of people. Like as an example,
if you ask for, you know, every single week we put out a recap of AI news and we use AI image
generators. So if we're asking for a tech CEO and it seems like it always, almost always gives you
a, you know, mid-40s white guy, right? So maybe why is there always this, this maybe bias or trends that
keep showing up in different models against certain populations?
It's, you know, in that example, it's a lot of the problem is the input data.
I mean, if you look at tech CEOs in the U.S. right now, there's a lot of white men in their 40s, most likely.
And so when you train the data or train the model on that data, that's what you're going to get out.
And what that means is if you want to change those results, and the important thing about that is, is that by changing,
those results, you potentially change the future. You can make a more equitable future.
If you want to change those, you have to really understand what the algorithm is doing and make
interventions in it that will make a fairer output.
Yeah, it's a good, it's a good point. I mean, how then can AI models find that right
balance, right? Of, you know, kind of maybe showcasing cultural norms, but at the same time,
being inclusive, equitable, and truly showing diversity that exists. You know, and maybe not just
thinking of image generators, right? I know that's a very small use case. But even in, you know,
I think if you're using a large language model to write content and you have it write you a story,
I think you'll probably see a lot of those same trends and storyline. So how do you find,
or how can models find that balance?
So there's actually a legal framework already that I think is really good for this.
In the conversation, particularly an academic conversation around fair AI, people are very binary about it.
It's either you shouldn't have to do anything, the data is the data, the model's the model, just deal with it.
Or it's you have to make everything completely equitable and fair.
And while that would be nice, it's oftentimes not realistic.
And there's a legal framework within the US that's actually quite good for defining the boundaries.
And the idea of it, the background of it, is that, and it's called the three-step burdens shifting process.
And the idea of it is you start and you say, is the algorithm causing discrimination?
And if it's intentional discrimination, then you have to do the
something about it. But if it's unintentional discrimination, then you move on to a next step,
which is, does the model have a valid business justification? If you are an artist using generative
AI, you probably have, you may have a valid justification. If I'm trying to write a memo, I'm not a
very good writer, so I have a valid justification for using gendered AI to make my memos actually
look at.
The, in a credit model, for example, you know, which is perhaps a more realistic one,
it's like a bank has a justification for building a good credit model.
And if they're just giving out loans randomly, they're going to lose money or go bankrupt.
So that's the second stage.
And then the third stage, and this really gets into what you were talking about,
is the company has the responsibility to see if it can generate a fairer model.
that still meets their business needs.
And so this breaks away from that dichotomy of we shouldn't have to do anything to we have to
completely throw out the algorithm or make everything entirely equitable to just can we do better
and how much can we do better?
And I think that that framework is really what people should try to apply.
Yeah, it's some great tactical advice that.
for sure. So quick question here. And I love this, Maybrit. Thank you for your question. And if you do
have other questions for Nick, make sure to get them in. So Maybrit asking, I wonder if it only has
bias because AI uses essentially everything from the internet, right? And those sources have bias.
And AI has a bias by making assumptions super curious how it works. So is that maybe why it is
because these models are trained on essentially the entire existence of the internet and more,
Right? Is that why these biases play out in the end because the internet is an ugly biased place?
It is definitely part of it and is probably the main driver of it.
You could even take a step back and say it's not so much the internet's fault as a piece of technology,
but it's our fault as human beings for what we put on the internet.
And I think that actually is a really important point is that we live in biased societies.
So regardless of where you're getting the data from,
those biases are likely to get put into the data.
That's absolutely, so that's probably the primary source of bias,
but it's not the whole story.
You can also have bias creep in because of model design,
not using the right model, not using a model that's equally accurate on multiple groups.
You can also have bias come in through the usage of the model,
like the example I gave earlier.
So it's really important, well, you should focus
on the data going in because that's probably the main place.
You really have to look through the entire pipeline
to know where the problem's coming from.
You know, a great, let's, hey, while we're at it,
Dr. Muthana here with a comment, kind of a question as well.
So saying he's looking forward to learn the best strategies
and practices to keep AI bias and risk to a bare minimum.
Because this is a great point because, you know, kind of what you all are working on, Nick,
is you're working on big picture, right?
But for individual users, I mean, obviously it depends on the model that we use
or the large language model or whatever.
But what are some maybe best practices for users or can users, you know, end users,
individuals do anything to help avoid this bias?
Yeah, absolutely.
And I think that the first step is actually asking yourself if your model is fair for everyone.
And what that means is the data that goes into it, is it reasonable data?
Is it predictive of the outcome?
Are you using month of birth year to predict whether or not you're going to default on a loan?
People do things like that all the time.
they say, oh, the data is predictive, I'll put it in.
But that doesn't make sense.
And so asking yourself, are the data that are going into it okay, or as good as you can get?
Who are you excluding from the model?
And are you putting in the data that's sort of collectively punishing groups of people?
So, for example, zip code data, because there's so much segregation in the United States,
If you put information about where someone lives in a model, you're going to get biased results.
So asking those questions is the first step.
And building a good model is the first step.
Once you've got an idea that you've got fair for everyone, then you can start to think about
fairness for a given group.
Yeah.
And I think that's a good point because there's obviously different use cases for AI.
and, you know, bias in AI goes back many decades, right?
Like, like AI is not new, I think, but for the average person, generative AI is very new, right?
The accessibility, the affordability, and the use cases are all kind of all swelling up together
at the same point.
But maybe even to ask that a little follow up on that more specifically, let's just say for
the average person, right, if they go into chat GPT, if they go into mid-journey, you know,
they don't necessarily have any control over how the model was trained.
But are there any best practices or strategies to avoid by just, you know, the everyday person using the most popular kind of large language models?
Is there anything we can do to avoid that?
Or do we have to literally spell it out in the world's longest prompt?
Like, hey, I do not want ABC through a hundred.
So I think in the use case of, you know, I'm going in and trying to write a memo that's not offensive and not, you know, evil, or get other kinds of information.
The thing you have to rely on is that a lot of machine learning researchers and AI people don't realize is that you're still smarter than your algorithm.
and you're still smarter and you know more and you generalize better.
And so if you are getting stuff out of a model, getting stuff out of general and AI, use a
critical eye and ask yourself, you know what the world looks like.
You know a lot about bias.
I mean, we live through this every day.
We see this.
If you're a critical thinker, you probably know that.
And so that's, you know, in a simple use case like just someone using chat GPT,
it's really about not just trusting the output.
And when you're getting to the question of actually building the model or deploying the model,
that's where you can start to really get into kind of the quantitative solutions.
Adobe just introduced an entirely new way to create,
bringing the power and precision of its creative suite into one conversational experience,
Meet Firefly AI Assistant, now live in the Adobe Firefly app, the all-in-one creative AI studio.
Powered by Adobe's Creative Agent, Firefly AI Assistant lets you start with your vision,
just describe what you want, and shape the outcome as it takes form with the Assistant.
The Assistant orchestrates multi-step workflows, drawing on 60-plus pro-grade tools across Adobe Creative Cloud apps,
including Photoshop, Illustrator, Premier, Lightroom Express, and more to help bring your ideas
to life. You can also get started with creative skills, a growing library of pre-built workflows
for common creative tasks like batch editing photos, creating mood boards, portrait retouching, and
creating social variations. Every step the assistant takes is visible so you can refine,
redirect, or take over at any time. You stay in the driver's seat as the creative director.
Adobe Firefly AI assistant now in public beta. See it today at firefly.adobie.com.
You know, I'm curious, Nick.
And if you're if you're just joining us midway through, thank you, welcome.
We have Nick Schmidt, the founder and CTO from Solas AI talking biases and models.
My question is like maybe we kind of skip to the end, but we also talk about the trend of where things are going because it seems like model development, large language models, generative AI is being developed at a breakneck pace, right?
Like even early on, there's, you know, thousands of, you know, CEOs and very recognizable people that signed a letter and everyone said pause AI development for six months.
And instead, we tripled down, right?
You know, big news. OpenAI just released their newest update to their model, GPT4 turbo.
It's supposed to be, you know, better, faster, bigger, better at decision making, all these things.
So if the big companies are just developing faster and faster, and I mean, I'm sure they're doing things to address their own model biases, but how can you actually fix this, right?
When the companies probably have so much pressure and feel they just have to get out more products, bigger models, more parameters, like how can you, you know, possibly fix these when it's just, everyone's just sprinting with their head down?
Well, to make a shameless marketing plug,
buy our licensed Solas AI software.
More seriously, I mean, that's true,
but more seriously, what I think is really important
is that we've designed software,
and certainly not the only company out there,
that have designed software that can be used quickly
to mitigate these risks.
And so I think that
there has been, and certainly was true in the past, mitigating these risks was very difficult and
time-consuming. We're actually using AI to fix AI, and it's much faster as a result. And so
you can still innovate and make fairer models. And I would say that you can actually make
better models if you're innovating and considering fairness.
So that was actually on my mind, Nick, because so I'm glad you brought that up, right?
It's kind of ironic, I guess, that you're using AI to fix AI.
I guess my question is, why aren't the big companies doing this?
Would it slow their process down?
Would it slow their development down?
you know, why aren't they, you know, maybe creating a bigger internal focus on fixing this problem before it ever goes out in the wild?
Because once it does go out, it takes companies like yours many months or multiple years to really start to even make a dent, right?
Because everyone's using these models.
So maybe why aren't these big companies just fixing the problem internally first?
So I think that they're trying often. But there's also there's a big split between companies that are and happen doing something for a very long time and ones that are just getting started. So actually in the financial sector, the big banks have been working on this and they've been working with us in my consulting role for 25 years trying to make sure that algorithms are fairer.
And so there really is already an established practice within financial services and to some degree
employment.
The tech companies kind of came in and said, we're smarter than you guys.
We'll figure it out on our own.
And it's kind of worked out how it's worked out.
And I think that what's going to happen is there's going to be a lot of regulation that is going to actually
move companies more and more in the direction of what financial services companies are doing.
And so, yes, this is a problem. And why isn't it being done? Why isn't anything being done about it?
Well, there's a little bit, but they don't realize that there's already a good solution out there.
And I think that'll change. Yeah. So another great question from our audience.
And thank you everyone for getting these questions in. So Monica asking, can model output
change over time with users asking for different or more diverse outputs?
What a great question. Nick, is that a thing? Can the outputs change?
So it depends on, if I understand the question correctly, it depends on the type of model.
Most of the work I do is in financial services and healthcare and other industries where
the models are not dynamic. So they're not continually updating. You build a model.
put into production for a while and then you rebuilt it later on.
And that kind of breaks that process of reinforced feedback on an immediate scale.
So in the work that I do, no, it generally doesn't happen.
But with some of the large language models, my understanding is that there is a lot of
reinforcement and back and forth that happens.
Yeah, that's a great point.
And also, you know, not to put Nick on the spot on all these things, but he's not, you know,
I think most of us use the chat GPs of the world and the anthropic clods and, you know,
mid-jury.
And that's not necessarily where you focus all of your time because you're working on helping
model fairness across the board and not just solely large language models.
But it's a good question that Monica brings up in, you know, because I'm sure users in mass
are having that conversation with the chat GPT or with a Google Bard saying like,
no, this is bias. Please reflect a more diverse output. But I believe over time that it would help
them train their models to make them more diverse and less bias.
Yeah, although there's actually a different way to frame the question that Monica has that
I think is important and really good and also shows the real benefit of moving to algorithms.
algorithmic decisioning, which is that when you start modeling, you're usually starting with a
modeling a human process. So when companies first start using underwriting models, they typically will
model what their human underwriters have done. And if there are biases in the decisions that
human underwriters make, then those biases immediately get transplanted into the algorithm. But what can happen,
is over time, because the algorithm is not intrinsically biased, it can see that those biases are not
predicted. You know, it's giving a loan to an African American and they're repaying it, even though
it, you know, it assigned a higher probability of default to them or whatever. Well, so over time,
what can happen is that those biases can actually be pulled out of the system. And so in that way,
over time, the use of an algorithm can make things fair.
So, Nick, we've talked about a lot.
We've talked a little bit about how models work.
We've talked about some ongoing issues with biases and stereotypes that show up over and over
in these models.
So maybe as we wrap up today's episode, let's just go big picture here.
Like if you had to say in a very, you know, direct way, how can we fix biased AI?
Because we've attacked it from all different angles.
But what's the takeaway?
How can't like aside from, yeah, like, you know, we can use solace or solace or products like this.
But aside from that, how can we fix the biased AI?
Make a start on it.
Don't don't get overwhelmed with all of the options.
think about what your model is doing and make a decision and test it, see whether or not it's biased,
and then choose a metric and go with it, and then fix it, try to fix it.
And that's a first step.
And once you have that process in place, you can start refining it, is don't let the inability to completely,
define the problem, keep you from doing something about the problem.
Such great advice. And I think that, hey, even if, you know, our listeners and our viewers,
even if you're not working on a model, I think that what we talked about today is so important
because it does take users. It takes consumers, the people paying, you know, to continue to
to have this conversation to let, you know, the big tech companies know, yes, this is a
problem. Yes, this is something that we care about. We care about fairness in our models.
And thank you, Nick, so much for coming on the show and helping us dive into this issue.
We very much appreciate your time. Thank you very much, Jordan. And hey, as a reminder,
we did cover a lot, as we always do. So make sure you go to your everyday AI.com.
Sign up for the free daily newsletter. We're going to have a lot more information from Nick and what we
talked about today as well as just about everything else that's always going on in the
alternative AI world and more. So make sure you do that and make sure you join us tomorrow and every day
for more every day. AI. Thanks y'all. Meet Firefly AI assistant. Now live in Adobe Firefly,
the Allman One Creative AI Studio. Just describe what you want to create in your own words and the assistant
handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps,
including Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome
while the assistant accelerates execution.
Stand control with the ability to step in and refine at any time.
See it today at firefly.adobie.com.
And that's a wrap for today's edition of Everyday AI.
Thanks for joining us.
If you enjoyed this episode, please subscribe and leave us a rating.
It helps keep us going.
For a little more AI magic, visit Your EverydayAI.com
and sign up to our daily newsletter so you don't get left behind.
Go break some barriers and we'll see you next.
time.
