Lex Fridman Podcast - Michael Kearns: Algorithmic Fairness, Bias, Privacy, and Ethics in Machine Learning
Episode Date: November 19, 2019Michael Kearns is a professor at University of Pennsylvania and a co-author of the new book Ethical Algorithm that is the focus of much of our conversation, including algorithmic fairness, bias, priva...cy, and ethics in general. But, that is just one of many fields that Michael is a world-class researcher in, some of which we touch on quickly including learning theory or theoretical foundations of machine learning, game theory, algorithmic trading, quantitative finance, computational social science, and more. This conversation is part of the Artificial Intelligence podcast. If you would like to get more information about this podcast go to https://lexfridman.com/ai or connect with @lexfridman on Twitter, LinkedIn, Facebook, Medium, or YouTube where you can watch the video versions of these conversations. If you enjoy the podcast, please rate it 5 stars on Apple Podcasts or support it on Patreon. This episode is sponsored by Pessimists Archive podcast. Here's the outline with timestamps for this episode (on some players you can click on the timestamp to jump to that point in the episode): 00:00 - Introduction 02:45 - Influence from literature and journalism 07:39 - Are most people good? 13:05 - Ethical algorithm 24:28 - Algorithmic fairness of groups vs individuals 33:36 - Fairness tradeoffs 46:29 - Facebook, social networks, and algorithmic ethics 58:04 - Machine learning 58:05 - Machine learning 59:19 - Algorithm that determines what is fair 1:01:25 - Computer scientists should think about ethics 1:05:59 - Algorithmic privacy 1:11:50 - Differential privacy 1:19:10 - Privacy by misinformation 1:22:31 - Privacy of data in society 1:27:49 - Game theory 1:29:40 - Nash equilibrium 1:30:35 - Machine learning and game theory 1:34:52 - Mutual assured destruction 1:36:56 - Algorithmic trading 1:44:09 - Pivotal moment in graduate school
Transcript
Discussion (0)
The following is a conversation with Michael Kerns. He's a professor at the University of Pennsylvania
and a co-author of the new book Ethical Algorithm that is the focus of much of this conversation.
It includes algorithmic fairness, bias, privacy, and ethics in general, but that is just one of
many fields that Michael is a world-class researcher in, some of which we touch on quickly, including
learning theory or the theoretical foundation of machine learning, game theory, quantitative
finance, computational social science and much more.
But on a personal note, when I was in undergrad, early on, I worked with Michael on an algorithmic
trading project and competition that he led.
That's when I first fell in love with algorithmic game theory.
While most of my research life has been a machine learning and human robot interaction,
the systematic way that game theory reveals the beautiful structure in our competitive and
cooperating world of humans has been a continued inspiration to me.
So for that and other things, I'm deeply thankful to Michael and really
enjoyed having this conversation again in person after so many years. This is the Artificial
Intelligence Podcast. If you enjoy it, subscribe on YouTube, give it 5 stars on Apple Podcast,
support it on Patreon, or simply connect with me on Twitter, Alex Friedman spelled F-R-I-D-M-A-N.
This episode is supported by an amazing podcast called Pessimists Archive.
Jason, the host of the show, reached out to me, looking to support this podcast, and so
I listened to it, to check it out.
And by listen, I mean I went through it, Netflix binge style, at least five episodes in
a row.
It's not one of my favorite podcasts and I think it should be one of the top podcasts
in the world, frankly.
It's a history show about why people resist new things.
Each episode looks at a moment in history when something new was introduced, something
that today we think of as commonplace, like recorded music, umbrellas, bicycles, cars, chastles, coffee,
the elevator, and the show explores why it freaked everyone out.
The latest episode on Mirrors and Vanity still stays with me as I think about Vanity in
the modern day of the Twitter world.
That's the fascinating thing about the show, is that stuff that happened long ago, especially
in terms of our fear of new things, repeats itself in the modern day, and so has many lessons for us to think
about in terms of human psychology and the role of technology in our society.
Anyway, you should subscribe and listen to Pesimist Archive.
I highly recommend it.
And now here's my conversation with Michael Kerns. You mentioned reading fear and loading Las Vegas in high school and having more or a bit
more of a literary mind.
So what books, non-technical, non-computer science, would you say had the biggest impact
on your life, either intellectually or emotionally?
You've dug deep into my history, I see.
When deep.
Yeah, I think my favorite novel is Infinite Gest by David
Foster Wallace, which actually coincidentally, much of it
takes place in the halls of buildings right around us here at MIT.
So that certainly had a big influence on me.
And as you noticed, when I was in high school,
I actually even started college as an English major.
So I was very influenced by that genre of journalism at the time and thought I wanted
to be a writer and then realize that an English major teaches you to read, but it doesn't
teach you how to write.
And then I became interested in math and computer science instead.
Well, in your new book, Ethical Algorithm, you kind of sneak up from an algorithmic perspective on these deep, profound, philosophical
questions of fairness, of privacy, and thinking about these topics.
How often do you return to that literary mind that you had?
Yeah, I'd like to claim there was a deeper connection, but I think both Aaron and I kind of came
at these topics first and foremost from a technical angle.
I mean, I'm kind of considered myself primarily and originally a machine learning researcher,
and I think as we just watched like the rest of the society, the field technically advance,
and then quickly on the heels of that kind of the buzz kill of all of the
anti-social behavior by algorithms just kind of realized there was an opportunity for us to do
something about it from a research perspective. You know, I more to the point in your question,
I mean, I do have an uncle who is literally a moral philosopher. And so in the early days of our
technical work on fairness topics, I would occasionally,
you know, run ideas behind them.
So, I mean, I remember an early email I sent to them in which I said, like, oh, you know,
here's a specific definition of algorithmic fairness that we think is some sort of variant
of rosy in fairness.
What do you think?
And I thought I was asking a yes or no question, and I got back to a kind of classical
philosophers' response.
Well, it depends.
If you look at it this way, then you might conclude this.
And that's when I realized that there was a real kind of rift
between the ways philosophers and others
had thought about things like fairness,
from sort of a humanitarian perspective,
and the way that you needed to think about it
as a computer scientist,
if you were going to kind of implement actual algorithmic solutions.
But I would say the algorithmic solutions take care of some of the low-hanging fruit.
Sort of, the problem is a lot of algorithms, when they don't consider fairness,
they are just terribly unfair. And when they don't consider fairness, they are just terribly unfair.
And when they don't consider privacy, they're terribly, they violate privacy, sort of algorithmic
approach fixes big problems.
But there is still, when you start pushing into the gray area, that's when you start getting
through this philosophy of what it means to be fair, starting from
Plato.
What is justice kind of questions?
Yeah, I think that's right.
And I mean, I would even not go as far as you want to say that sort of the algorithmic
work in these areas is solving like the biggest problems.
And we discuss in the book the fact that really we are, there's a sense in which we're kind of looking where the light is in that, you know, for example, if police are
racist in who they decide to stop and frisk, and that goes into the data, there's sort
of no undoing that downstream by kind of clever algorithmic methods.
And I think especially in fairness, I mean, I think less so in privacy where we
feel like the community kind of really has settled on the right definition, which is differential
privacy. If you just look at the algorithmic fairness literature already, you can see it's
going to be much more of a mess. And you know, you've got these theorems saying here are
three entirely reasonable, desirable notions of fairness.
And, you know, here's a proof that you cannot simultaneously have all three of them.
So I think we know that algorithmic fairness compared to algorithmic privacy is going to be kind of a harder problem.
And it will have to revisit, I think, things that have been thought about by, you know, many generations of scholars before us. So it's very early days for fairness, I think things that have been thought about by many generations of scholars before us.
So it's very early days for fairness, I think.
So before we get into details with differential privacy and on the fair and
aside, let me linger on the philosophy a bit.
Do you think most people are fundamentally good or do most of us have both the
capacity for good and evil within us?
I mean, I'm an optimist. I tend to think that most people are good and want to do right,
and that deviations from that are, you know, kind of usually due to circumstance,
not due to people being bad at heart.
With people with power, are people at the heads of governments, people at the heads of companies,
people at the heads of maybe so financial power markets.
Do you think the distribution there is also, most people are good and have good intent?
Yeah, I do. I mean, my statement wasn't qualified to people not in positions of power.
I mean, I think what happens in a lot of the cliche about absolute power corrupts absolutely.
I mean, I think even short of that, having spent a lot of time on Wall Street and also in arenas
very, very different from Wall Street like academia, One of the things I think I've benefited from by moving between two very different worlds
is you become aware that these worlds kind of develop their own social norms, and they
develop their own rationales for behavior, for instance, that might look unusual to outsiders,
but when you're in that world, it doesn't feel unusual at all.
And I think this is true of a lot of professional cultures,
for instance.
And so then your maybe slippery slope is too strong of a word,
but you're in some world where you're mainly around other people
with the same kind of viewpoints and training and world view as you.
And I think that's more of a source of, you know, kind of abuses of power than sort of, you know,
there being good people and evil people and that somehow the evil people are the ones that
somehow rise to power. That's really interesting. So it's the within the social norms constructed by that particular group of people, you're all trying to do good.
But because as a group, you might drift into something that for the broader population, it does not align with the values of society. That's the word.
Yeah, I mean, or not that you drift, but even that things that
that you drift, but even that things that don't make sense to the outside world don't seem unusual to you.
So it's not sort of like a good or a bad thing, but, you know, like so, for instance, you
know, in the world of finance, right?
There's a lot of complicated types of activity that if you are not immersed in that world, you
cannot see why the purpose of that, you know, that activity existed all.
It just seems like, you know, completely useless and people just pushing money around.
When you're in that world and you learn more, your view does become more nuanced.
You realize, okay, there is actually a function to this activity.
In some cases, you would conclude that actually, if magically we could eradicate
this activity tomorrow, it would come back because it actually is like serving some useful
purpose.
It's just a useful purpose.
It's very difficult for outsiders to see.
And so I think, you know, lots of professional work environments or cultures, as I might
put it, kind of have these social norms
that don't make sense to the outside world.
Academia is the same, right?
I mean, lots of people look at academia
and say, you know, what the hell are you,
all of you people doing?
Why are you paid so much in some cases
that taxpayer expenses to do, you know,
to publish papers that publish your reads?
But when you're in that world,
you come to see the value for it.
And, but even though you might not be able to explain it to, you know to the person in the street.
In the case of the financial sector, tools like credit might not make sense to people.
It's a good example of something that does seem to pop up and be useful or just the power of markets and just in general capitalism. Yeah, and finance, I think the primary example I would give is leverage.
So being allowed to borrow, to sort of use 10 times as much money as you've actually borrowed.
So that's an example of something that before I had any experience in financial markets,
I might have looked at and said, well, what is the purpose of that?
That just seems very dangerous and it dangerous, and it has proven dangerous.
But if the fact of the matter is that on some particular time
scale, you are holding positions that are very unlikely
to lose your value at risk or variance is like one or five percent, then it kind of
makes sense that you would be allowed to use a little bit more than you have because
you have, you know, some confidence that you're not going to lose it all in a single
day. Now, of course, when that happens, we've seen what happens, you know, not too long ago.
But, you know, but the idea that it serves no useful economic purpose under any circumstances is definitely
not true.
We'll return to the other side of the coast, Silicon Valley, and the problems there as
we talk about privacy as we talk about fairness.
At the high level, and I'll ask some sort of basic questions with the hope to get at the
fundamental nature of reality.
But from a very high level, what is an ethical algorithm?
So I can say that an algorithm has a running time of using big old notation and log-in.
I can say that a machine learning algorithm classified cat versus dog with 97% accuracy.
Do you think there will one day be a way to measure
sort of in the same compelling way as the big old notation
of this algorithm is 97% ethical?
First of all, let me riff for a second
on your specific and log- N-example.
So because early in the book, when we're just kind of trying to describe algorithms
period, we say like, okay, you know, what's an example of an algorithm or an algorithmic
problem? First of all, like, it's sorting, right? You have a bunch of index cards with numbers
on them and you want to sort them. And we describe, you know, an algorithm that sweeps all the
way through, finds the smallest
number, puts it at the front, then sweeps through, again, finds the second smallest number.
So we make the point that this is an algorithm, and it's also a bad algorithm in the sense that
it's quadratic rather than n log n, which we know is optimal for sorting. And we make
the point that's sort of like, you know, so even within the confines of a very precisely specified
problem, there might be many, many different algorithms
for the same problem with different properties.
Some might be faster in terms of running time.
Some might use less memory.
Some might have a better distributed implementations.
And so the point is that already we're used to, you know, in computer science, thinking
about trade-offs between different types of quantities and resources, and there being,
you know, better and worse algorithms.
And our book is about that part of algorithmic ethics that we know how to kind of put on
that same kind of quantitative footing right now.
So just to say something that our book is not about, our book is not about kind of broad,
fuzzy notions of fairness.
It's about very specific notions of fairness.
There's more than one of them.
There are tensions between them, right?
But if you pick one of them, you can do something akin to saying that this algorithm is 97% ethical.
You can say, for instance, for this lending model, the false rejection rate on black people and white people is within
3%.
Right.
So we might call that to a 97% ethical algorithm and a 100% ethical algorithm would mean
that that difference is 0%.
In that case, fairness is specified when two groups, however they're defined, are given
to you.
That's right.
And then you consider mathematically
start describing the algorithm.
But nevertheless, the part where the two groups are given to you,
unlike running time,
we don't in a computer science talk about how fast an algorithm feels like when
it runs.
True.
We measure it and ethical starts getting into feelings.
So, for example, an algorithm runs, you know, if it runs in the background, it doesn't
disturb the performance of my system.
It'll feel nice.
I'll be okay with it.
But if it overloads the system, it'll feel unpleasant.
So, in that same way, ethics, there's a feeling of how socially acceptable it is.
How does it represent the moral standards of our society today?
So in that sense, and sorry to linger on that first of a high-level philosophical question,
do you have a sense we'll be able to measure how ethical an algorithm is. First of all, I didn't, certainly didn't mean to give the impression that you can kind
of measure, you know, memory, speed, trade-offs, you know, and that there's a complete, you
know, mapping from that onto kind of fairness, for instance, or ethics and accuracy, for
example.
I mean, the type of fairness definitions that are largely the objects of study today
and starting to be deployed, you as the user of the definitions, you need to make some hard
decisions before you even get to the point of designing fair algorithms. One of them, for instance,
is deciding who it is that you're worried about protecting, who you're worried about being
harmed by, for instance, some notion of discrimination or unfairness.
And then you need to also decide what constitutes harm.
So for instance, in a lending application, maybe you decide that, you know, falsely rejecting
a creditworthy individual, you know, sort of a false negative,
is the real harm and that false positives,
i.e. people that are not credit worthy
or are not gonna repay your loan,
the get-alone, you might think of them as lucky.
And so that's not a harm, although it's not clear
that if you don't have the means to repay a loan
that being given a loan is not also a harm.
So the literature is so far quite limited in that you need to say, who do you want to
protect and what would constitute harm to that group?
When you ask questions like, will algorithms feel ethical?
One way in which they won't, under the definitions that I'm describing is if you know if you are an individual who is falsely denied alone,
incorrectly denied alone, all of these definitions basically say like well, you know your compensation is the knowledge that we are,
we are also falsely denying loans to other people, you know, in other groups at the same rate that we're doing it to you. And so there is actually this interesting,
even technical tension in the field right now
between these sort of group notions of fairness,
and notions of fairness that might actually feel
like real fairness to individuals, right?
They might really feel like their particular interests
are being protected or thought about
by the algorithm rather than just, you know, the groups that they happen to be members of.
Is there parallels to the big O notation of worst case analysis? So, is it important to
looking at the worst violation of fairness for an individual. Is it important to minimize that one individual? So, like, worst case analysis? Is that something you think about or?
I mean, I think we're not even at the point where we can sensibly think about that. So, first of all,
you know, we're talking here both about fairness applied at the group level, which is a relatively
weak thing, but it's better than nothing.
And also, the more ambitious thing of trying to give some individual promises.
But even that doesn't incorporate, I think, something that you're hinting at here is what
a child might have called subjective fairness.
So a lot of the definitions, I mean, all of the definitions in the algorithmic fairness
literature are what I would kind of call received wisdom definitions.
It's sort of, you know, somebody like me sits around and thinks like, okay, you know, I think here's a technical definition of fairness
that I think people should want or that they should, you know, think of as some notion of fairness, maybe not the only one,
maybe not the best one, maybe not the last one. But we really actually don't know from a subjective standpoint
what people really think is fair.
We just started doing a little bit of work in our group
at actually doing human subject experiments
in which we ask people about, we ask them questions about fairness, we survey
them, we show them pairs of individuals in, let's say, a criminal recidivism prediction setting,
and we ask them, do you think these two individuals should be treated the same as a matter of
fairness?
And to my knowledge, there's not a large literature in which ordinary
people are asked about, you know, they have sort of notions of their subjective fairness
elicited from them. It's mainly, you know, kind of scholars who think about fairness,
you know, kind of making up their own definitions. And I think this needs to change actually for many social norms, not just for fairness, right?
So there's a lot of discussion these days in the AI community
about interpretable AI or understandable AI.
And as far as I can tell, everybody
agrees that deep learning or at least the outputs of deep learning
are not very understandable.
And people might agree that sparse linear models with integer coefficients are more understandable.
But nobody's really asked people, you know, there's very little literature on, you know,
sort of showing people models and asking them to understand what the model is doing.
And I think that in all these topics
as these fields mature, we need to start doing more
behavioral work.
Yeah, which is, so one of my deep passions of psychology,
and I always thought computer scientists
will be the best future psychologists.
In a sense, that data is, especially in this modern world, the data is a really powerful
way to understand and study human behavior.
And you've explored that with your theory side of work as well.
Yeah.
I'd like to think that what you say is true about computer scientists and psychology from
my own limited wandering into human subject experiments, we have a great deal to learn.
Not just computer science, but AI and machine learning more specifically.
I kind of think of as imperialist research communities in that, you know, kind of like
physicists and an earlier generation computer scientists kind of don't think of any scientific
topic as off limits to them.
They will like freely wander into areas that others have been thinking about for decades or longer.
And, you know, we usually tend to embarrass ourselves in those efforts for some amount of time. Like, you know, I think reinforcement learning is a good example, right?
So a lot of the early work in reinforcement learning, I have complete sympathy for the control theorist that looked at this and said,
like, okay, you are reinventing stuff that we've known since the 40s, right?
But in my view, eventually,
this computer scientists have made significant contributions to that field,
even though we kind of embarrassed ourselves for the first decade.
So I think if computer scientists are going to start engaging in kind of psychology, human subjects,
type of research, we should expect to be embarrassing ourselves for a good 10 years or so,
and then hope that it turns out as well as some other areas that we've waited into.
So you kind of mentioned this,
just a linger on the idea of an ethical algorithm,
of idea of groups, sort of group thinking,
individual thinking, and we're struggling that,
one of the amazing things about algorithms
and your book and just this field of study is,
it gets us to ask like forcing machines,
converting these ideas into algorithms
is forcing us to ask questions of ourselves,
as a human civilization.
So there's a lot of people now in public discourse
doing sort of group thinking, thinking like there's
particular sets of groups that we don't want to discriminate against and so on.
And then there is individuals, sort of in the individual life stories, the struggles they went through and so on. And then there is individuals, sort of in the individual life stories, the struggles
they went through and so on. Now, like in philosophy, it's easier to do group thinking
because you don't, you know, it's very hard to think about individuals. There's so much
variability. But with data, you can start to actually say, you know, what group thinking
is too crude? You're actually doing more discrimination by thinking in terms of groups and individuals. Can you linger on that kind of idea of group for
individual and ethics and is it good to continue thinking in terms of groups and in algorithms?
So, let me start by answering a very good high-level question with a slightly narrow technical
response, which is these group definitions of fairness.
Like here is a few groups, like different racial groups, maybe gender groups, maybe age,
what have you.
And let's make sure that, you know, for none of these groups do we, you know, have a false
negative rate, which is much higher than any other one of these groups do we, you know, have a false negative rate which is much higher than
any other one of these groups, okay? So these are kind of classic group aggregate notions of fairness.
And, you know, but at the end of the day, an individual you can think of as a combination of all
their attributes, right? They're a member of a racial group. They have a gender,
they have an age, you know, and many other, you know, demographic properties that
are not biological, but that, you know, are still, you know, very strong determinants of
outcome and personality in the light. So one, I think, useful spectrum is to sort of
think about that array between the group and the specific individual, and to realize that in some ways,
asking for fairness at the individual level
is to sort of ask for group fairness simultaneously
for all possible combinations of groups.
So in particular, so in particular, if I build
a predictive model that meets some definition of fairness
by race, by gender,
by age, by what have you marginally to get a slightly technical, sort of independently,
I shouldn't expect that model to not discriminate against disabled, Hispanic women, overage
55, making less than $50,000 a year annually, even though I might have protected
each one of those attributes marginally.
So, the optimization, actually, that's a fascinating way to put it.
So, you're just optimizing the one way to achieve the optimizing fairness for individuals,
just add more and more definitions of groups that each individual belongs.
So, you know, at the end of the day, we could think of all of ourselves as groups of size one,
because eventually there's some attribute that separates you from me and everybody, from everybody else in the world.
Okay. And so, it is possible to put, you know, these incredibly course ways of thinking about fairness and these very, very,
individualistic specific ways on a common scale.
And one of the things we've worked on from a research perspective is, so we sort of know
how to, in relative terms, we know how to provide fairness guarantees at the core
system of the scale.
We don't know how to provide kind of sensible, tractable, realistic fairness guarantees
at the individual
level.
But maybe we could start creeping towards that by dealing with more refined subgroups.
I mean, we gave a name to this phenomenon where you protect, you enforce some definition
of fairness for a bunch of marginal attributes or features.
But then you find yourself discriminating against a combination of them.
We call that fairness gerrymandering,
because like political gerrymandering,
you're giving some guarantee at the aggregate level,
but when you look in a more granular way at what's going on,
you realize that you're achieving that aggregate guarantee
by favoring some groups and discriminating against other ones.
And so there are, you know, it's early days, but there are algorithmic approaches that
let you start creeping towards that, you know, individual end of the spectrum.
Does there need to be human input in the form of weighing the value of the importance of each kind of group.
So for example, is it like, so gender say, uh, crudely speaking male and female and then
different races, are we as humans supposed to put value on saying gender is 0.6 and race is 0.4 in
terms of in a big optimization of achieving fairness?
Is that kind of what humans are supposed to do?
I mean, of course, I don't need to tell you that, of course, technically, one could incorporate
such weights if you
wanted to into a definition of fairness.
You know, fairness is an interesting topic in that having worked in and the book being
about both fairness, privacy, and many other social norms.
Fairness, of course, is a much, much more loaded topic.
So privacy, I mean, people want privacy, people don't like violations of privacy, violations
of privacy, cause damage, angst, and bad publicity for the companies that are victims of them.
But sort of everybody agrees, more data privacy would be better than less data privacy. And you don't have these, somehow the discussions of fairness don't become politicized along
other dimensions like race and about gender and whether we, and you quickly find yourself
kind of revisiting topics that have been unresolved forever,
like affirmative action.
So, why are you protecting, some people will say,
why are you protecting this particular racial group?
And others will say, well, we need to do that
as a matter of retribution.
Other people will say, it's a matter of of retribution other people will say it's a matter of economic opportunity and
I don't know which of you know whether any of these are the right answers but you sort of fairness
is sort of special in that as soon as you start talking about it you inevitably have to participate
in debates about fair to whom at what expense to who else. I mean, even in criminal justice, right, um,
you know, where people talk about fairness in criminal sentencing or, um, you know,
predicting failures to appear or making parole decisions or the like, they will, you know,
they'll point out that what these definitions of fairness are all about fairness for the criminals
and what about fairness for the victims, right?
So when I basically say something like, well,
the false incarceration rate for black people and white people needs to be roughly the same,
you know, there's no mention of potential victims of criminals
in such a fairness definition.
And that's the realm of public discourse.
I should actually recommend I just listen to people listening.
Intelligent Squares debates, US edition just had a debate.
They have this structure we have, old Oxford style or whatever they're called,
debates, those two verses two, and they talk about affirmative action. It is incredibly interesting
that there's really good points on every side of this issue, which is fascinating to listen to.
Yeah, yeah, I agree. And so it's interesting to be a researcher trying to do for the most part
technical algorithmic work, but Aaron and I both quickly learn you cannot do that and then go out
and talk about and expect people to take it seriously if you're unwilling to engage
in these broader debates that are entirely extra algorithmic, right? They're not about algorithms and making
algorithms better. As you said, what should society be protecting in the first place?
When you discuss a fairness, an algorithm that achieves fairness, whether in the constraints
and the objective function, there's an immediate kind of analysis you can perform,
There's an immediate kind of analysis you can perform, which is saying, if you care about fairness and gender, this is the amount that you have to pay for in terms of the performance of the system.
Like, is there a role for statements like that in a table and a paper, or do you want to really not touch that?
Like, we want to touch that and we do touch it.
So, I mean, just to get to make sure I'm not promising
your viewers more than we know how to provide.
But if you pick a definition of fairness,
like I'm worried about gender discrimination
and you pick a notion of harm,
like false rejection for a loan, for example,
and you give me a model.
I can definitely, first of all, go audit that model.
It's easy for me to go from data to say, like, OK,
your false rejection rate on women is this much higher
than it is on men.
But once you also put the fairness into your objective function,
I mean, I think the table that you're talking about is what we would call the Pareto curve.
You can literally trace out, and we give examples of such plots on real data sets in the
book.
You have two axes.
On the x-axis is your error.
On the y-axis is unfairness by whatever, if know, if it's like the disparity between false rejection
rates between two groups.
And you know, your algorithm now has a knob that basically says, how strongly do I want
to enforce fairness?
And the less unfair, you know, we, you know, if the two axes are air and unfairness, we'd
like to be at zero zero.
We'd like zero air and zero unfair, unfairness, we'd like to be at zero zero. We'd like zero error and zero unfairness simultaneously.
Anybody who works in machine learning knows that you're generally not going to get to zero error
period without any fairness constraint whatsoever, so that's not going to happen. But in general,
you'll get this, you'll get some kind of convex curve that specifies the numerical trade-off you face. If I want to go from
17% error down to 16% error, what will be the increase in unfairness that I experience
as a result of that? And so this curve kind of specifies the kind of undominated models.
Models that are off that curve can be strictly improved
in one or both dimensions.
You can either make the air a better or the unfairness
better or both.
And I think our view is that not only are these objects,
these Pareto curves, or efficient frontiers,
as you might call them.
Not only are they valuable scientific objects,
I actually think that they, in the near term,
might need to be the interface
between researchers working in the field
and stakeholders in given problems.
So, you know, you could really imagine telling a criminal jurisdiction look, if you're concerned
about racial fairness, but you're also concerned about accuracy.
You want to, you know, you want to release on parole people that are not going to recommit
a violent crime, and you don't want to release the ones who are.
So, you know, that's accuracy.
But if you also care about those, you know, the mistakes you make not being disproportionately
on one racial group or another, you can show this curve.
I'm hoping that in the near future, it'll be possible to explain these curves to non-technical
people that have, that are the ones that have to make the decision,
where do we want to be on this curve?
Like, what are the relative merits or value of having lower air versus lower unfairness?
You know, that's not something computer scientists should be deciding for society, right?
That, you know, the people in the field, so to speak, the policy makers, the regulators,
that's who should be making these decisions.
But I think and hope that they can be made to understand that these trade-offs generally
exist and that you need to pick a point and like, and ignoring the trade-off, you know,
you're implicitly picking a point anyway.
Right.
You just don't know it and you're not admitting it.
Just to link out on the point of trade-offs,
I think that's a really important thing
to sort of think about.
So you think when we start to optimize for fairness,
there's almost always, in most system, going to be trade-offs.
Can you, like, what's the trade-off between?
Just to clarify, there have been some sort of
technical terms thrown around, but a sort of a perfectly fair world. Why will somebody
be upset about that? The specific trade-off I talked about just in order to make things very concrete was between
numerical error and some numerical measure of unfairness.
What is numerical error in the case of...
Just like, say, predictive error.
The probability or frequency with which you release somebody on parole who then goes
on to recommit a violent crime or keep incarcerated somebody who would not to recommit a violent crime, or keep incarcerated somebody who
would not have recommitted a violent crime.
So in the case of awarding somebody a parole, or giving somebody parole, or letting them
out on parole, you don't want them to recommit a crime.
So it's your system failed in prediction if they happen to do a crime.
Okay, so that's the performance, that's one axis.
Right. And what's the fairness axis?
And so then the fairness axis might be the difference between racial groups in the kind of false positive predictions,
namely people that I kept incarcerated, predicting that they would recommit a violent crime when in fact they wouldn't have right and the the unfairness of that just to linger it and
Allow me to
eloquently to try to
sort of describe why that's unfair why unfairness is there the
The unfairness you want to get rid of is that in the judges'
mind, the bias of having been brought up to society, the slight racial bias, the racism
that exists in the society, you want to remove that from the system. Another way that's
been debated is sort of equality of opportunityity versus equality of outcome and there's a
Weird dance there. That's really difficult to get right and we don't it's what the affirmative action is
Exploring that space
Right, and then we this also quickly, you know
Bleeds into questions like well
Maybe if one group really does recommit crimes
at a higher rate, the reason for that is that
at some earlier point in the pipeline
or earlier in their lives,
they didn't receive the same resources
that the other group did.
And so, you know, there's always in kind of fairness
discussions, the possibility that the real injustice came
earlier, right?
Earlier in this individual's life, earlier in this group's history, et cetera, et cetera.
And so a lot of the fairness discussions, almost the goal is for it to be a corrective mechanism
to account for the injustice earlier in life.
By some definitions of fairness or some theories of fairness, yeah.
Others would say like look it's, you know, it's not to correct that injustice,
it's just to kind of level the playing field right now and not, I mean,
Corsairate, falsely, and Corsairate, more people of one group than another group.
But I mean, I think just, it might be helpful just to demystify a little bit about
the diff, the many ways in which bias or unfairness
can come into algorithms, especially in the machine learning year.
I think many of your viewers have probably heard these examples before, but let's say I'm
building a face recognition system.
So, gathering lots of images of faces and trying to train the system to recognize
new faces of those individuals from training set of those faces of individuals.
And it shouldn't surprise anybody, or certainly not anybody in the field of machine learning,
if my training data set was primarily white males, and I'm
training the model to maximize the overall accuracy on my training data set, the model
can reduce its error most by getting things right on the white males that constitute the
majority of the data set, even if that means that on other groups, they will be less accurate.
Okay. Now, there's a bunch of ways you could think about addressing this. One is to deliberately
put into the objective of the algorithm not to optimize the error at the expense of this
discrimination, and then you're kind of back in the land of these two
dimensional numerical trade-offs.
A valid counter-argument is to say, well, no, you don't have to.
There's no, you know, the notion of the tension
between error and accuracy here is a false one.
You could instead just go out and get much more data
on these other groups that are in the minority
and equalize your data set, or you could train a separate model on those subgroups and
have multiple models.
The point I think we would try to make in the book is that those things have cost too,
going out and gathering more data on groups that are relatively rare,
compared to your plurality or majority group, that it may not cost you in the accuracy of
the model, but it's going to cost the company developing this model more money to develop
that. And it also costs more money to build separate predictive models and to implement
and deploy them. So even if you can find a way to avoid the tension between error and accuracy in training
of model, you might push the cost somewhere else, like money, like development time, research
time and the like.
There are fundamentally difficult philosophical questions in fairness.
And we live in a very divisive political climate,
outraged culture.
There is a alt-right folks on 4chan, trolls.
There is social justice warriors on Twitter.
There is very divisive, outraged folks on all sides of every kind of system. How do you, how do we, as engineers, build ethical algorithms in such divisive culture?
Do you think they could be disjoint?
The human has to inject your values, and then you could optimize over those values.
But in our times, when you start actually applying these systems, things
get a little bit challenging for the public discourse.
How do you think we can proceed?
Yeah, I mean, for the most part, in the book, you know, a point that we try to take some
pains to make is that we don't view ourselves or people like us as being in the position of deciding for society,
what the right social norms are, what the right definitions of fairness are.
Our main point is to just show that if society or the relevant stakeholders in a particular domain
can come to agreement on those sorts of things, there's a way of encoding that into algorithms
in many
cases, not in all cases.
One other misconception that hopefully we definitely dispel is sometimes people read
the title of the book and I think not unnaturally fear that what we're suggesting is that the
algorithms themselves should decide what those social norms are and develop their own notions
of fairness and privacy or ethics.
And we're definitely not suggesting that.
The title of the book is ethical algorithm, by the way, and I didn't think of that interpretation
of the title.
That's interesting.
Yeah.
I mean, especially these days where people are concerned about the robots becoming our
overlords, the idea that the robots would also sort of develop their own social norms is
just one step away from that.
But I do think, you know, obviously, despite disclaimer that people like us shouldn't
be making those decisions for society, we are kind of living in a world where, in many
ways, computer scientists have made some decisions that have fundamentally changed the nature
of our society and democracy and sort of civil discourse and
deliberation in ways that I think most people generally feel are bad these days, right?
So.
But they had to make, so if we look at people at the heads of companies and so on, they
had to make those decisions, right?
There has to be decisions.
So there's two options.
Either you kind of put your head in the sand and don't think
about these things and just let the algorithm do what it does, or you make decisions about what
you value, you know, injecting more values into the algorithm. Look, I don't, I never mean to be an
apologist for the tech industry, but I think it's, it's a little bit too far to sort of say that explicit decisions
were made about these things.
So let's, for instance, take social media platforms.
So like many inventions in technology and computer science, a lot of these platforms that we
now use regularly kind of started as curiosities.
I remember when things like Facebook came out and its predecessors, like Friendster,
which nobody even remembers now,
people really wonder like,
why would anybody want to spend time doing that?
I mean, even the web when it first came out,
when it wasn't populated with much content
and it was largely hobbyists building their own
kind of branch tackle websites,
a lot of people looked at this and said, like, what is the purpose of this thing? Why is this interesting? Who would want to do this?
And so even things like Facebook and Twitter, yes, technical decisions were made by engineers,
by scientists, by executives, and the design of those platforms. But I don't think 10 years ago anyone anticipated that those platforms, for instance, might kind of
acquire undue influence on political discourse or on the outcomes of elections.
And I think the scrutiny that these companies are getting now is entirely appropriate, but I think it's a little too harsh to kind of look at history and sort of say like, oh,
you should have been able to anticipate that this would happen with your platform.
And in the sort of gaming chapter of the book, one of the points we're making is that, you
know, these platforms, right, they don't operate in isolation.
So unlike the other topics we're discussing, like fairness and privacy,
like those are really cases where algorithms
can operate on your data and make decisions about you,
and you do not even are aware of it.
Okay, things like Facebook and Twitter,
these are systems, right?
These are social systems, and their evolution,
even their technical evolution because machine learning
is involved, is driven in no small part by the behavior of the users themselves and how the users decide to adopt
them and how to use them.
And so, you know, I'm kind of like, who really knew that, you know, until we saw it happen,
who knew that these things might be able to influence the outcome of elections, who knew that they might polarize political discourse because of the ability
to decide who you interact with on the platform and also with the platform naturally using
machine learning to optimize for your own interest, that they would further isolate us from
each other and feed us all basically just the stuff that we already agreed with.
And so I think we've come to that outcome, I think, largely, but I thinkic remedies to these kinds of things?
And again, these are big problems that are not going to be solved with somebody going
in and changing a few lines of code somewhere in a social media platform.
But I do think in many ways, there are definitely ways of making things better.
I mean, like an obvious recommendation that we make at some point in the book is like, look,
to the extent that we think that machine learning applied
for personalization purposes in things like newsfeed
or other platforms has led to polarization
and intolerance of opposing viewpoints.
As you know, right, these algorithms have models, right?
And they kind of place people in some kind of metric space
and they place content in that space,
and they sort of know the extent to which I have an affinity
for a particular type of content,
and by the same token, they also probably have that same model
probably gives you a good idea of the stuff I'm likely to
violently disagree with or be offended by.
So in this case, there really is some knob you could tune that says, like, instead of showing
people only what they like and what they want, let's show them some stuff that we think
that they don't like, or that's a little bit further away.
And you could even imagine users being able to control this.
Just like everybody gets a slider.
And that slider says like, how much stuff do you want
to see that you might disagree with,
or is at least further from your interest?
It's almost like an exploration button.
So just get your intuition.
Do you think engagement, so like you're staying on the platform, you're staying engaged.
Do you think fairness, ideas of fairness won't emerge?
Like how bad is it to just optimize for engagement?
Do you think we'll run into big trouble if we're just optimizing for how much you love
the platform?
Well, I mean, optimizing for engagement kind of got us where we are.
So do you one have faith that it's possible to do better and two, if it is, how do we do
better?
I mean, it's definitely possible to do different, right?
And again, you know, it's not as if I think that doing something different than optimizing
for engagement won't cost these companies in real ways, including revenue and profitability,
potentially.
In the short term, at least.
Yeah, in the short term, right.
And again, you know, if I worked at these companies, I'm sure that it would have seen the most natural thing in the world also to want to optimize engagement.
That's good for users in some sense. You want them to be vested in the platform and enjoying it and finding it useful, interesting and or productive.
But my point is that the idea that it's out of their hands, as you said,
or that there's nothing to do about it,
never say never, but that strikes me as implausible
as a machine learning person.
I mean, these companies are driven by machine learning
and this optimization of engagement
is essentially driven by machine learning.
It's driven by not just machine learning,
but very, very large-scale A, B, experimentation, where you
tweak some element of the user interface or tweak some component of an algorithm or tweak
some component or feature of your click-through prediction model.
My point is that anytime you know how to optimize for something, by definition, that solution tells you how
not to optimize for it or to do something different.
Engagement can be measured.
Optimizing for minimizing divisiveness or maximizing intellectual growth over the lifetime
of a human being, I are very difficult to measure.
That's right. So I'm not claiming that doing something different
will immediately make it apparent that this is a good thing
for society. In particular, I think one way of thinking about
where we are on some of these social media platforms is that
it kind of feels a bit like we're in a bad equilibrium,
right, that these systems are helping us all kind of optimize
something biopically and selfishly for ourselves.
And of course, from an individual standpoint at any given moment,
like why would I want to see things in my news feed that I found
irrelevant, offensive, or the like.
But maybe by all of us having these platforms myopically optimize in our interests, we have
reached a collective outcome as a society that we're unhappy with in different ways.
Let's say with respect to things like political discourse and tolerance of opposing viewpoints.
If Mark Zuckerberg gave you a call and said, I'm thinking of taking us to Baticle, could
you run Facebook for me for six months?
What would you...
I think no thanks would be my first response, but there are many aspects of being the head
of the entire company that are entirely exogenous
to many of the things that we're discussing here.
And so I don't really think I would need to be CEO of Facebook to implement the more
limited set of solutions that I might imagine.
But I think one concrete thing they could do is they could experiment with letting people who chose to
to see more stuff in their newsfeed that is not entirely kind of chosen to optimize for their
particular interests, beliefs, etc. So the kind of thing, as I can speak to YouTube, but I think Facebook probably does something similar,
is they're quite effective at automatically finding what sorts of groups you belong
to, not based on race or gender or so on, but based on the kind of stuff you enjoy watching
in the case of YouTube.
So it's a difficult thing for Facebook or YouTube to then say, well, you know what, we're
going to show you something from a very different cluster, even though we believe
algorithmically, you're unlikely to enjoy that thing.
So if that's a weird jump to make, there has to be a human, like at the very top of that
system that says, well, that will be long term healthy for you.
That's more than an algorithmic decision.
Or that same person could say that'll be long term healthy
for the platform.
For the platform.
For the platform's influence on society outside
of the platform, right?
And it's easy for me to sit here and say these things,
but conceptually, I do not think that these are kind of totally, or should they shouldn't be kind of completely alien ideas.
But you know, you could try things like this, and it wouldn't be, you know, we wouldn't
have to invent entirely new science to do it, because if we're all already embedded
in some metric space and there's a notion of distance between you and me and every other piece of content, then we know exactly the
same model that tells the dictates how to make me really happy also tells how to make
me as unhappy as possible as well.
Right.
The focus in your book and algorithmic fairness research today in general is on machine
learning, like we said, is data.
But, and just even the entire AI field right now is captivated with machine learning,
with deep learning.
Do you think ideas in symbolic AI or totally other kinds of approaches are interesting
useful in the space, have some promising ideas in
terms of fairness?
I haven't thought about that question specifically in the context of fairness.
I definitely would agree with that statement in the large, right?
I mean, I am one of many machine learning researchers who do believe that the great successes
that have been shown in machine learning
recently are great successes, but they're on a pretty narrow set of tasks.
I mean, I don't think we're kind of notably closer to general artificial intelligence
now than we were when I started my career.
I mean, there's been progress.
And I do think that we are kind of as a community,
maybe looking a bit where the light is,
but the light is shining pretty bright there right now
and we're finding a lot of stuff.
So I don't want to argue with the progress
that's been made in areas like deep learning, for example.
This touches another sort of related thing
that you've mentioned and that people might misinterpret
from the title of your book, Ethical Algorithm.
Is it possible for the algorithm
to automate some of those decisions?
Sort of higher level decisions of what kind of...
Like what should be fair or what should be fair?
The more you know about a field,
the more aware you are of its limitations.
And so I'm pretty leery of sort of trying,
there's so much we don't all, we already don't know in fairness.
Even when we're the ones picking the fairness definitions and, you know, comparing alternatives and thinking about the tensions between different definitions.
That the idea of kind of letting the algorithm start exploring as well.
I definitely think, you know, this is a much narrower statement. I definitely
think that kind of algorithmic auditing for different types of unfairness, right? So like in this
gerrymandering example, where I might want to prevent not just discrimination against very
broad categories, but against combinations of broad categories, you know, you quickly get to
a point where there's a lot of a lot of categories, there's a lot of combinations of end features, and you can use algorithmic techniques to sort of try
to find the subgroups on which you're discriminating the most and try to fix that.
That's actually kind of the form of one of the algorithms we developed for this fairness
gerrymandering problem.
But partly because of our
technology, our scientific ignorance on these topics right now, and also partly just
because these topics are so loaded emotionally for people that I just don't see the value.
Again, never say never, but I just don't think we're at a moment where it's a great time
for computer scientists to be rolling out the idea like, hey, you know, not only have we kind of figured fairness out, but, you know, we think the algorithms
should start deciding what's fair or giving input on that decision. I just don't like the cost
benefit analysis to the field of kind of going there right now just doesn't seem worth it to me.
That said, I should say that I think computer scientists should be more philosophically, like should enrich their thinking about these kinds of things. I think
it's been too often used as an excuse for roboticists working on autonomous vehicles, for example,
to not think about the human factor or psychology or safety in the same way, like computer
sizes and algorithms that have been sort of used as an excuse. And I think it's time for basically everybody
to become a computer scientist.
I was about to agree with everything you said
except that last point.
I think that the other way of looking at it
is that I think computer scientists,
you know, and many of us are.
But we need to wait out into the world more, right?
I mean, just the influence that computer science and therefore computer scientists have had on society at large,
just like has exponentially magnified in the last 10 or 20 years or so.
And, you know, before when we were just tinkering around amongst ourselves and it didn't matter that much. There was no need for sort of computer scientists to be citizens of the world more broadly.
And I think those days need to be over very, very fast. And I'm not saying everybody needs to do it.
But to me, like the right way of doing this, do not to sort of think that everybody else is going to
become a computer scientist. But I think people are becoming more sophisticated about computer
science, even lay people.
I think one of the reasons we decided to write this book is we thought, 10 years ago, I
wouldn't have tried this, just because I just didn't think that people's awareness
of algorithms and machine learning, the general population would have been high.
You would have had to first write one of the many books, kind of just
explicating that topic to a lay audience first.
Now I think we're at the point where lots of people without any technical training at
all know enough about algorithms, machine learning that you can start getting to these
nuances of things like ethical algorithms.
I think we agree that there needs to be much more mixing.
But I think a lot of the onus of that mixing
needs to be on the computer science community.
Yeah, so just to link around the disagreement,
because I do disagree with you on the point that,
I think if you're a biologist, if you're a chemist,
if you're an biologist, if you're a chemist, if you're an MBA business person, all of those things,
you can, like, if you learn to program, and not only program, if you learn to do machine learning,
if you learn to do data science, you immediately become much more powerful in the kinds of things
you can do. And therefore, literature, like library sciences, like, so you were speaking
I think, I think it holds true what you're saying for the next few years, but long term,
if you're interested to me, if you're interested in philosophy, you should learn a program.
Because then you can scrape data, and study what people are thinking about on Twitter and then start
making philosophical conclusions about the meaning of life.
I just feel like the access to data, the digitization of whatever problem you're trying to solve, it
fundamentally changes what it means to be a computer scientist.
A computer scientist in 20, 30 years will go back to being a Donald Knuth style
theoretical computer science,
and everybody would be doing basically,
they can't explain the kinds of ideas
that you're exploring in your book.
It won't be a computer science project.
Yeah, I mean, I don't think I disagree enough,
but I think that that trend of more and more people
and more and more disciplines and more and more disciplines,
adopting ideas from computer science, learning how to code. I think that that trend seems firmly underway.
I mean, you know, like an interesting
digressive question along these lines is maybe in 50 years,
there won't be computer science departments anymore
because the field will just sort of be ambient in all of the different disciplines.
And people will look back and having a computer science department will look like having an electricity department or something.
So everybody uses this, it's just out there.
I mean, I do think there will always be that kind of canoe style core to it.
But it's not an implausible path that we kind of get to the point where
the academic discipline of computer science becomes somewhat marginalized because of its
very success in kind of infiltrating all of science and society and the humanities, etc.
What is differential privacy or more broadly algorithmic privacy?
more broadly algorithmic privacy. Algorithmic privacy, more broadly, is just the study
or the notion of privacy definitions or norms
being encoded inside of algorithms.
And so, I think we count among this body of work,
just the literature and practice of things like data anonymization,
which we kind of at the beginning of our discussion of privacy, say like, okay, this is sort
of a notion of algorithmic privacy.
It kind of tells you something to go do with data.
But our view is that it's, and I think this is now quite widespread,
that despite the fact that those notions of anonymization,
kind of redacting and coarsening, are the most widely adopted technical solutions for data privacy,
they are like deeply fundamentally flawed.
And so, to your first question, what is differential privacy?
Differential privacy seems to be a much, much better notion of privacy that kind of avoids a lot
of the weaknesses of anonymization notions while still letting us do useful stuff with data.
What's anonymization of data? So by anonymization, I'm kind of referring to techniques like I have a database, the
rows of that database are, let's say, individual people's medical records, okay?
And I want to let people use that data, maybe I want to let researchers access that data to build predictive models for some disease, but I'm worried that that will leak, you know, sensitive information about specific
people's medical records.
So anonymization broadly refers to the set of techniques where I say, like, okay, I'm
first going to, like, like, I'm going to delete the column with people's names.
I'm going to not put, you know, so that would be like a redaction, right?
I'm just redacting that information.
I am going to take ages and I'm not going to like say
your exact age, I'm going to say whether you're, you know,
zero to 10, 10 to 20, 20 to 30.
I might put the first three digits of your zip code,
but not the last two, et cetera, et cetera.
And so the idea is that through some series
of operations like this on the data,
I anonymize it, you know, another term of art
that's used is removing personally identifiable information.
And, you know, this is basically the most common way
of providing data privacy, but that's in a way
that still lets people access some very informed
of the data.
So, a slightly broader picture, as you talk about, what does an automation mean when you
have multiple database, like with a Netflix prize when you can start combining stuff together?
So, this is exactly the problem with these notions, right?
Is that notions of adon notions of anonymization, removing personally
and identifying information, the kind of fundamental conceptual
flaw is that these definitions kind of pretend
as if the data set in question is the only data set that
exists in the world or that ever will exist in the future.
And of course, things like the Netflix
Prize and many, many other examples
since the Netflix Prize. I think many other examples since the Netflix Prize,
I think that was one of the earliest ones, though.
You know, you can re-identify people
that were anonymized in the data set
by taking that anonymized data set
and combining it with other allegedly anonymized data sets
and maybe publicly available information about you.
You know, for people who don't know, the Netflix Prize
was being publicly released
as data. So the names from those rows were removed, but was released as the preference or the ratings
of what movies you like and you don't like. And from that combined with other things, I think
foreign posts and so on, you can start to figure out the name. That case, it was specifically
the internet movie database where lots of Netflix users publicly
rate their movie preferences. And so the anonymized data in Netflix when, I mean, it's just this
phenomenon I think that we've all come to realize in the last decade or so is that just knowing a few
in the last decade or so is that just knowing a few apparently irrelevant, innocuous things about you can often act as a fingerprint.
Like if I know what rating you gave to these 10 movies and the date on which you entered
these movies, this is almost like a fingerprint for you as the sea of all Netflix users.
There was just another paper on this in Science or Nature about a month ago
that kind of 18 attributes. I mean, my favorite example of this was actually a paper
from several years ago now where it was shown that just from your likes on Facebook, just from the
the things on which you clicked on the thumbs up button on the platform. Not using any information, demographic information, nothing about who your friends are.
Just knowing the content that you would like, was enough to, you know, in the aggregate,
accurately predict things like sexual orientation, drug and alcohol use,
whether you were the child of divorced parents.
So we live in this era where, you know,
even the apparently irrelevant data that we offer about ourselves on public platforms and forums,
often unbeknownst to us, more or less acts as a signature or, you know, fingerprint,
and that if you can kind of, you know, do a join between that kind of data and allegedly anonymized data, you
have real trouble.
So, is there hope for any kind of privacy in a world where a few likes can identify you?
So there is differential privacy, right?
So what is differential privacy?
Yeah, so differential privacy basically is an alternate, much stronger notion of privacy
than these anonymization ideas.
And it's a technical definition,
but the spirit of it is,
we compare two alternate worlds.
Okay, so let's suppose I'm a researcher
and I wanna do,
there's a database of medical records
and one of them's yours. And I want to's a database of medical records and one of them is yours.
And I want to use that database of medical records
to build a predictive model for some disease.
So based on people's symptoms and test results
and the like, I want to, you know, build a,
probably, you know, model predicting the probability
that people have disease.
So, you know, this is the type of scientific research
that we would like to be allowed to continue.
And in differential privacy, you ask a very particular counterfactual question.
We basically compare two alternatives.
One is when I do this, I build this model on the database of medical records, including
your medical record. And the other one is where I do the same exercise with the same database, with just your medical
record removed.
So basically, you know, it's two databases, one with N records in it, and one with N minus
one records in it.
The N minus one records are the same, and the only one that's missing in the second
case is your medical record. So differential privacy basically says that any
harms that might come to you from the analysis in which your data was
included are essentially nearly identical to the harms that would have come to you
if the same analysis had done without your medical record
included.
So in other words, this doesn't say
that bad things cannot happen to you
as a result of data analysis.
It just says that these bad things were going to happen
to you already, even if your data wasn't included.
And to give a very concrete example,
we discussed at some length the study
that in the 50s that was done that created
that established the link between smoking and lung cancer.
And we make the point that, well,
if your data was used in that analysis,
and the world knew that you were a smoker
because there was no stigma associated with smoking
before those findings.
Real harm might have come to you as a result of that study that your data was included
in, in particular, you're insurer now might have a higher posterior belief that you might
have lung cancer and raise your premium.
So you've suffered economic damage. But the point is, is that if the
same analysis has been done without all the other N-1 medical records and just years missing,
the outcome would have been the same. Your data was an idiosyncratically crucial to establishing
the link between smoking and lung cancer, because the link between smoking and lung cancer
is like a fact about the world
that can be discovered with any sufficiently large
database of medical records.
But that's a very low value of harm.
Yeah, so that's showing that very little harm is done.
Great, but what is the mechanism of differential privacy?
So that's the kind of beautiful statement of it.
But what's the mechanism by which privacy is preserved?
Yeah, so it's basically by adding noise to computations.
So the basic idea is that every differentially private algorithm, first of all,
or every good, differentially private algorithm, every useful one,
is a probabilistic algorithm.
So it doesn't, on a given input, if you gave the algorithm
the same input multiple times,
it would give different outputs each time,
from some distribution.
And the way you achieve differential privacy
algorithmically is by kind of carefully and tastefully
adding noise to a computation in the right places.
And to give a very concrete example,
if I want to compute the average of a set of numbers,
the non-private way of doing that
is to take those numbers and average them
and release a numerically precise value for the average.
In differential privacy, you wouldn't do that.
You would first compute that average to numerical decisions. and then you'd add some noise to it, right?
You'd add some kind of zero mean Gaussian or exponential
noise to it, so that the actual value you output is not the
exact mean, but it'll be close to the mean, but the noise
that you add will sort of prove that nobody can kind of reverse
engineer any particular value that went into the average.
So noise is a savior. How many algorithms can be aided by adding noise?
Yeah, so I'm a relatively recent member of the differential privacy community. My co-author
Aaron Roth is, you know, really one of the founders of the field and has done a great
deal of work.
I've learned a tremendous amount working with him on it.
But when I've grown up, field already is.
Yeah, but it's pretty mature.
But I must admit, the first time I saw the definition of differential privacy, my reaction
was like, well, that is a clever definition, and it's really making very strong promises.
And my, you know, I first saw the definition in much earlier days. And my first reaction was like, well, my worry about this definition would be that
it's a great definition of privacy, but that it'll be so restrictive that we
won't really be able to use it.
Like, you know, we won't be able to do compute many things in a differentially
private way.
So that's one of the great successes of the field,
I think, is in showing that the opposite is true,
and that most things that we know how to compute
absent any privacy considerations
can be computed in a differentially private way.
So for example, pretty much all of statistics
in machine learning can be done differentially privately.
So pick your favorite machine learning algorithm, back propagation and neural networks,
you know, cart for decision trees, support vector machines, boosting you name it, as well
as classic hypothesis testing and the like and statistics.
None of those algorithms are differentially private in their original form.
All of them have modifications that add noise to the computation in different places in
different ways that achieve differential privacy.
So this really means that to the extent that we've become a scientific community very
dependent on the use of machine learning and statistical modeling
and data analysis, we really do have a path to kind of provide privacy guarantees to those
methods and so we can still enjoy the benefits of kind of the data science era while providing
rather robust privacy guarantees to individuals.
So perhaps a slightly crazy question,
but if we take that, the idea is a differential privacy
and take it to the nature of truth
that's being explored currently,
so what's your most favorite and least favorite food?
Hmm, not a real foodie,
so I'm a big fan of spaghetti.
A spaghetti?
What do you really don't like?
I really don't like cauliflower.
Well, I love cauliflower.
But is one way to protect your preference for spaghetti by having an information campaign,
bloggers and so on, of bots saying that you like cauliflower.
So, this kind of the same kind of noise ideas.
I mean, if you think of it in our politics today,
there's this idea of Russia hacking our elections.
What's meant there, I believe, is bots spreading
different kinds of information.
Is that a kind of privacy, or is that too much of a stretch?
No, it's not a stretch.
I've not seen those ideas.
That is not a technique that my knowledge will provide
differential privacy.
But to give an example, one very specific example
about what you're discussing is there was a very interesting project at NYU,
I think led by Helen Nissenbaum there,
in which they basically built a browser plugin
that tried to essentially obfuscate your Google searches.
So to the extent that you're worried
that Google is using your searches to build predictive models
about you to decide what ads
to show you, which they might very reasonably want to do.
But if you object to that, they built this widget you could plug in.
And basically, whenever you put an aquarium to Google, it would send that aquarium to
Google.
But in the background, all of the time, from your browser, it would just be sending this torrent of irrelevant queries to the search engine.
So, you know, it's like a weed and chaff thing. So, you know, out of every thousand queries, let's say that Google was receiving from your browser,
one of them was one that you put in, but the other 999 were not. Okay, so it's the same kind of idea, kind of privacy by obfuscation.
So I think that's an interesting idea,
doesn't give you differential privacy.
It's also, I was actually talking to somebody
one of the large tech companies recently about the fact
that just this kind of thing,
that there are sometimes when the response to my data
needs to be very specific to my data, like I type mountain biking into Google.
I want results on mountain biking, and I really want Google to know that I typed in mountain
biking.
I don't want noise added to that.
And so I think there's maybe even interesting technical questions around notions of privacy
that are appropriate where it's not that my date is part of some aggregate like medical records and that we're
trying to discover important correlations and facts about the world at large, but rather
that there's a service that I really want to pay attention to my specific date, yet I
still want some kind of privacy guarantee.
I think these kind of obfuscation ideas are one way of getting at that, but maybe there are others as well.
So where do you think will land in this algorithm driven society in terms of privacy? So
sort of China like Kyfou Lee describes, you know, it's collecting a lot of data on its citizens,
but in the best form, it's actually able to
provide a lot of sort of protect human rights and provide a lot of amazing services.
And it's worse forms.
It can violate those human rights and limit services.
So what do you think will land?
Because algorithms are powerful when they use data.
So as a society, do you think we'll give over more data?
Is it possible to protect the privacy of that data?
So I'm optimistic about the possibility of balancing
the desire for individual privacy
and individual control of privacy
with kind of societally and commercially beneficial uses of data, not unrelated to
differential privacy or suggestions that say like, well, individuals should have
control of their data. They should be able to limit the uses of that data. They
should even, you know, there's, you know, fledgling discussions going on in
research circles
about allowing people's selective use of their data
and being compensated for it.
And then you get to sort of very interesting economic
questions like pricing, right?
And one interesting idea is that maybe differential privacy
would also, you know, be a conceptual framework
in which you could talk about the relative value
of different
people's data, like to demystify this a little bit.
If I'm trying to build a predictive model for some rare disease, and I'm going to use
machine learning to do it, it's easy to get negative examples because the disease is rare,
right?
But I really want to have lots of people with the disease in my data set.
But somehow those people's data with respect to this application is much more valuable to me
than just the background population.
And so maybe they should be compensated more for it.
And so I think these are very, very fledgling conceptual questions that maybe will have
kind of technical thought on them sometime in the coming years.
But I do think we'll, to kind of get more directly answer your question.
I think I'm optimistic at this point from what I've seen that we will land at some better
compromised than we're at right now, where, again, privacy guarantees are few far between and week,
and users have very, very little control.
And I'm optimistic that we'll land in something
that provides better privacy overall
and more individual control of data and privacy.
But I think to get there, it's again, just like fairness,
it's not gonna be enough to propose algorithmic solutions. There's gonna have to be a, it's again, just like fairness, it's not going to be enough to propose
algorithmic solutions.
There's going to have to be a whole kind of regulatory legal process that prods, companies
and other parties to kind of adopt solutions.
And I think you've mentioned word control a lot.
And I think giving people control, that's something that people don't quite have in a lot
of these algorithms.
And that's a really interesting idea of giving them control.
Some of that is actually literally an interface design question, sort of just enabling, because
I think it's good for everybody to give users control.
It's not, it's almost not a trade-off, except that you have to hire people that are good
at interface design.
Yeah, I mean, the other thing that has to be said, right, is that, you know, it's a cliche,
but, you know, we, as the users of many systems, platforms, and apps, you know, we are the
product, we are not the customer.
The customer are advertisers, and our data is the product, okay?
So it's one thing to kind of suggest more individual control of data and privacy and uses, but
this happens in sufficient degree, it will upend the entire economic model that has supported
the internet to date.
Some other economic model will have to be, you know, we'll have to replace
it. So the idea of markets, you mentioned by exposing the economic model to the people,
they will then become a market. They could be participants in it.
Participants in it. And, you know, this isn't, you know, this is not a weird idea, right?
Because there are markets for data already. It's just that consumers are not participants
in one.
There's like, you know, there's sort of publishers
and content providers on one side that have inventory
and then they're advertised on them.
There's then, you know, Google and Facebook
are running, you know, they're pretty much their entire
revenue stream is by running two-sided markets
between those parties, right?
And so it's not a crazy idea that there would be like a three-sided market, or that on one
side of the market or the other, we would have proxies representing our interests.
It's not a crazy idea, but it's not a crazy technical idea, but it would have pretty extreme
economic consequences.
Speaking of markets, a lot of fascinating aspects
of this world arise not from individual humans,
but from the interaction of human beings.
You've done a lot of work in game theory.
First, can you say, what is game theory
and how does help us model and study game theory?
Game theory, of course. Let us give credit where it's due.
They don't come, comes from the economists first and foremost, but as I've mentioned before,
like, you know, computer scientists never hesitate to wander into other people's turf.
And so there is now this 20 year old field called algorithmic game theory. But game theory, first and foremost,
is a mathematical framework for reasoning
about collective outcomes in systems
of interacting individuals.
Yeah.
So you need at least two people to get started in game theory.
And many people are probably familiar with prisoners
to them as a kind of a classic example of game theory, and many people are probably familiar with prisoners dilemma as kind of a classic example
of game theory and a classic example where everybody looking out for their own individual interests
leads to a collective outcome that's kind of worse for everybody than what might be possible if
they cooperated, for example. But cooperation is not an equilibrium in prisoners dilemma.
And so my work and the field of algorithmic game theory
more generally in these areas kind of looks at settings
in which the number of actors is potentially extraordinarily
large.
And their incentives might be quite complicated and kind of hard to model
directly, but you still want kind of algorithmic ways of kind of predicting what will happen
or influencing what will happen in the design of platforms.
So what to you is the most beautiful idea that you've encountered in Game Theory?
There's a lot of them. I'm a big fan of the field.
I mean, technical answers to that, of course, would include
Nash's work just establishing that there's a competitive
equilibrium under very, very general circumstances,
which in many ways put the field on a firm conceptual footing because if you don't have
equal liberty, it's kind of hard to ever reason about what might happen since there's
just no stability.
So just the idea of the stability can emerge when there's multiple.
Or that it means not necessarily emerge just that it's possible, right?
It means like the existence of equilibrium doesn't mean that sort of natural iterative
behavior will necessarily lead to it.
In the real world, yes.
Yeah, maybe answering a slightly less personally than you asked the question.
I think within the field of algorithmic game theory, perhaps the single most important
kind of technical
contribution that's been made is the realization between close connections between machine learning
and game theory, and in particular between game theory and the branch of machine learning
that's known as no regret learning.
And this sort of provides a very general framework in which a bunch of players interacting in
a game or a system, each one doing something that's in their self interest
will actually reach an equilibrium
and actually reach an equilibrium in a pretty,
a rather short amount of steps.
So you kind of mentioned acting greedily
can somehow end up pretty good for everybody.
Or pretty bad.
Or pretty bad.
It will end up stable.
Yeah.
Right.
And stability or equilibrium by itself is not necessarily either a good thing or a bad thing.
So what's the connection between machine learning and the ideas of people?
I mean, I think we kind of talked about these ideas already
in kind of a non-technical way, which is maybe
the more interesting way of understanding them first,
which is, you know, we have many systems, platforms,
and apps these days that work really hard to use our data
and the data of everybody else on the platform
to selfishly optimize on behalf of each user.
So let me give the cleanest example,
which is just driving apps, navigation apps like Google Maps and Ways,
where miraculously compared to when I was growing up at least,
the objective would be the same when you wanted to
drive from point A to point B, spend the least time driving,
not necessarily minimize the distance, but minimize the time.
And when I was growing up, like the only resources you had to do that were
like maps in the car which literally just told you what roads were available.
And then you might have like half-hourly traffic reports just about the
major freeways but not about side roads. So you were pretty much on your own. And now we've
got these apps. You pull it out and you say I want to go from point A to point B. And
in response kind of to what everybody else is doing, if you like, what all the other
players in this game are doing right now, here's the, you know, the root that minimizes
your driving time. So it is really kind of computing a selfish, best response for each of us
in response to what all of the rest of us are doing at any given moment. And so, you know,
I think it's quite fair to think of these apps as driving or nudging us all towards
of these apps as driving or nudging us all towards the competitive or Nash equilibrium of that game. Now, you might ask, well, that sounds great. Why is that a bad thing? Well,
it's known both in theory and with some limited studies from actual like traffic data
studies from actual like traffic data, that all of us being in this competitive equilibrium might cause our collective driving time to be higher, maybe significantly higher, then
it would be under other solutions.
And then you have to talk about what those other solutions might be and what the algorithms
implement them are, which we do discuss in the kind of game theory chapter of the book.
But similarly, on social media platforms or on Amazon, all these algorithms that are essentially trying to optimize our behalf,
they're driving us in a colloquial sense towards some kind of competitive equilibrium.
And one of the most important lessons of game theory is that just because we're at equilibrium doesn't mean that there's not a solution in which some
or maybe even all of us might be better off. And then the connection to machine learning, of course,
is that in all these platforms, I've mentioned the optimization that they're doing on our behalf
is driven by machine learning, like predicting where the traffic will be, predicting what products
I'm going to like, predicting what would make me happy in my news feed.
Now in terms of the stability and the promise of that, I have to ask just out of curiosity,
how stable are these mechanisms that you game theory's just, the economists came up with and
we all know that economists don't live in the real world just kidding. Sort of what do you think when we look at the fact that we haven't blown ourselves
up from a game theoretic concept of mutually shared destruction?
What are the odds that we destroy ourselves when nuclear weapons?
As one example of a stable game theoretic system? Just to prime your viewers a little bit, I mean, I think you're referring to the fact
that game theory was taken quite seriously back in the 60s as a tool for reasoning about
kind of Soviet, US nuclear, armament, disarmative, detente, things like that.
I'll be honest, as huge of a fan as I am a game theory
and it's kind of rich history,
it still surprises me that you had people
at the Rand Corporation back in those days
kind of drawing up two by two tables
and one, the row player is the US and the column player
is Russia and that they were taking seriously.
I'm sure if I was there,
maybe it wouldn't have seemed as naive as it does at the time.
You know, it seems to have worked, which is why it seems naive.
Well, we're still here.
We're still here in that sense.
Yeah, even though I kind of laugh at those efforts,
they were more sensible than than they would be now, right?
Because there were sort of only two nuclear powers at the time
and you didn't have to worry about
deterring new entrants and who was developing the capacity. And so we have many, we know, we have this, it's definitely a game with more players now and more potential entrants.
I'm not in general somebody who advocates using kind of simple mathematical models when the
stakes are as high as things like that.
The complexities are very political and social, but we are still here.
You've worn many hats, one of which.
The one that first caused me to become a big fan of your work many years ago is algorithmic
trading.
I have to just ask a question about this because you have so much fascinating work there.
In the 21st century, what role do you think algorithms have in space of trading, investment,
in the financial sector?
Yeah, it's a good question.
I mean, in the time I've spent on Wall Street and in finance, I've seen a clear progression and I think
it's a progression that kind of models the use of algorithms and automation more generally
in society, which is the things that kind of get taken over by the algos.
First, are sort of the things that computers are obviously better at than people, right? So first of all, there needed to be this era of automation where just financial exchanges
became largely electronic, which then enabled the possibility of trading becoming more algorithmic
because once exchanges are electronic, an algorithm can submit an order through an API
just as well as a human can do.
I can monitor it quickly.
I can read all the data.
Yeah.
I think the places where algorithmic trading have had the greatest inroads and had the
first inroads were in kind of execution problems, kind of optimized execution problems.
So what I mean by that is at a large brokerage firm, for example,
one of the lines of business might be on behalf of large institutional clients taking what
we might consider difficult trade. So it's not like a mom and pop investor saying, I want
to buy 100 shares of Microsoft. It's a large hedge fund saying, I want to buy a very,
very large stake in Apple,
and I want to do it over the span of a day.
And it's such a large volume that if you're not clever
about how you break that trade up, not just over time,
but over perhaps multiple different electronic exchanges
that all let you trade Apple on their platform,
you will move, you'll push prices around
in a way that hurts your execution.
So, you know, this is an optimization problem. This is a control problem.
And so, machines are better. We know how to design algorithms that are better at that kind of
thing than a person is going to be able to do because we can take volumes of historical and real-time
data to kind of optimize the
schedule with which we trade.
And you know, similarly high-frequency trading, which is closely related, but not the same
as optimized execution, where you're just trying to spot very, very temporary mispricings
between exchanges or within an asset itself, or just predict directional movement of a stock
because of the very, very low level granular
by and selling data in the exchange,
machines are good at this kind of stuff.
It's kind of like the mechanics of trading.
What about the, can machines do long terms of prediction?
Yeah, so I think we are in an era where clearly there have been some very successful
quant hedge funds that are in what we would traditionally call still in the stat-arbrejime.
What's that stat-arbre referring to statistical arbitrage, but for the purposes of this conversation,
what it really means is making directional predictions in asset price movement or returns.
Your prediction about that directional movement is good for, you know, you have a view that it's
valid for some period of time between a few seconds and a few days. And that's the amount of time that you're
going to kind of get into the position hold it and then hopefully be right about the directional
movement and, you know, by low and sell high as the cliche goes. So that is, you know, kind of a sweet
spot, I think, for quant trading and investing right now and has been for some time. When you really get to more Warren Buffett style timescales,
right, like my cartoon of Warren Buffett
is that Warren Buffett sits and thinks
what the long-term value of Apple really should be.
And he doesn't even look at what Apple is doing today.
He just decides, I think that this is what
it's long-term value is and it's far
from that right now and so I'm going to buy some Apple or, you know, short some Apple,
and I'm going to sit on that for 10 or 20 years, okay?
So when you're at that kind of time scale or even more than just a few days, all kinds
of other sources of risk and information.
So now you're talking about holding things
through recessions and economic cycles.
Wars can break out.
So there you have to understand.
Politics, human nature at a level.
Yeah, and you need to just be able to ingest
many, many more sources of data
that are on wildly different timescales, right?
So if I'm an HFT, my high frequency trader,
like, my main source of data is just the data from the exchanges themselves about the activity
in the exchanges. Maybe I need to keep an eye on the news because that can cause sudden
the CEO gets caught in a scandal or gets run over by a bus or something that can cause sudden, you know, the CEO gets caught in a scandal or, you know, gets run over
by a bus or something, that can cause very sudden changes. But, you know, I don't need to understand
economic cycles. I don't need to understand recessions. I don't need to worry about the political
situation or war breaking out in this part of the world because, you know, all I need to know is,
as long as that's not going to happen in the next 500 milliseconds
Then you know my model is good
When you get to these longer time scales you really have to worry about that kind of stuff and people in the machine learning community are starting to think about this we held a
We jointly sponsored a workshop at 10 with the
Federal Reserve Bank of Philadelphia a little more than a year ago on,
I think the title is something like
machine learning for macroeconomic prediction,
macroeconomic referring specifically
to these longer time scales.
And it was an interesting conference,
but it left me with greater confidence
that we have a long way to go to.
And so I think that people that, in the grand scheme of things,
if somebody asked me like, well, whose job on Wall Street is safe from the bots,
I think people that are at that longer time scale and have that appetite for all
the risks involved in long-term investing and that really need kind of not just algorithms
that can optimize from data,
but they need views on stuff.
They need views on the political landscape,
economic cycles and the like.
And I think, you know, they're pretty safe for a while
as far as I can tell.
So Warren Buff is job is just a little while.
Yeah, I'm not seeing, you know,
a Robo Warren buffet anytime soon.
Give him comfort.
Last question.
If you could go back to, if there's a day in your life, you could relive because it
made you truly happy.
Maybe you're outside the family.
Yeah, otherwise we'd know.
What day would it be?
What can you look back, you remember just being profoundly transformed in some way or blissful?
I'll answer a slightly different question, which is like what's a day in my life or
my career that was kind of a watershed moment.
I went straight from undergrad to doctoral studies,
and that's not at all at typical.
And I'm also from an academic family.
My dad was a professor or my uncle on his side as a professor,
both my grandfather's were professors.
All kinds of majors too, philosophy, so.
Yeah, they're all over the map.
I was a grad student here just up the river at Harvard and then came to study
with less valiant, which was a wonderful experience.
But I remember my first year of graduate school, I was generally pretty unhappy.
And I was unhappy because at Berkeley as an undergraduate, yeah, I studied a lot of math
and computer science, but it was a huge school, first of all, and I took a lot of other courses as we discussed. I started
as an English major, and took history courses, and art history classes, and had friends that did
all kinds of different things. Harvard's a much smaller institution than Berkeley, and its computer
science department, especially at that time, was a much smaller place than it is now.
And I suddenly just felt very, you know,
like I'd gone from this very big world
to this highly specialized world.
And now all of the classes I was taking
were computer science classes,
and I was only in classes with math and computer science people.
And so I was, you know, I thought often in that first year of grad school about whether I
really wanted to stick with it or not. And, you know, I thought like, oh, I could, you know, stop
with a master's, I could go back to the Bay Area into California. And, you know, this was some one
of the early periods where there was, you know, like, you could definitely get a relatively good job
paying job at one of the tech companies back, you know, that
were the big tech companies back then.
And so I distinctly remember like kind of a late spring day when I was kind of, you know,
sitting in Boston, common and kind of really just kind of chewing over what I wanted to do
in my life.
And I realized, and I'm like, okay, and I think this is where my academic background helped
me a great deal.
I sort of realized, you know, yeah, you're not having a great time right now.
This feels really narrowing,
but you know that you're here for research eventually,
and to do something original,
and to try to carve out a career
where you kind of, you know,
choose what you want to think about,
you know, and have a great deal of independence.
And so, you know, at that point,
I really didn't have any real research experience yet. I so, at that point, I really didn't have
any real research experience, yeah.
I mean, it was trying to think about some problems
with very little success, but I knew that I hadn't really
tried to do the thing that I knew I'd come to do.
And so I thought, I'm gonna stick through it for the summer.
And that was very formative because I went from kind of I'm going to stick through it for the summer.
That was very formative because I went from contemplating quitting to a year later, being
very clear to me, I was going to finish because I still had a ways to go.
I started doing research.
It was going well.
It was really interesting.
It was a complete transformation. You know, this is that transition that I think every doctoral student makes at some point,
which is to sort of go from being like a student of what's been done before to doing,
you know, your own thing and figure out what makes you interested in what your strengths
and weaknesses are as a researcher.
And once, you know, I kind of made that decision on that particular day, at that particular moment
in Boston Common, you know, I'm glad I made that decision.
And also just accepting the painful nature of that journey. Yeah, exactly. Exactly.
And in that moment said, I'm going to, I'm going to stick it out. Yeah, I'm going to stick
around for a while. Well, Michael, I've looked up to you work for a long time. It's really nice to talk to you.
Thank you so much for coming back and touch with you too and see how great you're doing as well.
Thanks a lot. Appreciate it. Thank you.