Microsoft Research Podcast - 122 - Econ2: Causal machine learning, data interpretability, and online platform markets featuring Hunt Allcott and Greg Lewis
Episode Date: June 2, 2021In the world of economics, researchers at Microsoft are examining a range of complex systems—from those that impact the technologies we use to those that inform the laws and policies w...e create—through the lens of a social science that goes beyond the numbers to better understand people and society. In this episode, Senior Principal Researcher Dr. Hunt Allcott speaks with Microsoft Research New England office mate and Senior Principal Researcher Dr. Greg Lewis. Together, they cover the connection between causal machine learning and economics research, the motivations of buyers and sellers on e-commerce platforms, and how ad targeting and data practices could evolve to foster a more symbiotic relationship between customers and businesses. They also discuss EconML, a Python package for estimating heterogeneous treatment effects that Lewis has worked on as part of the ALICE (Automated Learning and Intelligence for Causation and Economics) project at Microsoft Research. https://www.microsoft.com/research
Transcript
Discussion (0)
I like causal ML software because it gives researchers fewer and fewer choices.
I think one of the downfalls of academic research for many reasons is that researchers have
to make a lot of individual decisions.
And sometimes for many reasons, sort of poor training or p-hacking, or just that the problem
is very hard, they just make bad choices.
Part of what's appealing about machine learning as a general idea is that it's an algorithm.
It's these are inputs, these are outputs.
There are no choices in between.
It just does a set of pre-programmed instructions.
And it's an end-to-end process.
In the case of what economists often do in practice,
sometimes it's just ordinary least squares
and then it's an algorithm.
But even then it's often I'll run it once
and then I'll add these variables,
delete these variables,
then I'll run it again,
then I'll run it again. and I'll run it again.
Now it looks like it's significant, so maybe that's the table I want to go with in practice.
And I think that there's something very nice about having tools that say,
yeah, I decided to let this machine go and figure out what the right model was,
and I cross-validated, and that's how I picked my high parameters, and we're done.
Welcome to the Microsoft Research Podcast,
where you get a front row seat to
cutting edge conversations. I'm Hunt Alcott, an economist at Microsoft Research in Cambridge,
Massachusetts. And I'll be your host today as we speak with Greg Lewis, my colleague at Microsoft
Research. Greg is an economist and an expert on two important trends in the online economy. First, with the big data revolution,
there's now much more opportunity for businesses and governments
and other groups to target interventions at people who would benefit the most.
For example, advertisers targeting ads,
health systems targeting medical treatments,
schools offering tutoring, etc.
The second key trend is the rise of online platforms, such as Amazon and Uber,
that facilitate transactions between independent buyers and sellers.
In this new business model, there are now important questions about how different online market designs
and data sharing approaches
benefit or harm sellers and consumers. In addition to being an expert on these two trends,
Greg is my next door office neighbor in our offices in Cambridge, Massachusetts.
Greg Lewis, welcome to the Microsoft Research Podcast.
Why, thank you, Hunt. I'm so glad we're not recording this in the office,
otherwise we'd be disturbing each other.
Indeed, indeed.
So before we get to economics
or the economics that you study in your current research,
I want to start with your background.
You grew up in South Africa
and you majored in economics and statistics.
And what made you majored in economics and statistics.
And what made you want to study economics?
Well, it's actually a pretty circuitous route.
I started off studying to become an actuary.
In South Africa, smart kids do one of two things.
They either go to medical school or they become actuaries.
So I started down that road and it was quite boring, ultimately.
A lot of it is about death, about when people are going to die, and how much you should charge them for insurance. No, it's not a lot of fun. And so at some point of my, I guess, my junior year of
college, I decided that although I'd been taking economics along the way as part of my degree, I
should, in fact, change that to be my major. And I kept statistics because that was part of actuarial science. So yeah, I became an
economist because I was turned off by death. That's great. And of course, ironic because
they call us the dismal science. It is indeed. So you're coming out of your undergrad with
these skills and trying to avoid death. And you could have worked in a lot of different areas.
You could have worked for a bank or a consulting company or done any number of things. So how
did you end up in a PhD program? I think a couple of things. One was peer pressure.
A lot of my friends decided to come do PhDs in the States, computer science and physics mainly,
but that was one thing. Is that what you wrote on your application essay?
No, no, no. Nobody writes that on their application essay. Actually, I wrote on my application essay that I wanted to
work with your advisor, one of your advisors with Sandal. Anyway, but yeah, no, I think what got me
excited, I was always excited about modeling. Even when I was a kid, I used to design casino games
to use statistics to cheat my brother out of money. And so I was always that kind of like math kid
who was like just interested in modeling.
When I discovered that you could connect the real world
and large economic systems with modeling,
I thought, oh, this is the field for me.
This is great.
And I got excited about studying that at a higher level
than I had in undergrad.
So you end up at the University of Michigan.
How did you end up specifically doing what we call industrial
organization? Yeah. The more you make me think about it, it all feels so very accidental.
So I started off as a theorist. I was jazzed on theory and I started working on a paper,
which eventually got published like a decade later. But in my second year of grad school,
I worked on a paper on college admissions and a game theoretic model of college admissions. And my advisor at the time, Melona
Smith, who's a well-known game theorist, asked me to prove a theorem and I just couldn't.
And at some point I got really frustrated and I thought maybe I should do something with data
instead. And the adjacent field to theory with data is industrial organization because it uses
a lot of theory. And then sometime later I worked at the theorem I was asked to prove was false,
but that was like three years later. And then we proved it was false and that was
advantageous and eventually got published. So that was part of it. And I had really good advisors.
I got to work with Jim Levinson, who's a giant industrial organization, with Pat Byrie, who's
similarly a giant. And so that was part of why I stuck with IOs. I had really good opportunities working with those folks. So then you graduate and you had to be an
assistant professor at Harvard. And that's where your two current interests really start to
solidify. And I want to start off by talking about causal machine learning. So what is causal machine learning?
So part of it is machine learning, which is, I'd say, the use of sort of standardized algorithms
to make sense of data, and in particular, to mostly do prediction. So to figure out,
is this a cat or is this a dog, typically on the internet, and more generally to do many,
many things that are very important. So machine translation, for example, figuring out which word comes next in a sentence.
Then there's a separate sort of topic, which we're very concerned with in economics,
which is causality. If I think about a job training program, does that really end up helping
workers or is it actually not a very cost-effective way of training our labor force?
Causal machine learning is bringing those two ideas together, is to say, can we use ideas from machine learning
in terms of algorithms and making sense of data,
especially large-scale data,
along with ideas from causality
to start automating processes
by which we can learn about causal relationships and data.
And so another way that I've often thought about
causal machine learning is just that it is a
more flexible way
of estimating heterogeneous treatment effects. So when I say heterogeneous treatment effects,
of course, what I mean is in your job training example, job training programs might be especially
useful for men, or for young men, or maybe for old women, or maybe for people with two kids, but not people
with three kids. And you could come up with any number of other examples of heterogeneity in your
favorite causal inference application. And so there are these traditional set of techniques
where you might say, well, let's just try to estimate the treatment effects separately for men versus women and see what the differences are. But that requires
the researcher to, of course, come in with some prior that men and women might be different and
then kind of manually do those separate estimates. And so I guess another way that one might think
about causal ML is that it provides a set of tools that allows us to more flexibly understand the richness in the differences or heterogeneity in the treatment effects of a job training program or better schools or a medical intervention across a wide variety of people.
Is that another way of saying it in your
view? I think that's a super narrow way of saying it. So I think, yeah, I think that that comes from
a very particular kind of economic tradition of thinking about what we can do with machine
learning. I think folks in the machine learning community would say, like, let's think about,
you know, Bayesian causal networks and start asking, like, which relationships do I think
I can discover in the data automatically?
I think econ is very focused on policy.
When you come from econ, as we both did, you often immediately gravitate towards these
questions like, will this work differently for different people?
But I think that the statistical richness of machine learning is much larger, and therefore
the things you can do in the causal space are much larger.
And even that's true in my own work.
So I work on some heterogeneous treatment effect estimation, but I also work on sort of
flexible methods for instrumental variables. And there are policy optimization questions that are
robust optimization questions. There are all these questions that are kind of in this general field.
What are the main applications of causal machine learning? I wonder if in light of our conversation here, if you can break those
down into the traditional, what I had called heterogeneous treatment effect estimation,
looking at a policy intervention and looking at impacts on different people.
Maybe you could start with one of those applications and then move into
something else that you think illustrates these broader applications you're thinking of?
Great. So one kind of major set of applications is exactly this sort of context-dependent treatment
or heterogeneous treatment effects. And the easiest example to think of there is not one
I personally worked on, but one that's sort of a natural application is in drugs. You can imagine that as we get more and
more genetic data, we're going to want to tailor the drugs we administer for different medical
conditions to the genetic makeup of the individual person. So you could run experiments to go ahead
and figure out which drugs work on average, and already we do that. That's sort of the gold
standard for causal inference. But those techniques typically won't tell you very much about which drugs work well in which circumstances, especially when you're
trying to figure it out on the basis of somebody's genetic code, which think of it as just this giant
string of variables, right? And so you have no sense. I mean, obviously, experts in the bio
community will have some priors about which parts of the genome matter,
right? But it's not super easy to figure that out. And so you'd like the computer to do the work for you. And so one sort of major set of applications is around trying to figure out
which modifiers are statistically, you know, you can confirm are statistically modifying the
treatment effect. This drug works for these people who have this
set of markers, but otherwise it doesn't work. You'll see this even in the popular press talking
about COVID. I don't know this research personally, but every self, you'll read about this and you'll
go, yeah, this seems to work for most people, but there are this rare group of people for which it
just doesn't seem to be working very well. And maybe even here's why. This is what we can point
to in the data that shows us that this is the marker of somebody who's not going to respond well to this.
So those are some applications that resonate within the definition I set up of estimating
differences in treatment effects across people where you have different characteristics of
the person.
But then you were also alluding to more highbrow or novel applications.
And can you give an example of one of those?
Yeah, sure.
So the ones that I've been thinking of are nonlinearities.
So you're trying to find causal effects,
but the effects are not just do I turn something on or off,
which is what we think of in the treatment effect paradigm often.
But what if I give you 10 versus 20 versus 30?
So you could think about dosage would be an example of this.
And you might think about nonlinear effects in dosage, and you want to understand what those
effects look like, right? So that would be a second kind of application is trying to figure
out, okay, what's the dosage curve look like? A third kind of application, which is more highbrow
and is not my field of expertise, but I think is important to think about, is actually trying to
figure out what the causal structure of a complex system is. So you now have many variables moving around, and you're trying to understand which ones are
related to which other ones, and in what order. Does X cause Y, or does Y cause X,
and how's that related to Z? And so now, instead of what we typically think about in economics,
which is this paradigm in which there is an outcome variable, the thing I care about, do they recover from the disease or not? There's a treatment variable,
that's the drug. And then there are these control or modifier variables like your genetic history
or family background or your age or gender or whatever. Now let's just think about putting
those all into a pot and saying, actually, I'm not really going to tell you in advance
what I care about and how these things are related. I'm just interested in all the relationships between these things. And that's a much,
much harder problem for obvious reasons. But you can imagine there are many biological systems in
which you'd like to be able to figure that out if you could, because it's not obvious where to start
or what the right levers to pull are. And so now I feel like it's quite clear that this is not the
same as we had a job training program
and we want to know the average effect of that job training program on the average person.
So this is really exciting stuff.
Can you tell us about a couple of applications that you're working on right now?
Sure.
So one of the recent things we've been working on has been thinking a little bit about internal
investment decisions being made by a large company. And in that case, it really becomes a matching
problem that we were looking at. So you can imagine throughout the sales process that
you have many different kinds of ways you could interact with your customers to make
the experience better. So for example, you might assign additional salespeople, you might
assign technical support,
or you may make direct investments to your customers in various ways.
And for some customers, different kinds of investments may be more or less appropriate.
And so you're trying to figure out what's the right matching.
And in economics, the way we think about that is, okay, well, which of these investments is going to give me the highest ROI on which kind of customer?
If I assign technical support to a company, are they is going to give me the highest ROI on which kind of customer?
If I assign technical support to a company, are they subsequently going to be able to grow the technical component of their business and be able to come back to us and buy more stuff
from us in the future? And you can imagine that the same solution isn't right for everybody.
People in different industries, companies of different sizes or parts of their company
lifecycle may want different kinds of
tools. They may have different needs. And so trying to figure that out is very much a causal
inference problem and one that uses a sort of causal machine learning.
I'm curious to hear a little bit more about the mechanics of what you're doing. I can imagine a
data structure where you have data on a bunch of customers. And then you saw that some of the customers got
tech support, and some of them got additional sales effort, and some of them got a third thing.
And you could correlate these different actions that the company took with how profitable the
account was, whether the client stayed with your large company, etc.
But of course, that correlation isn't what you want. So can you tell us a little bit more about
how you turn this into a causal inference problem? Yeah, right. So that's a great question. And of
course, the best answer would be to say, I ran an experiment, but I didn't. And the reason I didn't
is the reason most businesses don't. Nobody goes ahead and starts thinking, well, we'll randomize some large investments. And so you're in the imperfect world
of machine learning or causal machine learning without this nice experiment. And so what's the
best you can do? The best you can do is try and find a control group that's sort of relevant,
right? So if I have some group of people that are getting investment A, some group of people
are getting investment B, some group of people are getting investment C, and some group of people are getting nothing, can I find people in group D, the control group who's getting nothing, that look like the people in A and B and C?
And that's a procedure that's been around for a long time.
You're super familiar with it, the sort of matching estimators that we see in econometrics.
But what if I know a lot about
my customers? What if I have many, many, many variables, right? That gets us into some of the
way the machine learning approaches work well, and in particular, the so-called double machine
learning approaches. What do they look like? You're trying to find something that looks like
an experiment. So what you can do is you can take the set of people who get each of these different
kinds of things, you can say, okay, what part of that is unpredictable?
Let me find people, companies that were surprisingly likely or surprisingly unlikely, given everything
we know about them, to get a particular intervention.
So company A got a lot of technical support, but they're surprising. It's surprising they got a lot of technical support, but they're surprising.
It's surprising they got a lot of technical support.
And company B, who I might have thought got technical support, didn't.
Okay.
So now I have these companies that look like I'm basically doing a coin flip.
Beforehand, I really had very little idea.
They looked to me like, you know, it's hard to tell whether it's A or B who's going to get the tech support.
And then one of them does get the tech support.
That kind of looks like an experiment. I didn't literally flip a coin,
but as far as I can tell from everything I knew about these companies, it looks to me like we
pretty much flipped a coin. And then I can use that as if it were an experiment.
When you work in academia, that feels like a terrible approach for causal inference,
because you think to yourself,
well, if it looks like a coin flip, but actually they decided to pick something,
there was a reason they picked something and that's going to mess up all your estimates.
But once you're inside a company, you realize a lot more things look a whole lot more like
coin flips than you thought they did. And so it's maybe not as worrying as you thought. Yeah, I see.
That's very interesting.
So can you talk about the impact that your work has had?
The way that this sort of work plays out in large companies is in many ways.
One is it's informative a little bit as to this thing that we started off studying,
which is which kinds of customers should get which kinds of investments. And I think that's informative to people. It's not the only
source of information they have. And so they weighed it against a bunch of other factors
in thinking about how to make future sales investments. And also, just the raw numbers,
this particular kind of investment does very well with this kind of customer might
change the way you budget. So you might say, okay, I want to spend more money on this kind of investment in the future.
But then it also has much more sort of what I'd say micro implications in a large organization,
which is that there are teams that are not making budgeting decisions or big strategic
decisions that are just thinking, okay, what do I want to give my customer next?
What's the next thing I want to do with my customer?
There seems to be the study that's saying that often when I assign tech support to a customer
like that, it goes pretty well. And so maybe if my customer has recently gone from small to medium,
they've grown their business a lot, suddenly, according to the study, it looks like, oh,
now the next logical sequence of actions for me to take is to change my kinds
of interactions with that customer to a different kind of interaction. And this sort of customer
lifecycle journey is very much a kind of application for these sorts of tools. It's
trying to figure out what the next logical action is for each customer.
I see. This is super interesting. So I feel like I'm seeing a lot of applications these days of this type of flexible
approach to estimating heterogeneous treatment effects or causal machine learning that in
practice aren't delivering much value. I'll give you a couple of reasons in the applications I'm
seeing. So one is that at least in many academic studies, you don't have big data, you have small data.
So I've only got 2000 observations in my experiment.
And so I can be flexible
with my estimation of treatment effects,
but I just don't have enough data
to know reliably what heterogeneity there is.
The second is bad potential moderators. So it could be that
if we knew a lot about people, we could predict how they would respond differently to some drug
or a job training program or whatever. But in practice, we just don't know very much about
people or we may know a lot, but the stuff we know may not be relevant to predicting differences in
treatment effects. And the third reason I feel like I'm seeing these approaches not be relevant to predicting differences in treatment effects. And the third
reason I feel like I'm seeing these approaches not live up to the excitement that I had is that
the world is often pretty smooth or pretty linear. And so doing fancy non-parametric stuff where I
can learn about, you know, the effects on women from Indiana who are age 34 just isn't as useful as just knowing
that the effects are different for women or the effects tend to be different for older people.
So do you share that big picture assessment? And are there specific examples where you think
there's been really high value added that you can point to?
Yeah. So the reasoning you offer is pretty sound. What I'd say is that in many applications,
certainly corporate applications, applications with large data sets,
you often can find heterogeneity because you just have such large feature sets,
histories of customer engagements. I'm thinking of Bing, for example, where we have a lot of data
and we can think about what kinds of ads there
might be value seeing.
And because there's a very large customer base, you can cut it many ways.
And also in these corporate applications, you're less interested in interpretability.
You are interested in statistical significance.
So you don't want to incorrectly infer that women in Indiana really want to see a particular
kind of ad and then show them this ad when in fact that's not true at all. So really want to see a particular kind of ad,
and then show them this ad when in fact that's not true at all. So you want to get that right.
Statistical significance matters, but interpretability doesn't in some ways.
Whereas in academic applications, you often really want to be able to tell a story. Some of the business of academia is narrative construction, right? And so you want a compelling narrative.
In automated applications,
actually what you care about, say ads, is lift. Do I manage to lift the probability of somebody
clicking? And so in fact, I don't actually care why this happened. I'm never going to investigate
why this happened. All I want is a system that can reliably deliver higher lift. And if I have
enough scale in terms of data, I'll be able to figure out how to do that with causal machine
learning. And if I'm running large enough experiments of data, I'll be able to figure out how to do that with causal machine learning.
And if I'm running large enough experiments.
So I think there are cases in industry where it's pretty compelling.
In academia, my sense is that often the returns are low, but I also think the cost is being
very low.
What I'm hoping for as this group that I co-lead at Microsoft, this ALICE group that works
in software for estimating heterogeneous treatment effects, EconML, I'm hoping that you'll just be able to deploy that,
and it'll be easy, and it'll give all the results that academics want. And if the result is, man,
it's pretty much linear, that's fine. As long as the times when it's nonlinear, or the times where
it does happen to be an interesting heterogeneous effect, that's statistically well isolated,
you can say, oh, in this particular application, actually, we can show you something different. And it didn't cost us very much more time to do that.
Yeah. And I certainly, as someone who is a natural user of the software you're developing,
I really appreciate all the work you all are doing on the hall here at Microsoft Research
New England. I actually want to follow up on something you said,
which is, I'd like to ask about interpretability. So you mentioned that part of academia is storytelling. And I think that's totally right. I would have said it just, we're doing work that
we want to make generally interesting and generally useful. And so nobody cares about
32 year old women in Indiana and how they responded to my being ads treatment. But they
might care to have some insight in general about gender differences or age differences or whatever.
And so I think that's how I think about the storytelling and the value of storytelling. So what are the improvements that you're seeing
in our ability to make machine learning results more interpretable?
So I think you were very early there. And I actually think that's sort of
one of the weaknesses of this area. One of the things that we've worked on is tree-based interpreters.
By that I mean, let's take this complicated set of machine learning results, which actually
may say very specific things.
They may say that this particular user in Indiana, we think that they're really going
to like these ads.
Let's reduce that to a much simpler story by saying that all I really want to do is
divide my data into four groups.
People who respond a lot, people who respond a little, people who don't respond, people who actually respond negatively.
What is a very compact way of expressing that?
And it may be as simple as saying, oh, it's people who are young and male who are in category one.
And category two is a different destroyed set. That's something that we've worked on a little bit and other people have in category one. And category two is a different destroyed set, right?
That's something that we've worked on a little bit and other people have proposed as well.
But I think there's a lot more to do here because, as you say, nobody cares about the specific thing.
People care about general patterns. And that's really, in some ways, another machine learning
problem. This is kind of a bit of a tangent, but it's kind of interesting. I once had a research assistant at Harvard.
He was an undergrad.
And he was an undergrad who was previously an undergrad in biology,
and he switched to economics.
And I asked him to study something.
And he came back with 100 pages of regression results,
which was incredibly impressive.
I was like, wow, this guy has already thought about this.
And he had cut the data like a thousand different ways.
And his name is Chris Sullivan,
and he's now a professor of economics at Wisconsin.
So it worked out well for him.
But then I said, I need a regression to summarize these.
I don't have, I need some other technique
to make sense of all this data you've just thrown at me.
And it's the same thing
with these machine learning techniques often.
They give you this super fine grain data
about what you think is going to happen
for individual people. And then you say, okay, but now tell me a story with this.
And there are different machine learning tools you could stick on top of that to get some
interpretability, but I think we're still at the beginning of developing those.
That is a great story. So another direction that I think is interesting in this space is that in practice, some companies and analysts I talked to are using
prediction approaches when ideally they want to be using causal inference. So for example,
we just had a big election. And so an important thing that people talk about in election years
is targeting campaign ads on the internet.
And so a get out the vote organization, what they really want to know is the causal impact
of serving an ad to somebody on Facebook
on the chance that they'll donate
or the chance that they will vote for their candidate
instead of the other candidate.
But this is really hard for any number of reasons. And so instead, you might do some
prediction thing, we'd say, well, I'd like to using some characteristics, predict who's going
to click on an ad or predict who is a moderate, and then take those predictions as some signal
of what the treatment effect of the ad would be
on some outcome that I care about. In clinical applications, right, people might use the
predictions of who's sickest to target some medical intervention on the grounds that the
sickest people might benefit the most. I'm curious if you have any examples from your work or talking with other folks of when that type of heuristic approach tends to work well versus when it can give misleading results.
I mean, I think in those applications, it's sort of like you're targeting at the people where you have good reason to believe it's valuable.
So I can imagine, you don't actually know what happens in the election example, but I can imagine you targeted the pivotal voters, right? You'd think, okay, these people are the people who
might actually switch their minds. So they're the people that should get the ads. Or maybe I need to
play to my base. I don't know. It depends on your preferences. But you can imagine that those
actually could be quite wrong, right? So you may be misinformed. And so one of the examples that
came up recently was a study by Susan Athey and some co-authors where they were looking at targeting help
with filling out student applications to different kinds of students.
And their prior was that the people who were really bad at doing this by themselves, the
people who are least likely to apply for financial aid by themselves, were exactly the people
you wanted to help with financial aid applications.
And in fact, they didn't actually do that policy.
They randomized and then studied the results afterwards.
And when they studied the results, they found that really it was people who were actually
most likely to already do it who really just need to push over the edge.
And so that's an example where you could imagine that you have a strong belief that
it's the people who are the most vulnerable who need the most support.
But in fact, you can imagine easy stories in which they're actually so far in the
hole that it's not going to help.
And the places where you can do the most are the people who are much closer to the marginal,
closer to daylight.
And so I think we need to study these things.
That's not to say that I don't think this is super valuable.
So often we encourage folks who are thinking about using some of these causal machine learning
tools to first develop a predictive model, which says something like, this person
is the sickest, and then ask, can I look for heterogeneity by that prediction?
These people I think are the sickest, and then let's look for treatment effects among
the sickest people, the middle people, and the most healthy people, and let's see what
those look like.
And you could see very quickly whether your prior is right, and then maybe you'll even
run with that in the future.
Maybe you'll say, I've done this in a few cases, and it always turns out to be the sickest people who benefit the most.
So maybe I don't need to do this whole causal machine learning thing anymore. But I think it's
worth going through the process of checking if you can.
So as part of your work at Microsoft Research, you've developed this set of tools, computer code, basically, for implementing causal machine learning approaches.
Tell us about that software and how that helps researchers.
Thanks for asking.
So that's the EconML package, which is available on GitHub.
A group of folks have worked on it, and I've actually written very little code, as you'll notice if you ever look at our contributions to the repo. So it's definitely
a group effort. That's code for estimating heterogeneous treatment effects. And it's very
agnostic as to what the right way to do that is. So some of the techniques that myself and some of
my colleagues have worked on are in that repo, but many other techniques from leading researchers are represented there.
We want people to have access to the best available statistical technology for estimating
heterogeneous treatment effects.
One of the things I like about this package, as I've worked on it and discussed the conceptual
issues that arise, is that we've really thought carefully about the statistics here.
And we've realized along the way that there are many, many difficult statistical issues
that arise.
And so I want people to use automated packages because I think that if you don't use automated
packages, it's easy to think you're doing the right thing and somehow make mistakes
because these mistakes can often be subtle.
And there are varying kinds of mistakes,
right? So one of the things that's appealing for policy evaluation of some of these causal
machine learning tools is that they automate model selection. And what do I mean by that?
Well, if I'm trying to figure something out about the world, about whether X causes Y,
there are many different ways I could approach that problem. I could just look at the role correlation between X and Y. I could run an experiment.
I could look at the correlation between X and Y after controlling for, say, age and
demographics and maybe family history and medical history, lots of other things.
And I have a lot of choices to make along the way.
And what machine learning tools do is they try to figure out in the way that most
machine learning tries to figure out what the best forecast of something is, what the best
prediction of something is, they try to figure out what the best model is. And so that sort of ties
the researcher's hands a lot as to what they get to show as the set of results. And I think this is
good for a number of reasons. One, we might find the best model. But two, also, it sort of limits
the ability of
researchers to find significant results when there aren't any by just trying and trying and trying
again until they find the thing they want to find, right? It's better in many ways if we just don't
give the researcher that degree of freedom. So you're like the Steve Jobs of econometrics
data analysis packages in the sense that you want to have everything bottled up and not give the
user a lot of choice in what to use or how to plug it in? I mean, you know, there's a fine line
there. Discretion is useful, right? But I do think at the very least, we want to have the options
available to people to say, look, I did this completely standardized thing. So you should
believe me when I give you these results as opposed to, no, no, I didn't do the standardized thing because blah, blah, blah. And then you should
expect a little bit more scrutiny because you decided to do something sort of very bespoke.
Absolutely. Well, if you could make as much money as Steve Jobs with this
software package, and if you have any giveaway, you know where my office is.
Well, this is tremendous.
I want to shift gears to talk about another area where you're also an expert, which is the economics of platform markets.
And I want to just start very broad.
What makes e-commerce different from older ways of selling stuff?
Well, so I think there are a bunch of differences. The most obvious is that you're buying something online. And so this is something I worked on actually in my PhD thesis when I was looking
at people buying cars on eBay. So it's 2007. People were buying a lot of cars on eBay.
And everybody thought it was very surprising because
cars of all things are the case where you kind of want to go kick the tires, literally, right?
That's what you want to do. And yet people were comfortable with it. There I was curious,
that got me really interested in this topic of what makes e-commerce work. And one thing is that
products are often standardized. And so you don't have to worry about kicking the tires in person. Another thing is that you have reviews. So you have many other users
telling you whether this is a good product or not a good product.
The third thing is that the platform itself, and actually let me back up a bit, platform,
Amazon, Etsy, these are places where we go to buy things. That's kind of a new word in
many ways. It's been around for a while now, maybe 20 years. But before places where we go to buy things. That's kind of a new word in many ways, right?
It's been around for a while now, maybe 20 years. But before that, we didn't have platforms, right?
We had shops. Now we have these platforms, and they do lots of things that shops don't do.
They aggregate these reviews. They recommend things to you by showing you what's available.
They make it easy to find things because online, you don't have to go and actually wander through
aisles. You just type things into a keyword box and they magically appear for you. So this platform is like a shop, but it's like
a shop with a lot of other features. And those other features have made shopping online a pretty
good experience for many people. And so many people are buying there.
Platforms are, of course, not new to the last 20 years. There's a sense in which I could think of
a singles bar as a platform, the credit card networks are a platform. But the idea of certainly online platforms are new to the
internet era. Yeah. I mean, you're thinking of it in the classic sort of economic sense of a
platform market, which is, you know, so two-sided marketplace, right? Now I'm thinking of much more
in this sort of like e-commerce-y kind of like, oh yeah, this is where I go to buy things on the
internet. And you're right. The economics of the one category, the two-sided marketplace,
are old and super interesting and apply to, gosh, so many different industries.
But this sort of modern online phenomenon that I'm thinking of is, yeah, it's much newer.
So I want to talk specifically about the role of the platform in providing recommendations or product rankings to people.
What are platforms doing there?
And what does that mean for consumers and the benefits that consumers receive?
So let me actually start off by saying it's not obvious what they're doing, right?
This is one of the sort of opaque things about parts of
the internet. Back in the day, when you looked at Yelp and you asked, what did I get shown when I
typed in a search query for restaurant, you got shown the highest rated by Yelp restaurants.
That is no longer the case unless you work pretty hard, right? That's not the default.
So what are they doing? One of them is they're determining relevance for you.
They're trying to figure out, given what you typed in, what you want. And that is pretty solidly aligned with your interests as
a consumer. If you're looking for an Indian restaurant, you don't want them to show you
Polish restaurants. The second thing they're trying to do is they're trying to figure out
the best of the options that are relevant. And then the question of what is best becomes a little
bit complicated. Because best for whom, right? On the one hand,
they want to serve their customers. So they want to give customers recommendations which are good.
But of course, their customer base is not one homogenous block. It's many different kinds of
people. And so they have to think a little bit about the diversity of the things that they
recommend. And on the other hand, they're taking in fees from merchants from the other side of the market,
from restaurants in the case of Yelp or from manufacturers in the case of Amazon.
And so they have to think a little bit about what they want to promote among the sellers,
like what they can do in terms of ranking recommendations to make sellers better sellers,
and also how much money they're getting from each seller. And so let me talk first about the virtuous side and then talk a little bit about the less virtuous
side. So the virtuous side of this is that platforms have this incredible opportunity
to discipline sellers by saying, I will not make your product very prominent
unless you are an attractive opportunity for my customer base.
Both Amazon and eBay have thought really hard
about free shipping. So Amazon does it through Prime, but eBay, people could give you shipping
or not give you shipping. They could charge you $50 for shipping and price their product at $2,
or they could charge you $2 for shipping and price it at $50, and they didn't care which one it was.
But then what they just started to do is they said, look, we're actually going to say that
if you have free shipping, you get to float right to the top of the search results.
And so suddenly everybody was like,
oh, free shipping.
That's something customers want.
That's something we will offer.
Amazon has a tool called the Buy Box.
You want to be at the top of their rankings.
You better be one of the cheapest
instantiations of that kind of product.
If you're very expensive
for what you're selling
and it's kind of a commodity product,
you won't be near the top of their rankings.
So again,
that creates incredible competition among sellers to drop their margins,
which means that Amazon customers get very good deals. So that's the sort of the good part of
the platform building or platform ecosystem. The sort of the less talked about part is that
it is also true that as a platform, if I have a better contractual deal with some of the seller
side of the market, those are the people I with some of the seller side of the market,
those are the people I want to put at the top of the rankings, because those are the people who are giving me the most money on every transaction. And so that's also obviously being taken into
account. Most platforms say they don't do very much of this, but it's actually quite hard to
study because it's hard to see their contractual obligations. So I feel like another issue which is really apparent in many searches on Amazon is just the proliferation of very similar goods that may be very similar or there may be lots of differences across the sellers in terms of quality.
And so do you have a sense of why that's such a hard problem to fix?
Yeah, I don't.
Maybe I can offer you some conjectures, but it's interesting to think about the intermediation
business, right?
Because there are a limited set of manufacturers and given any one manufactured product, it
can be marketed in many different ways, which is, I think, what you see in Amazon.
It's the same product being marketed by many different people. And then what is the differentiation
mechanism? Well, that's often whatever the search algorithm ranks as being important.
For a brief while, eBay actually made this visible to sellers. They gave them an opportunity to type
in the title of their listing, and eBay would tell them how good a title they had according
to their relevance algorithm.
So as you can imagine,
every seller spent a lot of time trying to game that.
And it turned out that their algorithm was kind of a machine learning algorithm
that wasn't very thoughtful.
So it turned out like free, free, free, free shipping
was better than just one free before shipping.
And so you've got these insane titles.
And so that's crazy.
But if you think about like the whole search engine optimization business, the same thing applies in retail.
If I can be the guy who cracks the Amazon code and figures out how to get my particular version of the exact same product stacked at the top of the rankings, I make a ton of money.
So you can have many different businesses in the business of figuring out how to get to the top and they can leapfrog each other.
And one goes up and makes some money and then they kind of don't do so well for a while. Then another goes up and they make some money. And how do you
get rid of that? I don't know. You have to basically figure out if you're Amazon, that this
is an exact match of the product. And I don't know if their incentives for that are that strong,
because maybe they think that the customer seeing five versions of the same product doesn't really
make them that unhappy. I don't know. So you talked about the good aspects and then the bad aspects of the
platform's ability to rank different sellers. And I'm curious, given those opportunities for
contractual relationships that the customer may not know about, or incentives in terms of how
else they might exercise market power, does any of this lead in a direction where you would advocate for regulation from the government? Or do you think
this is mostly a set of issues where the platforms will get close to what's best for society by just
acting in their own best interest? I think that's a pretty difficult question.
That's why you're on this podcast, Greg, to answer difficult questions.
Difficult questions. My sense is that you worry... Where I get worried is about something that
approaches close to monopoly power, right? And there are many platforms like this in different
domains. And so ultimately, if you think of these companies as being the final stop in any
journey from a product through to a customer, right, then if I'm the bottleneck at the end of some process,
I can demand a lot of the value of that entire process.
Now, many of these companies have bet up until now, at least in the way they talk about it
publicly, but I sort of believe them actually, that their incentives at the moment are just
to grow in scale. They just want as much of the market as possible. And so what
they're going to do is make the deal the best for the customer they can. So from an antitrust point
of view, they're in fact like antitrust heroes, right? They're out there to make customers happy.
And you can argue about whether that's the right definition of what we want for the economy,
but it's certainly good for customers. The problem is that once you have enough scale,
then you might want to exploit that scale.
And I think this is what everybody in regulation is thinking about right now.
What's to stop a company from starting to demand more and more of the pie?
Antitrust authorities can make all these contracts visible, right?
So they can launch an investigation and see entire chains.
So they can ask, is it the case that suddenly a large share of the vertical value between
sort of the cost of production and the final transaction price is going to one single entity?
That's where the interesting debate lies, is that if we think that the economic principle
is that you should not be able to demand monopoly-sized profits from any vertical chain, what's the
regulation that will allow us to actually get to the point
where we can enforce that? I want to shift gears a little bit.
One of the things we know about the modern internet economy is that companies are collecting
lots of data and selling that data. What concerns does that raise for you
as an economist? Yeah, that's a good question. I think that one of my big concerns around privacy
is that customers or users don't really know how much loss of privacy they're signing up for.
When a company can see my data, that feels like one thing,
but perhaps I didn't realize that often these companies are then selling that data on
to other sellers. And so now my mortgage company might, for example, be thinking about
me based on data that tracks my behavior on the internet.
And so your concern here sounds almost like a consumer protection type concern,
like it's my right to know about my data.
Is that right?
And or do you have any concerns about market efficiency?
Yeah, I mean, so on the one hand,
I do think that there's a pure like creepiness
kind of element, which, you know, a lot of our colleagues in other fields will think about, you know, I do think that there's a pure creepiness element, which a lot of our
colleagues in other fields will think about.
I do think people have a right to know how the data is used, although I think that often,
back to our interpretability question, I can give you a data dump and then it has to be
interpretable.
And I think actually when you look at companies like Google, they've made it much easier to
see how your data has been used.
The interpretability of your browsing history is sort of quite useful.
But then there's also an economic question.
And the economic question is, am I going to get targeted offers based on my browsing history?
So in an economic context, am I going to get an offer to shop at Starbucks, for example, or to get a better rate of my mortgage. And when I get these targeted
offers, is that going to be for my benefit or for the benefit of the companies making those offers?
And economic theory tells us that in general, targeted offers need not be a bad thing.
Because for example, if it's knowable that somebody really can't afford to pay $2
for a cup of coffee at Starbucks, but they'd be willing to pay a dollar, maybe they'll
get an offer to buy it at 75 cents rather than $1.50, and that's good.
On the other hand, they might get targeted and offered at exactly a dollar, which means
Starbucks got to sell another cup of coffee that they wouldn't otherwise have sold, and
the customers basically know better off than before. And so there's this question of both,
does it improve the efficiency of the market? That is to say, do we sell more cups of coffee,
roughly? But then also, who's benefiting from that? Is that Starbucks or is that the customer?
And in which case is it the one or the other? So in your example, the way that a customer's data is being used is to give her a different
price on the cup of coffee.
But my sense is that the primary use case of data is for product discovery or reminder
ads that have a fixed price, but just say, hey, remember that you were looking at Gucci
shoes the other
day and you should buy these, or, you know, have you thought about going to Starbucks today or
whatever? So it's not price discrimination. It's advertising something that has a fixed price.
Are the efficiency implications of data sharing different in that case?
Yeah. And let me say, I actually think they're kind of the same sort of topic,
although I gave one of them, I think they all sort of fall into the general category of marketing.
And the one kind of marketing is sort of a discounted targeted price offer.
But the other kind of marketing is just, hey, look at my product again, right? Like you say.
And I think that the efficiency implications of the first under sort of, I guess,
think of as like standard economic models are pretty positive, right? Because I know a lot about you, I can remind you of things that are
actually really useful to you. You really had meant to buy that pair of shoes, right? It kind
of floated out of your head for a little while. I put it back in front of you. You go, oh yeah,
that's the pair of shoes. I really did want that pair of shoes. Great. I'm going to go ahead and
buy it, right? And so that's potentially a positive thing.
As we all know, as people who've been served these ads, oftentimes you buy the pair of
shoes and then you keep getting those ads for the shoes.
So suddenly this doesn't look quite so useful and it starts to look annoying.
And so that feels like a failure of ad technology, but there's a matching process.
The best case scenario is it's just improving matching.
The more complicated issues arise when we're now getting differential pricing. I see. Do you agree with my assessment that most of targeted
advertising or sorry, most of the use of customer data is for targeted advertising at a fixed price
as opposed to price discrimination when charging a different price across different customers?
Yes, I think so. I think that's by far the overwhelming use of sabotaging. So that's the overall optimistic view of the value of data sharing across companies
for targeted ads.
I think it is, but maybe with some sort of caveats, which is that I think the cases that
we think of as being potentially most problematic are high value cases. So like
mortgage applications, insurance applications, job applications, right? Where you might imagine
that the way in which this data is used is to differentiate between applicants in ways that
we may think are not fair potentially, or may lead to very differential offers that we wouldn't be entirely
comfortable with. I get the sense that the pricing angle as opposed to the matching angle
is not that prevalent, but maybe is more likely to be prevalent in exactly the cases where the
stakes are high. I see. So I guess the credit scoring example used for mortgage pricing, credit card pricing
seems to be a leading case study there. As you see lenders in the US and around the world using
more and more data to assess whether somebody has a good credit risk and then using those data to
offer different prices or just to offer credit or not offer credit
to different people.
So can you talk us through how you think the use
of increased customer data impacts the market,
especially in light of the fact that you also have these,
what you'd call adverse selection issues
that are at play in these markets
that are not at play in the buying shoes example
we talked about before.
Right. So yes, exactly. So this is definitely a more complicated set of issues.
So I would think about this sort of modular adverse elections without thinking through
who's going to subsequently be able to repay a mortgage or not be able to repay a mortgage.
I'm now going to have targeted offers where I'm going to figure out what is, based on the data,
my guess as to the maximum willingness to accept. So what rate can I charge you so that I'm still
going to win your business and not lose to the competitors? I'm going to try and make that as
high as possible. And so for many customers, potentially, this is going to be very different than the rate I would
have charged you when I had to win every customer. If I figured out that you're the doesn't care
about their mortgage repayment that much customer, right? And again, we might hope the competition is
going to get rid of this, but suppose I'm some company that's got a very specific handle on you.
I figured you out better than my competition has. Maybe I'm able to give you a slightly different
offer, or maybe I'll change the terms of the offer in such a way that they look more
appealing to you, but actually profitable for me. And now that customer is going to be paying a
little bit more than they would have than in the case where I had to make an offer that appealed
not only to the Gucci customers, but to everybody else as well. So there's now this potential
inframarginal loss. On the other hand,
because I have much more data about who's going to repay, I've got way more information,
I'm also not going to be making as many loans that are simply complete losers, where I make the loan,
this person gets the credit and then can't do anything with it, and subsequently defaults.
That again is good for the firm. They're avoiding defaults. Banks not as getting any of these defaults. It might also in that case be good for society
because those same dollars could potentially be reallocated to somebody who's less for credit risk,
who might be able to put the same amount of dollars to productive use.
So I think in those cases, the economics are a little bit more complicated. And you might think
that competition is going to do a lot of the work
for you here in making sure that no particular customer group gets exploited, unless there's
sort of differential access to information so that some, say, banks have a preferential
relationship with data providers that would allow them to sort of exploit customers more effectively. Why, Greg, are ads so irrelevant despite all the data
that tech companies have about me? I am getting ads for products I would never think of buying.
And you mentioned the retargeting example where I am sometimes getting relevant stuff,
but it's only relevant because I just bought it yesterday and now it's not relevant anymore. Why in a world that is awash in data and awash in concerns like the
ones you're talking about, how companies might be misusing our data and micro-targeting that
is somehow bad for the consumers. Why is it that as far as I can tell, the reality is that the relevance of ads that I'm being served with my data is not very high?
I would point to two things.
The first is that, you know, you and the rest of us are complex, multidimensional people.
And so what you look like on one day is not the same as what you look like on the next day.
And so, you know, in some sense, there's sort of like a consistency issue, right? That in fact, the things that you find interesting today or care about
today are not the same things as you want tomorrow. That's one thing. But I think the second
thing is just the strategies are dumb, right? There's a, I'm awash with data and then there's,
how do I figure out what to do with that to give you exactly what would be relevant to you?
And not only that, I have to
give you that diversity. I can't figure out the one thing that you might care about and then just
show it to you a thousand times. It's going to be diversity. And it's got to be the case that
that has to somehow be achieved across many different platforms. So there's no coordinating
mechanism, right? There are all these people who are trying
to send you messages, and they're not coordinated with respect to the set of messages. So that,
of course, also results in sort of a loss relative to what we might imagine would be the right sort
of ads to show you. And then there's a grouping and privacy question. So we've talked a lot about
privacy, but it is often the case that in reality, most groupings of ads are done at the level of like a consumer segment so they've tagged you as
something like five keywords and if i tagged you with like hiking um then they're like everything
hiking all the time hiking as if the only thing i ever like to do is hiking but it turns out like
you know like five keywords is already quite a lot of keywords like i've already put you in a pretty
fine group you know a segment group right and so quite a lot of keywords. Like I've already put you in a pretty fine group, you know, a segment group, right?
And so, and there's no sense in which the computer models have a model of you.
They have a model of people like you.
And it turns out you are not like, you know, all the people in your peer group.
You're like a lot of other things too.
And they can't figure that out.
Oh, I appreciate your recognition that I am a multi-dimensional person.
Although I resent you suggesting that I'm a flip-flopper
whose desires and opinions change from day to day.
I'm sure you're more consistent than the average bear.
And notwithstanding that,
the ads I get are not relevant.
Well, this is tremendous, Greg.
Is there anything else that you'd like to touch on today?
I'd like to mention briefly this idea that maybe customers would be better off if they could control their own data.
Yeah.
And so this is something I have worked on recently with my colleagues, Najiba Lee, who's at Penn
State, and Shosh Basman, who's at Stanford.
And we've been thinking about what would happen if customers could control what companies
knew about them?
And could that make them better off?
And there's some really interesting economics here.
Because imagine that you get to say to the world,
look, I'm Hunt Alcott. I like these things and these things and these things and not these things.
Okay. Let me say something that's not in our paper, but I think is interesting and reminds
me of something that our colleague Nancy Boehm has said before, is would that be nice? Because
then you could decide what it was that companies put in front of you. You would have suddenly this instantaneous control over what you wanted the world to think of you,
and maybe you'd get more relevant ads. Why don't we make this ad targeting not so much a data
acquisition and deployment kind of thing and more of a conversation? Tell me what you'd like to see
today, Hunt. We'll figure that out. In the paper, we say, look, suppose you can control your data. You might decide which groups of people to join, because it may be the case that
certain groups of people are going to get certain deals from companies, right? They've identified
with certain kind of interest groups, and that tends to give them certain kind of targeting
opportunities. And it may be to your advantage then to have an opportunity to self-declare which kind of group
you'd like to be in, right? Part of what's going on in our paper is a little bit more complicated
than this, because what we're interested in the paper is the game theoretic implications
of me saying I belong to some group and therefore the platform making deductions from the fact that
I've declared for a group or not declared for a group. If I say I'm not in some group, what should they then think about me? What kinds of offers
should they then make to me since I've decided not to declare for some group? Another part of
that research is around the packaging of groups. If I was a benevolent company, like a credit card
company, for example, that actually controls a lot of consumer data, could they package up the consumer's data in a way that would be favorable to the consumers?
So suppose I'm Visa. I have something very important. I have data that's actually data.
It's not like you declaring that you like something, which you might actually fake.
You might decide you like hiking because you want a discount from Octerix.
So they actually have like past purchase data.
This person has in fact bought a lot of actual goods.
So it's verifiable.
That's key.
And a second part of it is that they have data from a large, large group of people.
And so one of the pitches that they could make as a reason to have a Visa credit card
is we're going to get you better deals from merchants than anybody else.
And so then instead of the customers having to self-declare and then the packaging happening,
you could imagine a credit card just deciding to do that by themselves.
They're going to package up their customer data with the express intent of extracting the best possible deals from their customers from upstream markets.
And that's a very different model.
And it's a model that only works if that
credit card company is one of the leading repositories of information about customers.
Because if there are 20 other places that Starbucks and Octerix can get data from,
then why do they need to go through the credit card company? Why do they need to give their
credit card company people these very good offers? Because they already know a lot about you from
other sources. So they can just say, I declined to participate in any of this stuff.
And I'll give you an offer from all that information sources.
Well, we've now identified, I think, multiple business opportunities for Greg Lewis. So again,
when you become a billionaire, you know where my office is.
Greg Lewis, thank you for being a part of the Microsoft Research Podcast.
For more information on Greg's research, people can head to gregmlewis.com.
And thanks to everybody for listening to the Microsoft Research Podcast.
For more info on Microsoft Research, check out microsoft.com slash research.