Orchestrate all the Things - Salesforce's AI Economist research wants to explore the equilibrium between equality and productivity. Featuring Stephan Zheng, Salesforce Lead Research Scientist, Senior Manager, AI Economist Team
Episode Date: July 13, 2022Economic theory is known to be constrained by a number of inefficiencies in its modeling. Salesforce researchers claim AI can help address that, leading to more robust economic policies. Article... published on ZDNet
Transcript
Discussion (0)
Welcome to the Orchestrate All the Things podcast.
I'm George Amadiotis and we'll be connecting the dots together.
Economic theory is known to be constrained by a number of inefficiencies in its modeling.
Salesforce researchers claim AI can help address that,
leading to more robust economic policies.
I hope you will enjoy the podcast.
If you like my work, you can follow Linked Data Orchestration on Twitter,
LinkedIn, and Facebook. Yeah, thanks so much, George, for having me on your podcast,
and nice to meet you too, first time we meet. So I'm Stefan Zeng. I'm an AI researcher at Salesforce.
I lead the AI economist team here. I'm originally from the Netherlands, so I'm Dutch.
So I'm a fellow European.
I use physics and math.
And then during my PhD,
I got really interested in machine learning.
I started doing my PhD around the time
that deep learning exploded onto the world in 2014.
And the motivation for this work really is
to push the boundary of machine learning to discover the principles of general intelligence, but also to do social good.
And, you know, social economic issues are really one of the most critical issues of our time.
And so I'm in a very fortunate position that Salesforce really lives its values that allows me to
combine these two passions. So on one hand, push the boundary of AI and to find some useful
applications for it to make the world a better place.
Okay, well, great. Thanks for the introduction. And you already sort of touched upon something
that I intended to ask you about. So you did mention, well,
first of all, I have to say in ways of introduction that these days when most people hear about
AI, they sort of automatically start also thinking about machine learning, which, you
know, is kind of debatable in and of itself. But actually within machine learning, which I know that you have used, myself personally,
when I first got sort of stumbled upon your work, let's say, I kind of assumed that you
would be doing the typical, let's say, machine learning approach, which is basically, you
get a ton of data and then you do some feature engineering and you train your model and so on.
It turns out, however, that that's not what you did.
You took a different path.
So a technique called reinforcement learning.
And so I thought it's actually a good opportunity to ask you to first sort of, you know, at
a very high level, explain reinforcement learning for the benefit of people who may be listening
and sort of
point out the differences and then also explain why you chose to apply this method in your
effort.
Yeah, that's a great question.
So yeah, in our research, we use reinforcement learning because we've shown that it's a really
powerful computational modeling
framework for economics so the first variation of machine learning you
described is what I would say broadly speaking is supervised learning or
variations thereof so somebody gives you a static data set and then try to learn patterns in the data and so in
reinforcement learning instead you have this simulation so this interactive
environment and and the the algorithm learns to look at the world and interact
with the simulation and then from that it can actually play around the
environment and change the way that environment works and so that's times simulation. And then from that, it can actually play around with the environment. It can change
the way that the environment works. And so in that sense, reinforcement learning is more flexible.
So in a data set, you've basically given a picture of the world, so to say, and a picture
of static. But in reinforcement learning, you can then actually try to change the world with your behavior and so on.
And so to our research, that is a much more powerful way of looking at the world.
And really there are three parts
to this reinforcing learning approach.
There's the simulation itself.
Then there's the optimization of the policy part.
And then the third one is data too because you can use data
to inform how your simulation works so in that sense reinforcement learning is
a bigger framework and it's a more flexible framework and it's and it's a
really useful tool to think about how the economy works and how you might
optimize your policy in that world. Okay, I see.
So, disclaimer here, I'm not an economist and not even close.
I don't have any background in that.
But from the little bit of self-education I have on the subject,
I think I was able to spot another sort of touch point, let's say.
So, much of economic theory is based on the notion of agents and modeling as well.
So there's different types of rational agent behavior that people use to model economic behavior.
And I think this is what you have also tried to do in your modeling as well. So I would say that this is another way in which this technique is fit for purpose for what you want to use it.
However, having said that, I think that, you know, one of the, obviously, going into this level of detailed modeling for economic behavior is a notoriously hard problem.
And I think this is also one of the issues that you identify in your work as well,
that current economic modeling is really not able to, not up to the task of depicting,
let's say, complex interactions and behavior.
And so you tried to improve improve of that so how how did
you try to do that exactly so what are the points that you think that you were able to improve
through your modeling yeah no that's a great question so the the first paper that we published
in science advances really looks at two parts of the free that i mentioned before. So we looked at optimizing policies and simulation
of the economy.
And in the first part, when you talk about optimizing policies,
the beauty of reinforcement learning
is that you are very flexible.
So that means that if you look we look at income taxes so in
reinforcement learning your search base is completely flexible you can consider
any any any tax model any any set of tax rates that you can think of and that
really is a big difference with how traditional economics things because in
traditional economics if people want to optimize their policy, they need to make a lot of assumptions typically.
That means that they constrain the shape of the tax schedule. They make a lot of assumptions on how the world works.
For instance, they might say every year the world is more or less the same. Nothing really changes that much. And that's really constraining.
It means that a lot of these methods don't really find the best policy if you consider
the world in its full richness.
If you look at all the ways in which the world can change around you.
So the second part is simulation. And the reason that such a part that that
is such a powerful component is that it allows you to think about what if scenarios, right?
And if we go back to the data that I mentioned, if I give you historical data, that's kind
of like a picture, right? The picture static, I know what happened, but I don't really know
what happened if i had
done something different in the past right and so what a simulation lets you do it lets you
have this sort of time machine where it says let me go back to the past let me change the policy
and then let me let me try to simulate what people would have done instead. Right?
And so this modeling of how people might have responded
in an alternative scenario
where the government might have done something different,
that is really the power of these simulations.
And in traditional economics,
this is also captured by something called
the Lucas critique.
This is a very well-known,
it was a very well-known economist called Robert Lucas. And he wrote this critique where
essentially he was pointing out this issue of history being a picture, economics not really
being able to think about what-if scenarios. And really our work is addressing that. In a way, you can think about it as addressing this issue.
Okay, so since we're talking about scenarios,
then I was wondering if you could very briefly outline
what sort of scenarios did you model and explore in your work?
As far as I was able to tell, it was mostly centered around taxation and actually even more specifically taxation of labor.
Unless I missed it, I didn't see modeling having to do with things such as taxing assets or other capital or international trade or this sort of thing. So can you give us like a brief analysis on what you did include and how did you run
simulations and what kind of different parameters did you play within your simulations?
Yeah, I'll try to give a brief overview.
So yeah, we only looked at income taxation, so no other forms of taxation.
And we built this spatial temple two-dimensional
world and in this world there's agents who can work they can mine resources and
then build houses and make money that way and that income that they earn
through building houses is then taxed by the government and what we asked the system to do is design a tax system that can maximize a combination
of equality and productivity in this world.
And then we compared that with baseline tax policies.
And so these include the free market, a progressive tax system that is similar to the US federal
tax system, and something to the US federal tax system, and
something called the SAS formula.
And we showed that the AI economists actually finds higher combinations or better combinations
of equality and productivity in this world.
And so we essentially considered a lot of different layouts, spatial layouts in this world,
because the agents are actually moving around.
We considered different distributions of resources.
We also considered agents that have different skill sets
or skill levels.
And what that means is that some agents
might be really good at building high quality houses.
Some agents might not be as good as building houses
and similarly for gathering resources and so on and what you see is that this ai economist because again it's not constrained to the assumptions that these baselines are making about the shape of the
tax schedule for instance or about how the world works, we find that this reinforcement learning framework of ours
can then find these really well-performing tax policies,
meaning they find higher equality
and productivity combinations.
So how exactly did you run the comparisons
to those baseline policies?
Did you somehow code them into your system
and run your simulations based on those parameters?
No, yeah, exactly.
So for the free market, that is a baseline case where there are no taxes at all.
It is a base case, although I'll say that in the real world, nothing is really free.
There's always a little bit of tax.
In a progressive case, we took inspiration
from how the US tax system works, which means that as your income grows, your marginal tax
rate is higher. And the way you calibrate it is that the income distribution in this
simulation is similar to what you see in the US today. So you can look at the distribution of incomes. And then for this, for this SAS tax,
we essentially followed SAS's own papers that talk about how you implement this formula,
if you're given data on people's behavior. So you need to do a so called lock lock regression.
So you have a bunch of tax rates that you've observed you have a bunch of incomes that you've observed
and then you can regress the two
and it gives you the parameters that you plug into a formula
so this formula is just
the one that he's derived using theory
so those are the three baselines that we use
another thing that sort of drew my attention, let's say in your work, was
when kind of drilling down the details really, was the fact that unless again, unless if
if I miss something, it seems like you only use a small number of agents in your simulation, so
10 if I'm not mistaken, And I'm wondering why did you make
that choice? Was it like pragmatic restrictions like compute power that restrained you or
something else? Because I would imagine that this doesn't come very close to a realistic
economy when you only have like 10 laborers in the market for example? No, that's a great question. And I want to point out that this first paper we released is a proof of concept
where we really focused on the AI part of the problem.
Because one of the conceptual issues, again, here is that we want to get away
from this idea that you're just looking at a picture of the past.
You really want to think about what if scenarios where people are responding to your
different policies and so the the key conceptual issue that we're addressing is the government is
trying to optimize this policy but we can also use ai to model how the economy is going to respond in turn.
So this is something we call a two-level reinforcement learning problem.
There's really two levels of learning in a system.
And so the paper really looks at an algorithmic solution and a modeling solution to solve
this two-level learning
problem and in from that point of view having ten agents in the economy and a
government it's already quite challenging to solve we really have to
put a lot of work in to find the algorithm to find a right mix of
learning strategies to actually make the system find these really good tax policy solutions.
And yeah, like I mean, in that sense, we can also look at other papers where we've seen similar
trends. If you look at how people play some types of video games or chess, right, these are already
really hard search and optimization problems, even though there's just two or 10 agents
in the world.
So in a similar vein, for us to solve the AI problem,
the really technical AI problem, that 10 agents is,
in some sense, already quite challenging.
But we're confident now that we have a good grasp
on the learning part, like what's the right strategy to learn,
now we're in a great position to think about the future
and extended work also along the other dimensions.
Okay, so since you mentioned, well, using reinforcement learning
in other sorts of tasks, mostly games actually,
and I guess most people will be familiar with
things such as AlphaGo and so on. So do you have any sort of idea of the computational complexity
of what you're trying to do versus those scenarios? So how do they compare?
I would say that if I look at the reported public numbers and how long it took to train AlphaGo,
I think we're way more efficient with our simulation.
So I think it's probably one or two orders of magnitude at least more efficient, at least our simulation currently.
Yes, I think we're more efficient, at least our simulation currently. Yeah, so I think we're more efficient than them.
Okay.
I think in some experiments, you also included human players.
And I was wondering if you can, well,
elaborate on the rationale for doing that
and what kind of conclusions that helped you draw
when you included human players as well.
Yeah, and that's a really good question. So in the first public version of the paper, this is back
in 2020, we ran a few human case studies where we asked people to play in a simulation too. I want
to point out that in the 2022 version of the paper, we did not
include that. We really focused on another aspect of the work. But in 2020, we certainly
were very interested to see what happens if the behavior of the AI agents is not completely
rational like a computer. Real people behave in ways that is more complex
than how a computer might behave,
which sometimes could be, at least in our case,
what was still more narrow than how a human behaves.
And we were really wondering if some of the solutions that we found
would actually also be effective when real people
were playing the game.
So we asked real people to build houses in this two-dimensional simulation.
We gave them a web browser, and then we actually paid them a small amount of money for every
house that they built.
It's very interesting to see how people then respond to that. And we found that there is indeed a signal
that you can run these case studies.
And we found that there is a weak signal
that the AI commons, again, does do better.
And I want to put a bit of a caveat around this
because the noise level is higher in the results.
And that is really because people behave in inconsistent ways. But nevertheless, we saw suggestions that the AI solutions are indeed also, again, more...
They achieve higher quality and productivity levels even when people are playing the game.
Okay, interesting. Another noteworthy thing to me, at least about
the work was that, well, again, you can correct me if I'm wrong, but I think that none of the
team of authors is actually an economist. So neither you or your co-authors have any background
in this. However, I saw in the acknowledgments of the paper
that you did happen to consult one of the most famous,
I guess, economists in the world.
So Thomas Piketty, who is also an author,
who wrote a best-selling book a couple of years ago, I guess.
So I was wondering, well, how did that go, basically?
So who initiated the contact and what exactly did your consultations include?
And whether you see that going forward, if you think that this is a collaboration,
that you're going to keep up going forward with this work?
Right. Yeah, no, you're absolutely right.
When we first started out, we didn't have an economist on board. So we partnered with David Parks, who sits both in computer science and economics.
And so over the course of the work, we did talk to economists and got their opinions, got their feedback.
We did have an exchange with Thomas Piketty. He's a very busy man, so I think he found the work interesting.
He also raised questions about, to some degree,
how the policies could be implemented.
And you can think of this from many dimensions.
But overall, he was interested in the work and I think that
reflects I think the broader response from the economics community there's
both interest there's questions about is this implementable what do we need to do
this and it's a it's it's a food for thought for the economics community.
Okay, so I guess that also touches upon the last set of questions that I had,
which was basically around the way forward.
So from the looks of it,
and I guess you also just mentioned yourself earlier,
this is basically a proof of concept at this point.
So how, what do you see are the next steps, basically?
So what are your next goals and how do you
what what's the framework for implementing those next steps?
Basically, I mean, you did mention in the beginning that
you are in the in a good position of having a working organization that
supports you in your endeavor.
However, I also wonder how this could be of practical applicability for Salesforce, basically.
Right.
So this work sits in the AI for Social Good or AI for Society, part of the AI research
organization. So I want to make clear that the AI comments right now
is not meant where there's no, this is purely meant
for research and social good.
And the way forward is really to make this broadly useful
and have some positive social impact.
And so one of the directions that we're really going for
is thinking about how you can get closer
to the real world with this.
And that means that we want to build bigger
and better simulations that are more accurate,
more realistic, that go far beyond
what people have been doing before.
Because we believe that that will be a key component of a framework for economic modeling
and policy design.
And a big part of that for AI researchers also is to prove that you can actually trust
these methods.
So you want to show things like robustness, and you want to show things like explainability.
So we want to, for instance, tell everyone, like, here's the reasons why the AI recommended this or
that policy. And also a strong belief in this as an interdisciplinary problem. I think what's really
the opportunity here is for AI researchers to work together with economists, to work together
with policy experts in understanding not just the technical dimensions of their problem,
but also to understand how that technology can be useful for society.
How can you build trust in systems?
What are the requirements for that kind of system?
So I think there's a fair bit of education here where today economists are not trained
as computer scientists.
They typically are not taught programming in Python, for instance, in their education.
And things like reinforcement learning might also not be something that is part of their standard
curriculum and their way of thinking. And I think that there's a really big opportunity here for
interdisciplinary research. And I think that if we can work together, then we can build trust and we can understand really what both the social and
technical aspects are of the policy design problem.
And then we hope that with the improvements in technology, we can really make this into
a great policymaking framework for the future. Okay, if I may point out a couple of interesting facts to
add to what you just said. The first one is that, well, it seems that the goal setting for the
system is actually not part of the simulation. So it's done externally, so whoever uses it
chooses what they want to optimize for.
And that means that you can use it for many different scenarios. So that's the
first. And the second one is that it seems that you have published probably the
entirety of the code as open source on GitHub. So again that means that it's
open for people to experiment and even add and modify the code.
That's absolutely right.
So the cornerstone of this project and this idea is that
we want to have full transparency,
especially if in the future iterations of these types of systems,
if they are going to be used for social good,
then everyone should be able to inspect these systems,
to question the system, to critique the system,
and we strongly believe in full transparency.
That's why we open source all the code for the paper.
We've open sourced the code for the training routine.
We've open sourced the experimental data that our paper is
based on and we strongly encourage everyone who works in this field in these intersections to do
the same because that's the only way that we can have a broadly supported technology out there
and and the second point about the objectives, that's also a key feature.
So we want to be very clear that the system does not make a choice on what combination of equality and productivity it should be optimized for.
That is still really up to people to decide. And so we really see this as a very powerful sort of testing grounds for your ideas and testing, you know, very powerful advisory for people. in the economy, then they could use the AI economics framework to inspect that, to see what
the system recommends and why, and then have a very grounded and transparent debate about the
various trade-offs. Because ultimately, a key part of economics is that there are always trade-offs
that we have to make in the economy. How equality how much productivity you can also think about sustainability and other angles and so so at the end of the day the way that
technology can help you is by giving you an insight in what the optimal trade-off is right
and so what what is the optimal period of frontier that you can find and that is really the the power
of this framework.
So having that flexibility again is a key part and a key benefit of the framework.
Great. Well, to be honest with you, I'm glad that you say so
because as a fellow European and a fellow researcher in past life,
I was also involved in similar types of efforts, let's say.
And so basically collective decision making,
you can come at it from two different angles.
One is like the top down, okay,
here's this perfect modeling of the world.
The other one is the bottom up,
like deliberating and kind of debating,
let's say around issues.
And it sounds like the approach
that the way that you want to take
with this is actually sort of combine those two because without transparency then it will be
really hard to sort of implement those decisions in the real world whenever you know the modeling
becomes mature enough for that to actually happen. No, exactly. I think the compelling part of this
is that what AI allows you to do is really grasp and try
to model the economy in its full complexity.
That's the bottom-ups approach.
We can really think about the lowest, smallest
elements of an economy.
At the same time, yes, people can give these high-level objectives,
and that's sort of top-down view.
And the beautiful thing about AI is that because of the combination
of these things, it's really a new frontier to understand economic complexity.
And the fact that you can use AI
to have this full 360-degree view of the economy
without making a lot of assumptions or simplifications,
that's a really powerful way to think about the economy
and moving forward, that's going to be something
that's going to be a lot of benefit
from having this 360 view of the economy.
So I guess your immediate audience at least would be economists, obviously.
And I guess that part of making this effort more accessible to them would be to sort of
create some UI around it or somehow make accessing it more easier?
Right, so we are in constant
conversation with economists and we're presenting this work in a scientific community. So we are doing rigorous science.
We have a number of exciting projects that are sort of ongoing
right now that I can't talk about publicly right now, but I'm very happy to once they
do become public. And yes, I think part of it is to do a bit of education to make people
familiar with this approach. It's possible that better
UIs could help with that for sure.
Yeah, and I think
there's a lot of exciting work here to do
to spread the word
and to educate people
and to engage
technical and non-technical
experts in this
subject matter.
I hope you enjoyed the podcast.
If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn, and Facebook.