Orchestrate all the Things - Salesforce's AI Economist research wants to explore the equilibrium between equality and productivity. Featuring Stephan Zheng, Salesforce Lead Research Scientist, Senior Manager, AI Economist Team

Starting point is 00:00:00 Welcome to the Orchestrate All the Things podcast. I'm George Amadiotis and we'll be connecting the dots together. Economic theory is known to be constrained by a number of inefficiencies in its modeling. Salesforce researchers claim AI can help address that, leading to more robust economic policies. I hope you will enjoy the podcast. If you like my work, you can follow Linked Data Orchestration on Twitter, LinkedIn, and Facebook. Yeah, thanks so much, George, for having me on your podcast,

Starting point is 00:00:30 and nice to meet you too, first time we meet. So I'm Stefan Zeng. I'm an AI researcher at Salesforce. I lead the AI economist team here. I'm originally from the Netherlands, so I'm Dutch. So I'm a fellow European. I use physics and math. And then during my PhD, I got really interested in machine learning. I started doing my PhD around the time that deep learning exploded onto the world in 2014.

Starting point is 00:01:00 And the motivation for this work really is to push the boundary of machine learning to discover the principles of general intelligence, but also to do social good. And, you know, social economic issues are really one of the most critical issues of our time. And so I'm in a very fortunate position that Salesforce really lives its values that allows me to combine these two passions. So on one hand, push the boundary of AI and to find some useful applications for it to make the world a better place. Okay, well, great. Thanks for the introduction. And you already sort of touched upon something that I intended to ask you about. So you did mention, well,

Starting point is 00:01:47 first of all, I have to say in ways of introduction that these days when most people hear about AI, they sort of automatically start also thinking about machine learning, which, you know, is kind of debatable in and of itself. But actually within machine learning, which I know that you have used, myself personally, when I first got sort of stumbled upon your work, let's say, I kind of assumed that you would be doing the typical, let's say, machine learning approach, which is basically, you get a ton of data and then you do some feature engineering and you train your model and so on. It turns out, however, that that's not what you did. You took a different path.

Starting point is 00:02:30 So a technique called reinforcement learning. And so I thought it's actually a good opportunity to ask you to first sort of, you know, at a very high level, explain reinforcement learning for the benefit of people who may be listening and sort of point out the differences and then also explain why you chose to apply this method in your effort. Yeah, that's a great question. So yeah, in our research, we use reinforcement learning because we've shown that it's a really

Starting point is 00:03:03 powerful computational modeling framework for economics so the first variation of machine learning you described is what I would say broadly speaking is supervised learning or variations thereof so somebody gives you a static data set and then try to learn patterns in the data and so in reinforcement learning instead you have this simulation so this interactive environment and and the the algorithm learns to look at the world and interact with the simulation and then from that it can actually play around the environment and change the way that environment works and so that's times simulation. And then from that, it can actually play around with the environment. It can change

Starting point is 00:03:45 the way that the environment works. And so in that sense, reinforcement learning is more flexible. So in a data set, you've basically given a picture of the world, so to say, and a picture of static. But in reinforcement learning, you can then actually try to change the world with your behavior and so on. And so to our research, that is a much more powerful way of looking at the world. And really there are three parts to this reinforcing learning approach. There's the simulation itself. Then there's the optimization of the policy part.

Starting point is 00:04:23 And then the third one is data too because you can use data to inform how your simulation works so in that sense reinforcement learning is a bigger framework and it's a more flexible framework and it's and it's a really useful tool to think about how the economy works and how you might optimize your policy in that world. Okay, I see. So, disclaimer here, I'm not an economist and not even close. I don't have any background in that. But from the little bit of self-education I have on the subject,

Starting point is 00:04:58 I think I was able to spot another sort of touch point, let's say. So, much of economic theory is based on the notion of agents and modeling as well. So there's different types of rational agent behavior that people use to model economic behavior. And I think this is what you have also tried to do in your modeling as well. So I would say that this is another way in which this technique is fit for purpose for what you want to use it. However, having said that, I think that, you know, one of the, obviously, going into this level of detailed modeling for economic behavior is a notoriously hard problem. And I think this is also one of the issues that you identify in your work as well, that current economic modeling is really not able to, not up to the task of depicting, let's say, complex interactions and behavior.

Starting point is 00:06:04 And so you tried to improve improve of that so how how did you try to do that exactly so what are the points that you think that you were able to improve through your modeling yeah no that's a great question so the the first paper that we published in science advances really looks at two parts of the free that i mentioned before. So we looked at optimizing policies and simulation of the economy. And in the first part, when you talk about optimizing policies, the beauty of reinforcement learning is that you are very flexible.

Starting point is 00:06:40 So that means that if you look we look at income taxes so in reinforcement learning your search base is completely flexible you can consider any any any tax model any any set of tax rates that you can think of and that really is a big difference with how traditional economics things because in traditional economics if people want to optimize their policy, they need to make a lot of assumptions typically. That means that they constrain the shape of the tax schedule. They make a lot of assumptions on how the world works. For instance, they might say every year the world is more or less the same. Nothing really changes that much. And that's really constraining. It means that a lot of these methods don't really find the best policy if you consider

Starting point is 00:07:32 the world in its full richness. If you look at all the ways in which the world can change around you. So the second part is simulation. And the reason that such a part that that is such a powerful component is that it allows you to think about what if scenarios, right? And if we go back to the data that I mentioned, if I give you historical data, that's kind of like a picture, right? The picture static, I know what happened, but I don't really know what happened if i had done something different in the past right and so what a simulation lets you do it lets you

Starting point is 00:08:11 have this sort of time machine where it says let me go back to the past let me change the policy and then let me let me try to simulate what people would have done instead. Right? And so this modeling of how people might have responded in an alternative scenario where the government might have done something different, that is really the power of these simulations. And in traditional economics, this is also captured by something called

Starting point is 00:08:41 the Lucas critique. This is a very well-known, it was a very well-known economist called Robert Lucas. And he wrote this critique where essentially he was pointing out this issue of history being a picture, economics not really being able to think about what-if scenarios. And really our work is addressing that. In a way, you can think about it as addressing this issue. Okay, so since we're talking about scenarios, then I was wondering if you could very briefly outline what sort of scenarios did you model and explore in your work?

Starting point is 00:09:20 As far as I was able to tell, it was mostly centered around taxation and actually even more specifically taxation of labor. Unless I missed it, I didn't see modeling having to do with things such as taxing assets or other capital or international trade or this sort of thing. So can you give us like a brief analysis on what you did include and how did you run simulations and what kind of different parameters did you play within your simulations? Yeah, I'll try to give a brief overview. So yeah, we only looked at income taxation, so no other forms of taxation. And we built this spatial temple two-dimensional world and in this world there's agents who can work they can mine resources and then build houses and make money that way and that income that they earn

Starting point is 00:10:17 through building houses is then taxed by the government and what we asked the system to do is design a tax system that can maximize a combination of equality and productivity in this world. And then we compared that with baseline tax policies. And so these include the free market, a progressive tax system that is similar to the US federal tax system, and something to the US federal tax system, and something called the SAS formula. And we showed that the AI economists actually finds higher combinations or better combinations of equality and productivity in this world.

Starting point is 00:10:58 And so we essentially considered a lot of different layouts, spatial layouts in this world, because the agents are actually moving around. We considered different distributions of resources. We also considered agents that have different skill sets or skill levels. And what that means is that some agents might be really good at building high quality houses. Some agents might not be as good as building houses

Starting point is 00:11:32 and similarly for gathering resources and so on and what you see is that this ai economist because again it's not constrained to the assumptions that these baselines are making about the shape of the tax schedule for instance or about how the world works, we find that this reinforcement learning framework of ours can then find these really well-performing tax policies, meaning they find higher equality and productivity combinations. So how exactly did you run the comparisons to those baseline policies? Did you somehow code them into your system

Starting point is 00:12:03 and run your simulations based on those parameters? No, yeah, exactly. So for the free market, that is a baseline case where there are no taxes at all. It is a base case, although I'll say that in the real world, nothing is really free. There's always a little bit of tax. In a progressive case, we took inspiration from how the US tax system works, which means that as your income grows, your marginal tax rate is higher. And the way you calibrate it is that the income distribution in this

Starting point is 00:12:35 simulation is similar to what you see in the US today. So you can look at the distribution of incomes. And then for this, for this SAS tax, we essentially followed SAS's own papers that talk about how you implement this formula, if you're given data on people's behavior. So you need to do a so called lock lock regression. So you have a bunch of tax rates that you've observed you have a bunch of incomes that you've observed and then you can regress the two and it gives you the parameters that you plug into a formula so this formula is just the one that he's derived using theory

Starting point is 00:13:17 so those are the three baselines that we use another thing that sort of drew my attention, let's say in your work, was when kind of drilling down the details really, was the fact that unless again, unless if if I miss something, it seems like you only use a small number of agents in your simulation, so 10 if I'm not mistaken, And I'm wondering why did you make that choice? Was it like pragmatic restrictions like compute power that restrained you or something else? Because I would imagine that this doesn't come very close to a realistic economy when you only have like 10 laborers in the market for example? No, that's a great question. And I want to point out that this first paper we released is a proof of concept

Starting point is 00:14:09 where we really focused on the AI part of the problem. Because one of the conceptual issues, again, here is that we want to get away from this idea that you're just looking at a picture of the past. You really want to think about what if scenarios where people are responding to your different policies and so the the key conceptual issue that we're addressing is the government is trying to optimize this policy but we can also use ai to model how the economy is going to respond in turn. So this is something we call a two-level reinforcement learning problem. There's really two levels of learning in a system.

Starting point is 00:14:56 And so the paper really looks at an algorithmic solution and a modeling solution to solve this two-level learning problem and in from that point of view having ten agents in the economy and a government it's already quite challenging to solve we really have to put a lot of work in to find the algorithm to find a right mix of learning strategies to actually make the system find these really good tax policy solutions. And yeah, like I mean, in that sense, we can also look at other papers where we've seen similar trends. If you look at how people play some types of video games or chess, right, these are already

Starting point is 00:15:41 really hard search and optimization problems, even though there's just two or 10 agents in the world. So in a similar vein, for us to solve the AI problem, the really technical AI problem, that 10 agents is, in some sense, already quite challenging. But we're confident now that we have a good grasp on the learning part, like what's the right strategy to learn, now we're in a great position to think about the future

Starting point is 00:16:10 and extended work also along the other dimensions. Okay, so since you mentioned, well, using reinforcement learning in other sorts of tasks, mostly games actually, and I guess most people will be familiar with things such as AlphaGo and so on. So do you have any sort of idea of the computational complexity of what you're trying to do versus those scenarios? So how do they compare? I would say that if I look at the reported public numbers and how long it took to train AlphaGo, I think we're way more efficient with our simulation.

Starting point is 00:16:53 So I think it's probably one or two orders of magnitude at least more efficient, at least our simulation currently. Yes, I think we're more efficient, at least our simulation currently. Yeah, so I think we're more efficient than them. Okay. I think in some experiments, you also included human players. And I was wondering if you can, well, elaborate on the rationale for doing that and what kind of conclusions that helped you draw when you included human players as well.

Starting point is 00:17:26 Yeah, and that's a really good question. So in the first public version of the paper, this is back in 2020, we ran a few human case studies where we asked people to play in a simulation too. I want to point out that in the 2022 version of the paper, we did not include that. We really focused on another aspect of the work. But in 2020, we certainly were very interested to see what happens if the behavior of the AI agents is not completely rational like a computer. Real people behave in ways that is more complex than how a computer might behave, which sometimes could be, at least in our case,

Starting point is 00:18:14 what was still more narrow than how a human behaves. And we were really wondering if some of the solutions that we found would actually also be effective when real people were playing the game. So we asked real people to build houses in this two-dimensional simulation. We gave them a web browser, and then we actually paid them a small amount of money for every house that they built. It's very interesting to see how people then respond to that. And we found that there is indeed a signal

Starting point is 00:18:45 that you can run these case studies. And we found that there is a weak signal that the AI commons, again, does do better. And I want to put a bit of a caveat around this because the noise level is higher in the results. And that is really because people behave in inconsistent ways. But nevertheless, we saw suggestions that the AI solutions are indeed also, again, more... They achieve higher quality and productivity levels even when people are playing the game. Okay, interesting. Another noteworthy thing to me, at least about

Starting point is 00:19:28 the work was that, well, again, you can correct me if I'm wrong, but I think that none of the team of authors is actually an economist. So neither you or your co-authors have any background in this. However, I saw in the acknowledgments of the paper that you did happen to consult one of the most famous, I guess, economists in the world. So Thomas Piketty, who is also an author, who wrote a best-selling book a couple of years ago, I guess. So I was wondering, well, how did that go, basically?

Starting point is 00:20:03 So who initiated the contact and what exactly did your consultations include? And whether you see that going forward, if you think that this is a collaboration, that you're going to keep up going forward with this work? Right. Yeah, no, you're absolutely right. When we first started out, we didn't have an economist on board. So we partnered with David Parks, who sits both in computer science and economics. And so over the course of the work, we did talk to economists and got their opinions, got their feedback. We did have an exchange with Thomas Piketty. He's a very busy man, so I think he found the work interesting. He also raised questions about, to some degree,

Starting point is 00:20:53 how the policies could be implemented. And you can think of this from many dimensions. But overall, he was interested in the work and I think that reflects I think the broader response from the economics community there's both interest there's questions about is this implementable what do we need to do this and it's a it's it's a food for thought for the economics community. Okay, so I guess that also touches upon the last set of questions that I had, which was basically around the way forward.

Starting point is 00:21:36 So from the looks of it, and I guess you also just mentioned yourself earlier, this is basically a proof of concept at this point. So how, what do you see are the next steps, basically? So what are your next goals and how do you what what's the framework for implementing those next steps? Basically, I mean, you did mention in the beginning that you are in the in a good position of having a working organization that

Starting point is 00:22:04 supports you in your endeavor. However, I also wonder how this could be of practical applicability for Salesforce, basically. Right. So this work sits in the AI for Social Good or AI for Society, part of the AI research organization. So I want to make clear that the AI comments right now is not meant where there's no, this is purely meant for research and social good. And the way forward is really to make this broadly useful

Starting point is 00:22:42 and have some positive social impact. And so one of the directions that we're really going for is thinking about how you can get closer to the real world with this. And that means that we want to build bigger and better simulations that are more accurate, more realistic, that go far beyond what people have been doing before.

Starting point is 00:23:10 Because we believe that that will be a key component of a framework for economic modeling and policy design. And a big part of that for AI researchers also is to prove that you can actually trust these methods. So you want to show things like robustness, and you want to show things like explainability. So we want to, for instance, tell everyone, like, here's the reasons why the AI recommended this or that policy. And also a strong belief in this as an interdisciplinary problem. I think what's really the opportunity here is for AI researchers to work together with economists, to work together

Starting point is 00:23:55 with policy experts in understanding not just the technical dimensions of their problem, but also to understand how that technology can be useful for society. How can you build trust in systems? What are the requirements for that kind of system? So I think there's a fair bit of education here where today economists are not trained as computer scientists. They typically are not taught programming in Python, for instance, in their education. And things like reinforcement learning might also not be something that is part of their standard

Starting point is 00:24:35 curriculum and their way of thinking. And I think that there's a really big opportunity here for interdisciplinary research. And I think that if we can work together, then we can build trust and we can understand really what both the social and technical aspects are of the policy design problem. And then we hope that with the improvements in technology, we can really make this into a great policymaking framework for the future. Okay, if I may point out a couple of interesting facts to add to what you just said. The first one is that, well, it seems that the goal setting for the system is actually not part of the simulation. So it's done externally, so whoever uses it chooses what they want to optimize for.

Starting point is 00:25:25 And that means that you can use it for many different scenarios. So that's the first. And the second one is that it seems that you have published probably the entirety of the code as open source on GitHub. So again that means that it's open for people to experiment and even add and modify the code. That's absolutely right. So the cornerstone of this project and this idea is that we want to have full transparency, especially if in the future iterations of these types of systems,

Starting point is 00:26:03 if they are going to be used for social good, then everyone should be able to inspect these systems, to question the system, to critique the system, and we strongly believe in full transparency. That's why we open source all the code for the paper. We've open sourced the code for the training routine. We've open sourced the experimental data that our paper is based on and we strongly encourage everyone who works in this field in these intersections to do

Starting point is 00:26:32 the same because that's the only way that we can have a broadly supported technology out there and and the second point about the objectives, that's also a key feature. So we want to be very clear that the system does not make a choice on what combination of equality and productivity it should be optimized for. That is still really up to people to decide. And so we really see this as a very powerful sort of testing grounds for your ideas and testing, you know, very powerful advisory for people. in the economy, then they could use the AI economics framework to inspect that, to see what the system recommends and why, and then have a very grounded and transparent debate about the various trade-offs. Because ultimately, a key part of economics is that there are always trade-offs that we have to make in the economy. How equality how much productivity you can also think about sustainability and other angles and so so at the end of the day the way that technology can help you is by giving you an insight in what the optimal trade-off is right

Starting point is 00:27:55 and so what what is the optimal period of frontier that you can find and that is really the the power of this framework. So having that flexibility again is a key part and a key benefit of the framework. Great. Well, to be honest with you, I'm glad that you say so because as a fellow European and a fellow researcher in past life, I was also involved in similar types of efforts, let's say. And so basically collective decision making, you can come at it from two different angles.

Starting point is 00:28:30 One is like the top down, okay, here's this perfect modeling of the world. The other one is the bottom up, like deliberating and kind of debating, let's say around issues. And it sounds like the approach that the way that you want to take with this is actually sort of combine those two because without transparency then it will be

Starting point is 00:28:51 really hard to sort of implement those decisions in the real world whenever you know the modeling becomes mature enough for that to actually happen. No, exactly. I think the compelling part of this is that what AI allows you to do is really grasp and try to model the economy in its full complexity. That's the bottom-ups approach. We can really think about the lowest, smallest elements of an economy. At the same time, yes, people can give these high-level objectives,

Starting point is 00:29:28 and that's sort of top-down view. And the beautiful thing about AI is that because of the combination of these things, it's really a new frontier to understand economic complexity. And the fact that you can use AI to have this full 360-degree view of the economy without making a lot of assumptions or simplifications, that's a really powerful way to think about the economy and moving forward, that's going to be something

Starting point is 00:30:00 that's going to be a lot of benefit from having this 360 view of the economy. So I guess your immediate audience at least would be economists, obviously. And I guess that part of making this effort more accessible to them would be to sort of create some UI around it or somehow make accessing it more easier? Right, so we are in constant conversation with economists and we're presenting this work in a scientific community. So we are doing rigorous science. We have a number of exciting projects that are sort of ongoing

Starting point is 00:30:47 right now that I can't talk about publicly right now, but I'm very happy to once they do become public. And yes, I think part of it is to do a bit of education to make people familiar with this approach. It's possible that better UIs could help with that for sure. Yeah, and I think there's a lot of exciting work here to do to spread the word and to educate people

Starting point is 00:31:15 and to engage technical and non-technical experts in this subject matter. I hope you enjoyed the podcast. If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn, and Facebook.

Orchestrate all the Things - Salesforce's AI Economist research wants to explore the equilibrium between equality and productivity. Featuring Stephan Zheng, Salesforce Lead Research Scientist, Senior Manager, AI Economist Team

Economic theory is known to be constrained by a number of inefficiencies in its modeling. Salesforce researchers claim AI can help address that, leading to more robust economic policies. Article... published on ZDNet

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.