Microsoft Research Podcast - 053r - Chasing convex bodies and other random topics with Dr. Sébastien Bubeck

Episode Date: December 25, 2019

Dr. Sébastien Bubeck is a mathematician and a senior researcher in the Machine Learning and Optimization group at Microsoft Research. He’s also a self-proclaimed “bandit” who claims that, despi...te all the buzz around AI, it’s still a science in its infancy. That’s why he’s devoted his career to advancing the mathematical foundations behind the machine learning algorithms behind AI. Today, Dr. Bubeck explains the difficulty of the multi-armed bandit problem in the context of a parameter- and data-rich online world. He also discusses a host of topics from randomness and convex optimization to metrical task systems and log n competitiveness to the surprising connection between Gaussian kernels and what he calls some of the most beautiful objects in mathematics. https://www.microsoft.com/research

Transcript
Discussion (0)
Starting point is 00:00:00 When I had Sebastian Buback in the booth a year ago, he talked in depth about his work in machine learning and optimization, and also shared his passion for the mathematics that support artificial intelligence. Whether you heard Sebastian's podcast last December, or you asked Santa for some math this Christmas, I know you'll enjoy episode 53 of the Microsoft Research Podcast, Chasing Convex Bodies and Other Random Topics. You want science to be based on scientific facts.
Starting point is 00:00:30 And that's what I like about the theoretical computer science community is that they have well-established, long-standing open problems. And if you do make progress on those problems, then people will listen to you rightfully because it means that you have some new techniques, some things that people before couldn't do. You're listening to the Microsoft Research Podcast, a show that brings you closer to the cutting edge of technology research
Starting point is 00:00:55 and the scientists behind it. I'm your host, Gretchen Huizinga. Dr. Sebastian Bubeck is a mathematician and a senior researcher in the Machine Learning and Optimization group at Microsoft Research. He's also a self-proclaimed bandit who claims that, despite all the buzz around AI, it's still a science in its infancy. That's why he's devoted his career to advancing the mathematical foundations behind the machine learning algorithms behind AI. Today, Dr. Bubek explains the difficulty of the multi-armed bandit problem in the context of a parameter and data-rich online world.
Starting point is 00:01:36 He also discusses a host of topics from randomness and convex optimization, to metrical task systems and log-in competitiveness, to the surprising connection between Gaussian kernels and what he calls some of the most beautiful objects in mathematics. That and much more on this episode of the Microsoft Research Podcast. Sebastian Bubeck, welcome to the podcast. Thank you. You're a senior researcher in the newly minted or relatively newly minted machine learning and optimization group at MSR, which used to be called the theory group. And now there's a couple of subgroups under this new umbrella.
Starting point is 00:02:22 So give us a quick picture of the org chart and who does what, and then tell us how we could best frame the work you do. What gets you up in the morning? Sure. So I will start with the new organization that we have here at Microsoft Research in Redmond. So I guess a little bit more than a year ago,
Starting point is 00:02:40 we started a new lab, which is called Microsoft Research AI for artificial intelligence. So this is following, of course, a year ago, we started a new lab, which is called Microsoft Research AI for Artificial Intelligence. So this is following, of course, all the hype and the real scientific excitement around machine learning. So we created Microsoft Research AI, and some of the members of the former Siri group decided to join this new endeavor. And that's when we decided to create the Machine Learning and Optimization group. And then the other half of the group went to do the Algorithms group in just a basic Microsoft Research Lab in Redmond.
Starting point is 00:03:10 So is that actually a group, the Algorithms group? There is a group called the Algorithms group. So of course, in Machine Learning and Optimization, we also do a lot of algorithms. In fact, it is the main focus of our group, but specifically around machine learning. So you're in the Machine Learning and optimization side of this group. Yes. And you also work on algorithms. But what gets you particularly up in the morning?
Starting point is 00:03:33 What questions are you asking? What problems are you solving? So I guess in general, I am really passionate about online decision making decision-making in general. I think no matter what you think will happen in the future, humans or robots or AI, there will always be a need for making decisions. And the questions, the mathematics that surround decision-making, I find them really fascinating, and they are still in their infancy.
Starting point is 00:03:58 So there is really a lot to be done there. And somehow, traditionally, pure mathematicians have not looked at those questions so much. And so there is really this gap between beautiful mathematics that has been developed, say, for physics or slightly more recently for computer science. But I would argue that decision making, online decision making, is just as important today as those more traditional topics. How does your group doing very theoretical work on the math of machine learning interact with other groups at Microsoft Research or even the
Starting point is 00:04:32 discipline writ large? Right. So, you know, these days in the media, we hear a lot about artificial intelligence. As much as we would want to have it, an artificial intelligence that, you know, make recommendations that tells us what we should cook for the next day, you know, help us sleep better, whatever. It doesn't exist yet. And what this means is that we are still very much working on the foundations. So we have a couple of techniques that do work great, including gradient descent, convolutional neural networks, things like that. They really do work very well. And naturally, big companies are investing a lot in those and trying to, you know, engineer
Starting point is 00:05:13 around them as much as possible, get the full stack, get the hardware ready, you know, all of those things. But there are many, many other things that we would want to do that are out of reach for the current techniques. So what we really need to do is to develop fundamentally new tools. And that's what our group is doing. Well, we couldn't talk about the mathematics of machine learning and sequential decision making without talking about the multi-armed bandit problem. It's known, as you say, to be simple. I'm making air quotes here, and yet incredibly rich.
Starting point is 00:05:49 So explain this essential problem and tell us what's new in the research right now. Right. So the multi-armed bandit problem is actually the first real research problem that I looked at now a little bit more than a decade ago. So it's a mathematical question that conceptualizes the problem of exploration versus exploitation. So, you know, in our everyday life, we have to deal with this problem. You know, do we order from the same restaurant that we know we love or do we try this new restaurant that we don't really know yet whether it's good or not? So maybe I should explain where does the name come from. That's a good idea. Yeah.
Starting point is 00:06:20 So in American slang, a one-armed bandit is just a slot machine at a casino. And you can imagine that in the multi-armed bandit, you're facing, you know, many of those one-armed bandits. And for every coin that you have, you have to choose which bandit you're going to put your coin in and, you know, pull the arm. So we refer to the different actions as different arms of the bandit. Okay. So that's where the name comes from. And it really got popularized in the 1950s by Herbert Robbins. And at the time, it's actually quite interesting. The multi-armed bandit problem was viewed as a negative result in the sense that in the 1930s, people developed classical statistics. So people like Fisher or Neiman and et cetera.
Starting point is 00:07:03 So they developed classical statistics, and then they wanted very naturally to move to online statistics. So what happens if the data is not given to you all at once, but it's coming to you sequentially? And then they realized that the questions were really tricky, really difficult. And then Robbins came along and said, look at this multi-armed bandit problem. It's so simple, yet we have no idea how to solve it. So it was more viewed as a negative problem.
Starting point is 00:07:27 But then what happened is that in the early 2000, a huge, a very important application appeared on the scene, which is ad placement on the internet. So ad placement on the internet is exactly a bandit problem. You can imagine each user coming to your website as, you know, one time step of your problem. And then for that user, you have to select which ad to present. So each ad is an arm of your bandit. And then when you present the ad, you see whether the person clicks or not. You see whether the treatment was efficient or not. And then you move on to the next one. So it's a new important
Starting point is 00:07:58 application of multi-armed bandit. So now what has happened since a hundred years ago? So what has happened is that now we really care about applications where the number of actions, the number of arms is huge. It's very, very large. Another application, in fact, is website display. So you can imagine that you have many different ways to display your website. And, you know, users are going to prefer certain ways. But this is a combinatorial problem.
Starting point is 00:08:24 You know, whether you place the ad at the top or at the bottom of the page, whether you use the color green or the color red. So you have many, many possible choices, which leads to a combinatorial explosion of possibilities. And in turn, this means that you have to really think very deeply on how to leverage the experience that you had with a certain arm to other arms. You have to understand what are the correlations between those different arms. So this is what modern multi-armed bandits is about. Let's talk about a project you're involved in on the foundations of optimization.
Starting point is 00:09:11 So optimization methods are, as you call them, the engine of machine learning algorithms. Give us an overview of the research directions in this arena, particularly convex optimization, and explain why they're important to the theory behind machine learning algorithms. So today's optimization problem in machine learning are really characterized by two key features. One of them is that they deal with optimization problem in very high dimension. So by that, what I mean is the number of parameters of your model. These are the number of variables that you're going to do optimization over. We are not talking about optimizing a single parameter in dimension one
Starting point is 00:09:51 or two or three parameters in dimension two or three that are easy to visualize for humans. We're talking about millions, tens of millions, probably billions in the future. That's the space we're dealing with in optimization today. So that's one of the big obstacles. The other one is just the sheer amount of data. So the data is so huge
Starting point is 00:10:11 that you have to be very careful about the operations that you do with them. You cannot visit your whole data set a million times. It's just not possible. You have to do that more parsimoniously. So this begs the question in turn, when you go back to the optimization literature, on how to make the most of the calculations that you have already done.
Starting point is 00:10:37 And what are the basic calculations that you can do to optimize a function? Well, the basic thing you can do is to calculate its derivative, its gradient. And now you're asking the question, given the gradients that you have observed so far, how to best combine them to calculate the next gradient. So this optimal usage of your information, it has a very long history in optimization, in particular in the Russian school. And they developed optimal methods in the 70s and the 80s, and we're only now starting to really understand the fundamental concept behind it. Really? Yes. So it's actually a quite interesting story.
Starting point is 00:11:11 At the end of the 70s, Arkady Nemirovsky, who is the leading figure in the Russian school in optimization, he discovered that when the function that you're optimizing is smooth, you can actually do better than the simple gradient descent technique. You can leverage the past information in a better way. So he designed an accelerated technique, which later Yuri Nesterov, who is now a legendary figure in machine learning, simplified dramatically and he made a very practical algorithm out of it, which is now called Nesterov's momentum technique. So this momentum technique was first designed in the context of convex optimization as a way to obtain an optimal method. And nowadays, it's one of the most practical techniques for training deep neural networks, which are non-convex functions. And, you know, we don't really understand why momentum is working there, but that's the story of that technique. So there are things you do that you don't totally understand, but they're working. So you keep doing them.
Starting point is 00:12:10 That's right. But what our group is actually doing is trying to understand why they are working. So in particular for this momentum technique, we worked a lot on it. And a couple of years ago with my good collaborator, Yin Tat-at Lee, who is now an assistant professor at the University of Washington, we found a way to understand the momentum from a geometric perspective. And this is very useful because when you start to be able to draw a picture, you get better intuition. So this was, at least for us, a very important step towards understanding better momentum. You have two big areas of interest. Well, you have a lot of areas of interest, let's be honest. But you've noted that two big areas of your interest are the interplay between
Starting point is 00:12:52 convexity and randomness in optimization, and also inference problems on random graphs. So I wonder if you could just unpack those a little bit. Why are you interested in them? What do they address? Why are they important enough for someone with a brain like yours to tackle? So as I said before, nowadays we really deal with very high dimensional spaces. And one key technique to explore those high dimensional spaces is to use randomness. And in classical convex optimization, randomness was not really a tool that people were looking at. They were more focused on trying to understand what can you do with deterministic methods.
Starting point is 00:13:33 So it seems to me that there was, and there still is to some extent, a real opportunity to bring in those randomized algorithm tools into convex optimization and hopefully get a couple of breakthroughs on the way. So that was for the first area. The second area actually was one of my areas of focus for a couple of years,
Starting point is 00:13:54 which comes from the observation that nowadays many data sets come in the form of a graph. And what you want to do is you want to do basic statistics with it. What are the basic And what you want to do is you want to do basic statistics with it. What are, you know, the basic things that you want to compute with those graphs? To my surprise, there were some questions that seemed really fundamental to me that were not looked at. So there are a couple of models which are very well known now
Starting point is 00:14:19 in random graph theory and in network analysis to explain, say, how the Facebook graph grew or, you know, how the Twitter graph is growing. And one basic question that I wondered about is, OK, so you have those evolution processes, but they start with some seed graph. So the seed is not empty. It's not like, you know, Facebook started with one user. It was actually seeded with hundreds of users or maybe more. So what I was curious about is that there was this discrepancy between the reality where people were designing very carefully a good seed graph in order to enhance the growth of the network
Starting point is 00:14:57 and the mathematical analysis, which was thinking about what if the seed is empty? So the first question that I asked was, is a theory without a seed relevant to the practice with a seed? And what we proved is that it's not. That, you know, if you have two different seeds, you can get two completely different graph evolution processes. Quelle surprise!
Starting point is 00:15:20 Yeah, it was actually a surprise, yeah. We didn't expect it. We thought that maybe, you know, in the very large sample regime, at the end of time, once everything has settled down, then maybe the seed doesn't matter anymore. But that's not the case. Interesting. You have a blog called I'm a Bandit, and it deals with random topics in optimization, probability and statistics. That's right. So first of all, I'm glad you're not knocking over banks. But the big topic here is bandit convex optimization. Talk about this in general, and then talk about your polynomial time algorithm for bandit convex optimization.
Starting point is 00:15:58 We'll go kind of deep here. I'd like you to get as technical as you like, but also we'll nod to the fact that you've got a great video on this on the website and three big blog posts on it that kind of goes deep into the weeds. Yeah. Yes. So banded convex optimization is the following problem. So as I said before, the new research on the multi-armed banded problem is really interested in the case where there are many, many actions. So now you can ask, okay, what about the limit case where there is actually an infinite number of actions, a continuum of actions? So then, of course, if you don't make any assumption
Starting point is 00:16:33 about how the rewards are shaped over this set of actions, there is no hope. You cannot hope to find what is the best action. But as we discussed already, one of the very basic assumptions which has found a lot of application in machine learning is convexity. So then the question becomes, in these banded scenarios where you have a continuum of actions and the loss function is shaped as a
Starting point is 00:16:57 convex function, what can you do? Can you still recover the optimal guarantees that we had before with just two or three actions? So this was an open problem that was published in 2004. And when I joined Microsoft Research five years ago, the theory group and some of the members in the theory group, in particular, Ofer Dekel, who is now the manager of the machine learning and optimization group, and Yuval Peres, who is a distinguished probabilist, they were talking and thinking about this problem. And when I joined, you know,
Starting point is 00:17:27 I was coming with my own expertise on bandits and I was really excited to join the group and start thinking about the questions that those guys cared about. And we were very fortunate that we made good progress on it. So our first stride into the problem was in fact to show something a little bit unique in machine learning, which was we showed that the problem is solvable, but we didn't actually give a technique to solve it.
Starting point is 00:17:54 So that was both interesting and disappointing. That don't make no sense. Yes, it was strange. We were all surprised by this result to some extent, but also very excited because it meant that there is something to be discovered there. The problem is solvable, but it is certainly not solvable with the current techniques that we have. I want you to drill in there for a second because you're actually saying something that baffles me. You could solve it, but you don't know how you solved it? So it's more as follows.
Starting point is 00:18:26 There was this very concrete question. Can you get square root T regret for banded convex optimization? What we showed is that the answer to this question is yes. Yes, it is in principle possible to get square root T regret for banded convex optimization, but we did not give an algorithm that actually achieves this regret guarantee. So then the question remained, can you actually get a polynomial time algorithm, like an efficient algorithm, something that would actually work on real data and that get this square root T type behavior?
Starting point is 00:19:00 So that was the state of this problem when Yintat Lee came for an internship with me in the summer 2016. And we started thinking about this question. And again, we were lucky to come up with a new approach based on kernels. So kernel methods have a long and distinguished history in machine learning. In fact, they were the most popular technique before deep learning overtook the field. And what we found is a connection between bandit type information and kernels, that there is really an interplay between the two. And while we were exploring this to our own bafflement, we discovered that there is a connection between
Starting point is 00:19:46 kernels and what are called Bernoulli convolutions. So Bernoulli convolutions, they are a mathematical object that were defined in the 1930s that Paul Erdos, the famous mathematician, thought of as one of the most beautiful objects in mathematics. And these Bernoulli convolutions, they are directly related to the Gaussian kernel in machine learning. And they are exactly what we needed to solve banded convex optimization. And it turned out that the world expert on this topic was, you know, my office mate sitting next door, Yuval Perez. So it was very convenient that I could go to him
Starting point is 00:20:26 and ask him all the questions around this. And then later, Ronan Eldan joined us on this project, who was also one of the members of the team with which we proved that there is a solution, but we don't know what is the solution. I just have the biggest grin on my face. Every podcast, I hear things that delight me, and nobody can see my face. Every podcast I hear things that delight me and nobody can see my face. Let's talk about task systems as a level set, tell us what they are and what they're
Starting point is 00:21:07 for and then talk about metrical task systems and how they're different. So we have discussed so far the multi-armed bandit problem, which is a very fundamental and important problem, but it's missing a notion of a state. So in most of the real life problems that we care about, it's not like you take an action and it has no consequence on the future. In fact, when you take action, often the state of the world or the state of the system that you're acting on is evolving, and you have to take that into account. So task systems are a very fundamental mathematical modelization of this problem,
Starting point is 00:21:40 and it goes like this. So in task systems, you have a finite state space. I'm going to say that n is the size of the state space. And you're again going to play a sequential game, just like in Multi-Arm Bandit. And at every time step, what you're presented with is a task. And what is a task? A task is a cost function on the state space.
Starting point is 00:21:59 So a task just tells you for every state how long it's going to take you to complete the task if you're sitting in that state. So now in the game you're sitting in a certain state, you know, because of what you have done in the past, you see the new task coming at you and the question is to which new state do you move to complete the task? Now of course stated like this you would just move to the state that has the minimal completion time for the task. But that's where the metrical part comes into play, the metrical task system. So the metrical part is that states, they can be far away from each other. It's not like you can easily move, you know, from a state where, say, you know no mathematics to a state where you're a world expert in mathematics.
Starting point is 00:22:42 This is going to cost you a lot to make this movement, you know, to change your own state. So now you can think of the state space as a metric space, meaning that there is a notion of distance between two states. And when you're sitting in a certain state and you get your new task, when you move to a new state, you're going to pay the distance to travel there
Starting point is 00:23:00 plus the completion time. So these metrical task systems were introduced in the 80s in the theoretical computer science community. And as I defined it, it's a very discrete problem. And as expected, people used discrete mathematics technique to try to solve this problem. So the big conjecture, the big open problem in that subfield is the following. So you say that an algorithm for the metrical task system problem is competitive if it has the following property. No matter what the task sequence is,
Starting point is 00:23:35 the algorithm's performance, the total cost of the algorithm, is going to be comparable to the best it could have done had it known in hindsight what was the sequence of tasks to be presented. That's what it means to be competitive. It basically means you can see into the future. So we're going to say that you're alpha competitive if not only you're comparable to the best you could have done, but you're at most paying alpha times what the best could have
Starting point is 00:23:57 been. So, you know, maybe you want to be two competitive or three competitive. And the conjecture in that field is that you can be log n competitive for any metrical task system. So we're not very far from proving this conjecture. We are basically a polynomial factor off. But the important thing is that people were really looking at this problem from a discrete mathematics perspective. And what we brought in a year ago with my great collaborators, Michael Cohen, Yin Tat-Li, and James Lee, is that we looked at it from a convex optimization perspective. So we view this very discrete problem in continuous space and continuous time. And once you make that modification,
Starting point is 00:24:39 then it's very natural to again talk about gradient descent. In fact, this other strategy, which is called mirror descent, and using this new continuous strategy, we were able to refine the guarantees that we know about this problem. So this was for the discrete metrical task system. Now there is a convex version of the metrical task system because just as for discrete multi-armed bandit, there is a lot of interest into going into the case where
Starting point is 00:25:05 there is a huge set of actions, set of arms. It's the same thing for metrical task system. You might be interested in a case where the set of states is in fact infinite. So then if n is infinity, you know, log n is also infinity and you're not competitive. So then the question, yes, indeed, indeed, this is very disappointing. And so the question is, just in metrical task system, when you task as a convex function over the state space, can you be competitive at all? So this was an open problem from the early 90s
Starting point is 00:25:38 from Lineal and Friedman. And what they proved is that in dimension two, you can be competitive. Very restrictive. I mean, certainly not talking about the millions of dimensions that we opened with. And what we just proved recently, we just put out the paper a month ago. We proved that indeed you can be competitive for convex metrical task system in any dimension. But the competitive ratio that you get is going to depend on the dimension.
Starting point is 00:26:04 And a great open question is whether you can get polynomial in the dimension scaling. That seems very difficult. All the rest of it seems very difficult also. And this then, as you referenced, solves a 30-year-old problem. Right. So in this literature, they don't talk about convex metrical task system. Instead, they talk about this problem that they call chasing convex bodies. So in chasing convex bodies, you are moving a point in Euclidean space. So let's say you're moving a
Starting point is 00:26:32 point with d parameters and every time step, you're given a convex set of constraints. And what you have to do is you have to move your set of parameters inside this convex set. So you have to chase the convex set. You have to go inside the convex set. And you see this sequence of convex set. You have to go inside the convex set. And you see this sequence of convex sets
Starting point is 00:26:48 that are given to you, and you keep moving. And then again, you want to be competitive. So you want to compare yourself to the best you could have done in hindsight had you known the sequence of convex bodies right from the start. And so indeed, we do show that you can be competitive to chase convex bodies. So going philosophical a little bit here,
Starting point is 00:27:30 we talked about some of the issues with what we call the current machine learning culture versus the culture of theoretical computer science, which is more traditional. What are some of the things we ought to be paying attention to right now? Is there anything that keeps you up at night? Anything we ought to be thinking about? Yeah, sure. So I personally believe that science is a fundamentally slow enterprise. It takes time to digest new concepts. It takes time to really recognize whether something that looks new on the face of it is actually new or it's just, you know, a slightly different point of view, a slightly different perspective.
Starting point is 00:27:57 Our time is a little bit different from, you know, say two decades ago. And one reason is just because the amount of data that we have. So we have this influx of data and people want to do something with it. So this, I believe, is not going away. And there will be a need for, you know, data scientists and people who really think deeply about how to use and how to leverage information that you can acquire from data. So that will not change. What I worry a little bit more about is the culture which is centered around claims which have no, as far as I can tell, no mathematical basis. You want science to be based on scientific facts.
Starting point is 00:28:35 And that's what I like about the theoretical computer science community is that they have well-established, long-standing open problems. And if you do make progress on those problems, then people will listen to you rightfully because it means that you have some new techniques, some things that people before couldn't do. So there are pros and cons for these very well-defined open problems. The cons is, of course, that those open problems, they are not solving burning practical questions that we need to resolve right now.
Starting point is 00:29:06 So then, you know, you come up with a solution and it's not like people are going to implement your algorithm right away. So these are maybe not the most important question if you really want to get new algorithm right away. But on the other end, progress on this question, this is deep progress. This is progress where you really develop new tools, which hopefully, maybe a couple of years from now, some more practical methods are going to emerge out of it. Tell us a bit about your history, academic and otherwise. It's always great to hear personal anecdotes about where people came from and how they ended up where they are. What's your story? Sure.
Starting point is 00:29:40 So I grew up in a small town around Strasbourg in France, next to Germany. What's it called? It's called Ober-Osbergen, which means the house at the bottom of the hill. And then, you know, there is a city next to it, which is Mitte-Osbergen, the house on the middle of the hill. And then there is Nieder-Osbergen, the house at the top of the hill. So these are the three towns where, you know, I was hanging out until I was 18. So during those first 18 years, yeah, I didn't know anything about mathematics, science. But then at 18, I went into this thing which is called Prépa.
Starting point is 00:30:18 So that's famous in France. That's where you train to go into the Grandes Écoles, Engineering Grandes Écoles or, you know, École Normale Supérieure, which is where I went, where you learn how to do mathematics. So that's where I really fell in love with mathematics. I really always remember it, like the very first lecture, I immediately thought, okay, this is the first time that somebody is saying something that really resonates with me. So two years of prepa, one year in École Normale Supérieure, and then I moved on to do my PhD. What I thought at the time I wanted to do was mathematics of the brain, but then I realized that there was no such thing.
Starting point is 00:30:59 But as I realized that, I also noticed this field of machine learning, which seemed very exciting. So I started to learn more about it. And then I decided to do a PhD in machine learning and specifically in the multi-armed bandit problem. So that's what I did in France. So how did you end up at Microsoft Research? After my PhD, I did one year in Barcelona for my postdoc, which was just amazing scientifically and otherwise. And then with my wife, we moved to Princeton, New Jersey, where I was an assistant professor in the OR department. I stayed there for three years.
Starting point is 00:31:36 And then I did a semester at the Simons Institute in Berkeley. And that's where I met Yuval Perez. And that's also where I discovered the field of theoretical computer science, which, again, I fell in love with. And that's really where I could see, you know, my second calling. Let's say after mathematics, really theoretical computer science, I thought the people in the field were amazing. I thought the questions they were looking at were fantastic. I saw their work ethic was very refreshing. So that's where I met Yuval Perez. And then after that, it's history. That's why I joined Microsoft Research. I came to visit and they made me
Starting point is 00:32:17 an offer and I decided to join. So as of two days ago, you have a pretty cool announcement to make. Right. So our paper, joint work with researchers in RIA in France and in Tatli, again in Tatli, just got awarded the best paper at NIPS. So it's pretty exciting. Tell us a little bit about the paper and what it's adding to the literature in the field. Right. So it's, in fact, tying back to one of my main area of interest, which is convexity and randomness. So the question we were looking at is the problem of distributed convex optimization. So what happens when the objective function that you want to optimize, which depends on your data, it's not all in a single place, but the data is distributed among many servers.
Starting point is 00:33:03 Then how do you optimize both the communication between the different servers as well as the work that each server is doing? So that's the question we were looking at. Of course, we weren't very far from being the first ones to look at. There are literally hundreds, if not thousands, of papers written on this topic. But somewhat surprisingly,
Starting point is 00:33:21 it seems that nobody has really looked at optimal methods, meaning provably optimal, both upper and lower bounds. So in classical serial optimization, this is again work done by the Russian school, Arkady Nemirovsky, Yuri Nesterov, and those people. But they didn't look so much at the distributed scenario. And one reason I think is because in this distributed scenario, in fact, randomness is key. Randomization is very, very important. And that's what our paper proves, is that if you use randomization, then you can get optimal guarantees.
Starting point is 00:33:56 That's awesome. Congratulations. Thank you. Sebastian, as we close, I always ask my guests to offer some parting advice to our listeners, whether it be in the form of inspiration for future research or challenges that remain to be solved. And your recent paper kind of teased that up a little bit. What would you say to researchers who might be just getting interested in this field? Maybe a good advice to any starting researcher
Starting point is 00:34:18 is to become an expert in a very narrow field or even narrow question, but really become the world expert in that question. And that will yield rewards for a very long time. So that's what I did with the multi-armed bandit problem. And I think this was very successful. And, you know, later on, I could see connections with this problem that I wouldn't have made if I didn't know all of this about the multi-armed bandit problem. Now, in terms of research directions, there are really many, many questions emerging. So one of them we already mentioned is in this chasing convex body problem.
Starting point is 00:34:50 We showed that you can be competitive, but our dependency on the dimension is very bad. It's exponential in the dimension. So a great open question, but difficult one is whether you can be polynomial. Another open question is this metrical task system to actually solve the longstanding conjecture or whether you can be log n competitive. This is a fantastic question that relates to how do you define a good notion of entropy
Starting point is 00:35:16 for general metric spaces? It's really a very deep and beautiful question. We don't know how to solve this. Related to this latest best Paper Award at NIPS, in fact, you can think about the multi-armed bandit version of this problem, which is exactly distributed bandit. This problem is basically entirely open. And I believe it's a very rich question.
Starting point is 00:35:39 And in fact, it ties into a broader question of how do you think about multi-agent problem? And this is a very rich field yet to emerge. into a broader question of how do you think about multi-agent problem? And this is, yeah, a very rich field yet to emerge. Sebastian Bubeck, thank you very much for coming on the podcast. It's been really, really interesting. My pleasure.
Starting point is 00:35:57 I really enjoyed the questions. To learn more about Dr. Sebastian Bubeck and how theoretical computer scientists are tackling the multi-armed bandit and other combinatorial conundrums, visit Microsoft.com slash research.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.