Microsoft Research Podcast - 053 - Chasing convex bodies and other random topics with Dr. Sebastien Bubeck

Starting point is 00:00:00 You want science to be based on scientific facts. That's what I like about the theoretical computer science community is that they have well-established long-standing open problems. If you do make progress on those problems, then people will listen to you rightfully because it means that you have some new techniques, some things that people before couldn't do.

Starting point is 00:00:22 You're listening to the Microsoft Research Podcast, a show that brings you closer to the cutting edge of technology research and the scientists behind it. I'm your host, Gretchen Huizinga. Dr. Sebastian Bubeck is a mathematician and a senior researcher in the Machine Learning and Optimization Group at Microsoft Research. He's also a self-proclaimed bandit who claims that despite all the buzz around AI, it's still a science in its infancy. That's why he's devoted his career to advancing the mathematical foundations behind the machine learning algorithms behind AI. Today, Dr. Bubek

Starting point is 00:01:02 explains the difficulty of the multi-armed bandit problem in the context of a parameter and data-rich online world. He also discusses a host of topics from randomness and convex optimization, to metrical task systems and log-in competitiveness, to the surprising connection between Gaussian kernels and what he calls some of the most beautiful objects in mathematics. That and much more on this episode of the Microsoft Research Podcast. Sebastian Bubeck, welcome to the podcast. Thank you.

Starting point is 00:01:52 You're a senior researcher in the newly minted or relatively newly minted machine learning and optimization group at MSR, which used to be called the theory group. And now there's a couple of subgroups under this new umbrella. So give us a quick picture of the org chart and who does what. And then tell us how we could best frame the work you do. What gets you up in the morning? Sure. So I will start with the new organization that we have here at Microsoft Research in Redmond. So I guess a little bit more than a year ago,

Starting point is 00:02:13 we started a new lab, which is called Microsoft Research AI for artificial intelligence. So this is following, of course, all the hype and the real scientific excitement around machine learning. So we created Microsoft Research AI, and some of the members all the hype and the real scientific excitement around machine learning. So we created Microsoft Research AI, and some of the members of the former Siri group decided to join this new endeavor. And that's when we decided to create the Machine Learning and Optimization

Starting point is 00:02:35 group. And then the other half of the group went to do the algorithms group in just a basic Microsoft Research Lab in Redmond. So is that actually a group, the algorithms group? There is a group called the algorithms group. So of course, in machine learning and optimization, we also do a lot of algorithms. In fact, it is the main focus of our group, but specifically around machine learning. So you're in the machine learning and optimization side of this group. Yes. And you also work on algorithms. But what gets you particularly up in the morning?

Starting point is 00:03:06 What questions are you asking? What problems are you solving? So I guess in general, I am really passionate about online decision-making or decision-making in general. I think no matter what you think will happen in the future, humans or robots or AI, there will always be a need for making decisions.

Starting point is 00:03:24 And the questions, the mathematics that surround decision-making, I find them really fascinating, and they are still in their infancy. So there is really a lot to be done there. And somehow, traditionally, pure mathematicians have not looked at those questions so much. So there is really this gap between beautiful mathematics that has been developed, say, for physics or slightly more recently for computer science. But I would argue that decision making, online decision making is just as important today as those more traditional topics.

Starting point is 00:03:57 How does your group doing very theoretical work on the math of machine learning interact with other groups at Microsoft Research or even the discipline writ large. Right. So, you know, these days in the media, we hear a lot about artificial intelligence. As much as we would want to have it, an artificial intelligence that, you know, make recommendations,

Starting point is 00:04:19 that tells us what we should cook for the next day, you know, help us sleep better, whatever, it doesn't exist yet. And what this means is that we are still very much working on the foundations. So we have a couple of techniques that do work great, including gradient descent, convolutional neural networks, things like that. They really do work very well. And naturally, big companies are investing a lot in those and trying to, you know, engineer around them as much as possible, get the full stack, get the hardware ready, you know, all of those things.

Starting point is 00:04:52 But there are many, many other things that we would want to do that are out of reach for the current techniques. So what we really need to do is to develop fundamentally new tools. And that's what our group is doing. Well, we couldn't talk about the mathematics of machine learning and sequential decision-making without talking about the multi-armed bandit problem. It's known, as you say, to be simple, I'm making air quotes here, and yet incredibly rich. So explain this essential problem and tell us what's new in the research right now. Right. So the multi-armed bandit problem is actually the first real research problem that I looked at now a little bit more than a decade ago. So it's a mathematical question

Starting point is 00:05:32 that conceptualizes the problem of exploration versus exploitation. So, you know, in our everyday life, we have to deal with this problem, you know, do we order from the same restaurants that we know we love or do we try this new restaurant that we don't really know yet whether it's good or not? So maybe I should explain where does the name come from? That's a good idea. Yeah. So in American slang, a one-armed bandit is just a slot machine at a casino. And you can imagine that in the multi-armed bandit, you're facing, you know, many of those one-armed bandits. And for every coin that you have, you have to choose which bandit you're going to put your coin in and, you know, pull the arm. So we refer to the different actions as different arms of the bandit.

Starting point is 00:06:16 OK, so that's where the name comes from. And it really got popularized in the 1950s by Herbert Robbins. And at the time, it's actually quite interesting. The multi-armed bandit problem was viewed as a negative result in the sense that in the 1930s, people developed classical statistics. So people like Fisher or Neiman and et cetera. So they developed classical statistics

Starting point is 00:06:38 and then they wanted very naturally to move to online statistics. So what happens if the data is not given to you all at once, but it's coming to you sequentially? And then they realized that the questions were really tricky, really difficult. And then Robbins came along and said, look at this multi-armed bandit problem. It's so simple, yet we have no idea how to solve it.

Starting point is 00:06:58 So it was more viewed as a negative problem. But then what happened is that in the early 2000, a huge, a very important application appeared on the scene, which is ad placement on the internet. So ad placement on the internet is exactly a bandit problem. You can imagine each user coming to your website as, you know, one time step of your problem. And then for that user, you have to select which ad to present. So each ad is an arm of your bandit. And then when you present the ad, you see whether the person clicks or not. You see whether the treatment was efficient or not.

Starting point is 00:07:30 And then you move on to the next one. So it's a new important application of multi-armed bandit. So now what has happened since 100 years ago? So what has happened is that now we really care about applications where the number of actions, the number of arms is huge. It's very, very large. Another application, in fact, is a website display. So you can imagine that you have many

Starting point is 00:07:50 different ways to display your website and, you know, users are going to prefer certain ways. But this is a combinatorial problem. You know, whether you place the ad at the top or at the bottom of the page, whether you use the color green or the color red. So you have many, many possible choices, which leads to a combinatorial explosion of possibilities. And in turn, this means that you have to really think very deeply on how to leverage the experience that you had with a certain arm to other arms. You have to understand what are the correlations between those different arms. So this is what modern multi-armed bandits is about.

Starting point is 00:08:39 Let's talk about a project you're involved in on the foundations of optimization. So optimization methods are, as you call them, the engine of machine learning algorithms. Give us an overview of the research directions in this arena, particularly convex optimization, and explain why they're important to the theory behind machine learning algorithms. So today's optimization problem in machine learning are really characterized by two key features. One of them is that they deal with optimization problem in very high dimension. So by that, what I mean is the number of parameters of your model. These are the number of variables that you're going to do optimization over.

Starting point is 00:09:21 We are not talking about optimizing a single parameter in dimension one or, you know, two or three parameters in dimension two or three that are easy to visualize for humans. We're talking about millions, tens of millions, probably billions in the future. That's the space we're dealing with in optimization today. So that's one of the big obstacles. The other one is just the sheer amount of data. So the data is so huge that you have to be very careful about the operations that you do with them. You cannot visit your whole data set a million times. It's just not possible. You have to do that more parsimoniously. So this begs the question, in turn, when you go back to the optimization literature,

Starting point is 00:10:01 on how to make the most of the calculations that you have already done. And what are the basic calculations that you can do to optimize a function? Well, the basic thing you can do is to calculate its derivative, its gradient. And now you're asking the question, given that the gradients that you have observed so far, how to best combine them to calculate the next gradient. So this optimal usage of your information, it has a very long history in optimization, in particular in the Russian school. And they developed optimal methods in the 70s and the 80s.

Starting point is 00:10:36 And we're only now starting to really understand the fundamental concept behind it. Really? Yes. So it's actually a quite interesting story. At the end of the 70s, Arkady Nemirovsky, who is the leading figure in the Russian school in optimization, he discovered that when the function that you're optimizing is smooth, you can actually do better than the simple gradient descent technique.

Starting point is 00:11:00 You can leverage the past information in a better way. So he designed an accelerated technique, which later Yuri Nesterov, who is now a legendary figure in machine learning, simplified dramatically and he made a very practical algorithm out of it, which is now called Nesterov's momentum technique. So this momentum technique was first designed in the context of convex optimization

Starting point is 00:11:24 as a way to obtain an optimal method. And nowadays, it's one of the most practical techniques for training deep neural networks, which are non-convex functions. And, you know, we don't really understand why momentum is working there, but that's the story of that technique. So there are things you do that you don't totally understand, but they're working, so you keep doing them. That's right. But what our group is actually doing is trying to understand why they are working. So in particular for this momentum technique, we worked a lot on it. And a couple of years ago with my good collaborator, Yin-Tat Lee, who is now an assistant professor at the University of Washington, we found a way to understand the momentum from a geometric

Starting point is 00:12:05 perspective. And this is very useful because when you start to be able to draw a picture, you get better intuition. So this was, at least for us, a very important step towards understanding better momentum. You have two big areas of interest. Well, you have a lot of areas of interest, let's be honest. But you've noted that two big areas of your interest are the interplay between convexity and randomness in optimization, and also inference problems on random graphs. So I wonder if you could just unpack those a little bit. Why are you interested in them? What do they address? Why are they important enough for someone with a brain like yours to tackle? So as I said before, nowadays we really deal with very high dimensional spaces. And one key technique to explore those high dimensional spaces is to use randomness. And

Starting point is 00:12:55 in classical convex optimization, randomness was not really a tool that people were looking at. They were more focused on trying to understand what can you do with deterministic methods. So it seems to me that there was, and there still is to some extent, a real opportunity to bring in those randomized algorithm tools into convex optimization and hopefully get a couple of breakthroughs on the way. So that was for the first area. The second area actually was one of my areas of focus for a couple of years, which comes from the observation that nowadays many data sets come in the form of a graph.

Starting point is 00:13:35 And what you want to do is you want to do basic statistics with it. What are, you know, the basic things that you want to compute with those graphs? To my surprise, there were some questions that seemed really fundamental to me that were not looked at. So there are a couple of models which are very well known now in random graph theory and in network analysis to explain, say, how the Facebook graph grew or, you know, how the Twitter graph is growing. And one basic

Starting point is 00:14:02 question that I wondered about is, okay, so you have those evolution processes, but they start with some seed graph. So the seed is not empty. It's not like, you know, Facebook started with one user. It was actually seeded with hundreds of users or maybe more. So what I was curious about is that there was this discrepancy between the reality where people were designing very carefully a good seed graph in order to enhance the growth of the network and the mathematical analysis, which was thinking about what if the seed is empty. So the first question that I asked was, is a theory without a seed relevant to the practice with a seed? And what we proved is that it's not. That,

Starting point is 00:14:45 you know, if you have two different seeds, you can get two completely different graph evolution processes. Quelle surprise. Yeah, it was actually a surprise. Yeah. We didn't expect it. We thought that maybe, you know, in the very large sample regime, at the end of time, once everything has settled down, then maybe the seed doesn't matter anymore. But that's not the case. Interesting. You have a blog called I'm a Bandit, and it deals with random topics in optimization, probability and statistics.

Starting point is 00:15:16 That's right. So first of all, I'm glad you're not knocking over banks. But the big topic here is bandit convex optimization. Talk about this in general and then talk about your polynomial time algorithm for bandit convex optimization. We'll go kind of deep here. I'd like you to get as technical as you like, but also we'll nod to the fact that you've got a great video on this on the website and three big blog posts on it that kind of goes deep into the weeds. Yeah. Yes. So banded convex optimization is the following problem. So as I said before,

Starting point is 00:15:50 the new research on the multi-armed banded problem is really interested in the case where there are many, many actions. So now you can ask, okay, what about the limit case where there is actually an infinite number of actions, a continuum of actions? So then, of course, if you don't make any assumption about how the rewards are shaped over this set of actions,

Starting point is 00:16:10 there is no hope. You cannot hope to find, you know, what is the best action. But as we discussed already, one of the very basic assumptions which has found a lot of application in machine learning is convexity. So then the question becomes, in these bandit scenarios, where you have a continuum of actions and the loss function is shaped as a convex function, what can you do?

Starting point is 00:16:31 Can you still recover the optimal guarantees that we had before with just two or three actions? So this was an open problem that was published in 2004. And when I joined Microsoft Research five years ago, the theory group and some of the members in the theory group, in particular, Ofer Dekel, who is now the manager of the machine learning and optimization group, and Yuval Peres, who is a distinguished probabilist, they were talking and thinking about this problem.

Starting point is 00:16:59 And when I joined, you know, I was coming with my own expertise on bandits, and I was really excited to join the group and start thinking about the questions that those guys cared about. Yeah. And we were very fortunate that we made good progress on it. So our first stride into the problem was, in fact, to show something a little bit unique in machine learning, which was we showed that the problem is solvable, but we didn't actually give a technique to solve it. So that was both interesting and disappointing. That don't make no sense.

Starting point is 00:17:34 Yes, it was strange. We were all surprised by this result to some extent, but also very excited because it meant that there is something to be discovered there. The problem is solvable, but it is certainly not solvable with the current techniques that we have. I want you to drill in there for a second because you're actually saying something that baffles me. You could solve it, but you don't know how you solved it? So it's more as follows. There was this very concrete question.

Starting point is 00:18:02 Can you get square root T regret for banded convex optimization? What we showed is that the answer to this question is yes. Yes, it is in principle possible to get square root T regret for banded convex optimization, but we did not give an algorithm that actually achieves this regret guarantee.

Starting point is 00:18:22 So then the question remained, can you actually get a polynomial time algorithm, like an efficient algorithm, something that would actually work on real data and that gets this quality type behavior? Okay. So that was the state of this problem when Yin-Tat Lee came for an internship with me

Starting point is 00:18:40 in the summer 2016. And we started thinking about this question. And again, we were lucky to come up with a new approach based on kernels. So kernel methods have a long and distinguished history in machine learning. In fact, they were the most popular technique before deep learning overtook the field. And what we found is a

Starting point is 00:19:05 connection between bandit-type information and kernels, that there is really an interplay between the two. And while we were exploring this to our own bafflement, we discovered that there is a connection between kernels and what are called Bernoulli convolution. So Bernoulli convolutions, they are a mathematical object that were defined in the 1930s that Paul Erdos, the famous mathematician, thought of as one of the most beautiful objects in mathematics. And these Bernoulli convolutions, they are directly related to the Gaussian kernel in machine learning. And they are exactly what we needed to solve banded convex optimization. And it turned out that the world expert on this topic was, you know, my office mate sitting next door, Yuval Perez. So it was very convenient that I could go to him and ask him all the questions around

Starting point is 00:20:01 this. And then later, Ronan Eldan joined us on this project, who was also one of the members of the team with which we proved that there is a solution, but we don't know what is the solution. I just have the biggest grin on my face. Every podcast I hear things that delight me and nobody can see my face. Let's talk about task systems. As a level set, tell us what they are and what they're for, and then talk about

Starting point is 00:20:44 metrical task systems and how they're different. So we have discussed so far the multi-armed bandit problem, which is a very fundamental and important problem, but it's missing a notion of a state. So in most of the real-life problems that we care about, it's not like you take an action and it has no consequence on the future. In fact, when you take action, often the state of the world or the state of the system that you're acting on is evolving. And you have to take that into account. So task systems are a very fundamental mathematical modelization of this problem. And it goes like this. So in task system, you have a finite state space.

Starting point is 00:21:17 I'm going to say that n is the size of the state space. And you're again going to play a sequential game, just like in Multi-Arm Bandit. And at every time step, what you're presented with is a task. And what is a task? A task is a cost function on the state space. So a task just tells you for every state, how long it's going to take you to complete the task if you're sitting in that state. So now in the game, you're sitting in a certain state, you know, because of what you have done in the past, you see the new task coming at you, and the question is, to which new state do you move to complete the task?

Starting point is 00:21:51 Now, of course, stated like this, you would just move to the state that has the minimal completion time for the task. But that's where the metrical part comes into play, the metrical task system. So the metrical part is that play, the metrical task system. So the metrical part is that states, they can be far away from each other. It's not like you can easily move, you know, from a state where, say, you know no mathematics to a state where you're a world expert in mathematics.

Starting point is 00:22:15 This is going to cost you a lot to make this movement, you know, to change your own state. So now you can think of the state space as a metric space, meaning that there is a notion of distance between two states. And when you're sitting in a certain state and you get your new task, when you move to a new state, you're going to pay the distance to travel there, plus the completion time. So these metrical task systems were introduced in the 80s in the theoretical computer science community. And as I defined it, it's a very discrete problem. And as expected, people used discrete mathematics technique to try to solve this problem. So the big conjecture, the big open problem in that subfield is the following.

Starting point is 00:22:57 So you say that an algorithm for the metrical task system problem is competitive if it has the following property, no matter what the task sequence is, the algorithm's performance, the total cost of the algorithm is going to be comparable to the best it could have done had it known in hindsight what was the sequence of tasks to be presented. That's what it means to be competitive.

Starting point is 00:23:19 It basically means you can see into the future. So we're going to say that you're alpha competitive if not only you're comparable to the best you could have done, but you're at most paying alpha times what the best could have been. So, you know, maybe you want to be two competitive or three competitive. And the conjecture in that field is that you can be log n competitive for any metrical task system. So we're not very far from proving this conjecture.

Starting point is 00:23:44 We are basically a polynomial factor off. But the important thing is that people were really looking at this problem from a discrete mathematics perspective. And what we brought in a year ago with my great collaborators, Michael Cohen, Yin-Tat Li, and James Li, is that we looked at it from a convex optimization perspective. So we view this very discrete problem in continuous space and continuous time. And once you make that modification, then it's very natural to again talk about gradient descent. In fact, this other strategy, which is called mirror descent.

Starting point is 00:24:19 And using this new continuous strategy, we were able to refine the guarantees that we know about this problem. So this was for the discrete metrical task system. Now there is a convex version of the metrical task system. Because just as for discrete multi-armed bandits, you know, there is a lot of interest into going into the case where there is a huge set of actions, set of arms. It's the same thing for metrical task system. You might be interested in a case where the set of states is set of arms. It's the same thing for metrical task system. You might be interested in a case

Starting point is 00:24:45 where the set of states is in fact infinite. So then if n is infinity, you know log n is also infinity and you're not competitive. So then the question, yes, indeed, indeed, this is very disappointing. And so the question is, just in metrical task system,

Starting point is 00:25:02 when you task as a convex function over the state space, can you be competitive at all? So this was an open problem from the early 90s, from Lineal and Friedman. And what they proved is that in dimension two, you can be competitive. Very restrictive.

Starting point is 00:25:18 I mean, certainly not talking about the millions of dimensions that we opened with. And what we just proved recently, we just put out the paper a month ago, we proved that indeed you can be competitive for convex metrical task system in any dimension, but the competitive ratio that you get is going to depend on the dimension.

Starting point is 00:25:37 And a great open question is whether you can get polynomial in the dimension scaling. That seems very difficult. All the rest of it seems very difficult also. And this then, as you referenced, solves a 30-year-old problem. Right. So in this literature,

Starting point is 00:25:54 they don't talk about convex metrical task system. Instead, they talk about this problem that they call chasing convex bodies. So in chasing convex bodies, you are moving a point in Euclidean space. So let's say you're moving a point with D parameters. And every time step, you're given a convex set of constraints. And what you have to do is you have to move your set of parameters inside this convex set.

Starting point is 00:26:15 So you have to chase the convex set. You have to go inside the convex set. And you see this sequence of convex sets that are given to you and you keep moving. And then again, you want to be competitive. So you want to compare yourself to the best you could have done in hindsight had you known the sequence of convex bodies right from the start. And so indeed, we do show some of the issues with what we call the current machine learning culture versus the culture of theoretical computer science,

Starting point is 00:27:02 which is more traditional. What are some of the things we ought to be paying attention to right now? Is there anything that keeps you up at night? Anything we ought to be thinking about? Yeah, sure. So I personally believe that science is a fundamentally slow enterprise. It takes time to digest new concepts. It takes time to really recognize whether something that looks new on the face of it is actually new or it's just, you know, a slightly different point of view, a slightly different perspective. Our time is a little bit different from, you know, say two decades ago. And one reason is just because the amount of data that we have. So we have this influx of data and

Starting point is 00:27:39 people want to do something with it. So this, I believe, is not going away. And there will be a need for, you know, data scientists and people who really think deeply about how to use and how to leverage information that you can acquire from data. So that will not change. What I worry a little bit more about is the culture which is centered around claims which have no, as far as I can tell, no mathematical basis. You want science to be based on scientific facts. And that's what I like about the theoretical computer science community is that they have well-established, longstanding open problems. And if you do make progress on those problems, then people will listen to you rightfully

Starting point is 00:28:22 because it means that you have some new techniques, some things that people before couldn't do. So there are pros and cons for these very well-defined open problems. The cons is, of course, that those open problems, they are not solving burning practical questions that we need to resolve right now. So then, you know, you come up with a solution and it's not like people are going to implement your algorithm right away. So these are maybe not the most important question if you really want to get new algorithm right away. But on the other end, progress on this question, this is deep progress. This is progress where you really develop new tools, which hopefully maybe a couple of years from now, some more practical methods are going to emerge out of it. Tell us a bit about your history, academic and otherwise.

Starting point is 00:29:06 It's always great to hear personal anecdotes about where people came from and how they ended up where they are. What's your story? Sure. So I grew up in a small town around Strasbourg in France, next to Germany. What's it called? It's called Oberhausbergen, which means the house at the bottom of the hill. And then, you know, there is a city next to it, which is Mittel-Osbergen, the house on the middle of the hill.

Starting point is 00:29:33 And then there is Nieder-Osbergen, the house at the top of the hill. So these are the three towns where, you know, I was hanging out until I was 18. So during those first 18 years, yeah, I didn't know anything about mathematics, science. But then at 18, I went into this thing which is called PREPA. So that's famous in France. That's where you train to go into the Grandes Ecoles, Engineering Grandes Ecoles or, you know, Ecole Normale Supérieure, which is where I went, where you learn how to do mathematics. So that's where I really fell in love with mathematics. I really always remember it like the very first lecture.

Starting point is 00:30:10 I immediately thought, OK, this is the first time that somebody is saying something that really resonates with me. So two years of prepa, one year in École Normale Supérieure, and then I moved on to do my PhD. What I thought at the time I wanted to do was mathematics of the brain, but then I realized that there was no such thing. But as I realized that, I also noticed this field of machine learning, which seemed very exciting. So I started to learn more about it and then I decided to do a PhD in machine learning and specifically in the multi-armed bandit problem. So that's to learn more about it. And then I decided to do a PhD in machine learning and specifically in the multi-armed bandit problem. So that's what I did in France. So how did you end up at Microsoft Research?

Starting point is 00:30:52 After my PhD, I did one year in Barcelona for my postdoc, which was just amazing scientifically and otherwise. And then with my wife, we moved to Princeton, New Jersey, where I was an assistant professor in the OR department. I stayed there for three years. And then I did a semester at the Simons Institute in Berkeley. And that's where I met Yuval Perez. And that's also where I discovered the field of theoretical computer science, which, again, I fell in love with. And that's really where I could see, you know, my second calling. Let's say after mathematics, really theoretical computer science, I thought the people in the field were amazing.

Starting point is 00:31:34 I thought the questions they were looking at were fantastic. I thought their work ethic was very refreshing. So that's where I met Yuval Perez. And then after that, it's history. That's why I joined Microsoft Research. I came to visit and they made me an offer and I decided to join. So as of two days ago,

Starting point is 00:31:55 you have a pretty cool announcement to make. Right. So our paper, joint work with researchers in RIA in France and in Tatli, again in Tatli, just got awarded the best paper at NIPS. So it's pretty exciting. Tell us a little bit about the paper and what it's adding to the literature in the field. Right. So it's in fact tying back to one of my main area of interest,

Starting point is 00:32:19 which is convexity and randomness. So the question we were looking at is the problem of distributed convex optimization. So what happens when the objective function that you want to optimize, which depends on your data, it's not all in a single place, but the data is distributed among many servers. Then how do you optimize both the communication between the different servers, as well as the work that each server is doing? So that's the question we were looking at. Of course, we weren't very far from being the first ones to look at. There are literally hundreds, if not thousands of papers written on this topic. But somewhat surprisingly, it seems that nobody has really looked at optimal methods,

Starting point is 00:32:57 meaning provably optimal, both upper and lower bounds. So in classical serial optimization, this is again work done by the Russian school, Arkady Nemirovsky, Yuri Nesterov, and those people. But they didn't look so much at the distributed scenario. And one reason, I think, is because in this distributed scenario, in fact, randomness is key. Randomization is very, very important. And that's what our paper proves,

Starting point is 00:33:21 is that if you use randomization, then you can get optimal guarantees. That's awesome. Congratulations. Thank you. Sébastien, as we close, I always ask my guests to offer some parting advice to our listeners, whether it be in the form of inspiration for future research or challenges that remain to be solved. And your recent paper kind of teased that up a little bit. What would you say to researchers who might be just getting interested in this field?

Starting point is 00:33:49 Maybe a good advice to any starting researcher is to become an expert in a very narrow field or even narrow question, but really become the world expert in that question. And that will yield rewards for a very long time. So that's what I did with the multi-armed bandit problem. And I think this was very successful. And, you know, later on, I could see connections with this problem that I wouldn't have made

Starting point is 00:34:10 if I didn't know all of this about the multi-armed bandit problem. Now, in terms of research directions, there are really many, many questions emerging. So one of them we already mentioned is in this chasing convex body problem. We showed that you can be competitive, but our dependency on the dimension is very bad. It's exponential in the dimension. So a great open question, but difficult one, is whether you can be polynomial.

Starting point is 00:34:35 Another open question is this metrical task system to actually solve the longstanding conjecture of whether you can be log n competitive. This is a fantastic question that relates to how do you define a good notion of entropy for general metric spaces. It's really a very deep and beautiful question. We don't know how to solve this. Related to this latest best paper award at NIPS, in fact, you can think about the multi-armed bandit version of this problem, which is exactly

Starting point is 00:35:05 distributed bandit. This problem is basically entirely open. And I believe it's a very rich question. And in fact, it ties into a broader question of how do you think about multi-agent problem? And this is a very rich field yet to emerge. Sebastian Bubeck, thank you very much for coming on the podcast. It's been really, really interesting. My pleasure. I really enjoyed the questions. To learn more about Dr. Sebastian Bubeck and how theoretical computer scientists are tackling the

Starting point is 00:35:37 multi-armed bandit and other combinatorial conundrums, visit microsoft.com slash research.

Your Ad Here

Microsoft Research Podcast - 053 - Chasing convex bodies and other random topics with Dr. Sebastien Bubeck

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.